Martin Jones, Devan Dubnyk, and Making Reasonable Assumptions About the Population

A fun case study on somewhere I went wrong

Martin Jones and Devan Dubnyk celebrating a rare win.

If you follow me on Twitter, you probably know how I feel about goaltending: It’s basically random. A quick search finds 3 instances of me Tweeting the exact string “Goalies are basically random” and you can probably another 20 or so where I’ve shared that sentiment. This isn’t a belief I’ve always held as a hockey fan; I once freely used terms like “elite goalie,” and believed you could project future goaltending performance with a solid degree of confidence. But the overwhelming body of evidence I’ve seen has led me to change my position.

To give a quick example of this evidence: From 2013–2014 through 2019–2020, the R² between Goals Saved Above Expectation (GSAx) in year one and year two, according to my expected goal model, is a whopping 0.016. In other words, aggregate goaltending performance in year one explains a little under 2% of the variance in aggregate goaltending performance in year two. The chart below should provide an idea of why I feel comfortable making such a statement.

Fenwick save percentage above expectation, a per-shot metric, does significantly better with an R² of 0.046, but that’s still very low.

I won’t get into other models or more rudimentary metrics like raw save percentage, but I can promise you they’re all quite similar; JFresh’s studies done using data from other models would concur. If a statistic that can effectively predict future goaltending performance does exist, it’s certainly not available to the public.

I never expressed it, but my views on goaltending provided me with a good deal of reprieve when San Jose acquired Devan Dubnyk in October, and chose to enter the 2021 season with a goaltending tandem of he and Martin Jones.

I knew the results hadn’t been good. From 2018–2019 through 2019–2020 — the last two seasons heading into 2021 — Jones and Dubnyk had ranked 2nd and 3rd, respectively, from dead last in GSAx with marks of -31.43 and -29.45. My 2021 projections had the Sharks ranked 30th in goaltending behind Chicago, who had only Malcolm Subban and an unknown backup. I’ve got the receipts right here; I totally said their goalies would be bad:

But every time I made mention of it, I still hedged my bet on my belief that goalies are basically random. I theorized that if each Sharks goalies independently had a 70% chance of being below average — a completely unscientific number that just felt right to me — that meant there was a 49% (.7 * .7) chance that both would be bad, and thus a 51% chance that at least one would be good. I didn’t seriously get my hopes up over this, as I didn’t feel the team was a legitimate contender even if they had good goaltending, but I was not at all convinced that the poor decision to enter the season with this tandem would hurt them as badly as most fans thought.

Enter 2021: Martin Jones’s 3rd straight season with a save percentage of exactly .896 and Devan Dubnyk’s 2nd straight season with a save percentage below .900. Per my model, Jones ranks 6th from last in goals saved above expectation with -11.35 and Dubnyk ranks 12th from last with -7.66. Last night, Martin Jones was pulled for the 7th time in just 30 games (per @Sheng_Peng), which ties a career high he previously set in 60 games. The aggregate goals saved above expectation aren’t as deep into the negatives as the numbers from the prior two seasons, but that’s only because the sample size is smaller; their delta Fenwick save percentages are both worse, meaning they’ve actually declined on a per-shot basis.

Now, I have to preface this by saying that both goalies still have a few games left to play, and that goalies are random and weird enough that either one of these below-replacement goalies could suddenly string together a few good games. But by all accounts, it looks like Devan Dubnyk and Martin Jones are both in the midst of completing another horrible season. It’s the 3rd year in a row that San Jose’s got this kind of goaltending. Aaron Dell’s slightly above average performance as their backup in 2019–2020 is the closest thing they’ve had to decent goaltending in the past 3 years.

Why does this keep happening to the Sharks? Have they just been really unlucky to receive three straight years of awful performances from the most volatile and random position in hockey, yet arguably the most important one? Are they cursed?

Nah. They’re definitely not cursed. And they’re not even unlucky. They’re just poorly managed. They make bad decisions. And not only do they make bad decisions, but they make such bad decisions about goaltending that some of the perfectly reasonable assumptions I make about the population no longer apply to them.

Reasonable Assumptions

When I looked at the correlation between goaltending performance in year one and year two by my models, I made an assumption that I don’t think anybody reading this is aware of. In fact, I wasn’t even aware of it until I spent some time reflecting on how something so predictable happened somewhere that the only thing I’ve come to expect is the unexpected.

The assumption I made was that NHL teams are at least somewhat efficient at evaluating goaltenders, and that they try to go into each season with the two best goaltenders they can acquire. For the vast majority of the population, this is absolutely true. And it means that often when a goalie has a downright terrible season like Jimmy Howard did in 2019–2020, we just don’t see them next year. If we did, we’d probably see a lot more horrible goalies repeating their performances from year to year. This means that when I ran those regressions, I wasn’t actually comparing year one and year two performance between all goalies, but rather, I was comparing year one and year two performance between all goalies who were good enough in year one to stick in the league in year 2.

This is a sampling bias which significantly shrinks the margin between the best and worst goalies in each season, and thus makes these metrics more volatile. It’s not that goaltending really is basically random; it’s that the gap between the small set of NHL goalies who survive two seasons is small enough and that their performances are volatile enough that the best and worst of the bunch can trade positions at the top and bottom of these lists from year to year, which makes their rankings among their peers basically random.

Goalies like Devan Dubnyk and Martin Jones often do not exist in the “year 2” side of these calculations because they’re out of the league after such awful performances in year one. NHL organizations very rarely have the audacity to roll out a goaltending tandem like San Jose’s after the way each of them played in the past two seasons; especially not in a season where the organizational mandate from the top down is that making the playoffs is the bare minimum. If they did, we’d have a much clearer assortment of good and bad goaltenders. To give an example, somebody like John Gibson whose numbers were elite prior to 2019–2020 and poor since then would categorically be considered a good goaltender who recently just had two “less good” seasons, rather than the below average goalie he’s arguably become.

Because NHL organizations very rarely have the audacity to make such awful decisions, it’s okay to make these assumptions about the population. It’s okay to look at an established NHL goalie with solid results and note that the volatility of the position means he could very easily be one of the league’s best or worst full-timers going forward. If we didn’t make these assumptions, we’d be completely flabbergasted when Carter Hart and Thomas Greiss wind up being the league’s 2 worst goalies after stringing together multiple strong seasons. Instead, we can just laugh it off and say “goaltending is basically random.”

We make these assumptions because they’re almost always true. If we avoided making them, we’d actually be wrong a lot more frequently. But it’s important to be aware of these assumptions and acknowledge that they don’t always apply to edge cases like San Jose’s goaltending.