Sometimes you have a deadline that you just have to hit. The problem is that deadlines are challenging to hit because people always underestimate how long it is going to take them to build something, especially software. To account for this, a trick that some people use is to take their best estimate and then double it to get the “real” time it will take. I’ve used this method myself to varying degrees of success, but I never felt very good about it. And what about Hofstader’s Law?
Hofstadter’s Law: It always takes longer than you expect, even when you take into account Hofstadter’s Law.
Fortunately, there is a better way to make estimates, and it goes by the name of planning poker. When I first heard about planning poker I thought it was a gimmick. However, upon closer inspection, it turns out that underpinnings of planning poker are well supported by science.
The reason planning poker results in more accurate estimates is that it reduces the effects of The Four Planning Pitfalls. Let’s take a look at what those are:
The Four Planning Pitfalls:
- People are terrible at estimating absolute numbers (hours)
- Larger tasks are more difficult to estimate than smaller tasks
- Team members are susceptible to anchoring
- Individuals are influenced by the planning fallacy
Here’s how planning poker helps mitigate the effects of The Four Planning Pitfalls:
- Estimates are in terms of the relative complexity between tasks
- Estimates are distributed on an exponential scale
- Everyone shows their estimates at the same time
- The final estimate is reached by consensus
Before we dig into the science, let me briefly describe the mechanics of planning poker so that we’re all on the same page. The way planning poker works is that each team member is given a set of cards that have the following numbers on them 0, ½, 1, 2, 3, 5, 8, 13, 20, 40, 100.
The team leader describes a work item (story) to the team and asks them to choose the card that best represents the complexity (size) of the task. Each team member places their card face down in front of them. Once everyone has placed their card on the table, the cards are turned over revealing their choices. If the cards all show similar numbers then you’ve successfully estimated the task and the process is complete. However, if there is a big difference between the highest and lowest card (like a 2 and an 8) then the people who selected those cards are given the floor to defend their choices. With this new information, the team then repeats the estimation process. The process is repeated until a consensus is reached. In practice, when the cards only differ by one number, lets say a 3 and a 5, then just pick one and move on.
The reality of the world is that people are terrible at estimating how long as task will take to do. Luckily, while people are terrible at making absolute estimates, they are pretty good at choosing which task is bigger, task A or task B. To take full advantage of this fact, you should get your team to start thinking in relative terms instead of absolute hours. The common way to do this is to ask your team to give each task a relative complexity score called “points” instead of estimating how long an individual task will take to do.
Once your planning session gets rolling, it is easy to assign each task a certain number of “points” based on its complexity. However, before you get rolling, you’re going to have to arbitrarily give the very first task a certain number of points. As a rule of thumb, pick a medium size task and just assign it the value of 5 points. You can then use this first task as a baseline for figuring out the relative complexity of your next task, and so on. Ranking the cards in terms of complexity will produce an ordered list that is much more true to real life than if the tasks had been ranked by hours.
Now, with all of the tasks ranked by complexity, you can figure out how long all the tasks will take to do by determining how long it takes your team to complete a “point”. While I won’t discuss the details here, you can figure it out pretty easily by tracking the total number of points your team is able to complete each week (which you can get from a velocity chart) and also how many unplanned points are added per week (which you can see on a burnup chart).
As as an aside, it is also worth pointing out that while it’s important to try to determine how long it will take your team ship a new feature, it is also important to determine which tasks are the most complex, at least as far as efficiency is concerned. Teams are most efficient when they tackle tasks with the highest priority and lowest complexity first.
How much does a car weigh? I know it weighs a lot, but I don’t know the exact number. I’d guess that a car weighs somewhere between 3,000 and 5,000 lbs. Now, how much does a semi-truck weigh? I’m not sure of that either, but I’d guess somewhere between 30,000 and 50,000 lbs.
As you can see from my example, my range of uncertainty was 2,000 lbs for the car and 20,000 lbs for the truck. Making estimates like this doesn’t feel weird to me all. It seems natural to allow more leeway for the semi-truck. In general, when people make estimates they’re thinking “well it’s probably that amount ± 25%”. What they’re subconsciously doing is acknowledging that uncertainty grows with the size of their estimate.
To reflect this fact, your team should make their point estimates on an exponential scale. The scale that’s usually used for planning poker is a modified Fibonacci sequence, which takes into account that there is a wider range of error for larger guesses. The specific numbers are 0, ½, 1, 2, 3, 5, 8, 13, 20, 40, 100. To check if this scale makes sense, let’s use the same intuition we used above and see what ± 25% is for each of the numbers.
|Estimate||What you mean (-25% to +25%)|
|0.5||0.375 to 0.625|
|1||0.75 to 1.25|
|2||1.5 to 2.5|
|3||2.25 to 3.75|
|5||3.75 to 6.25|
|8||6 to 10|
|13||9.75 to 16.25|
|20||15 to 25|
|40||30 to 50|
|100||75 to 125|
Yup, looks pretty good. It make sense that there is a real difference between 2 and 3, for example, but no meaningful difference between 12 and 13.
As an aside, I wouldn’t worry about the large numbers 40 and 100. They are really just included to indicate that a task is too large. In practice, I’ve never actually seen a task what was assigned 40 or 100. If a task is that large, then it should be broken down into smaller tasks that are easier for the team to accomplish in bite sized chunks.
Anchoring is a cognitive bias where people base their decisions too heavily on the first suggestion that’s made. Anchoring is fascinating because everyone is susceptible to it. Even if you are an expert software developer or PM, you’re still heavily influenced by the first suggestion you hear, even if that suggestion is clearly wrong. I highly encourage you to read the wikipedia article on anchoring.
Fortunately, anchoring is easy to circumvent by making your choice in isolation. This is why planning poker has cards. Cards make it easy for your team members to hide their choices. There are obviously other ways to accomplish the same thing. For example, everyone could just write down their choice on a piece of paper and show it to the group at the same time. And for what it is worth, you don’t need to be in the same room to do it. You can just as easily do it over video chat or instant message. The important part is to eliminate anchoring by allowing everyone to make up their minds without being influenced by someone else’s choice.
Everyone is susceptible to the “planning fallacy”. The planning fallacy is the tendency for an individual to underestimate the complexity of a task that they are planning to do themselves. However, people don’t make the same mistake when asked to estimate how long it would take someone else to do the same task. The interesting thing about the planning fallacy is that experts are not immune from it, even when confronted with historical data showing that they had previously underestimated the time they needed to complete similar tasks.
To minimize the effects of the planning fallacy, it is important to get estimates from people who are not going to do the work themselves. A word of caution here, if your team is closely aligned, then they are still susceptible to the planning fallacy as a whole, meaning that they would give longer estimates if a different team was going to do the same work. So in reality, even when your team reaches a consensus, their estimates are still going to be too low. Over time, you’ll be able to track how much your team’s estimates differ from reality and you should be able to account for their optimism.
Planning poker is a proven technique for producing better estimates. I’d recommend giving it a try as it will most likely work for your team as well. If you team is too hip to do planning poker with the cards, then don’t use the cards. The important part is that you embrace a technique that avoids The Four Planning Pitfalls so you can make better estimates and ship your product on time.