Nov 292012

The Value of Paying Down Technical Debt

The Value of Paying Down Technical Debt

Our Engineering team has a great term called Technical Debt, which is the accumulation of coding shortcuts and operational inefficiencies over the years in the name of getting product out the door faster that weighs on the company’s code base like debt weighs on a balance sheet.  Like debt, it’s there, you can live with it, but it is a drag on the health of the technology organization and has hard servicing costs.  It’s never fun to pay down technical debt, which takes time away from developing new products and new features and is not really appreciated by anyone outside the engineering organization.

That last point is a mistake, and I can’t encourage CEOs or any leaders within a business strongly enough to view it the opposite way.  Debt may not be fun to pay off, but boy do you feel better after it’s done.  I attended an Engineering all-hands recently where one team presented its work for the past quarter.  For one of our more debt-laden features, this team quietly worked away at code revisions for a few months and drove down operational alerts by over 50% — and more important, drove down application support costs by almost 90%, and all this at a time when usage probably doubled.  Wow. 

I’m not sure how you can successfully scale a company rapidly without inefficiencies in technology.  But on the other side of this particular project, I’m not sure how you can afford NOT to work those ineffiencies out of your system as you grow.  Just as most Americans (political affiliation aside) are wringing their hands over the size and growth of our national debt now because they’re worried about the impact on future generations, engineering organizations of high growth companies need to pay attention to their technical debt and keep it in check relative to the size of their business and code base.

And for CEOs, celebrate the payment of technical debt as if Congress did the unthinkable and put our country back on a sustainable fiscal path, one way or another!

As a long Post Script to this, I asked our CTO Andy and VP Engineering David what they thought of this post before I put it up.  David’s answer was very thoughtful and worth reprinting in full:

 I’d like to share a couple of additional insight as to how Andy and I manage Tech Debt in the org: we insist that it be intentional. What do I mean by “intentional”

  •  There is evidence that we should pay it
  • There is a pay off at the end

 What are examples of “evidence?”

  •  Capacity plans show that we’ll run out of capacity for increased users/usage of a system in a quarter or two
  • Performance/stability trends are steadily (or rapidly) moving in the wrong direction
  • Alerts/warnings coming off of systems are steadily or rapidly increasing

 What are examples of “pay off?”

  •  Increased system capacity
  • Improved performance/stability
  • Decreased support due to a reduction in alerts/warnings

 We ask the engineers to apply “engineering rigor” to show evidence and pay-offs (i.e. measure, analyze, forecast).

 I bring this up because some engineers like to include “refactoring code” under the umbrella of Tech Debt solely because they don’t like the way the code is written even though there is no evidence that it’s running out of capacity, performance/stability is moving in the wrong direction, etc. This is a “job satisfaction” issue for some engineers. So, it’s important for morale reasons, and the Engineering Directors allocate _some_ time for engineers to do this type of refactoring.  But, it’s also important to help the engineer distinguish between “real” Tech Debt and refactoring for job satisfaction.

Jun 282012

How Many Thermometers Do You Need to Know the Turkey’s Done?

How Many Thermometers Do You Need to Know the Turkey’s Done?

Full credit to my colleague Jack Abbot for using this awesome phrase in an Engineering Management meeting I observed recently. It’s a gem. Filed!  The context was around spending extra cycles creating more metrics that basically measure the same thing. And in theory, sure, you don’t want or need to do that, even if you do have a cool data visualization tool that encourages metric proliferation.

But as I was thinking about it a bit more, I think there are situations where you might want multiple thermometers to tell you about the done-ness of the turkey.

First, sometimes you learn something by measuring the same thing in multiple ways.  Triangulation can be a beautiful thing.  Not only does it work for satellites, but think of a situation where you have a metric that is really made up of multiple underlying metrics.  Net Promoter Score is a good example.  Aren’t you better off knowing the number of Promoters and Detractors as well as the Net?

Second, sometimes redundant metrics aren’t bad if there is a potential failure of one of them.  For critical systems metrics that are measured in automated ways, sometimes automation fails. The second thermometer could be thought of as a backup.  You can have an internal web performance monitoring system, but wouldn’t you feel better with Keynote or Gomez as well, just in case your internal system fails?

Finally, sometimes metrics move between “lagging” and “leading,” which are fundamentally different and useful for different purposes. For example, we talk about sales in a couple different ways here.  There are bookings, which are forward-looking, and there is recognized revenue, which is backwards looking.  They are both about revenue.  But looking only at recognized revenue tells you nothing about the health of new business.  And looking only at bookings tells you very little about the current and next quarter.

Jack, thanks for this gem of a phrase, and for the thinking it provoked!