[fonc] Incentives and Metrics for Infrastructure vs. Functionality (was Re: Linus Chews Up Kernel Maintainer...)

Paul D. Fernhout Mon, 31 Dec 2012 13:59:12 -0800

On 12/31/12 1:39 PM, Marcus G. Daniels wrote:

Of course, there is rarely the time or incentive structure to do any of
this.  Productive programmers are the ones that get results and are fast
at fixing (and creating) bugs.  In critical systems, at least, that's
the wrong incentive structure.  In these situations, it's more important
to reward people that create tests, create internal proofs, and refactor
and simplify code.  Having very dense code that requires investment to
change is a good thing in these situations.

Programming for the broadcasting industry right now (where a few secondsdowntime might cost millions of dollars), I especially liked your point,Marcus. I live within this tension every day, as I imagine so do to aneven higher degree aircraft software designers, medical systemdesigners, automotive software designers, and so on where many lives areat risk from a bug. Certainly the more unit tests that code has, themore "dense" the code might feel, as the more resistant to casual changeit can become, even as one may be ever more assured that the code isprobably doing what you expect most of the time. And the argument goesthat such denseness in terms of unit tests may actually give you moreconfidence in refactoring. But I can't say I started out feeling orprogramming that way.

The movie "The Seven Samurai" begins with the villagers having a bigconceptual problem. How do the agriculturalists know how to hirecompetent Samurai, not being Samurai themselves? The villagers wouldmost likely be able to know the difference in a short time between aneffective and ineffective farm hand they might hire (based on theiragricultural domain knowledge) -- but what do farmers know aboutevaluating swordsmanship or military planning? Likewise, an end user mayknow lots about their problem domain, but how can users tell thedifference between effective and ineffective coding in the short term?How can users distinguish between software that just barely works athandling current needs and, by contrast, software that could handle abroad variety of input data, which could be easily expandable, and whichwould detect through unit tests unintended consequences of codingchanges? That is meant mostly rhetorically -- although maybe a moreon-topic question for this list would be how do we create softwaresystems that somehow help people more easily appreciate or understand orvisualize that difference?

Unless you know what to look for (and even sometimes if you do), it ishard to tell whether a programmer spending a month or two refactoring orwriting tests is making the system better, or making the system worse,or maybe just is not doing much at all. Even worse from a bean counterperspective, what about the programmer who claims to be spending time(weeks or months) just trying to *understand* what is going on? Andthen, what if after apparently doing nothing for weeks, the programmerthen removes lots of code? How does one measure that level of apparentnon-productivity or even negative-productivity? A related bit of history:

  http://c2.com/cgi/wiki?NegativeLinesOfCode

"A division of AppleComputer started having developers reportLinesOfCode written as a ProductivityMetric?. The guru, BillAtkinson,happened to be refactoring and tuning a graphics library at the time,and ended up with a six-fold speedup and a much smaller library. Whenasked to fill in the form, he wrote in NegativeLinesOfCode. Managementgot the point and stopped using those forms soon afterwards."

If there is a systematic answer, part of it might be in having lots ofdifferent sorts of metrics for code, like in the direction of projectslike "Sonar". I don't see Sonar mentioned on this list, at least in thepast six or so years. Here is a link:

  http://www.sonarsource.org/

"Sonar is an open platform to manage code quality. As such, it coversthe 7 axes of code quality: Architecture & Design, Comments, Codingrules, Potential bugs, Complexity, Duplications, and Unit tests"

We tend to get what we measure. So, are these the sorts of things newcomputing efforts should be measuring?

Obviously, users can generally see the value of new functionality,especially if they asked for it. And thus there is this tension betweeninfrastructure and functionality. This tension is especially strong inthe context of "black swan" situations where chances are some rare thingwill never happen, and if it does, someone else will be maintaining thecode by then. How does one create incentives (and supporting metrics)related to that? In practice, this tension may sometimes get resolved byspending some time on refactoring and tests that the user will notappreciate directly and some time of obvious enhancements users willappreciate. Of course, this will make the enhancements seem to takelonger than they might otherwise in the short term. And it can be anorganizational and personal challenge to interleave the two areas ofwork (infrastructure and functionality) so the users stay happy enough.Yet, I might argue, if you don't do the two in parallel, you may findyou are not creating either the needed infrastructure or the needed code.

"Agile" methods embrace the idea of coding for obvious functionality,saying "You're not going to need it". But agile methods may also be moreappropriate in some situations than others. And agile methods do alsotend to celebrate and insist on refactoring and testing on the way tocreating new functionality, which when you think about it is anodd-seeming way of addressing your point -- by a sort of "cultural"approach. In that sense, Agile methods start with the best practices oftesting and refactoring as givens in an programming culture, and thenmeasures progress in terms of function points. But there is always goingto be the pressure to focus more on the function points and less on theother things. And that might be where tools like Sonar help keep teamsfrom letting maintainability slide in the quest for new function points.

Maybe we need a language where code just won't compile unless it hastests written for it in advance? :-) Was Eiffel heading that way with"Design by Contract"? Or maybe we need to have more projects using toolslike Sonar, and write even better such tools, as well as betterlanguages to use with them? The message passing paradigm in Smalltalk(making proxying and mocking easier) along with other aspects likedynamic strong typing, means Smalltalk historically lent itself tocreating better tools and tests. Although there is still a tension inthat statically-typed systems also made some other sorts of reasoningabout programs easier to do. (Not to get into that discussion, which hasbeen hashed out on this list and in many other places before.)


Like Fred Brooks said in 1986:
  http://en.wikipedia.org/wiki/No_Silver_Bullet

Or, as Kyle Wilson said in 2007:
  http://gamearchitect.net/Articles/SoftwareIsHard.html

Still, we'd probably get software that was both more reliable and moremaintainable (as well as more fun to work with) if we just more commonlyused computing ideas (like message passing) that have been known fordecades. Or, for another example, Smalltalk's keyword syntax generallymakes for much more readable programs. Or for yet another example, Forthcould support a multi-user system doing real-time data collection, wordprocessing, and data analysis using just 32K bytes and a single 1 Mhzprocessor, whereas many systems today struggle to be responsive for asingle user or single application when they have 32GB and 8 X 3 Ghzprocessors. Indeed, do we need to go "Back to the future" again? :-)


--Paul Fernhout
http://www.pdfernhout.net/
====

The biggest challenge of the 21st century is the irony of technologiesof abundance in the hands of those thinking in terms of scarcity.

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

[fonc] Incentives and Metrics for Infrastructure vs. Functionality (was Re: Linus Chews Up Kernel Maintainer...)

Reply via email to