Re: [HACKYSTAT-DEV-L] Build Analysis Failures and Hypotheses

(Cedric) Qin ZHANG Thu, 16 Feb 2006 01:51:45 -0800

I would claim that our current build analysis is always "smart" by theway its culprit identification algorithm is written.

The first question is: "What is acceptable integration build failure byour standard?"Obviously, the answer is that if build fails in a module nobody isworking on (in other words, the failure is caused by inter-moduledependency), then it's acceptable.


The second question is: "What's the algorithm that identifies the culprit?"

The answer is that if a build failures in a module, the algorithm triesto find the person who has commit in that module but does not run testthat on that module.

So, from the 2 answers you can see that the build analysis will alwaysbe "smart", exactly because it does not use dependency informationduring the analysis.

My hypothesis is that since inter-module dependency build failure israre, our current approach requiring only local testing on the moduleyou are working on is a happy medium.


Cheers,

Cedric



Philip Johnson wrote:

Last night's build failure resulted in the following analysis:
--On Monday, February 13, 2006 6:00 AM -1000 Hackystat Administrator<[EMAIL PROTECTED]> wrote:
Integration build for Hackystat-7 project on 13-Feb-2006 FAILED.
  Module: hackyApp_Cgqm
  Failure Types: Compilation
  Plausible Culprit: Unknown
How culprit is identified: Build failure reason is unclear, and noone has made anycommit. Perhaps hackystat sensor data is incomplete, or it is causedby external error
on the integration build box.   Failure Messages:
    *
C:\HackystatSource\hackyApp_Cgqm\src\org\hackystat\app\cgqm\testbase\ABaseRemoteProject
TestClass.java::63::cannot find symbol
The compilation failures revolve around the ProjectManager class,which Hongbing committed recently, so it looks like this is somethingHongbing should investigate.
But there's a more interesting hypothesis that occurs to me.
I'm wondering if the Build Analysis mechanism has become "smartenough" in the following sense: build failures that the mechanism canidentify a culprit for are basically those modifications for which ourstandard development process should prevent from failing theintegration build. Conversely, build failures that the mechanismcannot identify a culprit for are those that we generally allow as"acceptable" integration build failures.
This goes back to the following basic premise of our process: theonly way to guarantee that developer-induced integration buildfailures are prevented would be to force everyone to do a 'freshStartall.junit' on _all_ modules prior to _every_ commit. We've said thatthis is too heavyweight: the cost in productivity for this kind ofprocess (even if we could get everyone to do it) is higher than thecost in productivity for occasional integration build failures.
On the other hand, if the nightly build fails constantly, then thatslows things down too much because people can't trust the repositoryto contain a working version.
So, our goal has been to find a "happy medium"---a level of process inwhich people make "reasonable" efforts to test their changes beforecommitting which reduces the integration build failures to a level inwhich those failures that remain are "justifiable", because the costof local testing to prevent these last remaining failures is too high.
So, what I'm starting to wonder is whether our build analysismechanism has actually become a valid measure for "reasonable": inother words, if it can identify the culprit, then the culprit shouldhave prevented the failure, but if it can't identify the culprit, thenthe failure is caused by a sufficiently indirect sequence of eventsthat we can view the daily build mechanism as being the most efficientway to uncover it.
To test this, we simply need to start evaluating the daily buildanalysis from the culprit/no culprit perspective. For example, I wouldclaim that today's failure is "reasonable", in that Hongbing wouldhave had to do a full test of the entire system to catch it.
Your thoughts?

Cheers,
Philip

Re: [HACKYSTAT-DEV-L] Build Analysis Failures and Hypotheses

Reply via email to