Re: False positive on Jenkins builds. How to address.

Albert Tresens Wed, 13 Aug 2014 05:50:38 -0700

Hi Rob, 

Thanks a lot for your post. 
Your use cases are very illustrative. I am just starting to tackle the 
situation and would like to come up with a solution that can at least 
reduce the time I spent triaging and debugging false positives.


The failures are not deterministic so is not that a specific set of tests 
are failing randomly but in a consistent manner. 

The plugins you mention could be a good approach. Need to analyse the 
impact on the overall build time or the need of adding new resources.

On Friday, August 8, 2014 6:09:21 PM UTC+2, Rob Mandeville wrote:
>
>  Ugh, intermittent test failures.  I lived in that jungle for years.  In 
> a previous life, my company used Jenkins to drive a complete homebrew 
> solution.  We wrote a tool to parse the log and write to database, wrote a 
> webapp to read the database and let you know what failed, even wrote a tool 
> to produce a subset of jobs so that you could auto-rerun all the failures.  
> Jenkins was reduced to simply running the jobs, and nobody looked at the 
> site to see what passed and what failed.
>
>  
>
> Trust me: you do not want to go there unless it’s absolutely necessary.
>
>  
>
> First thought should be to keeping these tests from being intermittent.  
> Perhaps the test code could account for strange OS or resource factors.  
> Perhaps you can limit slave nodes or something to control resource 
> contention.  Perhaps not.
>
>  
>
> If fixing the intermittency is not an option, your next defense is to 
> break up the build.  If you have an hour of tests, but only five minutes of 
> them are intermittent, isolate the intermittents into their own job so that 
> you can do a rerun in five minutes rather than an hour.
>
>  
>
> Now, for plugins.
>
>  
>
> First, there is a Naginator plugin.  This can be configured to rerun 
> failed builds, so it can keep trying those intermittent tests.
>
>  
>
> Secondly, there is a Build Flow plugin, which is more complicated and more 
> versatile.  You use it to tie a bunch of build jobs together (compile as 
> one job, unit tests over on another one, Selenium jobs on a third…).  The 
> DSL that you use has the ‘retry’ keyword, so you can say to run a certain 
> job some number of times until it succeeds.  This would allow you to have 
> one master build that is red, yellow, or (blue|green), and that could retry 
> your intermittent tests until they pass or just fail one too many times.
>
>  
>
> Of course, all of this retrying-until-it-works is based on the assumption 
> that if you run a test three times and it fails on two of those tries, the 
> test should be considered passing.  This is not a good assumption: it could 
> mean that the code is just plain flaky.  I prefer to get the test to 
> recognize the conditions that make it flaky and thus respond to them.  If 
> the test can fail because the database is unresponsive, have the test start 
> by trying to contact the database, and retrying every minute or two until 
> it succeeds or a test-determined timeout occurs.  Handling intermittency at 
> the test level rather than the Jenkins level requires you to define the 
> things that can break your test, rather than just assuming that passing 
> once in a while is okay.  If you can name me some factors that cause your 
> tests to go intermittent, I may have some ideas as to how to make the tests 
> non-intermittent.
>
>  
>
> --Rob
>
>  
>  
> *From:* jenkins...@googlegroups.com <javascript:> [mailto:
> jenkins...@googlegroups.com <javascript:>] *On Behalf Of *Albert Tresens
> *Sent:* Friday, August 08, 2014 11:05 AM
> *To:* jenkins...@googlegroups.com <javascript:>
> *Subject:* False positive on Jenkins builds. How to address.
>  
>  
>  
> Hi, 
>
> I am trying to optimize the triaging time on jenkins failues caused by 
> false positives. There is a percentage of failures that are always self 
> healed after subsequent builds.  Mostly dependent on the underlying OS or 
> some resource factors. 
>
> Does someone followed any specific approach to address such a situations?. 
> I guess is a common problem.
>
> I thought about spawning the Jenkins jobs so I get duplicated results and 
> discard if its not a double failure or adding plugins for filtering 
> specific exceptions.  
>
> Any suggestion or alternatives?
>
> Thanks!
>  
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to jenkinsci-use...@googlegroups.com <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>  Click here 
> <https://www.mailcontrol.com/sr/T2QE9Uq1co7GX2PQPOmvUtVTDJsKpCsgUcP!5TmclhuaW9pNFZEyQhCHjGJhJAM35b6TXk5rwVH1VsBVsOva5A==>
>  
> to report this email as spam.
>  
> ------------------------------
> This e-mail and the information, including any attachments it contains, 
> are intended to be a confidential communication only to the person or 
> entity to whom it is addressed and may contain information that is 
> privileged. If the reader of this message is not the intended recipient, 
> you are hereby notified that any dissemination, distribution or copying of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please immediately notify the sender and destroy 
> the original message.
>
> Thank you.
>
> Please consider the environment before printing this email.
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: False positive on Jenkins builds. How to address.

Reply via email to