Hi Rob, 

Thanks a lot for your post. 
Your use cases are very illustrative. I am just starting to tackle the 
situation and would like to come up with a solution that can at least 
reduce the time I spent triaging and debugging false positives. 

The failures are not deterministic so is not that a specific set of tests 
are failing randomly but in a consistent manner. 

The plugins you mention could be a good approach. Need to analyse the 
impact on the overall build time or the need of adding new resources.

On Friday, August 8, 2014 6:09:21 PM UTC+2, Rob Mandeville wrote:
>
>  Ugh, intermittent test failures.  I lived in that jungle for years.  In 
> a previous life, my company used Jenkins to drive a complete homebrew 
> solution.  We wrote a tool to parse the log and write to database, wrote a 
> webapp to read the database and let you know what failed, even wrote a tool 
> to produce a subset of jobs so that you could auto-rerun all the failures.  
> Jenkins was reduced to simply running the jobs, and nobody looked at the 
> site to see what passed and what failed.
>
>  
>
> Trust me: you do not want to go there unless it’s absolutely necessary.
>
>  
>
> First thought should be to keeping these tests from being intermittent.  
> Perhaps the test code could account for strange OS or resource factors.  
> Perhaps you can limit slave nodes or something to control resource 
> contention.  Perhaps not.
>
>  
>
> If fixing the intermittency is not an option, your next defense is to 
> break up the build.  If you have an hour of tests, but only five minutes of 
> them are intermittent, isolate the intermittents into their own job so that 
> you can do a rerun in five minutes rather than an hour.
>
>  
>
> Now, for plugins.
>
>  
>
> First, there is a Naginator plugin.  This can be configured to rerun 
> failed builds, so it can keep trying those intermittent tests.
>
>  
>
> Secondly, there is a Build Flow plugin, which is more complicated and more 
> versatile.  You use it to tie a bunch of build jobs together (compile as 
> one job, unit tests over on another one, Selenium jobs on a third…).  The 
> DSL that you use has the ‘retry’ keyword, so you can say to run a certain 
> job some number of times until it succeeds.  This would allow you to have 
> one master build that is red, yellow, or (blue|green), and that could retry 
> your intermittent tests until they pass or just fail one too many times.
>
>  
>
> Of course, all of this retrying-until-it-works is based on the assumption 
> that if you run a test three times and it fails on two of those tries, the 
> test should be considered passing.  This is not a good assumption: it could 
> mean that the code is just plain flaky.  I prefer to get the test to 
> recognize the conditions that make it flaky and thus respond to them.  If 
> the test can fail because the database is unresponsive, have the test start 
> by trying to contact the database, and retrying every minute or two until 
> it succeeds or a test-determined timeout occurs.  Handling intermittency at 
> the test level rather than the Jenkins level requires you to define the 
> things that can break your test, rather than just assuming that passing 
> once in a while is okay.  If you can name me some factors that cause your 
> tests to go intermittent, I may have some ideas as to how to make the tests 
> non-intermittent.
>
>  
>
> --Rob
>
>  
>  
> *From:* [email protected] <javascript:> [mailto:
> [email protected] <javascript:>] *On Behalf Of *Albert Tresens
> *Sent:* Friday, August 08, 2014 11:05 AM
> *To:* [email protected] <javascript:>
> *Subject:* False positive on Jenkins builds. How to address.
>  
>  
>  
> Hi, 
>
> I am trying to optimize the triaging time on jenkins failues caused by 
> false positives. There is a percentage of failures that are always self 
> healed after subsequent builds.  Mostly dependent on the underlying OS or 
> some resource factors. 
>
> Does someone followed any specific approach to address such a situations?. 
> I guess is a common problem.
>
> I thought about spawning the Jenkins jobs so I get duplicated results and 
> discard if its not a double failure or adding plugins for filtering 
> specific exceptions.  
>
> Any suggestion or alternatives?
>
> Thanks!
>  
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>  Click here 
> <https://www.mailcontrol.com/sr/T2QE9Uq1co7GX2PQPOmvUtVTDJsKpCsgUcP!5TmclhuaW9pNFZEyQhCHjGJhJAM35b6TXk5rwVH1VsBVsOva5A==>
>  
> to report this email as spam.
>  
> ------------------------------
> This e-mail and the information, including any attachments it contains, 
> are intended to be a confidential communication only to the person or 
> entity to whom it is addressed and may contain information that is 
> privileged. If the reader of this message is not the intended recipient, 
> you are hereby notified that any dissemination, distribution or copying of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please immediately notify the sender and destroy 
> the original message.
>
> Thank you.
>
> Please consider the environment before printing this email.
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to