Ugh, intermittent test failures.  I lived in that jungle for years.  In a 
previous life, my company used Jenkins to drive a complete homebrew solution.  
We wrote a tool to parse the log and write to database, wrote a webapp to read 
the database and let you know what failed, even wrote a tool to produce a 
subset of jobs so that you could auto-rerun all the failures.  Jenkins was 
reduced to simply running the jobs, and nobody looked at the site to see what 
passed and what failed.

Trust me: you do not want to go there unless it’s absolutely necessary.

First thought should be to keeping these tests from being intermittent.  
Perhaps the test code could account for strange OS or resource factors.  
Perhaps you can limit slave nodes or something to control resource contention.  
Perhaps not.

If fixing the intermittency is not an option, your next defense is to break up 
the build.  If you have an hour of tests, but only five minutes of them are 
intermittent, isolate the intermittents into their own job so that you can do a 
rerun in five minutes rather than an hour.

Now, for plugins.

First, there is a Naginator plugin.  This can be configured to rerun failed 
builds, so it can keep trying those intermittent tests.

Secondly, there is a Build Flow plugin, which is more complicated and more 
versatile.  You use it to tie a bunch of build jobs together (compile as one 
job, unit tests over on another one, Selenium jobs on a third…).  The DSL that 
you use has the ‘retry’ keyword, so you can say to run a certain job some 
number of times until it succeeds.  This would allow you to have one master 
build that is red, yellow, or (blue|green), and that could retry your 
intermittent tests until they pass or just fail one too many times.

Of course, all of this retrying-until-it-works is based on the assumption that 
if you run a test three times and it fails on two of those tries, the test 
should be considered passing.  This is not a good assumption: it could mean 
that the code is just plain flaky.  I prefer to get the test to recognize the 
conditions that make it flaky and thus respond to them.  If the test can fail 
because the database is unresponsive, have the test start by trying to contact 
the database, and retrying every minute or two until it succeeds or a 
test-determined timeout occurs.  Handling intermittency at the test level 
rather than the Jenkins level requires you to define the things that can break 
your test, rather than just assuming that passing once in a while is okay.  If 
you can name me some factors that cause your tests to go intermittent, I may 
have some ideas as to how to make the tests non-intermittent.

--Rob

From: jenkinsci-users@googlegroups.com 
[mailto:jenkinsci-users@googlegroups.com] On Behalf Of Albert Tresens
Sent: Friday, August 08, 2014 11:05 AM
To: jenkinsci-users@googlegroups.com
Subject: False positive on Jenkins builds. How to address.

Hi,

I am trying to optimize the triaging time on jenkins failues caused by false 
positives. There is a percentage of failures that are always self healed after 
subsequent builds.  Mostly dependent on the underlying OS or some resource 
factors.

Does someone followed any specific approach to address such a situations?. I 
guess is a common problem.

I thought about spawning the Jenkins jobs so I get duplicated results and 
discard if its not a double failure or adding plugins for filtering specific 
exceptions.

Any suggestion or alternatives?

Thanks!
--
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
jenkinsci-users+unsubscr...@googlegroups.com<mailto:jenkinsci-users+unsubscr...@googlegroups.com>.
For more options, visit https://groups.google.com/d/optout.


Click 
here<https://www.mailcontrol.com/sr/T2QE9Uq1co7GX2PQPOmvUtVTDJsKpCsgUcP!5TmclhuaW9pNFZEyQhCHjGJhJAM35b6TXk5rwVH1VsBVsOva5A==>
 to report this email as spam.

________________________________
This e-mail and the information, including any attachments it contains, are 
intended to be a confidential communication only to the person or entity to 
whom it is addressed and may contain information that is privileged. If the 
reader of this message is not the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
immediately notify the sender and destroy the original message.

Thank you.

Please consider the environment before printing this email.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to