Re: recent failures on fucit - how to de-dupe...

Gus Heck Wed, 05 Dec 2018 15:09:52 -0800

Hi Hoss,

Thanks, this starts to clear some things up for me. Please let me be clear
that i am in no way complaining or requesting changes. The builds page on
fucit.org is quite cool. It's very likely that I am confused because I'm
lacking in knowledge about what builds are doing what. Is there anywhere
the various Jenkins builds are listed, maybe with some high level
description?

I had guessed that there was some re-running of individual tests going on
based on the repro line options, and it's definitely clear that more than
one match is occurring per file, but I wasn't seeing a consistent pattern.
What I'm trying to understand is how to pick apart what I see. It's great
that we are doing automatic "repro" runs. Till today I didn't realize that
was happening automatically. That's awesome :).

I guess I'd ideally want to be able to come up with the following numbers:

   1. How many fresh, clean, completely independent runs were conducted
   2. How many fresh, clean, completely independent runs failed.
   3. With failures on independent runs identified, I intend to look for
   and count different types of failures (SSL, ZK session loss, assertion
   fail, carrot complaining about thread leak etc). I don't want to spend time
   sorting through reproductions for that.

Once I have that I want to understand

   1. Which repro runs tie back to the fresh runs (seed probably tells me
   that)
   2. How many repro runs were attempted pursuant to each fresh run
   3. Failure rate for repro runs attempted pursuant to a fresh run.
   4. Finally, is the failure the same cause in the reproduction as in the
   initial run. (flakey test bad luck vs real reproduction).

So I guess what I'm asking is: What I should be looking at to weed out and
quantify the reproduction and duplicative runs vs fresh runs so I can
understand the base failure rate?

I'd like to venture a guess that a good first pass is that anything with
tests.method in the repro line is a reproduction re-run. Do any of the
builds re-run at the top level? It appears so based on what i see for seed
2E743D2D45BF625E, but I'm not necessarily convinced that in going through
the modal dialogs I didn't wind up grabbing the same build more than once
either. Do the links in the modal dialogs potentially overlap?

-Gus

On Wed, Dec 5, 2018 at 5:09 PM Chris Hostetter <hossman_luc...@fucit.org>
wrote:

>
> : Seeing TimeRoutedAliasUpdateProcessorTest on the 7.6 bad apple list,
> having
> : recently been looking at that test, and waiting on a long build for other
> : work, I went to
> http://fucit.org/solr-jenkins-reports/failure-report.html
> : to gather recent failures, and when I started looking I began to suspect
> : there were duplicates... So I downloaded/extracted everything that comes
> up
>
> Duplicates of what exactly?
>
> It's not 100% clear what the subject/object you're asking about in
> regardes to "de-dupe" ing ... it seems like maybe you are worried that
> individual failures are being "duplicate counted" but i suspect what
> you're actually seeing/confused by is either:
>
> 1) That a single failure / reproduce line might exist multiple times in
> the logs & test reports for a single jenkins build ID.  this is absolutely
> possible because of how our jenkinsjobs run.  Depending on what jenkins
> server & what target it invokes, some jenkins jobs try to "repro" any
> failure that eixsted during hte main test run.
>
> 2) That you might see the same exact reproduce line / seed in multiple
> jenkins build logs & test reports ... similar to #1, we have other jenkins
> jobs with "repro" in their name that only run that bit of logic: looking
> at some recent jenkins failures (or other jobs) and running the "reproduce
> with" lines from the logs.
>
> ...or the combination of both.
>
> : I clicked on each line and opened a tab fore each line in the modal
> dialog
> : and then from each tab downloaded jenkins.log.txt.gz into a folder
> : corresponding to the day on the file timestamp
>
> FYI: if that modal dialog box has (XN) next to a jenkins build ID, that
> means that w/in that sinle jenkins build that test failed multiple times.
>
> : gus$ grep -r 'reprod' * | grep TimeRouted | perl -pe
> 's/(^[^[]*).*reproduce
> : with:(.*Dtests\.seed=(\w+)\s.*)/\3 \1 \2/' | sort
>         ....
> : What I've done is sort by seed, and found a LOT of duplication, even
> across
> : files and some apparent running of specific test methods. I'd like to
> : understand what's happening with the build servers here... why am we
> seeing
> : so many duplicates? I would guess that this really boils down to 1 fail
> per
> : seed value seen? I'm trying to figure out how many and which of these I
> : need to consider, and I'm interested in the frequency of different
> failure
> : scenarios which is hard to gauge if there's duplication.
>
> The duplicated "failures" (with identical seeds) in the logs come from
> duplicated "runs" (with identical seeds) for the express purpose of trying
> to figure
> out if a given failure is reliably reproducible.
>
> Ie: don't assume that because you see the same "reproduce with" line
> duplicated multiple times that the failure stats are "wrong" and the test
> isn't failing as often as it seems -- quite the oposite is true: if you
> see the "reproduce with" line failing multiple times (either in the same
> build, or in two diff builds) then that just re-iterates that the failure
> is very easy to reproduce.
>
>         ...
> :  gus$ grep -r 'reprod' * | grep Time | perl -pe 's/.*reproduce
> : with:(.*Dtests\.seed=(\w+)\s.*)/\2/' | wc -l
> :
> :       28
> :
> : So 17 failures listed in fucit.org lead me to find 14 files containing
> 10
> : distinct seeds seeds and 28 lines that contain "reproduce with:"
> :
> : Not quite sure how to interpret that. Even if I say each seed is a unique
> : fail, I have no idea how many total builds that relates to...
>
> I think you need to look more closely at what lines your regexes are
> matching -- that "29" is also going to include other failures in the same
> job that have nothing to do with the class you are focused on (example:
> failures for unrelated tests like TestTlogReplica.testRealTimeGet with
> match your grep for "Time")
>
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
http://www.the111shift.com

Re: recent failures on fucit - how to de-dupe...

Reply via email to