Re: OverseerStatusTest recent failures

2021-02-21 Thread David Smiley
Ah; that makes total sense; thanks. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, Feb 21, 2021 at 12:06 PM Ilan Ginzburg wrote: > Searching in my jenkins folder for failures of this test (label:jenkins > "FAILED:

Re: OverseerStatusTest recent failures

2021-02-21 Thread Ilan Ginzburg
I have fixed the issue. A PR is out https://github.com/apache/lucene-solr/pull/2410/files. Most of the work was documenting what stats are actually returned. Now OverseerStatusCmd has more comment lines than code lines. Will merge it shortly. Ilan On Sun, Feb 21, 2021 at 6:05 PM Ilan Ginzburg

Re: OverseerStatusTest recent failures

2021-02-21 Thread Ilan Ginzburg
Searching in my jenkins folder for failures of this test (label:jenkins "FAILED: org.apache.solr.cloud.OverseerStatusTest.test") 26 emails match. Searching for all jenkins master builds emails since the first failure email found above (2 days ago), I see 40 messages. 26 over 40 is not far from

Re: OverseerStatusTest recent failures

2021-02-21 Thread Ilan Ginzburg
Yes Marcus this is the commit. David I would have expected 50% failures, as 50% of the runs use distributed updates. I’ll try to understand better as I fix the issue. Ilan On Sun 21 Feb 2021 at 06:17, David Smiley wrote: > Interesting. Do you have a guess as to why the failures there are ~5%

Re: OverseerStatusTest recent failures

2021-02-20 Thread David Smiley
Interesting. Do you have a guess as to why the failures there are ~5% and not 100% reproducible? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sat, Feb 20, 2021 at 6:41 PM Ilan Ginzburg wrote: > Indeed the issue is due to my changes. > > In

Re: OverseerStatusTest recent failures

2021-02-20 Thread Ilan Ginzburg
Indeed the issue is due to my changes. In OverseerStatusCmd I've skipped some stat collection when running in distributed cluster state updates mode because I thought these were only stats related to cluster state updates. Obviously that was too aggressive and some of the stats are related to the

Re: OverseerStatusTest recent failures

2021-02-20 Thread Ilan Ginzburg
Thank you David for reporting this. Seems due to my recent changes. I reproduce the failure locally and will look at this tomorrow. With the distributed cluster state updates i've introduced a randomization for using either Overseer based cluster state updates or distributed cluster state