Interesting.  Do you have a guess as to why the failures there are ~5% and
not 100% reproducible?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Feb 20, 2021 at 6:41 PM Ilan Ginzburg <ilans...@gmail.com> wrote:

> Indeed the issue is due to my changes.
>
> In OverseerStatusCmd I've skipped some stat collection when running in
> distributed cluster state updates mode because I thought these were only
> stats related to cluster state updates.
> Obviously that was too aggressive and some of the stats are related to the
> Collection API.
>
> I will make sure to skip returning only the stats that are related to
> cluster state updater and restore returning collection api stats (when
> running in distributed cluster updates mode, otherwise all stats are
> returned).
>
> Tomorrow...
>
> Ilan
>
> On Sun, Feb 21, 2021 at 12:22 AM Ilan Ginzburg <ilans...@gmail.com> wrote:
>
>> Thank you David for reporting this.
>>
>> Seems due to my recent changes. I reproduce the failure locally and will
>> look at this tomorrow.
>>
>> With the distributed cluster state updates i've introduced a
>> randomization for using either Overseer based cluster state updates or
>> distributed cluster state updates in tests. This failure seems to happen in
>> the distributed state update case. I suspect it is due to Overseer
>> returning less stats than expected by the test (which is expected: Overseer
>> cannot return stats about cluster state updates if it does not handle
>> cluster state updates).
>>
>> The following line in the logs tells that the run is using distributed
>> cluster state:
>> 972874 INFO  (jetty-launcher-8973-thread-2) [     ]
>> o.a.s.c.DistributedClusterStateUpdater Creating
>> DistributedClusterStateUpdater with useDistributedStateUpdate=true. Solr
>> will be using distributed cluster state updates.
>>
>> Ilan
>>
>>
>> On Sat, Feb 20, 2021 at 3:00 PM David Smiley <dsmi...@apache.org> wrote:
>>
>>> I encountered a failure from OverseerStatusTest locally.  According to
>>> our test failure trends, this guy only just recently started failing ~4-5%
>>> of the time, but previously was fine.  Only master branch.
>>>
>>>
>>> http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.OverseerStatusTest.test
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>

Reply via email to