Re: Recent PRS test flakiness

Noble Paul Sun, 02 Oct 2022 14:17:18 -0700

Let's mark them as @BadApple for the time being. I shall look into them
right away


On Mon, Oct 3, 2022, 6:46 AM Jason Gerlowski <[email protected]> wrote:

> Hey all,
>
> I noticed this week (after running into a handful of test failures locally)
> that 3 of our 5 flakiest tests (according to fucit) are trying to test
> Solr's "per-replica state" code.  The tests in question are:
> PerReplicaStatesIntegrationTest.testRestart,
> PerReplicaStatesIntegrationTest.testPerReplicaStateCollection, and
> CloudSolrClientTest.testPerReplicaStateCollection.
>
> All three of these saw a big jump in flakiness between Sept 12 and Sept
> 19.  I spent a bit of time debugging, but didn't get all too far.  In most
> failures, the test times out waiting for a particular number of replicas to
> be reported in ZooKeeper.  I suspect there's a race condition in how we're
> updating our ZK state, but that's as far as I was able to get for now.
>
> So, a few questions:
>
> 1. Does anyone know what the root cause of these failures might be, or at
> least what might've caused their flakiness to spike in mid-Sept?
> 2. Should we temporarily @Ignore or @BadApple them to help the builds out a
> bit until someone with context can attend to them?
>
> Best,
>
> Jason
>

Re: Recent PRS test flakiness

Reply via email to