[
https://issues.apache.org/jira/browse/SOLR-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325641#comment-14325641
]
Shalin Shekhar Mangar commented on SOLR-7118:
---------------------------------------------
bq. An indexing request comes in but there is no leader in the cached state.
Every subsequent indexing request will continue to fail until the cache entry
expires i.e. 60 seconds
Reading the code again I realized that when this condition happens, the cached
state is evicted. So, the above statement isn't true. What's really happening
here is that in the time it takes for leader election, many updates are
rejected (as they should be) and the test fails saying too many updates failed.
>From the logs,
# The leader election process logs the following at time=1646937:
{code}
[junit4] 2> 1646937 T8923 oasc.ShardLeaderElectionContext.runLeaderProcess I
may be the new leader - try and sync
{code}
# Then it goes to sleep for 2500 ms to wait for ongoing updates to finish (see
ShardLeaderElectionContext.runLeaderProcess)
# By the time it wakes up at time=1649437, the monkey is finished and the test
is being teared down
{code}
[junit4] 2> 1648644 T8922 oasc.ChaosMonkey.monkeyLog monkey: finished
[junit4] 2> 1648645 T8922 oasc.ChaosMonkey.monkeyLog monkey: I ran for
7.931sec. I stopped 1 and I started 0. I also expired 0 and caused 0 connection
losses
[junit4] 2> added docs:123 with 24 fails deletes:44
[junit4] 2> num searches done:3 with 0 fails
[junit4] 2> ASYNC NEW_CORE C3537 name=collection1
org.apache.solr.core.SolrCore@1f81ebc url=http://127.0.0.1:38608/collection1
node=127.0.0.1:38608_ C3537_STATE=coll:collection1 core:collection1
props:{core=collection1, base_url=http://127.0.0.1:38608,
node_name=127.0.0.1:38608_, state=active}
[junit4] 2> 1649437 T8923 C3537 P38608 oasc.SyncStrategy.sync Sync
replicas to http://127.0.0.1:38608/collection1/
{code}
In summary, there is no bug, just another spurious failure because the test is
tolerant of an arbitrary number of failures.
> ChaosMonkeyNothingIsSafeTest fails with too many update fails
> -------------------------------------------------------------
>
> Key: SOLR-7118
> URL: https://issues.apache.org/jira/browse/SOLR-7118
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud, Tests
> Affects Versions: 5.0
> Reporter: Shalin Shekhar Mangar
> Fix For: Trunk, 5.1
>
> Attachments: SOLR-7118.patch
>
>
> There are frequent failures on both trunk and branch_5x with the following
> message:
> {code}
> java.lang.AssertionError: There were too many update fails - we expect it can
> happen, but shouldn't easily
> at
> __randomizedtesting.SeedInfo.seed([786DB0FD42626C16:F98B3EE5353D0C2A]:0)
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at org.junit.Assert.assertFalse(Assert.java:68)
> at
> org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:224)
> at
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]