[ 
https://issues.apache.org/jira/browse/SOLR-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324304#comment-14324304
 ] 

Shalin Shekhar Mangar commented on SOLR-7118:
---------------------------------------------

bq. But this test doesn't need at least one replica to kill like the safe 
leader test. It will kill leaders.

Look at ChaosMonkey.getRandomJetty() which calls checkIfKillIsLegal. If the 
number of active shards in a slice is less than 2 then ChaosMonkey doesn't kill 
a node at all.

bq. I don't know when it started happening but it looks like this test never 
adds a document at all?

Answering myself, actually the FullThrottleStopableIndexingThread is only used 
sometimes so the test does work but not always.

bq. Yeah, I see many request failing due to stale state and then "Not enough 
nodes to handle the request". I don't think we should mark node(s) as zombies 
if we exceed max retries on stale state.

This idea of mine was wrong. The node isn't being marked as a zombie i.e. it 
has nothing to do with LbHttpSolrClient. There's something else at play here. 
Still digging.

> ChaosMonkeyNothingIsSafeTest fails with too many update fails
> -------------------------------------------------------------
>
>                 Key: SOLR-7118
>                 URL: https://issues.apache.org/jira/browse/SOLR-7118
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud, Tests
>    Affects Versions: 5.0
>            Reporter: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.1
>
>         Attachments: SOLR-7118.patch
>
>
> There are frequent failures on both trunk and branch_5x with the following 
> message:
> {code}
> java.lang.AssertionError: There were too many update fails - we expect it can 
> happen, but shouldn't easily
>       at 
> __randomizedtesting.SeedInfo.seed([786DB0FD42626C16:F98B3EE5353D0C2A]:0)
>       at org.junit.Assert.fail(Assert.java:93)
>       at org.junit.Assert.assertTrue(Assert.java:43)
>       at org.junit.Assert.assertFalse(Assert.java:68)
>       at 
> org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:224)
>       at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to