[ 
https://issues.apache.org/jira/browse/SOLR-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305639#comment-16305639
 ] 

Steve Rowe commented on SOLR-11258:
-----------------------------------

This branch_7x nightly failure from my Jenkins reproduced for me 5/5 iterations:

{noformat}
Checking out Revision 89344ea4c5c1c1f7da2797f0e724574751723976 
(refs/remotes/origin/branch_7x)
[...]
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=ChaosMonkeySafeLeaderWithPullReplicasTest -Dtests.method=test 
-Dtests.seed=DD16DE67F3F5708A -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=fr -Dtests.timezone=US/Indiana-Starke -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] FAILURE  104s J6 | ChaosMonkeySafeLeaderWithPullReplicasTest.test 
<<<
   [junit4]    > Throwable #1: java.lang.AssertionError: The Monkey ran for 
over 60 seconds and no jetties were stopped - this is worth investigating!
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([DD16DE67F3F5708A:5542E1BD5D091D72]:0)
   [junit4]    >        at 
org.apache.solr.cloud.ChaosMonkey.stopTheMonkey(ChaosMonkey.java:589)
   [junit4]    >        at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderWithPullReplicasTest.test(ChaosMonkeySafeLeaderWithPullReplicasTest.java:175)
   [junit4]    >        at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
   [junit4]    >        at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
[...]
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
{rnd_b=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128))),
 a_t=PostingsFormat(name=Memory), 
id=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
 docValues:{_version_=DocValuesFormat(name=Asserting)}, 
maxPointsInLeafNode=590, maxMBSortInHeap=5.448183350014092, 
sim=RandomSimilarity(queryNorm=false): {}, locale=fr, timezone=US/Indiana-Starke
   [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 
1.8.0_151 (64-bit)/cpus=16,threads=6,free=170276048,total=524288000
{noformat}

> ChaosMonkeySafeLeaderWithPullReplicasTest fails a lot & reproducibly:  The 
> Monkey ran for over 45 seconds and no jetties were stopped - this is worth 
> investigating!
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11258
>                 URL: https://issues.apache.org/jira/browse/SOLR-11258
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>
> Between June21 & Aug18, there have been 18 failures like this...
> {noformat}
>    [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=ChaosMonkeySafeLeaderWithPullReplicasTest -Dtests.method=test 
> -Dtests.seed=7669B63E9E4D1685 -Dtests.nightly=true -Dtests.slow=true 
> -Dtests.locale=pa-Guru -Dtests.timezone=Europe/Podgorica -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>    [junit4] FAILURE 82.4s | ChaosMonkeySafeLeaderWithPullReplicasTest.test <<<
>    [junit4]    > Throwable #1: java.lang.AssertionError: The Monkey ran for 
> over 45 seconds and no jetties were stopped - this is worth investigating!
>    [junit4]    >        at 
> __randomizedtesting.SeedInfo.seed([7669B63E9E4D1685:FE3D89E430B17B7D]:0)
>    [junit4]    >        at 
> org.apache.solr.cloud.ChaosMonkey.stopTheMonkey(ChaosMonkey.java:587)
>    [junit4]    >        at 
> org.apache.solr.cloud.ChaosMonkeySafeLeaderWithPullReplicasTest.test(ChaosMonkeySafeLeaderWithPullReplicasTest.java:174)
>    [junit4]    >        at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
>    [junit4]    >        at 
> org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
>    [junit4]    >        at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In my own testing, when these failures happen, the seeds reproduce - 
> suggesting the problem is logic flaw in the test that can can happen by 
> chance.
> Perhaps the ChaosMonkey needs to be changed to get more aggressive about 
> stopping nodes bsaed on how long it's been since hte last time it stopped a 
> node?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to