[ 
https://issues.apache.org/jira/browse/SOLR-6640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249708#comment-14249708
 ] 

Shalin Shekhar Mangar commented on SOLR-6640:
---------------------------------------------

I am looking at this failure too and I see another bug. I was wondering why did 
the replica have these writes in the first place considering that it hadn't 
recovery on startup wasn't complete yet.

# RecoveryStrategy publishes the state of the replica as 'recovering' before it 
sets the update log to buffering mode which is why the leader sends updates to 
this replica that affect the index.
# The test itself doesn't wait for a steady state e.g. by calling 
waitForRecovery or waitForThingsToLevelOut before starting the indexing 
threads. This is probably a good thing because that's what has helped us find 
this problem.
# Shouldn't the peersync also be done while update log is set to buffering mode?

{quote}
So it's these files which are not getting removed when we do IW.rollback that 
were causing the problem - 
_0.cfe _0.cfs _0.si _0_1.liv _1.fdt _1.fdx
I am yet to figure out whether these files should have been removed by 
IW.rollback() or not?
{quote}

These files hang around because an IndexReader is open using the IndexWriter 
due to soft commit(s).

> ChaosMonkeySafeLeaderTest failure with CorruptIndexException
> ------------------------------------------------------------
>
>                 Key: SOLR-6640
>                 URL: https://issues.apache.org/jira/browse/SOLR-6640
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java)
>    Affects Versions: 5.0
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 5.0
>
>         Attachments: Lucene-Solr-5.x-Linux-64bit-jdk1.8.0_20-Build-11333.txt, 
> SOLR-6640.patch, SOLR-6640.patch
>
>
> Test failure found on jenkins:
> http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11333/
> {code}
> 1 tests failed.
> REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch
> Error Message:
> shard2 is not consistent.  Got 62 from 
> http://127.0.0.1:57436/collection1lastClient and got 24 from 
> http://127.0.0.1:53065/collection1
> Stack Trace:
> java.lang.AssertionError: shard2 is not consistent.  Got 62 from 
> http://127.0.0.1:57436/collection1lastClient and got 24 from 
> http://127.0.0.1:53065/collection1
>         at 
> __randomizedtesting.SeedInfo.seed([F4B371D421E391CD:7555FFCC56BCF1F1]:0)
>         at org.junit.Assert.fail(Assert.java:93)
>         at 
> org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1255)
>         at 
> org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1234)
>         at 
> org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:162)
>         at 
> org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869)
> {code}
> Cause of inconsistency is:
> {code}
> Caused by: org.apache.lucene.index.CorruptIndexException: file mismatch, 
> expected segment id=yhq3vokoe1den2av9jbd3yp8, got=yhq3vokoe1den2av9jbd3yp7 
> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/solr/build/solr-core/test/J0/temp/solr.cloud.ChaosMonkeySafeLeaderTest-F4B371D421E391CD-001/tempDir-001/jetty3/index/_1_2.liv")))
>    [junit4]   2>              at 
> org.apache.lucene.codecs.CodecUtil.checkSegmentHeader(CodecUtil.java:259)
>    [junit4]   2>              at 
> org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat.readLiveDocs(Lucene50LiveDocsFormat.java:88)
>    [junit4]   2>              at 
> org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat.readLiveDocs(AssertingLiveDocsFormat.java:64)
>    [junit4]   2>              at 
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:102)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to