[ https://issues.apache.org/jira/browse/LUCENE-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated LUCENE-8692: ----------------------------- Summary: IndexWriter.getTragicException() may not reflect all corrupting exceptions (notably: NoSuchFileException) (was: IndexWriter.getTragicException() nay not reflect all corrupting exceptions (notably: NoSuchFileException)) > IndexWriter.getTragicException() may not reflect all corrupting exceptions > (notably: NoSuchFileException) > --------------------------------------------------------------------------------------------------------- > > Key: LUCENE-8692 > URL: https://issues.apache.org/jira/browse/LUCENE-8692 > Project: Lucene - Core > Issue Type: Bug > Reporter: Hoss Man > Priority: Major > Attachments: LUCENE-8692.patch, LUCENE-8692.patch, LUCENE-8692.patch, > LUCENE-8692_test.patch > > > Backstory... > Solr has a "LeaderTragicEventTest" which uses MockDirectoryWrapper's > {{corruptFiles}} to introduce corruption into the "leader" node's index and > then assert that this solr node gives up it's leadership of the shard and > another replica takes over. > This can currently fail sporadically (but usually reproducibly - > seeSOLR-13237) due to the leader not giving up it's leadership even after the > corruption causes an update/commit to fail. Solr's leadership code makes > this decision after encountering an exception from the IndexWriter based on > wether {{IndexWriter.getTragicException()}} is (non-)null. > ---- > While investigating this, I created an isolated Lucene-Core equivilent test > that demonstrates the same basic situation: > * Gradually cause corruption on an index untill (otherwise) valid execution > of IW.add() + IW.commit() calls throw an exception to the IW client. > * assert that if an exception is thrown to the IW client, > {{getTragicException()}} is now non-null. > It's fairly easy to make my new test fail reproducibly -- in every situation > I've seen the underlying exception is a {{NoSuchFileException}} (ie: the > randomly introduced corruption was to delete some file). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org