[
https://issues.apache.org/jira/browse/LUCENE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526741#comment-13526741
]
Shai Erera commented on LUCENE-4596:
------------------------------------
Ok, that was an interesting experience. Mike and I chatted about it, I'm w/o
the code, I ask Mike to look here, he pastes code, then look there, he pastes
another code ... like playing blind chess !
And then bam ! bug found (we think, Mike is still beasting). The code first
updates DTW.cache, then updates the parents array. So what probably happens is:
* T1 addCategory(123), updates cache
* context switch, T2 addCategory(123), receives that 123 is found.
* T2 calls getParent(123), BOOM! parentArray still not updated by T1
Simple fix, swap the two lines in addCategoryDocument. Cache should always be
updated last !
It's late and weekend here, I'll do some beasting too, and if all goes well,
will commit the fix by Sunday !
Thanks Mike for guiding me in the dark ! :)
> DirectoryTaxonomyWriter concurrency bug
> ---------------------------------------
>
> Key: LUCENE-4596
> URL: https://issues.apache.org/jira/browse/LUCENE-4596
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/facet
> Reporter: Shai Erera
> Assignee: Shai Erera
> Priority: Blocker
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4596.patch
>
>
> Mike tripped this error while running some benchmarks:
> {no format}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 130
> at
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getParent(DirectoryTaxonomyWriter.java:835)
> at
> org.apache.lucene.facet.index.streaming.CategoryParentsStream.incrementToken(CategoryParentsStream.java:106)
> at
> org.apache.lucene.facet.index.streaming.CountingListTokenizer.incrementToken(CountingListTokenizer.java:63)
> at
> org.apache.lucene.facet.index.streaming.CategoryTokenizer.incrementToken(CategoryTokenizer.java:48)
> at
> org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:177)
> at
> org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:272)
> at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:250)
> at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1455)
> at
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1130)
> at
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1111)
> at perf.IndexThreads$IndexThread.run(IndexThreads.java:335)
> {noformat}
> At first we thought this might be related to LUCENE-4565, but he reverted to
> before that commit and still hit the exception. I modified
> TestDirTaxoWriter.testConcurrency to index hierarchical categories, thinking
> that's the cause, but failed to reproduce.
> Eventually I realized that the test doesn't call getParent(), because it
> tests DirTaxoWriter concurrency, not concurrent indexing. As soon as I added
> a call to getParent, I hit this exception too.
> Adding 'synchronized' to DirTaxoWriter.addCategory seems to avoid that ex.
> I'll upload a patch with the modifications to the test and dig.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]