[ 
https://issues.apache.org/jira/browse/LUCENE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526741#comment-13526741
 ] 

Shai Erera commented on LUCENE-4596:
------------------------------------

Ok, that was an interesting experience. Mike and I chatted about it, I'm w/o 
the code, I ask Mike to look here, he pastes code, then look there, he pastes 
another code ... like playing blind chess !

And then bam ! bug found (we think, Mike is still beasting). The code first 
updates DTW.cache, then updates the parents array. So what probably happens is:

* T1 addCategory(123), updates cache
* context switch, T2 addCategory(123), receives that 123 is found.
* T2 calls getParent(123), BOOM! parentArray still not updated by T1

Simple fix, swap the two lines in addCategoryDocument. Cache should always be 
updated last !

It's late and weekend here, I'll do some beasting too, and if all goes well, 
will commit the fix by Sunday !

Thanks Mike for guiding me in the dark ! :)

                
> DirectoryTaxonomyWriter concurrency bug
> ---------------------------------------
>
>                 Key: LUCENE-4596
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4596
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Blocker
>             Fix For: 4.1, 5.0
>
>         Attachments: LUCENE-4596.patch
>
>
> Mike tripped this error while running some benchmarks:
> {no format}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 130
>         at 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getParent(DirectoryTaxonomyWriter.java:835)
>         at 
> org.apache.lucene.facet.index.streaming.CategoryParentsStream.incrementToken(CategoryParentsStream.java:106)
>         at 
> org.apache.lucene.facet.index.streaming.CountingListTokenizer.incrementToken(CountingListTokenizer.java:63)
>         at 
> org.apache.lucene.facet.index.streaming.CategoryTokenizer.incrementToken(CategoryTokenizer.java:48)
>         at 
> org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:177)
>         at 
> org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:272)
>         at 
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:250)
>         at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
>         at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1455)
>         at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1130)
>         at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1111)
>         at perf.IndexThreads$IndexThread.run(IndexThreads.java:335)
> {noformat}
> At first we thought this might be related to LUCENE-4565, but he reverted to 
> before that commit and still hit the exception. I modified 
> TestDirTaxoWriter.testConcurrency to index hierarchical categories, thinking 
> that's the cause, but failed to reproduce.
> Eventually I realized that the test doesn't call getParent(), because it 
> tests DirTaxoWriter concurrency, not concurrent indexing. As soon as I added 
> a call to getParent, I hit this exception too.
> Adding 'synchronized' to DirTaxoWriter.addCategory seems to avoid that ex.
> I'll upload a patch with the modifications to the test and dig.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to