[ 
https://issues.apache.org/jira/browse/LUCENE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660586#comment-13660586
 ] 

Simon Willnauer commented on LUCENE-5002:
-----------------------------------------

Ok so I tried to make this work for an entire day and bottom line is that once 
I move the DocumentsWriter#abort() out of the sync block my test still fails 
all over the place. Yet, it's not hanging but concurrent access to IW while 
IW#deleteAll() is called is entirely broken IMO. I don't even know where to 
start, here is a small wrapup of the failures I saw:
 - asserts are tripped in global field map since we clear and concurrently 
index (remember indexing is non-blocking)
 - concurrent commits fail with fiel not found exception (even if we fully 
sync) seems like some state in IW is not cleared
 - updatePendingMerges fails with FNF when merges are updated concurrently.

To begin with I doubt that the semantics of IW#deleteAll() are correct today if 
you are accessing the IW concurrently. I mean we basically dropping everything 
and don't maintain any happens before relationship here at all, delete all 
files that are not referenced in any seg info wipe all the global field infos 
etc. We should address this properly.

I agree that we have to fix this until 4.3.1!

Yet, Serguiuz  do you see any FileNotFoundExceptions or anything when you 
concurrently call deleteAll? I mean this seems entirely broken to me at this 
point. I suggest you to use deleteQuery(new MatchAllDocsQuery()) for now and 
not lock globally. 

simon
                
> Deadlock in DocumentsWriterFlushControl
> ---------------------------------------
>
>                 Key: LUCENE-5002
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5002
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 4.3
>         Environment: OpenJDK 64-Bit Server VM (23.7-b01 mixed mode)
> Linux Ubuntu Server 12.04 LTS 64-Bit
>            Reporter: Sergiusz Urbaniak
>            Assignee: Simon Willnauer
>             Fix For: 5.0, 4.4, 4.3.1
>
>         Attachments: LUCENE-5002_test.patch
>
>
> Hi all,
> We have an obvious deadlock between a "MaybeRefreshIndexJob" thread
> calling ReferenceManager.maybeRefresh(ReferenceManager.java:204) and a
> "RebuildIndexJob" thread calling
> IndexWriter.deleteAll(IndexWriter.java:2065).
> Lucene wants to flush in the "MaybeRefreshIndexJob" thread trying to 
> intrinsically lock the IndexWriter instance at 
> {{DocumentsWriterPerThread.java:563}} before notifyAll()ing the flush. 
> Simultaneously the "RebuildIndexJob" thread who already intrinsically locked 
> the IndexWriter instance at IndexWriter#deleteAll wait()s at 
> {{DocumentsWriterFlushControl.java:245}} for the flush forever causing a 
> deadlock.
> {code}
> "MaybeRefreshIndexJob Thread - 2" daemon prio=10 tid=0x00007f8fe4006000 
> nid=0x1ac2 waiting for monitor entry [0x00007f8fa7bf7000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at 
> org.apache.lucene.index.IndexWriter.useCompoundFile(IndexWriter.java:2223)
>       - waiting to lock <0x00000000f1c00438> (a 
> org.apache.lucene.index.IndexWriter)
>       at 
> org.apache.lucene.index.DocumentsWriterPerThread.sealFlushedSegment(DocumentsWriterPerThread.java:563)
>       at 
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:533)
>       at 
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422)
>       at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:559)
>       at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:365)
>       - locked <0x00000000f1c007d0> (a java.lang.Object)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:270)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:245)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235)
>       at 
> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:170)
>       at 
> org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:118)
>       at 
> org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
>       at 
> org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:155)
>       at 
> org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:204)
>       at jobs.MaybeRefreshIndexJob.timeout(MaybeRefreshIndexJob.java:47)
> "RebuildIndexJob Thread - 1" prio=10 tid=0x00007f903000a000 nid=0x1a38 in 
> Object.wait() [0x00007f9037dd6000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x00000000f1c0c240> (a 
> org.apache.lucene.index.DocumentsWriterFlushControl)
>       at java.lang.Object.wait(Object.java:503)
>       at 
> org.apache.lucene.index.DocumentsWriterFlushControl.waitForFlush(DocumentsWriterFlushControl.java:245)
>       - locked <0x00000000f1c0c240> (a 
> org.apache.lucene.index.DocumentsWriterFlushControl)
>       at 
> org.apache.lucene.index.DocumentsWriter.abort(DocumentsWriter.java:235)
>       - locked <0x00000000f1c05370> (a 
> org.apache.lucene.index.DocumentsWriter)
>       at org.apache.lucene.index.IndexWriter.deleteAll(IndexWriter.java:2065)
>       - locked <0x00000000f1c00438> (a org.apache.lucene.index.IndexWriter)
>       at jobs.RebuildIndexJob.buildIndex(RebuildIndexJob.java:102)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to