[ 
https://issues.apache.org/jira/browse/LUCENE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097149#comment-14097149
 ] 

Shai Erera commented on LUCENE-5885:
------------------------------------

I ran tests after the fix on LUCENE-5871, but I get this failure:

{noformat}
ant test  -Dtestcase=TestIndexWriter -Dtests.method=testThreadInterruptDeadlock 
-Dtests.seed=80C0F4CAF8F7E5D8 -Dtests.slow=true -Dtests.locale=bg_BG 
-Dtests.timezone=Africa/Blantyre -Dtests.file.encoding=US-ASCII
{noformat}

The test fails because a MergeThread hits AlreadyClosedException. I added some 
sops and I think that's because of how this test interrupts the indexing 
thread. So what I see is that the indexing thread enters IW.shutdown(), calls 
flush, then finishMerges, but doesn't get to call commitInternal, because it 
hits an interrupt. Therefore, I assume the indexing thread is inside 
flushMerges(true), which waitForMerges(). But since it hits an interrupt, it 
throws a ThreadInterruptedException.

With this patch, it means there are still pending or running merges, and since 
in IW.rollbackInternal I no longer call MS.close() (which for CMS meant 
finishing those merges, while ignoring InterruptedException), there is a merge 
still running, and hence the exception.

In fact, I think that the way we abortMerges() isn't safe. After we notify all 
merges to abort, we wait on the running merges. But if we hit an 
InterruptedException while waiting, we just throw this exception further, and 
in fact leave running merges behind. At some point they will die, since we 
marked their merges as aborted, but since this is rollback and we do EVERYTHING 
to close this writer, we proceed, and so the background running merge hits the 
AlreadyClosedException. I tried to quickly wait and ignore InterruptedException 
until success, but the test failed on being interrupted ... I'll debug it later.

> MergeScheduler should not implement Closeable
> ---------------------------------------------
>
>                 Key: LUCENE-5885
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5885
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Shai Erera
>         Attachments: LUCENE-5885.patch
>
>
> MergeScheduler implements Closeable and IndexWriter calls ms.close() when 
> it's closed. But MergeScheduler can be shared between several writers, which 
> means closing it by any particular writer is wrong. We should rather 
> implement some ref-counting logic such that each IW will call incRef() in the 
> ctor, and decRef() on close(), and MergeScheduler will truly close when the 
> ref-count hits 0.
> As it is now, if you share a MergeScheduler between writers and close() does 
> something terminating, I doubt if it really works.
> Also, when I look at ConcurrentMergeScheduler.close(), it calls sync() which 
> joins all MergeThreads. But if that CMS instance is shared between few IWs, 
> doesn't it mean that a single IW calling close() waits on MergeThreads that 
> execute merges of other IWs!?!? This seems ... wrong?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to