[
https://issues.apache.org/jira/browse/SOLR-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441898#comment-16441898
]
Robert Muir commented on SOLR-12232:
------------------------------------
{quote}
Understood. But we don't have to use NIO.
{quote}
Yes, use another lock factory or some alternative if you want. But this is NIO
lock factory, and well it uses NIO. And its behavior is correct: its wrong to
interrupt the NIO stuff. It is definitely OK to dictate that its wrong to
interrupt NIO stuff, we document it that way for a reason, because its
dangerous.
Lock validation and other checks here are important because they prevent screw
crazy corruption-looking cases from showing up. Please don't shoot the
messenger but fix the actual bugs instead (the perp calling interrupt on lucene
threads).
> NativeFSLockFactory loses the channel when a thread is interrupted and the
> SolrCore becomes unusable after
> ----------------------------------------------------------------------------------------------------------
>
> Key: SOLR-12232
> URL: https://issues.apache.org/jira/browse/SOLR-12232
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 7.1.1
> Reporter: Jeff Miller
> Assignee: Erick Erickson
> Priority: Minor
> Labels: NativeFSLockFactory, locking
> Original Estimate: 24h
> Time Spent: 10m
> Remaining Estimate: 23h 50m
>
> The condition is rare for us and seems basically a race. If a thread that is
> running just happens to have the FileChannel open for NativeFSLockFactory and
> is interrupted, the channel is closed since it extends
> [AbstractInterruptibleChannel|https://docs.oracle.com/javase/7/docs/api/java/nio/channels/spi/AbstractInterruptibleChannel.html]
> Unfortunately this means the Solr Core has to be unloaded and reopened to
> make the core usable again as the ensureValid check forever throws an
> exception after.
> org.apache.lucene.store.AlreadyClosedException: FileLock invalidated by an
> external force:
> NativeFSLock(path=data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807
> exclusive invalid],creationTime=2018-04-06T21:45:11Z) at
> org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:178)
> at
> org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:43)
> at
> org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
> at
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:113)
> at
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
> at
> org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
>
> Proposed solution is using AsynchronousFileChannel instead, since this is
> only operating on a lock and .size method
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]