[ https://issues.apache.org/jira/browse/LUCENE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130172#comment-14130172 ]
Sanne Grinovero commented on LUCENE-5541: ----------------------------------------- Hi [~mikemccand], I think I'm hitting this issue, indeed using still Lucene 3.6.2. Your comments are much appreciated, but I'm not understanding how {{File.exists}} is related with the exception, when this is being thrown by the {{CompoundFileReader}} ? In fact these tests were run having compound files disabled, so I'd love to put a breackpoint in the IndexWriter code where it decided this segment needed to be wrapped in a {{CompoundFileReader}}, however it seems I can't easily reproduce the same error. In case we're able to reproduce it again I would like to provide a patch, even if I understand there won't be more releases. > FileExistsCachingDirectory, to work around unreliable File.exists > ----------------------------------------------------------------- > > Key: LUCENE-5541 > URL: https://issues.apache.org/jira/browse/LUCENE-5541 > Project: Lucene - Core > Issue Type: Bug > Components: core/store > Reporter: Michael McCandless > Attachments: LUCENE-5541.patch > > > File.exists is a dangerous method in Java, because if there is a > low-level IOException (permission denied, out of file handles, etc.) > the method can return false when it should return true. > Fortunately, as of Lucene 4.x, we rely much less on File.exists, > because we track which files the codec components created, and we know > those files then exist. > But, unfortunately, going from 3.0.x to 3.6.x, we increased our > reliance on File.exists, e.g. when creating CFS we check File.exists > on each sub-file before trying to add it, and I have a customer > corruption case where apparently a transient low level IOE caused > File.exists to incorrectly return false for one of the sub-files. It > results in corruption like this: > {noformat} > java.io.FileNotFoundException: No sub-file with id .fnm found > (fileName=_1u7.cfs files: [.tis, .tii, .frq, .prx, .fdt, .nrm, .fdx]) > > org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:157) > > org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:146) > org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:71) > org.apache.lucene.index.IndexWriter.getFieldInfos(IndexWriter.java:1212) > > org.apache.lucene.index.IndexWriter.getCurrentFieldInfos(IndexWriter.java:1228) > org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1161) > {noformat} > I think typically local file systems don't often hit such low level > errors, but if you have an index on a remote filesystem, where network > hiccups can cause problems, it's more likely. > As a simple workaround, I created a basic Directory delegator that > holds a Set of all created but not deleted files, and short-circuits > fileExists to return true if the file is in that set. > I don't plan to commit this: we aren't doing bug-fix releases on > 3.6.x anymore (it's very old by now), and this problem is already > "fixed" in 4.x (by reducing our reliance on File.exists), but I wanted > to post the code here in case others hit it. It looks like it was hit > e.g. https://netbeans.org/bugzilla/show_bug.cgi?id=189571 and > https://issues.jboss.org/browse/ISPN-2981 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org