Hi,

 

the error message is: “java.io.IOException: No sub-file with id _xv.fnm found”, 
produced by CompoundFileReader. This means that, the corresponding compound 
file does not contain the missing sile –xv.fnm. Because it should be inside the 
CFS file, it is of course not part of the directory. Lucene never has separate 
copies of the same data, only during merging or when commit points are kept for 
later use.

 

The CFS file seems to be corrupt. Back in times of Lucene 3, CFS files had 
their “index” (the dictionary of files inside) at the end of the file, because 
it was written at the end (and then the offset of dictionary was written at 
beginning of file). You mentioned that you had disk full issues, so it’s almost 
sure, that the cfs file is incomplete and the dictionary is completely missing. 
It is very unlikely that you can recover from that situation unless you have 
very deep knowledge on

 

In some  Lucene JAR files is an additional tool to “extract” CFS files (like 
unzip), you may try to use it – but I am not sure if this was already existent 
in Lucene 3.0.3 (you need to do some Javadoc search to look it up). But without 
the dictionary at the end of the file it will also not work.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: u...@thetaphi.de

 

From: Shlomit Rosen [mailto:shlom...@il.ibm.com] 
Sent: Wednesday, December 17, 2014 8:04 PM
To: <java-user@lucene.apache.org>
Subject: Index corruption with lucene 3.0.3

 

Hello, 

We have a client that is using lucene 3.0.3. 
They  are working with NAS storage device which recently had permission issues, 
which might have generated some "out of disk space" exceptions during indexing. 
We are uncertain if they also suffered JDK crashes in the past few months, as 
we 
discovered dmp files and javacores on their system. 

Consequently, they now have 3 corrupted indices. 
All of them show a similar issue: 

java.io.IOException: No sub-file with id _xv.fnm found 
        at 
org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:137)
 
        at 
org.apache.lucene.index.CompoundFileReader.openInput(CompoundFileReader.java:125)
 
        at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:68) 
        at 
org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReader.java:120)
 
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:605) 
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:583) 
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:470) 
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:883) 


Looking at the indices file listing, I see that this file (i.e. - _xv.fnm) is 
really missing, 
but I also see that a compound file with the same name exist on disk (i.e. - 
_xv.cfs). 

My question is - 
        is there a way to "save" the collection by re-creating the fnm file 
from the cfs file (or in any other way...?) 
        Or does our client need to re-index the entire collection? (Assuming 
the checkIndex -fix option is no good, because we cannot know which documents 
are lost...) 

I'm attaching the checkIndex output as reference 

Thanks in advance! 
Shlomit 



Reply via email to