> Note that IndexReader has a main() that will list the contents of compound index files.
It looks like some of my index is compound and some isn't. My not very well informed guess is that an optimize() got interrupted somewhere along the line. If I try to optimize the index now, it throws exceptions. lucli> optimize Starting to optimize index. java.io.IOException: Cannot overwrite: /mnt/sdb1/index/index-1/_2lhqi.fnm at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:322) at org.apache.lucene.index.FieldInfos.write(FieldInfos.java:255) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:176) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:707) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:684) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:543) at lucli.LuceneMethods.optimize(LuceneMethods.java:192) at lucli.Lucli.handleCommand(Lucli.java:223) at lucli.Lucli.<init>(Lucli.java:148) at lucli.Lucli.main(Lucli.java:175) Would the smart thing to do at this point be to use IndexReader's main() method to extract the contents of the compound files and then to attempt to merge them with the unmerged indexes? [I'll need to delve further into Doug's excellent LIA to figure out how to do this.] To recap, I have 98 GB of files in my index, with: 51 GB devoted to field data (.fdt), 13 GB devoted to term positions (.prx) 13 GB devoted to term frequencies (.frq) 12 GB devoted to compound files for the field index (.cfs) I seem to have the same mess on a parallel system too - I'm indexing in two data centres.
smime.p7s
Description: S/MIME cryptographic signature