[ http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376115 ]
Chuck Williams commented on LUCENE-555: --------------------------------------- I'm surprised that an optimize led to a corrupt index. Other than the non-atomic rename problem, there isn't anything else in Lucene that should lead to corruption. Even a failure in rename is recoverable, since some correct version of the segments file is always available. The one recovery issue I've encountered is the buffering of recently indexed documents in memory. I have a selective journaling mechanism that saves just as many documents as might ever be buffered in RAM. This mechanism supports several logging modes: complete information, just stored fields wth the ability to compute the others, and just keys with the ability to retrieve externally. A mechanism like this could be extended to support unlimited journaling if you really want it. If the optimize of an index leaves it corrupted, this is a bug that should be fixed. If Lucene is robust in the sense that it doesn't corrupt the index, as it is designed to be now, I think this is sufficient. Optional journaling facilties would be a nice add-on feature. It is somethat that is not too hard for applications to create now. The mechanism I use is bundled with a number of other useful services, including updating the index by modifying field values of selected documents, managing synchronization of delete, write and update, managing the periodic refreshing of the reader used for search, etc. If I can get agreement from my Company, the I'll contribute some of all of this. Maybe it would be of help. > Index Corruption > ---------------- > > Key: LUCENE-555 > URL: http://issues.apache.org/jira/browse/LUCENE-555 > Project: Lucene - Java > Type: Bug > Components: Index > Versions: 1.9 > Environment: Linux FC4, Java 1.4.9 > Reporter: dan > Priority: Critical > > Index Corruption > >>>>>>>>> output > java.io.FileNotFoundException: ../_aki.fnm (No such file or directory) > at java.io.RandomAccessFile.open(Native Method) > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:204) > at > org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:425) > at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:434) > at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324) > at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:56) > at > org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144) > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129) > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110) > at > org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674) > at > org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658) > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517) > >>>>>>>>> input > - I open an index, I read, I write, I optimize, and eventually the above > happens. The index is unusable. > - This has happened to me somewhere between 20 and 30 times now - on indexes > of different shapes and sizes. > - I don't know the reason. But, the following requirement applies regardless. > >>>>>>>>> requirement > - Like all modern database programs, there has to be a way to repair an > index. Period. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]