[ https://issues.apache.org/jira/browse/LUCENE-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Busch resolved LUCENE-458. ---------------------------------- Resolution: Duplicate The problem here apparently is that when the JVM crashed not all files are properly synced with the FS. This seems to be a similar problem to LUCENE-1044. > Merging may create duplicates if the JVM crashes half way through > ----------------------------------------------------------------- > > Key: LUCENE-458 > URL: https://issues.apache.org/jira/browse/LUCENE-458 > Project: Lucene - Java > Issue Type: Bug > Affects Versions: 1.4 > Environment: Windows XP SP2, JDK 1.5.0_04 (crash occurred in this > version. We've updated to 1.5.0_05 since, but discovered this issue with an > older text index since.) > Reporter: Trejkaz > > In the past, our indexing process crashed due to a Hotspot compiler bug on > SMP systems (although it could happen with any bad native code.) Everything > picked up and appeared to work, but now that it's a month later I've > discovered an oddity in the text index. > We have two documents which are identical in the text index. I know we only > stored it once for two reasons. First, we store the MD5 of every document > into the hash and the MD5s were the same. Second, we store a GUID into each > document which is generated uniquely for each document. The GUID and the MD5 > hash on these two documents, as well as all other fields, is exactly the same. > My conclusion is that a merge was occurring at the point the JVM crashed, > which is consistent with the time the process crashed. Is it possible that > Lucene did the copy of this document to the new location, and didn't get to > delete the original? > If so, I guess this issue should be prevented somehow. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]