[ 
https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547168
 ] 

Michael McCandless commented on LUCENE-1044:
--------------------------------------------

Another nuance here is ... say we do a "soft commit" (write a new
segment & segments_N but do not sync the files), and, the machine
crashes.  This is fine because there will always be an earlier commit
point (segments_M) that was a "hard commit" (sync was done).

Then, machine comes back up and we open a reader.  The reader sees
both segments_M (the hard commit) and segments_N (the soft commit) and
chooses segments_N because it's more recent.

We have retry logic in SegmentInfos to fallback to segments_M if we
hit an IOException on opening the index described by segments_N.

But, the problem is: the extent of the "corruption" caused by the
crash could be somewhat subtle.  EG a given file might be the right
length, but, filled w/ zeroes.  This is a problem because we may not
then hit an IOException while opening the reader, but only later hit
some exception while searching.

I think this means when we do a "soft commit" we should not in fact
write a new segments_N file (as we do today).  When we do a "hard
commit" we should first sync all files except the new segments_N file,
then write the segments_N file, then sync it.

The thing is, while we have been (and want to continue to be) vague
about exactly when a "commit" takes place as you add docs to
IndexWriter, users have presumably gotten used to every flush (when
autoCommit=true) committing a new segments_N file that an IndexReader
can then see.  So, this change (do not write segments_N file except
for a hard commit) will break that behavior.  Maybe, with the addition
of the explicit commit() method, this is OK?


> Behavior on hard power shutdown
> -------------------------------
>
>                 Key: LUCENE-1044
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1044
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>         Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 
> 1.5
>            Reporter: venkat rangan
>            Assignee: Michael McCandless
>             Fix For: 2.3
>
>         Attachments: FSyncPerfTest.java, LUCENE-1044.patch, 
> LUCENE-1044.take2.patch, LUCENE-1044.take3.patch, LUCENE-1044.take4.patch
>
>
> When indexing a large number of documents, upon a hard power failure  (e.g. 
> pull the power cord), the index seems to get corrupted. We start a Java 
> application as an Windows Service, and feed it documents. In some cases 
> (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the 
> following is observed.
> The 'segments' file contains only zeros. Its size is 265 bytes - all bytes 
> are zeros.
> The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes 
> are zeros.
> Before corruption, the segments file and deleted file appear to be correct. 
> After this corruption, the index is corrupted and lost.
> This is a problem observed in Lucene 1.4.3. We are not able to upgrade our 
> customer deployments to 1.9 or later version, but would be happy to back-port 
> a patch, if the patch is small enough and if this problem is already solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to