Here is a relevant email on the subject (from a very long impassioned thread...)

http://www.ussg.iu.edu/hypermail/linux/kernel/0207.1/1632.html

On Nov 30, 2007, at 7:31 AM, robert engels wrote:

My reading of the Unix specification shows it should work (the _commit under Windows is less clear, and since Windows is not inode based, there may be different issues).

http://www.opengroup.org/onlinepubs/007908799/xsh/fsync.html


On Nov 30, 2007, at 7:10 AM, Michael McCandless (JIRA) wrote:


[ https://issues.apache.org/jira/browse/LUCENE-1044? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12547122 ]

Michael McCandless commented on LUCENE-1044:
--------------------------------------------

{quote}
You could just queue the file names for sync, close them, and then have the background thread open, sync and close them. The close could trigger the OS to sync things faster in the background. Then the open/sync/close could mostly be a no-op. Might be worth a try.
{quote}

I am taking this approach now, but one nagging question I have is: do
we know with some certainty that re-opening a file and then sync'ing
it in fact syncs all writes that were ever done to this file in this
JVM, even with previously opened and now closed descriptors?  VS, eg,
only sync'ing any new writes done with that particular descriptor?

In code:

{code}
file = new RandomAccess(path, "rw");
<do many writes to file>
file.close();
new RandomAccess(path, "rw").getFD().sync();
{code}  

Are we pretty sure that all of the "many writes" will in fact be
sync'd by that sync call, on all OSs?

I haven't been able to find convincing evidence one way or another. I did run a timing test comparing overall time if you sync with the same
descriptor you used for writing vs closing it, opening a new one, and
syncing with that one, and on Linux at least it seems both approaches
seem to be syncing because the total elapsed time is roughly the
same.

Robert do you know?

I sure hope the answer is yes ... because if not, the alternative is
we must sync() before closing the original descriptor, which makes
things less flexible because eg we cannot cleanly implement
IndexWriter.commit().


Behavior on hard power shutdown
-------------------------------

                Key: LUCENE-1044
URL: https://issues.apache.org/jira/browse/ LUCENE-1044
            Project: Lucene - Java
         Issue Type: Bug
         Components: Index
Environment: Windows Server 2003, Standard Edition, Sun Hotspot Java 1.5
           Reporter: venkat rangan
           Assignee: Michael McCandless
            Fix For: 2.3

Attachments: FSyncPerfTest.java, LUCENE-1044.patch, LUCENE-1044.take2.patch, LUCENE-1044.take3.patch, LUCENE-1044.take4.patch


When indexing a large number of documents, upon a hard power failure (e.g. pull the power cord), the index seems to get corrupted. We start a Java application as an Windows Service, and feed it documents. In some cases (after an index size of 1.7GB, with 30-40 index segment .cfs files) , the following is observed. The 'segments' file contains only zeros. Its size is 265 bytes - all bytes are zeros. The 'deleted' file also contains only zeros. Its size is 85 bytes - all bytes are zeros. Before corruption, the segments file and deleted file appear to be correct. After this corruption, the index is corrupted and lost. This is a problem observed in Lucene 1.4.3. We are not able to upgrade our customer deployments to 1.9 or later version, but would be happy to back-port a patch, if the patch is small enough and if this problem is already solved.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to