It is unclear if you need to open them for writing. The unix specs
clearly allow you do call fsync on ANY file descriptor.
The Linux docs seem to imply that a file descriptor opened for write
is required.
The Java specification allows it on ANY file descriptor as well -
this should be the only one that matters.
On Nov 30, 2007, at 11:35 AM, Michael McCandless (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-1044?
page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
tabpanel#action_12547217 ]
Michael McCandless commented on LUCENE-1044:
--------------------------------------------
From java-dev, Robert Engels wrote:
{quote}
My reading of the Unix specification shows it should work (the
_commit under Windows is less clear, and since Windows is not inode
based, there may be different issues).
http://www.opengroup.org/onlinepubs/007908799/xsh/fsync.html
{quote}
OK thanks Robert.
I think very likely this approach (let's call it "sync after close")
will work. The _commit docs (for WIN32) also seems to indicate that
the file referenced by the descriptor is fully flushed (as we want):
http://msdn2.microsoft.com/en-us/library/17618685
Also at least PostgreSQL and Berkeley DB "trust" _commit as the
equivalent of fsync (though I have no idea if they use it the same way
we want to).
Though ... I am also a bit concerned about opening files for writing
that we had already previously closed. It arguably makes Lucene "not
quite" write-once. And, we may need a retry loop on syncing because
on Windows, various tools might wake up and peek into a file right
after we close them, possibly interfering w/ our reopening/syncing.
I think the alternative ("sync before close") is something like:
* Add a new method IndexOutput.close(boolean doSync)
* When a merge finishes, it must close all of its files with
doSync=true; and write the new segments_N with doSync=true.
* To implement commit() ... I think we'd have to force a merge of
all written segments that were not sync'd. And on closing the
writer we'd call commit(). This is obviously non-ideal because
you can get very different sized level 1 segments out. Although
the cost would be contained since it's only up to mergeFactor
level 0 segments that we will merge.
OK ... I'm leaning towards sticking with "sync after close", so I'll
keep coding up this approach for now.
Behavior on hard power shutdown
-------------------------------
Key: LUCENE-1044
URL: https://issues.apache.org/jira/browse/
LUCENE-1044
Project: Lucene - Java
Issue Type: Bug
Components: Index
Environment: Windows Server 2003, Standard Edition, Sun
Hotspot Java 1.5
Reporter: venkat rangan
Assignee: Michael McCandless
Fix For: 2.3
Attachments: FSyncPerfTest.java, LUCENE-1044.patch,
LUCENE-1044.take2.patch, LUCENE-1044.take3.patch,
LUCENE-1044.take4.patch
When indexing a large number of documents, upon a hard power
failure (e.g. pull the power cord), the index seems to get
corrupted. We start a Java application as an Windows Service, and
feed it documents. In some cases (after an index size of 1.7GB,
with 30-40 index segment .cfs files) , the following is observed.
The 'segments' file contains only zeros. Its size is 265 bytes -
all bytes are zeros.
The 'deleted' file also contains only zeros. Its size is 85 bytes
- all bytes are zeros.
Before corruption, the segments file and deleted file appear to be
correct. After this corruption, the index is corrupted and lost.
This is a problem observed in Lucene 1.4.3. We are not able to
upgrade our customer deployments to 1.9 or later version, but
would be happy to back-port a patch, if the patch is small enough
and if this problem is already solved.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]