I think the 0-segment segments_1 file is expected in Lucene.Net since we changed that later, in 3.1 in Lucene (LUCENE-2386)?
Mike McCandless http://blog.mikemccandless.com On Thu, Jun 14, 2012 at 8:40 PM, Itamar Syn-Hershko <ita...@code972.com> wrote: > I can confirm 2.9.4 had autoCommit, but it is gone in 3.0.3 already, so > Lucene.Net doesn't have autoCommit. > > So I don't have autoCommit set to true, but I can clearly see a segments_1 > file there along with the other files. If that helpes, it always keeps with > the name segments_1 with 32 bytes, never changes. > > And as again, if I kill the process and try to open the index with Luke 3.3, > the index folder is being wiped out. > > Not sure what to make of all that. > > On Fri, Jun 15, 2012 at 3:21 AM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> >> Hmm, OK: in 2.9.4 / 3.0.x, if you open IW on a new directory, it will >> make a zero-segment commit. This was changed/fixed in 3.1 with >> LUCENE-2386. >> >> In 2.9.x (not 3.0.x) there is still an autoCommit parameter, >> defaulting to false, but if you set it to true then IndexWriter will >> periodically commit. >> >> Seeing segment files created and merge is definitely expected, but >> it's not expected to see segments_N files unless you pass >> autoCommit=true. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Thu, Jun 14, 2012 at 8:14 PM, Itamar Syn-Hershko <ita...@code972.com> >> wrote: >> > Not what I'm seeing. I actually see a lot of segments created and merged >> > while it operates. Expected? >> > >> > Reminding you, this is 2.9.4 / 3.0.3 >> > >> > On Fri, Jun 15, 2012 at 3:10 AM, Michael McCandless >> > <luc...@mikemccandless.com> wrote: >> >> >> >> Right: Lucene never autocommits anymore ... >> >> >> >> If you create a new index, add a bunch of docs, and things crash >> >> before you have a chance to commit, then there is no index (not even a >> >> 0 doc one) in that directory. >> >> >> >> Mike McCandless >> >> >> >> http://blog.mikemccandless.com >> >> >> >> On Thu, Jun 14, 2012 at 1:41 PM, Itamar Syn-Hershko >> >> <ita...@code972.com> >> >> wrote: >> >> > I'm quite certain this shouldn't happen also when Commit wasn't >> >> > called. >> >> > >> >> > Mike, can you comment on that? >> >> > >> >> > On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens >> >> > <currens.ch...@gmail.com> wrote: >> >> >> >> >> >> Well, the only thing I see is that there is no place where >> >> >> writer.Commit() >> >> >> is called in the delegate assigned to corpusReader.OnDocument. I >> >> >> know >> >> >> that >> >> >> lucene is very transactional, and at least in 3.x, the writer will >> >> >> never >> >> >> auto commit to the index. You can write millions of documents, but >> >> >> if >> >> >> commit is never called, those documents aren't actually part of the >> >> >> index. >> >> >> Committing isn't a cheap operation, so you definitely don't want to >> >> >> do >> >> >> it >> >> >> on every document. >> >> >> >> >> >> You can test it yourself with this (naive) solution. Right below >> >> >> the >> >> >> writer.SetUseCompoundFile(false) line, add "int numDocsAdded = 0;". >> >> >> At >> >> >> the >> >> >> end of the corpusReader.OnDocument delegate add: >> >> >> >> >> >> // Example only. I wouldn't suggest committing this often >> >> >> if(++numDocsAdded % 5 == 0) >> >> >> { >> >> >> writer.Commit(); >> >> >> } >> >> >> >> >> >> I had the application crash for real on this file: >> >> >> >> >> >> >> >> >> >> >> >> http://dumps.wikimedia.org/gawiktionary/20120613/gawiktionary-20120613-pages-meta-history.xml.bz2, >> >> >> about 20% into the operation. Without the commit, the index is >> >> >> empty. >> >> >> Add >> >> >> it in, and I get 755 files in the index after it crashes. >> >> >> >> >> >> >> >> >> Thanks, >> >> >> Christopher >> >> >> >> >> >> On Wed, Jun 13, 2012 at 6:13 PM, Itamar Syn-Hershko >> >> >> <ita...@code972.com>wrote: >> >> >> >> >> >> >> >> >> > Yes, reproduced in first try. See attached program - I referenced >> >> >> > it >> >> >> > to >> >> >> > current trunk. >> >> >> > >> >> >> > >> >> >> > On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko >> >> >> > <ita...@code972.com>wrote: >> >> >> > >> >> >> >> Christopher, >> >> >> >> >> >> >> >> I used the IndexBuilder app from here >> >> >> >> https://github.com/synhershko/Talks/tree/master/LuceneNeatThings >> >> >> >> with a >> >> >> >> 8.5GB wikipedia dump. >> >> >> >> >> >> >> >> After running for 2.5 days I had to forcefully close it (infinite >> >> >> >> loop >> >> >> >> in >> >> >> >> the wiki-markdown parser at 92%, go figure), and the 40-something >> >> >> >> GB >> >> >> >> index >> >> >> >> I had by then was unusable. I then was able to reproduce this >> >> >> >> >> >> >> >> Please note I now added a few safe-guards you might want to >> >> >> >> remove >> >> >> >> to >> >> >> >> make sure the app really crashes on process kill. >> >> >> >> >> >> >> >> I'll try to come up with a better way to reproduce this - >> >> >> >> hopefully >> >> >> >> Mike >> >> >> >> will be able to suggest better ways than manual process kill... >> >> >> >> >> >> >> >> On Thu, Jun 14, 2012 at 1:41 AM, Christopher Currens < >> >> >> >> currens.ch...@gmail.com> wrote: >> >> >> >> >> >> >> >>> Mike, The codebase for lucene.net should be almost identical to >> >> >> >>> java's >> >> >> >>> 3.0.3 release, and LUCENE-1044 is included in that. >> >> >> >>> >> >> >> >>> Itamar, are you committing the index regularly? I only ask >> >> >> >>> because >> >> >> >>> I >> >> >> >>> can't >> >> >> >>> reproduce it myself by forcibly terminating the process while >> >> >> >>> it's >> >> >> >>> indexing. I've tried both 3.0.3 and 2.9.4. If I don't commit >> >> >> >>> at >> >> >> >>> all >> >> >> >>> and >> >> >> >>> terminate the process (even with a 10,000 4K documents created), >> >> >> >>> there >> >> >> >>> will >> >> >> >>> be no documents in the index when I open it in luke, which I >> >> >> >>> expect. >> >> >> >>> If >> >> >> >>> I >> >> >> >>> commit at 10,000 documents, and terminate it a few thousand >> >> >> >>> after >> >> >> >>> that, >> >> >> >>> the >> >> >> >>> index has the first ten thousand that were committed. I've even >> >> >> >>> terminated >> >> >> >>> it *while* a second commit was taking place, and it still had >> >> >> >>> all >> >> >> >>> of >> >> >> >>> the >> >> >> >>> documents I expected. >> >> >> >>> >> >> >> >>> It may be that I'm not trying to reproducing it correctly. Do >> >> >> >>> you >> >> >> >>> have a >> >> >> >>> minimal amount of code that can reproduce it? >> >> >> >>> >> >> >> >>> >> >> >> >>> Thanks, >> >> >> >>> Christopher >> >> >> >>> >> >> >> >>> On Wed, Jun 13, 2012 at 9:31 AM, Michael McCandless < >> >> >> >>> luc...@mikemccandless.com> wrote: >> >> >> >>> >> >> >> >>> > Hi Itamar, >> >> >> >>> > >> >> >> >>> > One quick question: does Lucene.Net include the fixes done for >> >> >> >>> > LUCENE-1044 (to fsync files on commit)? Those are very >> >> >> >>> > important >> >> >> >>> > for >> >> >> >>> > an index to be intact after OS/JVM crash or power loss. >> >> >> >>> > >> >> >> >>> > More responses below: >> >> >> >>> > >> >> >> >>> > On Tue, Jun 12, 2012 at 8:20 PM, Itamar Syn-Hershko < >> >> >> >>> ita...@code972.com> >> >> >> >>> > wrote: >> >> >> >>> > >> >> >> >>> > > I'm a Lucene.Net committer, and there is a chance we have a >> >> >> >>> > > bug >> >> >> >>> > > in >> >> >> >>> our >> >> >> >>> > > FSDirectory implementation that causes indexes to get >> >> >> >>> > > corrupted >> >> >> >>> > > when >> >> >> >>> > > indexing is cut while the IW is still open. As it roots from >> >> >> >>> > > some >> >> >> >>> > > retroactive fixes you made, I'd appreciate your feedback. >> >> >> >>> > > >> >> >> >>> > > Correct me if I'm wrong, but by design Lucene should be able >> >> >> >>> > > to >> >> >> >>> recover >> >> >> >>> > > rather quickly from power failures or app crashes. Since >> >> >> >>> > > existing >> >> >> >>> segment >> >> >> >>> > > files are read only, only new segments that are still being >> >> >> >>> > > written >> >> >> >>> can >> >> >> >>> > get >> >> >> >>> > > corrupted. Hence, recovering from worst-case scenarios is >> >> >> >>> > > done >> >> >> >>> > > by >> >> >> >>> simply >> >> >> >>> > > removing the write.lock file. The worst that could happen >> >> >> >>> > > then >> >> >> >>> > > is >> >> >> >>> having >> >> >> >>> > the >> >> >> >>> > > last segment damaged, and that can be fixed by removing >> >> >> >>> > > those >> >> >> >>> > > files, >> >> >> >>> > > possibly by running CheckIndex on the index. >> >> >> >>> > >> >> >> >>> > You shouldn't even have to run CheckIndex ... because (as of >> >> >> >>> > LUCENE-1044) we now fsync all segment files before writing the >> >> >> >>> > new >> >> >> >>> > segments_N file, and then removing old segments_N files (and >> >> >> >>> > any >> >> >> >>> > segments that are no longer referenced). >> >> >> >>> > >> >> >> >>> > You do have to remove the write.lock if you aren't using >> >> >> >>> > NativeFSLockFactory (but this has been the default lock impl >> >> >> >>> > for >> >> >> >>> > a >> >> >> >>> > while now). >> >> >> >>> > >> >> >> >>> > > Last week I have been playing with rather large indexes and >> >> >> >>> > > crashed >> >> >> >>> my >> >> >> >>> > app >> >> >> >>> > > while it was indexing. I wasn't able to open the index, and >> >> >> >>> > > Luke >> >> >> >>> > > was >> >> >> >>> even >> >> >> >>> > > kind enough to wipe the index folder clean even though I >> >> >> >>> > > opened >> >> >> >>> > > it >> >> >> >>> > > in >> >> >> >>> > > read-only mode. I re-ran this, and after another crash >> >> >> >>> > > running >> >> >> >>> CheckIndex >> >> >> >>> > > revealed nothing - the index was detected to be an empty >> >> >> >>> > > one. I >> >> >> >>> > > am >> >> >> >>> not >> >> >> >>> > > entirely sure what could be the cause for this, but I >> >> >> >>> > > suspect >> >> >> >>> > > it >> >> >> >>> > > has >> >> >> >>> > > been corrupted by the crash. >> >> >> >>> > >> >> >> >>> > Had no commit completed (no segments file written)? >> >> >> >>> > >> >> >> >>> > If you don't fsync then all sorts of crazy things are >> >> >> >>> > possible... >> >> >> >>> > >> >> >> >>> > > I've been looking at these: >> >> >> >>> > > >> >> >> >>> > > >> >> >> >>> > >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> https://issues.apache.org/jira/browse/LUCENE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> >> >> >>> > > >> >> >> >>> > >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> >> >> >>> > >> >> >> >>> > (And LUCENE-1044 before that ... it was LUCENE-1044 that >> >> >> >>> > LUCENE-2328 >> >> >> >>> > broke...). >> >> >> >>> > >> >> >> >>> > > And it seems like this is what I was experiencing. Mike and >> >> >> >>> > > Mark >> >> >> >>> > > will >> >> >> >>> > > probably be able to tell if this is what they saw or not, >> >> >> >>> > > but >> >> >> >>> > > as >> >> >> >>> > > far >> >> >> >>> as I >> >> >> >>> > > can tell this is not an expected behavior of a Lucene index. >> >> >> >>> > >> >> >> >>> > Definitely not expected behavior: assuming nothing is flipping >> >> >> >>> > bits, >> >> >> >>> > then on OS/JVM crash or power loss your index should be fine, >> >> >> >>> > just >> >> >> >>> > reverted to the last successful commit. >> >> >> >>> > >> >> >> >>> > > What I'm looking for at the moment is some advice on what >> >> >> >>> > > FSDirectory >> >> >> >>> > > implementation to use to make sure no corruption can happen. >> >> >> >>> > > The >> >> >> >>> > > 3.4 >> >> >> >>> > version >> >> >> >>> > > (which is where LUCENE-3418 was committed to) seems to >> >> >> >>> > > handle a >> >> >> >>> > > lot >> >> >> >>> of >> >> >> >>> > > things the 3.0 doesn't, but on the other hand LUCENE-3418 >> >> >> >>> > > was >> >> >> >>> introduced >> >> >> >>> > by >> >> >> >>> > > changes made to the 3.0 codebase. >> >> >> >>> > >> >> >> >>> > Hopefully it's just that you are missing fsync! >> >> >> >>> > >> >> >> >>> > > Also, is there any test in the suite checking for those >> >> >> >>> > > scenarios? >> >> >> >>> > >> >> >> >>> > Our test framework has a sneaky MockDirectoryWrapper that, >> >> >> >>> > after >> >> >> >>> > a >> >> >> >>> > test finishes, goes and corrupts any unsync'd files and then >> >> >> >>> > verifies >> >> >> >>> > the index is still OK... it's good because it'll catch any >> >> >> >>> > times >> >> >> >>> > we >> >> >> >>> > are missing calls t sync, but, it's not low level enough such >> >> >> >>> > that >> >> >> >>> > if >> >> >> >>> > FSDir is failing to actually call fsync (that wsa the bug in >> >> >> >>> > LUCENE-3418) then it won't catch that... >> >> >> >>> > >> >> >> >>> > Mike McCandless >> >> >> >>> > >> >> >> >>> > http://blog.mikemccandless.com >> >> >> >>> > >> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> > >> >> > >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >