On Wed, Jun 13, 2012 at 8:45 PM, Itamar Syn-Hershko ita...@code972.com wrote:
Mike,
On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Hi Itamar,
One quick question: does Lucene.Net include the fixes done for
LUCENE-1044 (to fsync files on commit)? Those
I think the 0-segment segments_1 file is expected in Lucene.Net since
we changed that later, in 3.1 in Lucene (LUCENE-2386)?
Mike McCandless
http://blog.mikemccandless.com
On Thu, Jun 14, 2012 at 8:40 PM, Itamar Syn-Hershko ita...@code972.com wrote:
I can confirm 2.9.4 had autoCommit, but it
Well, the only thing I see is that there is no place where writer.Commit()
is called in the delegate assigned to corpusReader.OnDocument. I know that
lucene is very transactional, and at least in 3.x, the writer will never
auto commit to the index. You can write millions of documents, but if
I'm quite certain this shouldn't happen also when Commit wasn't called.
Mike, can you comment on that?
On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens
currens.ch...@gmail.com wrote:
Well, the only thing I see is that there is no place where writer.Commit()
is called in the delegate
If this is the case, 2328 probably made it's way to Lucene.Net since we are
using the released sources for porting, and we now need to apply 3418 in
the current version.
Iatmar: I confirmed that 2328 is in the latest code.
Thanks,
Troy
On Wed, Jun 13, 2012 at 5:45 PM, Itamar Syn-Hershko
I can confirm 2.9.4 had autoCommit, but it is gone in 3.0.3 already, so
Lucene.Net doesn't have autoCommit.
So I don't have autoCommit set to true, but I can clearly see a segments_1
file there along with the other files. If that helpes, it always keeps with
the name segments_1 with 32 bytes,
Mike, The codebase for lucene.net should be almost identical to java's
3.0.3 release, and LUCENE-1044 is included in that.
Itamar, are you committing the index regularly? I only ask because I can't
reproduce it myself by forcibly terminating the process while it's
indexing. I've tried both
Christopher,
I used the IndexBuilder app from here
https://github.com/synhershko/Talks/tree/master/LuceneNeatThings with a
8.5GB wikipedia dump.
After running for 2.5 days I had to forcefully close it (infinite loop in
the wiki-markdown parser at 92%, go figure), and the 40-something GB index
I