[ https://issues.apache.org/jira/browse/LUCENE-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525377 ]
Michael McCandless commented on LUCENE-992: ------------------------------------------- > - addDocument(Document doc, Analyzer analyzer, Term delTerm): is it > better to name it updateDocument? Good, I'll make that change. > - I didn't check all the variable accesses in DocumentsWriter, but > it seems abort() should lock for some of the variables it > accesses. Or make abort() a synchronized method. OK I will make abort synchronized. > - Observation: Large documents will block small documents from being > flushed if addDocument of large documents is called before that of > small ones. This is not the case before LUCENE-843. Right, when multiple documents are in flight at once (because multiple threads are adding documents), the documents must be flushed in order of docID. Each one grabs a unique (sequential) docID at the start (synchronized), does the indexing un-synchronized, then flushes (synchronized) but only if it is that documents "turn" to flush (ie it is the next docID to be written). So if a large doc grabs docID first, then a small doc comes through, it's possible for small docs to finish indexing before large doc does in which case small docs are buffered, waiting for large doc to flush first. > I also slightly changed the exception semantics in IndexWriter: > previously if a disk full (or other) exception was hit when flushing > the buffered docs, the buffered deletes were retained but the > partially flushed buffered docs (if any) were discarded. > - Observation: Before LUCENE-843, both buffered docs and buffered > deletes were retained when such an exception occurs. Now both > buffered docs and buffered deletes would be discared if an exception > is hit. Right, altough if the exception is hit after the commit point (eg, while building the compound file) then the buffered docs & deletes are added to the index. I plan to commit this in a day or two. > IndexWriter.updateDocument is no longer atomic > ---------------------------------------------- > > Key: LUCENE-992 > URL: https://issues.apache.org/jira/browse/LUCENE-992 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 2.2 > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.3 > > Attachments: LUCENE-992.patch > > > Spinoff from LUCENE-847. > Ning caught that as of LUCENE-843, we lost the atomicity of the delete > + add in IndexWriter.updateDocument. > Ning suggested a simple fix: move the buffered deletes into > DocumentsWriter and let it do the delete + add atomically. This has a > nice side effect of also consolidating the "time to flush" logic in > DocumentsWriter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]