[ 
https://issues.apache.org/jira/browse/LUCENE-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525377
 ] 

Michael McCandless commented on LUCENE-992:
-------------------------------------------


>  - addDocument(Document doc, Analyzer analyzer, Term delTerm): is it
>    better to name it updateDocument?

Good, I'll make that change.

> - I didn't check all the variable accesses in DocumentsWriter, but
>   it seems abort() should lock for some of the variables it
>   accesses. Or make abort() a synchronized method.

OK I will make abort synchronized.

> - Observation: Large documents will block small documents from being
>   flushed if addDocument of large documents is called before that of
>   small ones. This is not the case before LUCENE-843.

Right, when multiple documents are in flight at once (because multiple
threads are adding documents), the documents must be flushed in order
of docID.  Each one grabs a unique (sequential) docID at the start
(synchronized), does the indexing un-synchronized, then flushes
(synchronized) but only if it is that documents "turn" to flush (ie it
is the next docID to be written).  So if a large doc grabs docID
first, then a small doc comes through, it's possible for small docs to
finish indexing before large doc does in which case small docs are
buffered, waiting for large doc to flush first.

> I also slightly changed the exception semantics in IndexWriter:
> previously if a disk full (or other) exception was hit when flushing
> the buffered docs, the buffered deletes were retained but the
> partially flushed buffered docs (if any) were discarded.

> - Observation: Before LUCENE-843, both buffered docs and buffered
>   deletes were retained when such an exception occurs. Now both
>   buffered docs and buffered deletes would be discared if an exception
>   is hit.

Right, altough if the exception is hit after the commit point (eg,
while building the compound file) then the buffered docs & deletes
are added to the index.

I plan to commit this in a day or two.


> IndexWriter.updateDocument is no longer atomic
> ----------------------------------------------
>
>                 Key: LUCENE-992
>                 URL: https://issues.apache.org/jira/browse/LUCENE-992
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.2
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-992.patch
>
>
> Spinoff from LUCENE-847.
> Ning caught that as of LUCENE-843, we lost the atomicity of the delete
> + add in IndexWriter.updateDocument.
> Ning suggested a simple fix: move the buffered deletes into
> DocumentsWriter and let it do the delete + add atomically.  This has a
> nice side effect of also consolidating the "time to flush" logic in
> DocumentsWriter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to