[
https://issues.apache.org/jira/browse/LUCENE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749656#action_12749656
]
Chuck Williams commented on LUCENE-600:
---------------------------------------
The version attached here is from over 3 years ago. Our version has evolved
along with Lucene and the whole apparatus is fully functional with the latest
lucene.
The fields in each subindex are disjoint. A logical Document is the collection
of all fields from each real Document in each real subindex with same doc-id
(i.e., the model Doug started with ParallelReader). There is no issue with
deletion by query or term as it deletes the whole logical Document. Field
updates in our scheme don't use deletion.
Merge-by-size is only an issue if you allow it to be decided independently in
each subindex. In practice that is not very important since one subindex is
size-dominant (the one containing the document body field). One can
merge-by-size that subindex and force the others to merge consistently.
The only reason for the corresponding-segment constraint is that deletion
changes doc-id's by purging deleted documents. I know some Lucene apps address
this by never purging deleted documents, which is ok in some domains where
deletion is rare. I think there are other ways to resolve it as well.
> ParallelWriter companion to ParallelReader
> ------------------------------------------
>
> Key: LUCENE-600
> URL: https://issues.apache.org/jira/browse/LUCENE-600
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 2.1
> Reporter: Chuck Williams
> Priority: Minor
> Attachments: ParallelWriter.patch
>
>
> A new class ParallelWriter is provided that serves as a companion to
> ParallelReader. ParallelWriter meets all of the doc-id synchronization
> requirements of ParallelReader, subject to:
> 1. ParallelWriter.addDocument() is synchronized, which might have an
> adverse effect on performance. The writes to the sub-indexes are, however,
> done in parallel.
> 2. The application must ensure that the ParallelReader is never reopened
> inside ParallelWriter.addDocument(), else it might find the sub-indexes out
> of sync.
> 3. The application must deal with recovery from
> ParallelWriter.addDocument() exceptions. Recovery must restore the
> synchronization of doc-ids, e.g. by deleting any trailing document(s) in one
> sub-index that were not successfully added to all sub-indexes, and then
> optimizing all sub-indexes.
> A new interface, Writable, is provided to abstract IndexWriter and
> ParallelWriter. This is in the same spirit as the existing Searchable and
> Fieldable classes.
> This implementation uses java 1.5. The patch applies against today's svn
> head. All tests pass, including the new TestParallelWriter.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]