[ https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523621 ]
Ning Li commented on LUCENE-847: -------------------------------- I include comments for both LUCENE-847 and LUCENE-870 here since they are closely related. I like the stateless approach used for refactoring merge policy. But modeling concurrent merge (ConcurrentMergePolicyWrapper) as a MergePolicy seems to be inconsistent with the MergePolicy interface: 1 As you pointed out, "the merge policy is no longer responsible for running the merges itself". MergePolicy.maybeMerge simply returns a merge specification. But ConcurrentMergePolicyWrapper.maybeMerge actually starts concurrent merge threads thus doing the merges. 2 Related to 1, cascading is done in IndexWriter in non-concurrent case. But in concurrent case, cascading is also done in merge threads which are started by ConcurrentMergePolicyWrapper.maybeMerge. MergePolicy.maybeMerge should continue to simply return a merge specification. (BTW, should we rename this maybeMerge to, say, findCandidateMerges?) Can we carve the merge process out of IndexWriter into a Merger? IndexWriter still provides the building blocks - merge(OneMerge), mergeInit(OneMerge), etc. Merger uses these building blocks. A ConcurrentMerger extends Merger but starts concurrent merge threads as ConcurrentMergePolicyWrapper does. Other comments: 1 UpdateDocument's and deleteDocument's bufferDeleteTerm are synchronized on different variables in this patch. However, the semantics of updateDocument changed since LUCENE-843. Before LUCENE-843, updateDocument, which is a delete and an insert, guaranteed the delete and the insert are committed together (thus an update). Now it's possible that they are committed in different transactions. If we consider DocumentsWriter as the RAM staging area for IndexWriter, then deletes are also buffered in RAM staging area and we can restore our previous semantics, right? 2 OneMerge.segments seems to rely on its segment infos' reference to segment infos of IndexWriter.segmentInfos. The use in commitMerge, which calls ensureContiguousMerge, is an example. However, segmentInfos can be a cloned copy because of exceptions, thus the reference broken. 3 Calling optimize of an IndexWriter with the current ConcurrentMergePolicyWrapper may cause deadlock: the one merge spec returned by MergePolicy.optimize may be in conflict with a concurrent merge (the same merge spec will be returned without changes to segmentInfos), but a concurrent merge cannot finish because optimize is holding the lock. 4 Finally, a couple of minor things: 1 LogMergePolicy.useCompoundFile(SegmentInfos infos, SegmentInfo info) and useCompoundDocStore(SegmentInfos infos): why the parameters? 2 Do we need doMergeClose in IndexWriter? Can we simply close a MergePolicy if not null? > Factor merge policy out of IndexWriter > -------------------------------------- > > Key: LUCENE-847 > URL: https://issues.apache.org/jira/browse/LUCENE-847 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Steven Parkes > Assignee: Steven Parkes > Fix For: 2.3 > > Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, > LUCENE-847.patch.txt, LUCENE-847.take3.patch, LUCENE-847.take4.patch, > LUCENE-847.txt > > > If we factor the merge policy out of IndexWriter, we can make it pluggable, > making it possible for apps to choose a custom merge policy and for easier > experimenting with merge policy variants. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]