[
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ning Li updated LUCENE-847:
---------------------------
Attachment: concurrentMerge.patch
Here is a patch for concurrent merge as discussed in:
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651
I put it under this issue because it helps design and verify a factored merge
policy which would provide good support for concurrent merge.
As described before, a merge thread is started when a writer is created and
stopped when the writer is closed. The merge process consists of three steps:
first, create a merge task/spec; then, carry out the actual merge; finally,
"commit" the merged segment (replace segments it merged in segmentInfos), but
only after appropriate deletes are applied. The first and last steps are fast
and synchronous. The second step is where concurrency is achieved. Does it make
sense to capture them as separate steps in the factored merge policy?
As discussed in
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651:
documents can be buffered while segments are merged, but no more than
maxBufferedDocs can be buffered at any time. So this version provides limited
concurrency. The main goal is to achieve short ingestion hiccups, especially
when the ingestion rate is low. After the factored merge policy, we could
provide different versions of concurrent merge policies which provide different
levels of concurrency. :-)
All unit tests pass. If IndexWriter is replaced with
IndexWriterConcurrentMerge, all unit tests pass except the following:
- TestAddIndexesNoOptimize and TestIndexWriter*
This is because they check segment sizes expecting all merges are done.
These tests pass if these checks are performed after the concurrent merges
finish. The modified tests (with waits for concurrent merges to finish) are in
TestIndexWriterConcurrentMerge*.
- testExactFieldNames in TestBackwardCompatibility and
testDeleteLeftoverFiles in TestIndexFileDeleter
In both cases, file name segments_a is expected, but the actual is
segments_7. This is because with concurrent merge, if compound file is used,
only the compound version is "committed" (added to segmentInfos), not the
non-compound version, thus the lower segments generation number.
Cheers,
Ning
> Factor merge policy out of IndexWriter
> --------------------------------------
>
> Key: LUCENE-847
> URL: https://issues.apache.org/jira/browse/LUCENE-847
> Project: Lucene - Java
> Issue Type: Improvement
> Reporter: Steven Parkes
> Assigned To: Steven Parkes
> Attachments: concurrentMerge.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable,
> making it possible for apps to choose a custom merge policy and for easier
> experimenting with merge policy variants.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]