Hi
Our team is seeking a way of construct (or rebuild) a deterministic sorted
index concurrently (I know lucene could achieve that in a sequential manner
but that might be too slow for us sometimes)
Currently we have roughly 2 ideas, all assuming there's a pre-built index
and have dumped a doc-segment map so that IndexWriter would be able to be
aware of which doc belong to which segment:
1. First build index in the normal way (concurrently), after the index is
built, using "addIndexes" functionality to merge documents into the correct
segment.
2. By controlling FlushPolicy and other related classes, make sure each
segment created (before merge) has only the documents that belong to one of
the segments in the pre-built index. And create a dedicated MergePolicy to
only merge segments belonging to one pre-built segment.

Basically we think first one is easier to implement and second one is
faster. Want to seek some ideas & suggestions & feedback here.

Thanks
Patrick Zhai

Reply via email to