On Mon, Jul 6, 2009 at 2:18 AM, John Wang<john.w...@gmail.com> wrote:
> Currently, addIndexesNoOptimize(Directory[] dir) is really really > really fast! (I duplicated my index of 15k docs 200 times and created a 3M > doc index in less than a minute) Perhaps we should handle duplicate > directory names more gracefully? e.g. append a numeral after the segment > name or something? (I'd happy to work on a patch for it) I guess we could explicitly disambiguate on adding external SegmentInfo instances into IndexWriter's segmentInfos (add a new member to SegmentInfo that's normally set to a default value but on importing dups is set to unique values, and then use that member in hashCode/equals). It's somewhat "smelly" though... Or, you could call addIndexesNoOptimize N times, instead; I wonder how the performance would compare. Is performance a real issue here? This is just for testing right? > For what I need now, I think in my case addIndexesNoOptimize(IndexReader[]) > would work as well (I wouldn't know how performance would compare though). Actually implementing this is actually rather tricky, because MergePolicy expects to receive SegmentInfo instances, not IndexReader instances, to make its decisions. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org