Hi Mike, Response inline:
On Thu, Apr 5, 2012 at 11:36 AM, Michael McCandless <luc...@mikemccandless.com> wrote: > I'm assuming this is a "build once and never change" index...? Else, > it sounds like you should never run forceMerge... Correct. The forceMerge was merely to preserve the previous 2.3 behavior of using optimize. > To preserve insertion order you just need to use one of the > Log*MergePolicy (which you are already doing). Merge factor doesn't > affect this... I was never sure why the merge factor was set to 2. My experiences in the past was to set a high merge factor when doing a batch index. > For the fastest way to get to a single-segment index.... use > NoMergePolicy while indexing the documents, and set the largest RAM > buffer you can afford. This will create tons of segments in the index > dir, which is fine as long as you will not open a reader on it... > then: > > Open a new IW, with Log*MergePolicy, set a highish (maybe 30) > mergeFactor, and call forceMerge(1). You may need to cutover to > SerialMergeScheduler... NoMergePolicy? Never seen that class used before. RAM buffer size is not an issue. Is the limitation still 2048MB? Is the fastest way also the best way? :) There will never be a read open on the index. Your second solution is similar to the existing code with the exception of the mergeFactor. Will setting the merge factor to a more reasonable number help with the merge speed? What enforces the preservation of the insertion order? The MergePolicy? How does the MergeScheduler affect things? Used Lucene on a few projects over the years and I never had to tweak the index creation. I guess I need to reread the tuning chapter in LIA, it's been a few years. Cheers, Ivan --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org