Hello!

Here is a puzzling experiment:

I build an index of about 1.2MM documents using SOLR 3.1. The index has a
large number of dynamic fields (about 15.000). Each document has about 100
fields.

I add the documents in batches of 20, and every 50.000 documents I optimize
the index.

The first 10 optimizes (up to exactly 500k documents) take less than a
minute and a half.

But the 11th and all subsequent commits take north of 10 minutes. The commit
logs look identical (in the INFOSTREAM.txt file), but what used to be

   Jun 19, 2011 4:03:59 AM IW 13 [Sun Jun 19 04:03:59 EDT 2011; Lucene Merge
Thread #0]: merge: total 500000 docs

Jun 19, 2011 4:04:37 AM IW 13 [Sun Jun 19 04:04:37 EDT 2011; Lucene Merge
Thread #0]: merge store matchedCount=2 vs 2


now eats a lot of time:


   Jun 19, 2011 4:37:06 AM IW 14 [Sun Jun 19 04:37:06 EDT 2011; Lucene Merge
Thread #0]: merge: total 550000 docs

Jun 19, 2011 4:46:42 AM IW 14 [Sun Jun 19 04:46:42 EDT 2011; Lucene Merge
Thread #0]: merge store matchedCount=2 vs 2


What could be happening between those two lines that takes 10 minutes at
full CPU? (and with 50k docs less used to take so much less?).


Thanks in advance,

Santiago

Reply via email to