Re: maxMergeDocs and performance tuning

2010-08-17 Thread Andrew Clegg

Okay, thanks Marc. I don't really have any complaints about performance
(yet!) but I'm still wondering how the mechanics work, e.g. when you have a
number of segments equal to mergeFactor, and each contains maxMergeDocs
documents.

The docs are a bit fuzzy on this...
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/maxMergeDocs-and-performance-tuning-tp1162695p1183064.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: maxMergeDocs and performance tuning

2010-08-16 Thread Marc Sturlese

As far as I know, the higher you set the value, the faster the indexing
process will be (because more things are kept in memory). But depending on
which are your needs, it may not be the best option. If you set a high
mergeFactor and you want to optimize the index once the process is done,
this optimization process will take longer than if the merge factor was very
low.
This is because optimization process compacts many segment files. If
mergeFactor is lower, there will be less files so optimize will be faster.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/maxMergeDocs-and-performance-tuning-tp1162695p1168480.html
Sent from the Solr - User mailing list archive at Nabble.com.


maxMergeDocs and performance tuning

2010-08-15 Thread Andrew Clegg

Hi,

I'm a little confused about how the tuning params in solrconfig.xml actually
work.

My index currently has mergeFactor=25 and maxMergeDocs=2147483647.

So this means that up to 25 segments can be created before a merge happens,
and each segment can have up to 2bn docs in, right?

But this page:

http://www.ibm.com/developerworks/java/library/j-solr2/

says Smaller values [of maxMergeDocs] ( 10,000) are best for applications
with a large number of updates. Our system does indeed have frequent
updates.

But if we set maxMergeDocs=1, what happens when we reach 25 segments
with 1 docs in each? Is the mergeFactor just ignored, so we start a new
segment anyway?

More generally, what would be reasonable params for a large index consisting
of many small docs, updated frequently?

I think a few different use-case examples like this would be a great
addition to the wiki.

Thanks!

Andrew.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/maxMergeDocs-and-performance-tuning-tp1162695p1162695.html
Sent from the Solr - User mailing list archive at Nabble.com.