On 31-Aug-07, at 7:13 AM, Yonik Seeley wrote:

On 8/30/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
example/solr/conf/solrconfig.xml:

<maxBufferedDocs>1000</maxBufferedDocs>

Anyone else thinks that this might be a tad high? lucene ships with MBD==10.

A lot of the settings in the orriginal example config/schema came from one particular index we had at CNET ... i think it would makes sense to change almost any settings that have a hardcoded default in code to match the
hardcoded default.

I don't think Solr should necessarily use the same defaults as Lucene.
An MBD of 10 performed much worse for the average Solr collection.
In the next release, I think the default should be to flush by memory
(prob at 32MB level) since it will give good performance at reasonable
memory usage regardless of document size.

I agree that flush by mem is the best option, but I wasn't sure if we'd end up doing a release before lucene 2.3 or not.

In general I am okay with the Solr defaults being different from lucene defaults (I'd expect library code to be more conservative). 1000, though, is really big for decent sized docs (like web pages with lots of metadata fields). Especially since things like token filters for the buffered docs get kept around until they are flushed (WhitespaceTokenizer, for instance, allocates 1.2kB of buffers per instance). 100, perhaps (assuming it isn't moot)?

-Mike

Reply via email to