On 31-Aug-07, at 7:13 AM, Yonik Seeley wrote:
On 8/30/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
example/solr/conf/solrconfig.xml:
<maxBufferedDocs>1000</maxBufferedDocs>
Anyone else thinks that this might be a tad high? lucene ships
with MBD==10.
A lot of the settings in the orriginal example config/schema came
from one
particular index we had at CNET ... i think it would makes sense
to change
almost any settings that have a hardcoded default in code to match
the
hardcoded default.
I don't think Solr should necessarily use the same defaults as Lucene.
An MBD of 10 performed much worse for the average Solr collection.
In the next release, I think the default should be to flush by memory
(prob at 32MB level) since it will give good performance at reasonable
memory usage regardless of document size.
I agree that flush by mem is the best option, but I wasn't sure if
we'd end up doing a release before lucene 2.3 or not.
In general I am okay with the Solr defaults being different from
lucene defaults (I'd expect library code to be more conservative).
1000, though, is really big for decent sized docs (like web pages
with lots of metadata fields). Especially since things like token
filters for the buffered docs get kept around until they are flushed
(WhitespaceTokenizer, for instance, allocates 1.2kB of buffers per
instance). 100, perhaps (assuming it isn't moot)?
-Mike