he
fastest possible indexing time. For case 3, I'd like minimal pauses and
no noticeable degradation in search performance.
Based on reading the code (including the javadocs comments), I'm
thinking of values along these lines:
mergeFactor: 1000 during Full indexing, and during optimize (
We've found something interesting about mergeFactors.
We are indexing a million documents with a batch of 1000.
We first set the mergeFactor to 1000.
What we found is at every 10th commit, we see a significant spike in
indexing time.
The reason is that the indexer is trying to merg
In my experiments with mergeFactor I found the point of diminishing/no
returns. If I remember correctly, I hit the limit at mergeFactor of
50.
But here is something from Lucene in Action that you can use to play
with various index tuning factors and see their effect on indexing
performance
I'm wondering what values of mergeFactor, minMergeDocs and maxMergeDocs
people have found to yield the best performance for different
configurations. Is there a repository of this information anywhere?
I've got about 30k documents and have 3 indexing scenarios:
1. Full in
How do you set mergeFactor on an IndexWriter object. I tried the way it
was mentioned in this
article(http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html)
writer.mergeFactor = 1000;
This did not work for me. I tried setting the
org.apache.lucene.mergeFactor property. That worked for me
Can somebody explain the difference between the parameters minMergeDocs
and mergeFactor in IndexWriter. When I read the documentation, it looks
like both of them represent number of documents to be in buffer before
they are merged into a new segment.
Thanks in advance,
Ravi
ith writer.mergeFactor==4000, but I think it was a
> specially tweaked kernel on Linux which I don't have anymore.
>
> Since I don't really understand yet how open files, segments, memory
> use, indexing time and mergeFacter interact, I would appreciate a
> good
> gues how to co
, indexing time and mergeFacter interact, I would appreciate a good
gues how to combine these indexes.
Which mergeFactor to use?
Use a different strategy then the 3 lines shown above?
Thanks,
Harald.
--
Harald Kirsc
: mergeFactor and maxMergeDocs
Obsession with indexing performance is not healthy. Before changing any
settings convince yourself that indexing performance is a real problem
for your application. How often do you re-index from scratch? Are you
really having any difficulty keeping up with the
t too high may cause out of memory problems.
You may also see some indexing speedup by increasing the mergeFactor,
but raising it too high will cause file handle problems. Calling
setUseCompoundFile() will enable higher mergeFactor settings before
encountering file handle problems.
Obsession
what effect and what recommendations are valid for Lucene 1.3?
Herb
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hello,
Lucene javadoc defines the merge factor and mergemaxdocs as follows:
int maxMergeDocs Determines the largest number of documents ever merged by
addDocument().
int mergeFactor Determines how often segment indexes are merged by
addDocument().
void optimize Merges all segments together into a
Increasing the MergeFactor on an IndexWriter can speed the indexing
process tremendously.
However, you can quickly run out of file descriptors and the indexing
application will thorw
java.io.FileNotFoundException:.(Too many open files)
I know prefectly how to increase the number of
13 matches
Mail list logo