How many indexed fields do you have, overall, in the index?
If you have a very large number of fields that are "sparse" (meaning
any given document would only have a small subset of the fields), then
norms could explain what you are seeing.
Norms are not stored sparsely, so when segments get merged the "holes"
get filled (occupy bytes on disk and in RAM) and consume more
resources. Turning off norms on sparse fields would resolve it, but
you must rebuild the entire index since if even a single doc in the
index has norms enabled for a given field, it "spreads".
Mike
Utan Bisaya wrote:
Recently, our lucene index version was upgraded to 2.3.1 and the
index had to be rebuilt for several weeks which made the entire
index a total of 20 GB or so.
After the the rebuild, a weekly sunday task was executed for
optimization.
During that time, the optimization failed several times complaining
about OOM errors but then after a couple of tries, it completes.
So the entire index is now 1 segment that is 20 GB.
The problem is that any subsequent searches on that index fails with
OOM errors at Lucene's reading of bytes.
Our environment:
jvm Xmx1600 (This is the max we could set the box since it's windows)
8G Memory available on box
4G CPU (8 core) but only 12.5% is used. (Not sure if this would
impact it)
Harddisk available is 120GB
mergeFactor, and other lucene config is set at default.
We
checked this 20GB using luke and it has 400,000,000 documents in it.
It was able to count the docs however when we do the search in Luke
it fails giving us OOM errors.
We also did the check index and the tool fails at the 20GB segment
but succeeds on the others.
We've managed to rollback to a previously unoptimized index copy
(about 20GB or so) and the searches were find now. This unoptimized
index is made up of several 8GB segments and a few smaller segments.
However there is a big possibility that the optimization error could
happen again...
Does anybody have insights on why this is happening?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org