From: John Nielsen [j...@mcb.dk]:
> The index is about 35GB on disk with each register between 15k and 30k.
> (This is simply the size of a full xml reply of one register. I'm not sure
> how to measure it otherwise.)

> Our memory requirements are running amok. We have less than a quarter of
> our customers running now and even though we have allocated 25GB to the JVM
> already, we are still seeing daily OOM crashes.

That does sound a bit peculiar. I do not understand what you mean by "register" 
though. How many documents does your index holds?

> I can see from the memory dumps we've done that the field cache is by far
> the biggest sinner.

Do you sort on a lot of different fields?

> We do a lot of facetting. One client facets on about 50.000 docs of approx
> 30k each on 5 fields. I understand that this is VERY memory intensive.

To get a rough approximation of memory usage, we need the total number of 
documents, the average number of values for each of the 5 fields for a document 
and the number of unique values in each of the 5 fields. The rule of thumb I 
use for lower ceiling is

#documents*log2(#references) + #references*log2(#unique_values) bit

If your whole index has 10M documents, which each has 100 values for each 
field, with each field having 50M unique values, then the memory requirement 
would be more than 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~= 
1.6GB for faceting on all fields. Even when we multiply that with 4 to get a 
more real-world memory requirement, it is far from the 25GB that you are 
allocating. Either you have an interestingly high number somewhere in the 
equation or something's off.

Regards,
Toke Eskildsen

Reply via email to