bq: The type of queries that are run can return anything from 1
million to 9.5 million documents, and typically run for anything from
20 to 45 minutes.

Uhhh, are you literally setting the &rows parameter to over 9.5M and
getting that many docs all at once? Or is that just numFound and
you're _really_ returning just a relatively few docs? Because if
you're returning 9.5M rows, that's really an anti-pattern for Solr.
There are other ways to do some of this (cursor mark, streaming
aggregation,  export). But before we go there I want to be sure I'm
understanding the use-case.

Because I agree with Toke, the performance numbers you give are waaaay
out of what I would expect, so clearly I don't get something about
your setup.

Best,
Erick

On Tue, Jun 30, 2015 at 3:43 AM, Toke Eskildsen <t...@statsbiblioteket.dk> 
wrote:
> On Tue, 2015-06-30 at 16:39 +1000, Caroline Hind wrote:
>> We have very recently upgraded from SOLR 4.1 to 5.2.1, and at the same
>> time increased the physical RAM from 24Gb to 96Gb. We run multiple
>> cores on this one server, approximately 20 in total, but primarily we
>> have one that is huge in comparison to all of the others. This very
>> large core consists of nearly 62 million documents, and the index is
>> around 45Gb in size.(Is that index unreasonably large, should it be
>> sharded?)
>
> The size itself sounds fine, but your performance numbers below are
> worrying. As always it is hard to give advice on setups:
> https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
>> I'm really unfamiliar with how we should be configuring our JVM.
>> Currently we have it set to a maximum of 48Gb, up until yesterday it
>> was set to 24Gb and we've been seeing the dreaded OOME messages from
>> time to time.
>
> There is a shift in pointer size when one passes the 32GB mark for JVM
> memory. Your 48GB allocation gives you about the same amount of heap as
> a 32GB allocation would:
> https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/
> Consider running two Solrs on the same machine instead. Maybe one for
> the large collection and one for the rest?
>
> Anyway, OOMs with ~32GB of heap for 62M documents indicates that you are
> doing heavy sorting, grouping or faceting on fields that does not have
> DocValues enabled. Could you describe what you do in that regard?
>
>> The type of queries that are run can return anything from
>> 1 million to 9.5 million documents, and typically run for anything from
>> 20 to 45 minutes.
>
> Such response times are a thousand times higher than what most people
> are seeing. There might be a perfectly fine reason for those response
> times, but I suggest we sanity check them: Could you show us a typical
> query and tell us how many concurrent queries you normally serve?
>
> - Toke Eskildsen, State and University Library, Denmark
>
>

Reply via email to