The reason I mention sort is that we in my project, half a year ago, have dealt with the FieldCache->OOM-problem when doing sort-requests. We basically just reject sort-requests unless they hit below X documents - in case they do we just find them without sorting and sort them ourselves afterwards.

Currently our problem is, that we have to do a group/distinct (in SQL-language) query and we have found that we can do what we want to do using group (http://wiki.apache.org/solr/FieldCollapsing) or facet - either will work for us. Problem is that they both use FieldCache and we "know" that using FieldCache will lead to OOM-execptions with the amount of data each of our Solr-nodes administrate. This time we have really no option of just "limit" usage as we did with sort. Therefore we need a group/distinct-functionality that works even on huge data-amounts (and a algorithm using FieldCache will not)

I believe setting facet.method=enum will actually make facet not use the FieldCache. Is that true? Is it a bad idea?

I do not know much about DocValues, but I do not believe that you will avoid FieldCache by using DocValues? Please elaborate, or point to documentation where I will be able to read that I am wrong. Thanks!

Regards, Per Steffensen

On 9/11/13 1:38 PM, Erick Erickson wrote:
I don't know any more than Michael, but I'd _love_ some reports from the
field.

There are some restriction on DocValues though, I believe one of them
is that they don't really work on analyzed data....

FWIW,
Erick

Reply via email to