On Mon, Sep 21, 2015 at 8:09 AM, Uwe Reh <r...@hebis.uni-frankfurt.de> wrote:
> our bibliographic index (~20M entries) runs fine with Solr 4.10.3
> With Solr 5.3 faceted searching is constantly incredibly slow (~ 20 seconds)
[...]
>
> The 'fieldValueCache' seems to be unused (no inserts nor lookups) in Solr
> 5.3. In Solr 4.10 the 'fieldValueCache' is in heavy use with a
> cumulative_hitratio of 1.


Indeed.  Use of the fieldValueCache (UnInvertedField) was secretly
removed as part of LUCENE-5666, causing these performance regressions.

This code had been evolved over years to be very fast for specific use
cases.  No one facet algorithm is going to be optimal for everyone, so
it's important we have multiple.  But use of the UnInvertedField was
removed without any notification or discussion whatsoever (and
obviously no benchmarking), and was only discovered later by Solr devs
in SOLR-7190 that it was essentially dead code.


When I brought back my "JSON Facet API" work to Solr (which was based
on 4.10.x) it came with a heavily modified version of UnInvertedField
that is available via the JSON Facet API.  It might currently work
better for your usecase.

On your normal (non-docValues) index, you can try something like the
following to see what the performance would be:

$ curl http://yxz/solr/hebis/query -d 'q=darwin&
json.facet={
  authors : { type:terms, field:author_facet, limit:30 },
  material_access : { type:terms, field:material_access, limit:30 },
  material_brief : { type:terms, field:material_brief, limit:30 },
  rvk : { type:terms, field:rvk_facet, limit:30 },
  lang : { type:terms, field:language, limit:30 },
  dept : { type:terms, field:department_3, limit:30 }
}'

There were other changes in LUCENE-5666 that will probably slow down
faceting on the single valued fields as well (so this may still be a
little slower than 4.10.x), but hopefully it would be more
competitive.

-Yonik

Reply via email to