Re: Unable to identify why faceting is taking so much time

Toke Eskildsen Wed, 13 May 2015 03:44:54 -0700

On Wed, 2015-05-13 at 09:22 +0000, Abhishek Gupta wrote:

> Yes we have that many documents (exact count: 522664425), but I am not
> sure why that matters because what I understood from documentation is
> that fc will only work on the documents filtered by filter query and
> query.


What the documentation does not mention explicitly is the UnInversion
that takes place on first call. If you look in your Solr-log after
"UnInverted", you will see how many milliseconds it takes at the
parameter "time".

For example:
UnInverted multi-valued field {field=lsubject,memSize=216343445,
tindexSize=1037315,time=36620,phase1=35868,nTerms=4440544,bigTerms=1,
termInstances=55196823,uses=0}
took 36620 milliseconds to UnInvert 4.440.544 terms.

The number of references from documents to terms is 55.196.823. If we
assume you have approximately 1 reference/document, you will have half a
billion references or about 10 times my number. 10 times 37 seconds is
quite close to the 300 seconds you state below. Of course our numbers
cannot be compared directly, but it means that your measurements passed
the sanity check.

> For my query there are only 137 documents for fc to work on and to
> make FieldCache.

The mapping structure from your 522.664.425 documents to the values in
your field (also in the higher millions, as I understand it) is
independent of your search result.

After the structure has been created, it is used to look up the terms
used by your 137 hits.

> Also subsequent calls are not fast:
> First call time: 297572
> Second call time (made with in 2 sec): 249287

Are you indexing while searching? Each time the index is changed, the
UnInversion will have to be re-done. facet.method=fcs seems a better
choice with an often-changing index of your size.
>         
- Toke Eskildsen, State and University Library, Denmark

>

Re: Unable to identify why faceting is taking so much time

Reply via email to