On Thu, Oct 20, 2016 at 8:45 AM, Bastien Latard | MDPI AG
<lat...@mdpi.com.invalid> wrote:
> Hi Yonik,
>
> Thanks for your answer!
> I'm not quite I understood everything...please, see my comments below.
>
>
>> On Wed, Oct 19, 2016 at 6:23 AM, Bastien Latard | MDPI AG
>> <lat...@mdpi.com.invalid> wrote:
>>>
>>> I just had a question about facets.
>>> *==> Is the facet run on all documents (to pre-process/cache the data) or
>>> only on returned documents?*
>>
>> Yes ;-)
>>
>> There are sometimes per-field data structures that are cached to
>> support faceting.  This can make the first facet request after a new
>> searcher take longer.  Unless you're using docValues, then the cost is
>> much less.
>
> So how to force it to use docValues? Simply:
> <field name="my_field" type="string" indexed="false" stored="false"
> docValues="true" />
> Are there other advantage/inconvenient?

You probably still want the field indexed as well... that supports
fast filtering by specific values (fq=my_field:value1)
without having to do a complete column scan.

>> Then there are per-request data structures (like a count array) that
>> are O(field_cardinality) and not O(matching_docs).
>> But then for default field-cache faceting, the actual counting part is
>> O(matching_docs).
>> So yes, at the end of  the day we only facet on the matching
>> documents... but what the total field looks like certainly matters.
>
> This would only be like that if I would use docValues, right?

If docvalues aren't indexed, then they are built in memory (or
something like them) before they are used.

-Yonik

> If I have such field declaration (dedicated field for facet-- without
> stemming), what would be the best setting?
> <field name="author_facet" type="text_facet" indexed="true" stored="true"
> required="false" multiValued="true" />
>
> Kind regards,
> Bastien
>

Reply via email to