On Thu, Oct 20, 2016 at 8:45 AM, Bastien Latard | MDPI AG <lat...@mdpi.com.invalid> wrote: > Hi Yonik, > > Thanks for your answer! > I'm not quite I understood everything...please, see my comments below. > > >> On Wed, Oct 19, 2016 at 6:23 AM, Bastien Latard | MDPI AG >> <lat...@mdpi.com.invalid> wrote: >>> >>> I just had a question about facets. >>> *==> Is the facet run on all documents (to pre-process/cache the data) or >>> only on returned documents?* >> >> Yes ;-) >> >> There are sometimes per-field data structures that are cached to >> support faceting. This can make the first facet request after a new >> searcher take longer. Unless you're using docValues, then the cost is >> much less. > > So how to force it to use docValues? Simply: > <field name="my_field" type="string" indexed="false" stored="false" > docValues="true" /> > Are there other advantage/inconvenient?
You probably still want the field indexed as well... that supports fast filtering by specific values (fq=my_field:value1) without having to do a complete column scan. >> Then there are per-request data structures (like a count array) that >> are O(field_cardinality) and not O(matching_docs). >> But then for default field-cache faceting, the actual counting part is >> O(matching_docs). >> So yes, at the end of the day we only facet on the matching >> documents... but what the total field looks like certainly matters. > > This would only be like that if I would use docValues, right? If docvalues aren't indexed, then they are built in memory (or something like them) before they are used. -Yonik > If I have such field declaration (dedicated field for facet-- without > stemming), what would be the best setting? > <field name="author_facet" type="text_facet" indexed="true" stored="true" > required="false" multiValued="true" /> > > Kind regards, > Bastien >