[
https://issues.apache.org/jira/browse/LUCENE-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-7927:
---------------------------------------
Attachment: LUCENE-7927.patch
Another iteration, also adding an option to count all facets from a
{{LongValuesSource}}.
I made a simple artificial benchmark
(https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/NumericValueFacetBenchmark.java),
indexing 50M docs with a numeric DV field with values 0 - 9, to test whether
special casing small values (0-1023) is worthwhile:
Counting long values for all docs takes 99.0 msec (best of 100 iters), and
153.4 msec if I turn off the opto, so ~35% faster.
The overall gains are less if I run an {{IntPoint.newRangeQuery}} matchin first
50% of the index and compute facets on that: 255.3 msec and 279.4 if I turn off
the optimization, so ~9% faster. But net/net I think we should keep the
opto... I think it's a common use case to count smallish ordinals.
> Add facets impl to count unique numeric values
> ----------------------------------------------
>
> Key: LUCENE-7927
> URL: https://issues.apache.org/jira/browse/LUCENE-7927
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 7.1
>
> Attachments: LUCENE-7927.patch, LUCENE-7927.patch, LUCENE-7927.patch
>
>
> The facets module has multiple facet methods for counting flat and
> hierarchical fields, and also a method for counting numeric ranges. I'd like
> to also add a method that counts unique numeric (long) values, designed to be
> used for fields that have only a few, typically low valued, numbers across
> the index e.g. a "review" rating from 1 to 5.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]