hll stands for https://en.wikipedia.org/wiki/HyperLogLog
You will not get the exact distinct count, but a distinct count very close to
the real number. It is very fast and memory efficient for large number of
distinct values.
> Am 10.03.2020 um 00:25 schrieb Nicolas Paris :
>
>
> Erick Erick
Erick Erickson writes:
> Have you looked at the HyperLogLog stuff? Here’s at least a mention of
> it: https://lucene.apache.org/solr/guide/8_4/the-stats-component.html
I am used to hll in the context of count distinct values -- cardinality.
I have to admit that section
https://lucene.apache.o
Toke Eskildsen writes:
> JSON faceting allows you to skip the fine counting with the parameter
> refine:
I also tried the facet.refine parameter, but didn't notice any improvement.
>> I am wondering how I could filter the documents to get approximate
>> facets ?
>
> Clunky idea: Introduce a
Have you looked at the HyperLogLog stuff? Here’s at least a mention of it:
https://lucene.apache.org/solr/guide/8_4/the-stats-component.html
Best,
Erick
> On Mar 9, 2020, at 02:39, Nicolas Paris wrote:
>
> Hello,
>
>
> Environment:
> - SolrCloud 8.4.1
> - 4 shards with xmx = 120GO and ssd
On Mon, 2020-03-09 at 10:39 +0100, Nicolas Paris wrote:
> I want to provide terms facet on a string multivalue field.
> ...
> How to improve brute performances ?
It might help to have everything in a single shard, to avoid the
secondary fine count. But your index is rather large for single-shard
s
Hello,
Environment:
- SolrCloud 8.4.1
- 4 shards with xmx = 120GO and ssd disks
- 50M documents / 40GO physical per shard
- mainly large texts fields and also, one multivalue/docvalue/indexed string
list of 15 values per document
Goal:
I want to provide terms facet on a string multivalue field.