Just in case anyone is interested, weighted collect (collect more on
shards of more documents) actually does not necessarily improve the
accuracy if the documents are distributed by default hash algorithm. There
is no such correlations.
On Tuesday, September 16, 2014 5:09:51 PM UTC-4, Yifan
It seems to be a common problem that the top N results returned from an
aggregation query is inaccurate due to uneven distribution of matching
documents on different shards, because ES will collect top N buckets from
each shard no matter actually how many hits are on each shard. It is very
Hi Yifan,
Nothing dynamic, but you can increase the number of terms collected on each
shard to increase the accuracy [1]. Might also want to play with the
shard_min_doc_count value if you know certain shards have a low hit count
and are throwing off the aggregations [2].
[1]