My instincts says this is not the proper use of aggregation but want to 
check w/ people who have actually used it. We want to bucket on a very high 
cardinality field and return **ALL** buckets (no size limit). For example, 
imagine documents representing people and their parents:

person - parent
===========
john - cindy
james - cindy
tony - mark
tim - doug

I want to bucket by parent, so it'll be

cindy
   - john
   - james
mark
   - tony
doug
   - tim

This is a high cardinality field, so already it concerns me. I want all 
buckets (setting size to zero). So if I have 10,000 documents I have 5,000 
parent buckets and I want all 5,000 of these parent buckets. Essentially 
I'm trying to display by parent (group by parent). Moreover, I want to sort 
the parent's age (so imagine the parent has an age it it). Or maybe I want 
to sort by the average person (child) age in each bucket. So w/ aggregation 
this seems possible:

bucket by parent, sort by average age of person, bucket by person (to get 
all people for a parent bucket), set size to zero.

But it feels very wrong to me, both in terms of the potential performance 
issues around unlimited, high cardinality buckets and the sorting of those 
buckets; and that aggregration/bucketing wasn't designed for this.

Any input/feedback would be appreciated.
-T

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ea6665a1-6562-456c-a806-937fd9f15463%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to