Adding a 'node reduce phase' to aggregations is something I'm very 
interested in, and also investigating for the project I'm currently working 
on.

"If you introduce an extra reduction phase (for multiple shards on the same 
node) you introduce further potential for inaccuracies in the final 
results."

This is true if you only reduce the top-k items per shard, but I was 
thinking to reduce the complete set of buckets locally. This takes a bit 
more cpu, and memory, but my guess is that this is negligible compared to 
the work already being done by the aggregation framework. If you reduce the 
buckets on the node before sending it to the coordinator it will actually 
increase the accuracy for aggregations!

"how many of these sorts of use cases generate sufficiently large trees of 
results where a node-level merging would be beneficial"

It is primarily beneficial for bigger installations with lots of shards per 
machine. Say 40 machines with ~100 shards per machine. In the current 
strategy where every node is sending 100 results there is a lot of 
bandwidth used on the coordinating node, since it receives 4000 responses, 
while it could do with 40 responses (1 per machine).

I acknowledge it is a highly specialised use-case which not very many 
people run into, but it is a case I'm currently working on.

"How hard would it to be to implement such a feature?"

I have been looking into this, and it is not trivial. This needs to be 
implemented in/around the SearchService. This is the place I found to be 
implementing the different search strategies, eg. DFS. Unlike the rest of 
Elasticsearch it does seem to not consist of modules that implement 
different search strategies.

Regarding the accuracy of top-k lists. I think the above, both the 'node 
reduce phase' and making the search strategy pluggable will be the 
groundwork to start working on implementations of TJA or TPUT strategies as 
discussed in an old issue[1] about accuracy of factes.

The order of steps to take before reaching the ultimate goal would be:
1) Make search strategies (eg. query then fetch, dfs query then fetch) more 
modularized.
2) Make a search strategy with a 'node reduce phase' for the aggregations. 
Start with a complete reduce on the node. If that takes to much memory/time 
you can use TJA or TPUT locally on the node to get a reliable top-k list.
3a) Make a search strategy that executes TJA on the cluster coordinated by 
the coordinating node
3b) Make a separate strategy that executes TPUT on the cluster coordinated 
by the coordinating node

I would say that 3a and 3b are 'easy' if doing a complete reduce in step 2 
is not consuming to much resources.

Adding strategies for both TJA and TPUT gives ultimate control to the user, 
as TPUT is not suited for reliably sorting on sums where the field might 
contain a negative value. But TPUT has better performance in latency over 
TJA.

I would love to get an opinion from Adrien concerning the feasibility of 
such an approach.

-- Nils

[1] https://github.com/elasticsearch/elasticsearch/issues/1305

On Wednesday, January 14, 2015 at 7:47:07 PM UTC+1, Elliott Bradshaw wrote:
>
> How hard would it to be to implement such a feature?  Even if there are 
> only a handful of use cases, it could prove very helpful in these.  
> Particularly since very large trees are the ones that will struggle the 
> most with bandwidth issues.
>
>
> On Wednesday, January 14, 2015 at 1:36:53 PM UTC-5, Mark Harwood wrote:
>>
>> Understood, but what about cases where size is set to unlimited?  
>>> Inaccuracies are not a concern in that case, correct?
>>>
>>
>> Correct. But if we only consider the scenarios where the key sets are 
>> complete and accuracy is not put at risk by merging (i.e. there is no "top 
>> N" type filtering in play), how many of these sorts of use cases generate 
>> sufficiently large trees of results where a node-level merging would be 
>> beneficial? 
>>  
>>
>>>
>>> On Wednesday, January 14, 2015 at 1:09:48 PM UTC-5, Mark Harwood wrote:
>>>>
>>>> If you introduce an extra reduction phase (for multiple shards on the 
>>>> same node) you introduce further potential for inaccuracies in the final 
>>>> results.
>>>> Consider the role of 'size' and 'shard_size' in the "terms" aggregation 
>>>> [1] and the effects they have on accuracy. You'd arguably need a 
>>>> 'node_size' setting to also control the size of this new intermediate 
>>>> collection. All stages that reduce the volumes of data processed can 
>>>> introduce an approximation with the potential for inaccuracies upstream 
>>>> when merging.
>>>>
>>>>
>>>> [1] 
>>>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size
>>>>
>>>> On Wednesday, January 14, 2015 at 5:44:47 PM UTC, Elliott Bradshaw 
>>>> wrote:
>>>>>
>>>>> Adrien,
>>>>>
>>>>> I get the feeling that you're a pretty heavy contributor to the 
>>>>> aggregation module.  In your experience, would a shard per cpu core 
>>>>> strategy be an effective performance solution in a pure aggregation use 
>>>>> case?    If this could proportionally reduce the aggregation time, would 
>>>>> a 
>>>>> node local reduce (in which all shard aggregations on a given node are 
>>>>> reduced prior to being sent to the client node) be a good follow on 
>>>>> strategy for further enhancement?
>>>>>
>>>>> On Wednesday, January 14, 2015 at 10:56:03 AM UTC-5, Adrien Grand 
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 14, 2015 at 4:16 PM, Elliott Bradshaw <ebrad...@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Just out of curiosity, are aggregations on multiple shards on a 
>>>>>>> single node executed serially or in parallel?  In my experience, it 
>>>>>>> appears 
>>>>>>> that they're executed serially (my CPU usage did not change when going 
>>>>>>> from 
>>>>>>> 1 shard to 2 shards per node, but I didn't test this extensively).  I'm 
>>>>>>> interested in maximizing the parallelism of an aggregation without 
>>>>>>> creating 
>>>>>>> a massive number of nodes.
>>>>>>>
>>>>>>>
>>>>>> Requests are processed serially per shard, but several shards can be 
>>>>>> processed at the same time. So if you have an index that consists of N 
>>>>>> primaries, this would run on N processors of your cluster in parallel.
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Adrien Grand
>>>>>>  
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f28febf8-c097-4301-8a4e-315ff1d00b92%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to