[jira] [Comment Edited] (SOLR-8281) Add RollupMergeStream to Streaming API

Dennis Gove (JIRA) Wed, 18 Nov 2015 18:32:38 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012651#comment-15012651
 ]


Dennis Gove edited comment on SOLR-8281 at 11/19/15 2:31 AM:
-------------------------------------------------------------

To be honest I think this logic should live in the ParallelStream. As a user of 
this stream I would expect it to properly merge all workers together, including 
metrics calculated in those workers. 

That said, putting it in the ReducerStream is also a good idea. I'm on the 
fence as to which would be better. Adding too much to the ParallelStream might 
end up hurting us long-term.


was (Author: dpgove):
To be honest I think this logic should live in the ParallelStream. As a user of 
this stream I would expect it to properly merge all workers together, including 
metrics calculated in those workers. 

That said, putting it in the ReducerStream is also a good idea. I'm on the 
fence as to which would be better. Adding to much to the ParallelStream might 
end up hurting us long-term.

> Add RollupMergeStream to Streaming API
> --------------------------------------
>
>                 Key: SOLR-8281
>                 URL: https://issues.apache.org/jira/browse/SOLR-8281
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>
> The RollupMergeStream merges the aggregate results emitted by the 
> RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform 
> rollup Aggregations on the joined Tuples. The HashJoinStream will require the 
> tuples to be partitioned on the Join keys. To avoid needing to repartition on 
> the *group by* fields for the RollupStream, we can perform a merge of the 
> rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
>                       parallel (...
>                                     rollup (...
>                                                 hashJoin (
>                                                                   search(...),
>                                                                   search(...),
>                                                                   on="fieldA" 
>                                                 )
>                                      )
>                          )
>                )
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker* 
> nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8281) Add RollupMergeStream to Streaming API

Reply via email to