[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API

2020-09-23 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200886#comment-17200886
 ] 

Joel Bernstein commented on SOLR-8281:
--

[~gus], feel free to send me an email to discuss.

> Add RollupMergeStream to Streaming API
> --
>
> Key: SOLR-8281
> URL: https://issues.apache.org/jira/browse/SOLR-8281
> Project: Solr
>  Issue Type: Bug
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The RollupMergeStream merges the aggregate results emitted by the 
> RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform 
> rollup Aggregations on the joined Tuples. The HashJoinStream will require the 
> tuples to be partitioned on the Join keys. To avoid needing to repartition on 
> the *group by* fields for the RollupStream, we can perform a merge of the 
> rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
>   parallel (...
> rollup (...
> hashJoin (
>   search(...),
>   search(...),
>   on="fieldA" 
> )
>  )
>  )
>)
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker* 
> nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API

2020-09-23 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200848#comment-17200848
 ] 

Gus Heck commented on SOLR-8281:


This seems related to something I wanted to do for a client... I had reduce 
with group() and I wanted to then feed the groups to an arbitrary streaming 
expression for further processing, and have the result show up in the groups 
(result would have been a matrix). Problem I stopped on was how to express the 
stream to process the group without having a source (the source is the group).

> Add RollupMergeStream to Streaming API
> --
>
> Key: SOLR-8281
> URL: https://issues.apache.org/jira/browse/SOLR-8281
> Project: Solr
>  Issue Type: Bug
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The RollupMergeStream merges the aggregate results emitted by the 
> RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform 
> rollup Aggregations on the joined Tuples. The HashJoinStream will require the 
> tuples to be partitioned on the Join keys. To avoid needing to repartition on 
> the *group by* fields for the RollupStream, we can perform a merge of the 
> rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
>   parallel (...
> rollup (...
> hashJoin (
>   search(...),
>   search(...),
>   on="fieldA" 
> )
>  )
>  )
>)
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker* 
> nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org