[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API
[ https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200886#comment-17200886 ] Joel Bernstein commented on SOLR-8281: -- [~gus], feel free to send me an email to discuss. > Add RollupMergeStream to Streaming API > -- > > Key: SOLR-8281 > URL: https://issues.apache.org/jira/browse/SOLR-8281 > Project: Solr > Issue Type: Bug >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > > The RollupMergeStream merges the aggregate results emitted by the > RollupStream on *worker* nodes. > This is designed to be used in conjunction with the HashJoinStream to perform > rollup Aggregations on the joined Tuples. The HashJoinStream will require the > tuples to be partitioned on the Join keys. To avoid needing to repartition on > the *group by* fields for the RollupStream, we can perform a merge of the > rolled up Tuples coming from the workers. > The construct would like this: > {code} > mergeRollup (... > parallel (... > rollup (... > hashJoin ( > search(...), > search(...), > on="fieldA" > ) > ) > ) >) > {code} > The pseudo code above would push the *hashJoin* and *rollup* to the *worker* > nodes. The emitted rolled up tuples would be merged by the mergeRollup. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API
[ https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200848#comment-17200848 ] Gus Heck commented on SOLR-8281: This seems related to something I wanted to do for a client... I had reduce with group() and I wanted to then feed the groups to an arbitrary streaming expression for further processing, and have the result show up in the groups (result would have been a matrix). Problem I stopped on was how to express the stream to process the group without having a source (the source is the group). > Add RollupMergeStream to Streaming API > -- > > Key: SOLR-8281 > URL: https://issues.apache.org/jira/browse/SOLR-8281 > Project: Solr > Issue Type: Bug >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > > The RollupMergeStream merges the aggregate results emitted by the > RollupStream on *worker* nodes. > This is designed to be used in conjunction with the HashJoinStream to perform > rollup Aggregations on the joined Tuples. The HashJoinStream will require the > tuples to be partitioned on the Join keys. To avoid needing to repartition on > the *group by* fields for the RollupStream, we can perform a merge of the > rolled up Tuples coming from the workers. > The construct would like this: > {code} > mergeRollup (... > parallel (... > rollup (... > hashJoin ( > search(...), > search(...), > on="fieldA" > ) > ) > ) >) > {code} > The pseudo code above would push the *hashJoin* and *rollup* to the *worker* > nodes. The emitted rolled up tuples would be merged by the mergeRollup. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org