[ 
https://issues.apache.org/jira/browse/SOLR-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-8337:
---------------------------------
    Description: 
The current ReducerStream groups all documents that share the same key(s) into 
a list and emits a single Tuple that contains this list. There is no way to 
tell the ReducerStream to do something more interesting with groups, for 
example summing a column within a group, or joining tuples. 

This ticket adds a new type of operation called a ReduceOperation which is 
passed to the ReducerStream so that the reduce behavior can be specialized.

The ReduceOperation has two methods:

1) operate(Tuple) : This is called once for each Tuple in a group. This method 
can be used to aggregate Tuples as they added to a group. 
2) reduce() : This is called when the group keys change. This method returns a 
single Tuple which is output by the ReducerStream. The ReduceOperation must 
clear it's internal structures when reduce is called as well, to prepare for 
the next group.




  was:
This is a very simple ticket to create new interface that extends the 
StreamOperation. The interface will be called the ReduceOperation.

In the near future the ReducerStream will be changed to accept a 
ReduceOperation. This will allow users to pass in the specific reduce algorithm 
to the ReducerStream, making the ReducerStream much more powerful.


> Add ReduceOperation and wire it into the ReducerStream
> ------------------------------------------------------
>
>                 Key: SOLR-8337
>                 URL: https://issues.apache.org/jira/browse/SOLR-8337
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>         Attachments: SOLR-8337.patch, SOLR-8337.patch, SOLR-8337.patch, 
> SOLR-8337.patch, SOLR-8337.patch
>
>
> The current ReducerStream groups all documents that share the same key(s) 
> into a list and emits a single Tuple that contains this list. There is no way 
> to tell the ReducerStream to do something more interesting with groups, for 
> example summing a column within a group, or joining tuples. 
> This ticket adds a new type of operation called a ReduceOperation which is 
> passed to the ReducerStream so that the reduce behavior can be specialized.
> The ReduceOperation has two methods:
> 1) operate(Tuple) : This is called once for each Tuple in a group. This 
> method can be used to aggregate Tuples as they added to a group. 
> 2) reduce() : This is called when the group keys change. This method returns 
> a single Tuple which is output by the ReducerStream. The ReduceOperation must 
> clear it's internal structures when reduce is called as well, to prepare for 
> the next group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to