[
https://issues.apache.org/jira/browse/SOLR-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joel Bernstein updated SOLR-8337:
---------------------------------
Description:
The current ReducerStream groups all documents that share the same key(s) into
a list and emits a single Tuple that contains this list. There is no way to
tell the ReducerStream to do something more interesting with groups, for
example summing a column within a group, or joining tuples.
This ticket adds a new type of operation called a ReduceOperation which is
passed to the ReducerStream so that the reduce behavior can be specialized.
The ReduceOperation has two methods:
1) operate(Tuple) : This is called once for each Tuple in a group. This method
can be used to aggregate Tuples as they added to a group.
2) reduce() : This is called when the group keys change. This method returns a
single Tuple which is output by the ReducerStream. The ReduceOperation must
clear it's internal structures when reduce is called as well, to prepare for
the next group.
was:
This is a very simple ticket to create new interface that extends the
StreamOperation. The interface will be called the ReduceOperation.
In the near future the ReducerStream will be changed to accept a
ReduceOperation. This will allow users to pass in the specific reduce algorithm
to the ReducerStream, making the ReducerStream much more powerful.
> Add ReduceOperation and wire it into the ReducerStream
> ------------------------------------------------------
>
> Key: SOLR-8337
> URL: https://issues.apache.org/jira/browse/SOLR-8337
> Project: Solr
> Issue Type: Bug
> Reporter: Joel Bernstein
> Attachments: SOLR-8337.patch, SOLR-8337.patch, SOLR-8337.patch,
> SOLR-8337.patch, SOLR-8337.patch
>
>
> The current ReducerStream groups all documents that share the same key(s)
> into a list and emits a single Tuple that contains this list. There is no way
> to tell the ReducerStream to do something more interesting with groups, for
> example summing a column within a group, or joining tuples.
> This ticket adds a new type of operation called a ReduceOperation which is
> passed to the ReducerStream so that the reduce behavior can be specialized.
> The ReduceOperation has two methods:
> 1) operate(Tuple) : This is called once for each Tuple in a group. This
> method can be used to aggregate Tuples as they added to a group.
> 2) reduce() : This is called when the group keys change. This method returns
> a single Tuple which is output by the ReducerStream. The ReduceOperation must
> clear it's internal structures when reduce is called as well, to prepare for
> the next group.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]