[ 
https://issues.apache.org/jira/browse/FLINK-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053467#comment-14053467
 ] 

Fabian Hueske commented on FLINK-758:
-------------------------------------

This is a good observation, though it is only a problem for the ungrouped 
GroupReduce.
If you have a grouped reduce (plain Reduce or GroupReduce), you cannot create 
an initial value for all possible keys that are "missing".
An ungrouped Reduce will only get no data if all data is filtered. 

I think we are mixing the semantics a bit here. There is a difference between 
an initial value that is used if at least one element of a group exists and an 
initial value that is always used (only practical for ungrouped reduces) and 
serves as a default value if no data is present.

Should we separate this into 1) an initial value that is used if at least on 
element is present and 2) a default value for ungrouped Reduce and GroupReduce 
if the input is empty?

> Add count method to DataSet and implement CountOperator
> -------------------------------------------------------
>
>                 Key: FLINK-758
>                 URL: https://issues.apache.org/jira/browse/FLINK-758
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: GitHub Import
>              Labels: github-import
>             Fix For: pre-apache
>
>         Attachments: pull-request-758-7518001488867571817.patch
>
>
> At the request of @twalthr. This is the count operator I've implemented some 
> time ago to get the to know the new Java API. It introduces 
> `DataSet.count()`, which is executed as a map (to ones) and reduce (sum up 
> the ones). I initially didn't do the PR, because of the following problem: 
> empty DataSets don't work as the first map won't have any input to operate on.
> If more people think that we should include this operator we can think about 
> a possible solution to the problem.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/pull/758
> Created by: [uce|https://github.com/uce]
> Labels: enhancement, java api, 
> Milestone: Release 0.6 (unplanned)
> Created at: Tue May 06 10:42:33 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to