[ 
https://issues.apache.org/jira/browse/FLINK-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124957#comment-14124957
 ] 

ASF GitHub Bot commented on FLINK-758:
--------------------------------------

Github user uce commented on the pull request:

    https://github.com/apache/incubator-flink/pull/63#issuecomment-54751876
  
    Thanks for the review. The initial value for the reduce function and the 
count operator are tightly connected. The reduce with initial value is the 
general solution, of which the count operator is a special case. Therefore, I 
wouldn't say that these are independent features. The refactorings are also 
limited to files related to the initial value reduce/count operator.
    
    The counting for grouped data sets was a quick fix after @hsaputra's 
comment. We can either fix it with this PR or open a seperate issue if we want 
to merge it.
    
    I think the limitation to AllReduce was the result of a discussion with you 
and @StephanEwen.
    
    ---
    
    All in all, I think that we should wait for the upcoming changes to the 
runtime and scheduler to support the more intuitive API of simply returning the 
count to the user program. As you said, we might move some of the changes (like 
initial value reduce) to a separate issue if we find them useful.


> Add count method to DataSet and implement CountOperator
> -------------------------------------------------------
>
>                 Key: FLINK-758
>                 URL: https://issues.apache.org/jira/browse/FLINK-758
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: GitHub Import
>              Labels: github-import
>             Fix For: pre-apache
>
>         Attachments: pull-request-758-7518001488867571817.patch
>
>
> At the request of @twalthr. This is the count operator I've implemented some 
> time ago to get the to know the new Java API. It introduces 
> `DataSet.count()`, which is executed as a map (to ones) and reduce (sum up 
> the ones). I initially didn't do the PR, because of the following problem: 
> empty DataSets don't work as the first map won't have any input to operate on.
> If more people think that we should include this operator we can think about 
> a possible solution to the problem.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/pull/758
> Created by: [uce|https://github.com/uce]
> Labels: enhancement, java api, 
> Milestone: Release 0.6 (unplanned)
> Created at: Tue May 06 10:42:33 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to