[ 
https://issues.apache.org/jira/browse/CRUNCH-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450003#comment-13450003
 ] 

Kiyan Ahmadizadeh commented on CRUNCH-57:
-----------------------------------------

On a similar note, is it expected that for a PCollection<T>, T is comparable?  
Aggregate.max and Aggregate.min contain implementations that check that the 
type class for the PCollection is Comparable.  Given the MapReduce framework's 
reliance on sorting, however, it seems like T should always be Comparable.  

If it's not expected that T be Comparable, I would argue that max, min, and 
sort functionality should be in a PCollection decorator that contains 
operations for PCollections that contain comparable elements.  
                
> Add a length function to PCollection
> ------------------------------------
>
>                 Key: CRUNCH-57
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-57
>             Project: Crunch
>          Issue Type: New Feature
>          Components: Core
>    Affects Versions: 0.3.0
>            Reporter: Kiyan Ahmadizadeh
>            Assignee: Josh Wills
>         Attachments: CRUNCH-57.patch
>
>
> Sometimes it's useful and interesting to compute the number of elements in a 
> PCollection.
>  
> For example, suppose there was an initial PCollection that was then filtered 
> into another.  If I'm interested in how many elements of the original 
> PCollection matched the filter, I'll have to write extra code to compute this.
> PCollections should have a length method that, when called, computes the 
> number of elements in the PCollection and returns the result. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to