[ 
https://issues.apache.org/jira/browse/CRUNCH-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453908#comment-13453908
 ] 

Josh Wills commented on CRUNCH-58:
----------------------------------

I understand the motivation on #2, #1 is kinda iffy for the reason you 
mentioned (i.e., in Crunch, everything is based off of PCollection). I would 
prefer to have the interface only provide the getValue method for now. write() 
is somewhat misleading, IMO, since it writes the underlying PCollection, which 
is not necessarily the same type as the value returned by the PObject, and the 
getName/getPipeline stuff shouldn't be necessary since we tend to think of 
PObjects as end points, and those methods are intended to be used when we're 
going to do some subsequent processing on a PCollection (e.g., the lib/* 
methods make extensive use of them, and I don't know that we have PObject lib/* 
related methods in mind just yet.)

Of course, if we have use cases later on where exposing those methods make 
sense, adding them to the interface won't be a big deal. I'm happy to just take 
your existing patch as-is and prune off those methods, if it's alright with you.
                
> Implement PObject in Crunch/Scrunch
> -----------------------------------
>
>                 Key: CRUNCH-58
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-58
>             Project: Crunch
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Kiyan Ahmadizadeh
>            Assignee: Kiyan Ahmadizadeh
>         Attachments: CRUNCH-58.patch
>
>
> FlumeJava has the concept of a PObject<T>, a container for a singleton of 
> type T.  It is meant represent the result of a distributed computation that 
> yields a singleton value (for example max, min, and length methods on 
> PCollection<T>).  Generally speaking, the result of any computation that 
> combines/reduces a PCollection into a singleton value could be represented by 
> a PObject.  
> Like PCollection, a PObject defers distributed computation until its value is 
> actually used.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to