[ https://issues.apache.org/jira/browse/PIG-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460821#comment-13460821 ]
Julien Le Dem commented on PIG-2536: ------------------------------------ Basically this is making the bag projection syntax work on relations. If we want to do this we should look at all the places where a relation can be used and make sure we can have a consistent syntax. This type of shortcut is useful only if it consistent, otherwise it is confusing. For some it is straightforward: B = distinct A.x; => B = distinct (foreach A generate x); B = limit A.x 10; => B = limit (foreach A generate x) 10; B = sample A.x 0.1; => B = sample (foreach A generate x) 0.1; For other operators it could be tricky: B = ORDER A.x BY x; => B = ORDER (FOREACH A GENERATE x) BY x; B = ORDER A.(y,z) BY x; => B = FOREACH (ORDER A BY x) GENERATE y,z; Same for group by: B = GROUP A.(y,z) BY x; => B = FOREACH (GROUP A BY x) GENERATE group, A.(y,z); And Filter B = FILTER A.(y,z) BY x=0 => B = FOREACH (FILTER A BY x=0) GENERATE y,z; For Split, Join, cogroup it becomes trickier. > Extend pig to support DISTINCT x.(project) > ------------------------------------------ > > Key: PIG-2536 > URL: https://issues.apache.org/jira/browse/PIG-2536 > Project: Pig > Issue Type: Improvement > Reporter: Jonathan Coveney > Assignee: Jonathan Coveney > Priority: Minor > Fix For: 0.11 > > Attachments: PIG-2436-0.patch > > > Currently, pig does not allow this syntax: > {code} > A = load 'thing' (x:int, y:int, z:int); > B = distinct A.x; > C = distinct A.(y,z) > D = distinct C.$0; > {code} > and so on. With this patch, it does. I should probably add more tests, though > it's a simple patch... it just turns distinct rel.proj into syntactic sugar > for distinct (foreach rel generate proj) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira