[ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-1916: ---------------------------- Attachment: PIG-1916_5.patch Change the patch slightly to fix test-patch warnings. > Nested cross > ------------ > > Key: PIG-1916 > URL: https://issues.apache.org/jira/browse/PIG-1916 > Project: Pig > Issue Type: New Feature > Components: impl > Reporter: Daniel Dai > Assignee: Zhijie Shen > Labels: gsoc2011 > Fix For: 0.10 > > Attachments: PIG-1916_1.patch, PIG-1916_2.patch, PIG-1916_3.patch, > PIG-1916_4.patch, PIG-1916_5.patch > > > It is useful to have cross inside foreach nested statement. One typical use > case for nested foreach is after cogroup two relations, we want to flatten > the records of the same key, and do some processing. This is naturally to be > achieved by cross. Eg: > {code} > C = cogroup user by uid, session by uid; > D = foreach C { > crossed = cross user, session; -- To flatten two input bags > filtered = filter crossed by user::region == session::region; > result = foreach crossed generate processSession(user::age, user::gender, > session::ip); --Nested foreach Jira: PIG-1631 > generate result; > } > {code} > If we don't have cross, user have to write a UDF process the bag user, > session. It is much harder than a UDF process flattened tuples. This is > especially true when we have nested foreach statement(PIG-1631). > This is a candidate project for Google summer of code 2011. More information > about the program can be found at http://wiki.apache.org/pig/GSoc2011 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira