[ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-1916. ----------------------------- Resolution: Fixed Release Note: Allow cross two or more bags inside a foreach statement. For example: user = load 'user' as (uid, age, gender, region); session = load 'session' as (uid, region); C = cogroup user by uid, session by uid; D = foreach C { crossed = cross user, session; generate crossed; } Hadoop Flags: [Reviewed] TestScriptLanguage fail, but it is not caused by this patch. Will open a separate Jira to fix test case. Test-patch show several more javac warnings, all of them are in Antlr generated code. Patch committed to trunk. Congratulation, Zhijie! > Nested cross > ------------ > > Key: PIG-1916 > URL: https://issues.apache.org/jira/browse/PIG-1916 > Project: Pig > Issue Type: New Feature > Components: impl > Reporter: Daniel Dai > Assignee: Zhijie Shen > Labels: gsoc2011 > Fix For: 0.10 > > Attachments: PIG-1916_1.patch, PIG-1916_2.patch, PIG-1916_3.patch, > PIG-1916_4.patch, PIG-1916_5.patch > > > It is useful to have cross inside foreach nested statement. One typical use > case for nested foreach is after cogroup two relations, we want to flatten > the records of the same key, and do some processing. This is naturally to be > achieved by cross. Eg: > {code} > C = cogroup user by uid, session by uid; > D = foreach C { > crossed = cross user, session; -- To flatten two input bags > filtered = filter crossed by user::region == session::region; > result = foreach crossed generate processSession(user::age, user::gender, > session::ip); --Nested foreach Jira: PIG-1631 > generate result; > } > {code} > If we don't have cross, user have to write a UDF process the bag user, > session. It is much harder than a UDF process flattened tuples. This is > especially true when we have nested foreach statement(PIG-1631). > This is a candidate project for Google summer of code 2011. More information > about the program can be found at http://wiki.apache.org/pig/GSoc2011 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira