[ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053407#comment-13053407 ]
Zhijie Shen commented on PIG-1916: ---------------------------------- As Daniel suggested, I differ the process of visiting LOCross: for the top-level LOCross, I keep the original logic; for the nested LOCross, I generate a POCross instead and place it into the correct place of the physical plan. After that, I've found the physical and map/reduce plan is generated correctly. However, the pipeline was still broken because the getNext() function of POCross hasn't been implemented. I recall that you've mentioned there's POCross in previous version of Pig. I found out the code and reuse the getNext() function. I found it almost worked! There were some bugs, which were fixed by me. Finally, the nested cross basically works! Please refer to the newest patch for the code update. > Nested cross > ------------ > > Key: PIG-1916 > URL: https://issues.apache.org/jira/browse/PIG-1916 > Project: Pig > Issue Type: New Feature > Components: impl > Reporter: Daniel Dai > Labels: gsoc2011 > Fix For: 0.10 > > Attachments: PIG-1916_1.patch, PIG-1916_2.patch > > > It is useful to have cross inside foreach nested statement. One typical use > case for nested foreach is after cogroup two relations, we want to flatten > the records of the same key, and do some processing. This is naturally to be > achieved by cross. Eg: > {code} > C = cogroup user by uid, session by uid; > D = foreach C { > crossed = cross user, session; -- To flatten two input bags > filtered = filter crossed by user::region == session::region; > result = foreach crossed generate processSession(user::age, user::gender, > session::ip); --Nested foreach Jira: PIG-1631 > generate result; > } > {code} > If we don't have cross, user have to write a UDF process the bag user, > session. It is much harder than a UDF process flattened tuples. This is > especially true when we have nested foreach statement(PIG-1631). > This is a candidate project for Google summer of code 2011. More information > about the program can be found at http://wiki.apache.org/pig/GSoc2011 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira