[ 
https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1916:
----------------------------

    Description: 
It is useful to have cross inside foreach nested statement. One typical use 
case for nested foreach is after cogroup two relations, we want to flatten the 
records of the same key, and do some processing. This is naturally to be 
achieved by cross. Eg:
{code}
C = cogroup user by uid, session by uid;
D = foreach C {
    crossed = cross user, session; -- To flatten two input bags
    filtered = filter crossed by user::region == session::region;
    result = foreach crossed generate processSession(user::age, user::gender, 
session::ip);  --Nested foreach Jira: PIG-1631
    generate result;
}
{code}

If we don't have cross, user have to write a UDF process the bag user, session. 
It is much harder than a UDF process flattened tuples. This is especially true 
when we have nested foreach statement(PIG-1631).

  was:
It is useful to have cross inside foreach nested statement. One typical for 
nested foreach is after cogroup two relations, we want to flatten the records 
of the same key, and do some processing. This is naturally to be achieved by 
cross. Eg:
{code}
C = cogroup user by uid, session by uid;
D = foreach C {
    crossed = cross user, session; -- To flatten two input bags
    filtered = filter crossed by user::region == session::region;
    result = foreach crossed generate processSession(user::age, user::gender, 
session::ip);  --Nested foreach Jira: PIG-1631
    generate result;
}
{code}

If we don't have cross, user have to write a UDF process the bag user, session. 
It is much harder than a UDF process flattened tuples. This is especially true 
when we have nested foreach statement(PIG-1631).


> Nested cross
> ------------
>
>                 Key: PIG-1916
>                 URL: https://issues.apache.org/jira/browse/PIG-1916
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Daniel Dai
>             Fix For: 0.10
>
>
> It is useful to have cross inside foreach nested statement. One typical use 
> case for nested foreach is after cogroup two relations, we want to flatten 
> the records of the same key, and do some processing. This is naturally to be 
> achieved by cross. Eg:
> {code}
> C = cogroup user by uid, session by uid;
> D = foreach C {
>     crossed = cross user, session; -- To flatten two input bags
>     filtered = filter crossed by user::region == session::region;
>     result = foreach crossed generate processSession(user::age, user::gender, 
> session::ip);  --Nested foreach Jira: PIG-1631
>     generate result;
> }
> {code}
> If we don't have cross, user have to write a UDF process the bag user, 
> session. It is much harder than a UDF process flattened tuples. This is 
> especially true when we have nested foreach statement(PIG-1631).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to