[ https://issues.apache.org/jira/browse/PIG-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1834: ------------------------------- Description: Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries. For example - {code} l = load 'x' as (a,b); l = filter l by a > 1; l = foreach ... store l into 'y' {code} At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it. But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script. For example - {code} l = load 'x' as (a,b); A = load 'x' as (a,b); B = foreach A generate a, l.a as la; l = foreach l generate a+1 as a; store B into 'b'; {code} The alias l in relation with alias B should refer to the load, but it refers to the foreach statement - {code} #-------------------------------------------------- # Map Reduce Plan #-------------------------------------------------- MapReduce node scope-16 Map Plan l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8 | |---l: New For Each(false)[bag] - scope-7 | | | Add[int] - scope-5 | | | |---Cast[int] - scope-3 | | | | | |---Project[bytearray][0] - scope-2 | | | |---Constant(1) - scope-4 | |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1-------- Global sort: false ---------------- MapReduce node scope-17 Map Plan B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15 | |---B: New For Each(false,false)[bag] - scope-14 | | | Project[bytearray][0] - scope-9 | | | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13 | | | |---Constant(0) - scope-11 | | | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12 | |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0-------- Global sort: false ---------------- {code} was: Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries. For example - {code} l = load 'x' as (a,b); l = filter l by a > 1; l = foreach ... store l into 'y' {code} At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it. But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script. For example - {code} l = load 'x' as (a,b); A = load 'x' as (a,b); B = foreach A generate a, l.a as la; l = foreach l generate a+1 as a; store B into 'b'; {code} The alias l in relation with alias B should refer to the load, but it refers to the foreach statement - #-------------------------------------------------- # Map Reduce Plan #-------------------------------------------------- MapReduce node scope-16 Map Plan l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8 | |---l: New For Each(false)[bag] - scope-7 | | | Add[int] - scope-5 | | | |---Cast[int] - scope-3 | | | | | |---Project[bytearray][0] - scope-2 | | | |---Constant(1) - scope-4 | |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1-------- Global sort: false ---------------- MapReduce node scope-17 Map Plan B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15 | |---B: New For Each(false,false)[bag] - scope-14 | | | Project[bytearray][0] - scope-9 | | | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13 | | | |---Constant(0) - scope-11 | | | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12 | |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0-------- Global sort: false ---------------- > relation-as-scalar - uses the last statement associated with the scalar alias > ----------------------------------------------------------------------------- > > Key: PIG-1834 > URL: https://issues.apache.org/jira/browse/PIG-1834 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.0 > Reporter: Thejas M Nair > Fix For: 0.8.0, 0.9.0 > > > Pig allows relation alias to be re-used , ie refer to different > relations(/statements) . I have not seen this in documentation, but I have > seen people writing such queries. > For example - > {code} > l = load 'x' as (a,b); > l = filter l by a > 1; > l = foreach ... > store l into 'y' > {code} > At any part of the query, the alias "l' always represents the relation it > last associated with the portion of pig-query above it. > But in case of relation-as-scalar feature the association is happening with > the last relation associated with the alias in entire script. > For example - > {code} > l = load 'x' as (a,b); > A = load 'x' as (a,b); > B = foreach A generate a, l.a as la; > l = foreach l generate a+1 as a; > store B into 'b'; > {code} > The alias l in relation with alias B should refer to the load, but it refers > to the foreach statement - > {code} > #-------------------------------------------------- > # Map Reduce Plan > #-------------------------------------------------- > MapReduce node scope-16 > Map Plan > l: > Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) > - scope-8 > | > |---l: New For Each(false)[bag] - scope-7 > | | > | Add[int] - scope-5 > | | > | |---Cast[int] - scope-3 > | | | > | | |---Project[bytearray][0] - scope-2 > | | > | |---Constant(1) - scope-4 > | > |---l: > Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) > - scope-1-------- > Global sort: false > ---------------- > MapReduce node scope-17 > Map Plan > B: > Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) > - scope-15 > | > |---B: New For Each(false,false)[bag] - scope-14 > | | > | Project[bytearray][0] - scope-9 > | | > | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13 > | | > | |---Constant(0) - scope-11 > | | > | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12 > | > |---A: > Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) > - scope-0-------- > Global sort: false > ---------------- > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.