Using an alias withing Nested Foreach causes indeterminate behaviour --------------------------------------------------------------------
Key: PIG-1633 URL: https://issues.apache.org/jira/browse/PIG-1633 Project: Pig Issue Type: Bug Affects Versions: 0.7.0, 0.6.0, 0.5.0, 0.4.0 Reporter: Viraj Bhat I have created a RANDOMINT function which generates random numbers between (0 and specified value), For example RANDOMINT(4) gives random numbers between 0 and 3 (inclusive) {code} $hadoop fs -cat rand.dat f g h i j k l m {code} The pig script is as follows: {code} register math.jar; A = load 'rand.dat' using PigStorage() as (data); B = foreach A { r = math.RANDOMINT(4); generate data, r as random, ((r == 3)?1:0) as quarter; }; dump B; {code} The results are as follows: {code} {color:red} (f,0,0) (g,3,0) (h,0,0) (i,2,0) (j,3,0) (k,2,0) (l,0,1) (m,1,0) {color} {code} If you observe, (j,3,0) is created because r is used both in the foreach and generate clauses and generate different values. Modifying the above script to below solves the issue. The M/R jobs from both scripts are the same. It is just a matter of convenience. {code} A = load 'rand.dat' using PigStorage() as (data); B = foreach A generate data, math.RANDOMINT(4) as r; C = foreach B generate data, r, ((r == 3)?1:0) as quarter; dump C; {code} Is this issue related to PIG:747? Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.