If I am not wrong, PIG-1705 talks about conflicting alias's in a join : interesting to see how that affects Jay Hacker's issue where there is no alias re-use from what I saw ...


Regards,
Mridul

On Tuesday 19 April 2011 03:11 AM, Daniel Dai wrote:
I believe it is PIG-1705.

Daniel

On 04/18/2011 12:02 PM, Jay Hacker wrote:
Thanks.  Which Jira issue number is it?



On Fri, Apr 15, 2011 at 9:07 PM, Daniel Dai<jiany...@yahoo-inc.com>   wrote:
This is a known bug, it is fixed on 0.8 svn. You can check out from
http://svn.apache.org/repos/asf/pig/branches/branch-0.8, or wait for 0.8.1
coming in a few days.

Daniel

On 04/15/2011 01:45 PM, Jay Hacker wrote:
I'm trying to replace a couple of fields in a relation with values
looked up in another relation.  Here's an example; let's say I have a
relation mapping each integer to its square:

-----map.txt-----
1    1
2    4
3    9
4    16
5    25

Then I have some data, let's call the columns a and b:

-----data.txt-----
1    2
3    4
5    2

I want to replace each number in the data with its square.  My basic
approach is to join 'a' with the key, then generate the value; then
join 'b' with the key, and generate that value. Here's my pig script:

m = load 'map.txt' as (k,v);
data = load 'data.txt' as (a,b);
x = join m by k, data by a;
y = foreach x generate v as aa, b;
z = join m by k, y by b;
w = foreach z generate aa, v as bb;
dump w;

This outputs:

(4,4)
(4,4)
(16,16)

The problem is it y's version of v gets replaced with w's version.  I
expect it to output:

(1, 4)
(9, 16)
(25, 4)

What's weird is I'm pretty sure this used to work in Pig 0.7.  If
there's a better way to do this (using maps?), please let me know.
I'm using Pig 0.8 with Cloudera CDH3b4.

Thanks.



Reply via email to