Re: join result dataset bigger than before

2012-06-26 Thread Marco Cadetg
hrm this is obviously my bad. The right dataset was just having multiple keys... Sorry if someone has taken the time to read the garbage. Cheers, -Marco On Tue, Jun 26, 2012 at 3:35 PM, Marco Cadetg wrote: > Hi there, > > I'm doing a join like this: > > A = LOAD '/data/sessions' USING PigStorag

join result dataset bigger than before

2012-06-26 Thread Marco Cadetg
Hi there, I'm doing a join like this: A = LOAD '/data/sessions' USING PigStorage(',') AS (userid:chararray, client_type:chararray, flag:long); A1 = GROUP bettyy_sessions ALL; A1 = FOREACH A1 GENERATE COUNT(A); DUMP A1 (543872) B = LOAD '/data/userdb' USING PigStorage(',') AS (uid:chararray, bi