Let me illustrate why outer join does not work for me. pig: A = LOAD '$in1' USING PigStorage(',') AS (a:chararray, b:chararray, c:chararray); B = LOAD '$in2' USING PigStorage(',') AS (a:chararray, b:chararray, c:chararray); C = JOIN A BY (a,b), B BY (a,b); CProj = FOREACH C GENERATE A::a, A::b, A::c, B::c; DUMP CProj;
DataSet in1: a,b,c1 a,,c2 ,b,c3 ,,c4 DataSet in2: a,b,c10 a,,c11 ,b,c12 ,,c13 The inner join would produce this output (a,b,c1,c10) While outer(Full) would produce this result: (a,b,c1,c10) (a,,c2,) (,,,c11) (,b,c3,) (,,,c12) (,,c4,) (,,,c13) The desired output is : (a,b,c1, c10) (a,,c2, c11) (,b,c3, c12) (,,c4, c13) On Sat, Sep 13, 2014 at 3:22 AM, Mona Chitnis <mona.chit...@yahoo.in> wrote: > Why not use Outer Join instead?<a href=" > https://overview.mail.yahoo.com?.src=iOS"><br/><br/>Sent from Yahoo Mail > for iPhone</a>