You can just do: join_pe_pre = JOIN page_events BY (day, session_id, page_seq_num) LEFT OUTER, page_events_pre BY (day, session_id, page_seq_num + 1);
Amit On 2/16/11 2:09 PM, "sonia gehlot" <[email protected]> wrote: > Hi All, > > I am new to Hadoop and I started exploring Pig since last month. I have few > question I have to replicate some SQL query to Pig that has left join for > example: > > select blah, blah > From > page_events pe > Left Join page_events pe_pre > on pe.day = pe_pre.day > And pe.session_id = pe_pre.session_id > And pe.page_seq_num = pe_pre.page_seq_num + 1 > > So I wanted to confirm is this is the right and only way to do multi column > join in Pig? Or we can do this in some other way? > > join1_pe_pre = JOIN page_events BY day LEFT OUTER, page_events_pre BY day ; > > join2_pe_pre = JOIN join1_pe_pre BY session_id LEFT OUTER, page_events_pre > BY session_id ; > > join3_pe_pre = JOIN join2_pe_pre BY page_seq_num LEFT OUTER, page_events_pre > BY page_seq_num +1 ; > > Thanks for your help. > > Sonia
