You can just do:

join_pe_pre = JOIN page_events BY (day, session_id, page_seq_num) LEFT
OUTER, page_events_pre BY (day, session_id, page_seq_num + 1);

Amit


On 2/16/11 2:09 PM, "sonia gehlot" <[email protected]> wrote:

> Hi All,
> 
> I am new to Hadoop and I started exploring Pig since last month. I have few
> question I have to replicate some SQL query to Pig that has left join for
> example:
> 
> select blah, blah
> From
> page_events pe
> Left Join page_events pe_pre
> on pe.day = pe_pre.day
> And pe.session_id = pe_pre.session_id
> And pe.page_seq_num = pe_pre.page_seq_num + 1
> 
> So I wanted to confirm is this is the right and only way to do multi column
> join in Pig? Or we can do this in some other way?
> 
> join1_pe_pre = JOIN page_events BY day LEFT OUTER,  page_events_pre BY day ;
> 
> join2_pe_pre = JOIN join1_pe_pre BY session_id LEFT OUTER, page_events_pre
> BY session_id ;
> 
> join3_pe_pre = JOIN join2_pe_pre BY page_seq_num LEFT OUTER, page_events_pre
> BY page_seq_num +1 ;
> 
> Thanks for your help.
> 
> Sonia

Reply via email to