Re: About Multiple Join in Pig

Daniel Dai Tue, 01 Nov 2016 10:40:43 -0700

Hi, Mingda,

Pig does not do join reordering and will execute the query as the way it is 
written. Note you can join multiple relations in one join statement.

Do you want execution time for each join in your statement? I assume you are 
using regular join and running with MapReduce, every join statement will be a 
separate MapReduce job and the join runtime is the runtime for its MapReduce 
job.

Thanks,
Daniel

On 10/31/16, 8:21 PM, "mingda li" <limingda1...@gmail.com> wrote:

>Dear all,
>
>I am doing optimization for multiple join. I am not sure if Pig can decide
>the join order in optimization layer. Does anyone know about this? Or Pig
>just execute the query as the way it is written.
>
>And, I want to do the multiple way Join on different keys. Can the
>following query work?
>
>Res =
>JOIN
>(JOIN catalog_sales BY cs_item_sk, inventory BY  inv_item_sk) BY
>(cs_item_sk, cs_order_number), catalog_returns BY (cr_item_sk,
>cr_order_number);
>
>BTW, each time, I run the query, it is finished in one second. Is there a
>way to see the execution time? I have set the  pig.udf.profile=true. Where
>can I find the time?
>
>Bests,
>Mingda

Re: About Multiple Join in Pig

Reply via email to