Dear all,

I want to test the different multiple join orders' efficiency. However,
since the pig query is executed lazily, I need to use dump or store to let
the query be executed.

Now, I use the following query to test the efficiency.

*Bad_OrderIn = JOIN inventory BY  inv_item_sk, catalog_sales BY cs_item_sk;*
*Bad_OrderRes = JOIN Bad_OrderIn  BY   (cs_item_sk, cs_order_number),
catalog_returns BY (cr_item_sk, cr_order_number);*
*limit_data = LIMIT Bad_OrderRes 4; *
*Dump limit_data;*

Do you think this is OK to just show 4 of results? Could this query
execution time represent the efficiency of multilpe join? I am not sure if
it will just get 4 items and stop without executing other items.

Bests,
Mingda

Reply via email to