Re: how to optimize multiple stores

2015-01-08 Thread Marco Cadetg
Hi Rodrigo, Thanks for your suggestion. Though I don't see how the multistore UDF helps. Register UDFs etc A = LOAD B = LOAD C = LOAD -- do lots of transformations with A and B and C get intermediate result INTER_RES result1 = FOREACH (GROUP INTER_RES BY (... STORE result1

left join on multiple columns

2015-01-08 Thread Patcharee Thongtra
Hi, I am new to pig. I am using pig version 0.12. I found an unexpected behaviour from left join on multiple columns as listed below -- ... ... dump r_four_dim1; describe r_four_dim1; dump result_height; describe result_height;

how to optimize multiple stores

2015-01-08 Thread Marco Cadetg
Hi there, I've a big pig script which first generates some expensive intermediate result on which I run multiple group by statements and multiple stores. Something like this. Register UDFs etc A = LOAD B = LOAD C = LOAD -- do lots of transformations with A and B and C get

Re: how to optimize multiple stores

2015-01-08 Thread Rodrigo Ferreira
Marco, check out this UDF: http://pig.apache.org/docs/r0.8.1/api/org/apache/pig/piggybank/storage/MultiStorage.html I think it can get the job done without having to group everything. Cheers, Rodrigo 2015-01-08 7:27 GMT-02:00 Marco Cadetg ma...@zattoo.com: Hi there, I've a big pig script

Re: left join on multiple columns

2015-01-08 Thread David Warshaw
Hi Patcharee, I wasn't able to reproduce either issue on Pig 0.14.0. 1: -- grunt dump join_height; (1,1,2009,0,559,447,1,-4.964739,1,1,2009,0,559,447,1,109.71929) grunt describe join_height; join_height: {r_four_dim1::date: