Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Pradeep Gollakota
FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address), cust_email; ​ On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe drah...@googlemail.com wrote: Hi All, I have imported hive table into pig having a complex data type (ARRAYString). The alias in pig looks as below grunt describe A;

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Pradeep Gollakota
Disregard last email. Sorry... didn't fully understand the question. On Mon, Jun 2, 2014 at 8:44 AM, Pradeep Gollakota pradeep...@gmail.com wrote: FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address), cust_email; ​ On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Pradeep Gollakota
There was a similar question as this on StackOverflow a while back. The suggestion was to write a custom BagToTuple UDF. http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig On Mon, Jun 2, 2014 at 8:46 AM, Pradeep Gollakota pradeep...@gmail.com wrote:

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Rahul Channe
Thank You Pradeep, it worked to a certain extend but having following difficulty in separating fields as $0,$1 for the customer_address. Example - grunt describe A; A: {cust_id: int,cust_name: chararray,cust_address: {innertuple: (innerfield: chararray)},cust_email: chararray} grunt dump A;

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Pradeep Gollakota
If you're using the built-in BagToTuple UDF, then you probably don't need the FLATTEN operator. I suspect that your output looks as follows: 2200 benjamin avenue philadelphia ... Can you confirm that this is what you're seeing? On Mon, Jun 2, 2014 at 9:52 AM, Rahul Channe

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Rahul Channe
grunt B = foreach A generate BagToTuple(cust_address); grunt describe B; B: {org.apache.pig.builtin.bagtotuple_cust_address_24: (innerfield: chararray)} grunt dump B; ((2200,benjamin franklin,philadelphia)) ((44,atlanta franklin,florida)) On Mon, Jun 2, 2014 at 12:59 PM, Pradeep Gollakota

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Rahul Channe
I tried changing the hive column datatype from ARRAY to STRUCT for cust_address, then i imported the table in pig. Now I am able to separate the fields, as below grunt Z = load 'cust_info' using org.apache.hcatalog.pig.HCatLoader(); grunt describe Z; Z: {cust_id: int,cust_name:

Re: How to FLATTEN hive column in Pig with ARRAY data type

2014-06-02 Thread Pradeep Gollakota
Awesome... that's the way I would have done it as well. On Mon, Jun 2, 2014 at 10:14 AM, Rahul Channe drah...@googlemail.com wrote: I tried changing the hive column datatype from ARRAY to STRUCT for cust_address, then i imported the table in pig. Now I am able to separate the fields, as

How to FLATTEN hive column in Pig with ARRAY data type

2014-06-01 Thread Rahul Channe
Hi All, I have imported hive table into pig having a complex data type (ARRAYString). The alias in pig looks as below grunt describe A; A: {cust_id: int,cust_name: chararray,cust_address: {innertuple: (innerfield: chararray)},cust_email: chararray} grunt dump A; (123,phil abc,{(2200),(benjamin