FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address), cust_email;
On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe drah...@googlemail.com wrote:
Hi All,
I have imported hive table into pig having a complex data type
(ARRAYString). The alias in pig looks as below
grunt describe A;
Disregard last email.
Sorry... didn't fully understand the question.
On Mon, Jun 2, 2014 at 8:44 AM, Pradeep Gollakota pradeep...@gmail.com
wrote:
FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address), cust_email;
On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe
There was a similar question as this on StackOverflow a while back. The
suggestion was to write a custom BagToTuple UDF.
http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig
On Mon, Jun 2, 2014 at 8:46 AM, Pradeep Gollakota pradeep...@gmail.com
wrote:
Thank You Pradeep, it worked to a certain extend but having following
difficulty in separating fields as $0,$1 for the customer_address.
Example -
grunt describe A;
A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
(innerfield: chararray)},cust_email: chararray}
grunt dump A;
If you're using the built-in BagToTuple UDF, then you probably don't need
the FLATTEN operator.
I suspect that your output looks as follows:
2200
benjamin avenue
philadelphia
...
Can you confirm that this is what you're seeing?
On Mon, Jun 2, 2014 at 9:52 AM, Rahul Channe
grunt B = foreach A generate BagToTuple(cust_address);
grunt describe B;
B: {org.apache.pig.builtin.bagtotuple_cust_address_24: (innerfield:
chararray)}
grunt dump B;
((2200,benjamin franklin,philadelphia))
((44,atlanta franklin,florida))
On Mon, Jun 2, 2014 at 12:59 PM, Pradeep Gollakota
I tried changing the hive column datatype from ARRAY to STRUCT for
cust_address, then i imported the table in pig.
Now I am able to separate the fields, as below
grunt Z = load 'cust_info' using org.apache.hcatalog.pig.HCatLoader();
grunt describe Z;
Z: {cust_id: int,cust_name:
Awesome... that's the way I would have done it as well.
On Mon, Jun 2, 2014 at 10:14 AM, Rahul Channe drah...@googlemail.com
wrote:
I tried changing the hive column datatype from ARRAY to STRUCT for
cust_address, then i imported the table in pig.
Now I am able to separate the fields, as
Hi All,
I have imported hive table into pig having a complex data type
(ARRAYString). The alias in pig looks as below
grunt describe A;
A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
(innerfield: chararray)},cust_email: chararray}
grunt dump A;
(123,phil abc,{(2200),(benjamin