Thank You Pradeep, it worked to a certain extend but having following
difficulty in separating fields as $0,$1 for the customer_address.


Example -

grunt> describe A;
A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
(innerfield: chararray)},cust_email: chararray}

grunt> dump A;

(123,phil abc,{(2200),(benjamin avenue),(philadelphia)},t...@gmail.com)
(124,diego arty,{(44),(atlanta franklin),(florida)},o...@gmail.com)

grunt> B = foreach A generate FLATTEN(BagToTuple(cust_address));
grunt> dump B;
(2200,benjamin franklin,philadelphia)
(44,atlanta franklin,florida)

grunt> describe B;
B: {org.apache.pig.builtin.bagtotuple_cust_address_34::innerfield:
chararray}



I am not able to seperate the fields in B as $0,$1 and $3 ,tried using
STRSPLIT but didnt work.



On Mon, Jun 2, 2014 at 11:50 AM, Pradeep Gollakota <pradeep...@gmail.com>
wrote:

> There was a similar question as this on StackOverflow a while back. The
> suggestion was to write a custom BagToTuple UDF.
>
>
> http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig
>
>
> On Mon, Jun 2, 2014 at 8:46 AM, Pradeep Gollakota <pradeep...@gmail.com>
> wrote:
>
> > Disregard last email.
> >
> > Sorry... didn't fully understand the question.
> >
> >
> > On Mon, Jun 2, 2014 at 8:44 AM, Pradeep Gollakota <pradeep...@gmail.com>
> > wrote:
> >
> >> FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address),
> cust_email;
> >>
> >> ​
> >>
> >>
> >> On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe <drah...@googlemail.com>
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> I have imported hive table into pig having a complex data type
> >>> (ARRAY<String>). The alias in pig looks as below
> >>>
> >>> grunt> describe A;
> >>> A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
> >>> (innerfield: chararray)},cust_email: chararray}
> >>>
> >>> grunt> dump A;
> >>>
> >>> (123,phil abc,{(2200),(benjamin avenue),(philadelphia)},t...@gmail.com
> )
> >>> (124,diego arty,{(44),(atlanta franklin),(florida)},o...@gmail.com)
> >>>
> >>> The cust_address is the ARRAY field from hive. I want to FLATTEN the
> >>> cust_address into different fields.
> >>>
> >>>
> >>> Expected output
> >>> (2200,benjamin avenue,philadelphia)
> >>> (44,atlanta franklin,florida)
> >>>
> >>> please help
> >>>
> >>> Regards,
> >>> Rahul
> >>>
> >>
> >>
> >
>

Reply via email to