If you're using the built-in BagToTuple UDF, then you probably don't need
the FLATTEN operator.

I suspect that your output looks as follows:

2200
benjamin avenue
philadelphia
...

Can you confirm that this is what you're seeing?


On Mon, Jun 2, 2014 at 9:52 AM, Rahul Channe <drah...@googlemail.com> wrote:

> Thank You Pradeep, it worked to a certain extend but having following
> difficulty in separating fields as $0,$1 for the customer_address.
>
>
> Example -
>
> grunt> describe A;
> A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
> (innerfield: chararray)},cust_email: chararray}
>
> grunt> dump A;
>
> (123,phil abc,{(2200),(benjamin avenue),(philadelphia)},t...@gmail.com)
> (124,diego arty,{(44),(atlanta franklin),(florida)},o...@gmail.com)
>
> grunt> B = foreach A generate FLATTEN(BagToTuple(cust_address));
> grunt> dump B;
> (2200,benjamin franklin,philadelphia)
> (44,atlanta franklin,florida)
>
> grunt> describe B;
> B: {org.apache.pig.builtin.bagtotuple_cust_address_34::innerfield:
> chararray}
>
>
>
> I am not able to seperate the fields in B as $0,$1 and $3 ,tried using
> STRSPLIT but didnt work.
>
>
>
> On Mon, Jun 2, 2014 at 11:50 AM, Pradeep Gollakota <pradeep...@gmail.com>
> wrote:
>
> > There was a similar question as this on StackOverflow a while back. The
> > suggestion was to write a custom BagToTuple UDF.
> >
> >
> >
> http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig
> >
> >
> > On Mon, Jun 2, 2014 at 8:46 AM, Pradeep Gollakota <pradeep...@gmail.com>
> > wrote:
> >
> > > Disregard last email.
> > >
> > > Sorry... didn't fully understand the question.
> > >
> > >
> > > On Mon, Jun 2, 2014 at 8:44 AM, Pradeep Gollakota <
> pradeep...@gmail.com>
> > > wrote:
> > >
> > >> FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address),
> > cust_email;
> > >>
> > >> ​
> > >>
> > >>
> > >> On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe <drah...@googlemail.com>
> > >> wrote:
> > >>
> > >>> Hi All,
> > >>>
> > >>> I have imported hive table into pig having a complex data type
> > >>> (ARRAY<String>). The alias in pig looks as below
> > >>>
> > >>> grunt> describe A;
> > >>> A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
> > >>> (innerfield: chararray)},cust_email: chararray}
> > >>>
> > >>> grunt> dump A;
> > >>>
> > >>> (123,phil abc,{(2200),(benjamin avenue),(philadelphia)},
> t...@gmail.com
> > )
> > >>> (124,diego arty,{(44),(atlanta franklin),(florida)},o...@gmail.com)
> > >>>
> > >>> The cust_address is the ARRAY field from hive. I want to FLATTEN the
> > >>> cust_address into different fields.
> > >>>
> > >>>
> > >>> Expected output
> > >>> (2200,benjamin avenue,philadelphia)
> > >>> (44,atlanta franklin,florida)
> > >>>
> > >>> please help
> > >>>
> > >>> Regards,
> > >>> Rahul
> > >>>
> > >>
> > >>
> > >
> >
>

Reply via email to