grunt> B = foreach A generate BagToTuple(cust_address);

grunt> describe B;
B: {org.apache.pig.builtin.bagtotuple_cust_address_24: (innerfield:
chararray)}

grunt> dump B;
((2200,benjamin franklin,philadelphia))
((44,atlanta franklin,florida))




On Mon, Jun 2, 2014 at 12:59 PM, Pradeep Gollakota <pradeep...@gmail.com>
wrote:

> If you're using the built-in BagToTuple UDF, then you probably don't need
> the FLATTEN operator.
>
> I suspect that your output looks as follows:
>
> 2200
> benjamin avenue
> philadelphia
> ...
>
> Can you confirm that this is what you're seeing?
>
>
> On Mon, Jun 2, 2014 at 9:52 AM, Rahul Channe <drah...@googlemail.com>
> wrote:
>
> > Thank You Pradeep, it worked to a certain extend but having following
> > difficulty in separating fields as $0,$1 for the customer_address.
> >
> >
> > Example -
> >
> > grunt> describe A;
> > A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
> > (innerfield: chararray)},cust_email: chararray}
> >
> > grunt> dump A;
> >
> > (123,phil abc,{(2200),(benjamin avenue),(philadelphia)},t...@gmail.com)
> > (124,diego arty,{(44),(atlanta franklin),(florida)},o...@gmail.com)
> >
> > grunt> B = foreach A generate FLATTEN(BagToTuple(cust_address));
> > grunt> dump B;
> > (2200,benjamin franklin,philadelphia)
> > (44,atlanta franklin,florida)
> >
> > grunt> describe B;
> > B: {org.apache.pig.builtin.bagtotuple_cust_address_34::innerfield:
> > chararray}
> >
> >
> >
> > I am not able to seperate the fields in B as $0,$1 and $3 ,tried using
> > STRSPLIT but didnt work.
> >
> >
> >
> > On Mon, Jun 2, 2014 at 11:50 AM, Pradeep Gollakota <pradeep...@gmail.com
> >
> > wrote:
> >
> > > There was a similar question as this on StackOverflow a while back. The
> > > suggestion was to write a custom BagToTuple UDF.
> > >
> > >
> > >
> >
> http://stackoverflow.com/questions/18544602/how-to-flatten-a-group-into-a-single-tuple-in-pig
> > >
> > >
> > > On Mon, Jun 2, 2014 at 8:46 AM, Pradeep Gollakota <
> pradeep...@gmail.com>
> > > wrote:
> > >
> > > > Disregard last email.
> > > >
> > > > Sorry... didn't fully understand the question.
> > > >
> > > >
> > > > On Mon, Jun 2, 2014 at 8:44 AM, Pradeep Gollakota <
> > pradeep...@gmail.com>
> > > > wrote:
> > > >
> > > >> FOREACH A GENERATE cust_id, cust_name, FLATTEN(cust_address),
> > > cust_email;
> > > >>
> > > >> ​
> > > >>
> > > >>
> > > >> On Sun, Jun 1, 2014 at 5:54 PM, Rahul Channe <
> drah...@googlemail.com>
> > > >> wrote:
> > > >>
> > > >>> Hi All,
> > > >>>
> > > >>> I have imported hive table into pig having a complex data type
> > > >>> (ARRAY<String>). The alias in pig looks as below
> > > >>>
> > > >>> grunt> describe A;
> > > >>> A: {cust_id: int,cust_name: chararray,cust_address: {innertuple:
> > > >>> (innerfield: chararray)},cust_email: chararray}
> > > >>>
> > > >>> grunt> dump A;
> > > >>>
> > > >>> (123,phil abc,{(2200),(benjamin avenue),(philadelphia)},
> > t...@gmail.com
> > > )
> > > >>> (124,diego arty,{(44),(atlanta franklin),(florida)},o...@gmail.com
> )
> > > >>>
> > > >>> The cust_address is the ARRAY field from hive. I want to FLATTEN
> the
> > > >>> cust_address into different fields.
> > > >>>
> > > >>>
> > > >>> Expected output
> > > >>> (2200,benjamin avenue,philadelphia)
> > > >>> (44,atlanta franklin,florida)
> > > >>>
> > > >>> please help
> > > >>>
> > > >>> Regards,
> > > >>> Rahul
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
>

Reply via email to