Good to hear that my inputs helped you .
All the best !!
On Jul 24, 2015 1:34 AM, "Sunilmanohar Kancharlapalli -X (sunkanch - ZENSAR
TECHNOLOGIES INC at Cisco)" <[email protected]> wrote:
> Thanks for your inputs.
>
> I have written a UDF which I have used in addition to the previous
> YES_MULTILINE argument in the CSVExcelStorage() of piggybank and it worked
> perfect. The particular column has got additional parenthesis surrounded to
> it. But that’s ok and I can deal with it.
>
> Here is how I have done
>
> A = LOAD '/path/to/file/location/filename.csv' USING
> org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'YES_MULTILINE',
> 'UNIX', 'SKIP_INPUT_HEADER');
> B = FOREACH A GENERATE $0,$1,$2,clean($3); clean is the
> UDF to replace \n character in the string
> store B into '/desired/file /location using PigStorage('\t');
>
>
> Sunilmanohar Kancharlapalli
> Engineer - IT
> [email protected]
> Phone:
> Cisco Systems Limited
>
>
>
>
> US
> Cisco.com
>
>
>
>
>
> Think before you print.
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html
>
>
>
> -----Original Message-----
> From: Divya Gehlot [mailto:[email protected]]
> Sent: Wednesday, July 22, 2015 11:49 PM
> To: [email protected]
> Subject: Re: Loading data from a CSV file which has '\n' character in a
> field
>
> you can try this
> http://pig.apache.org/docs/r0.7.0/udf.html#Load%2FStore+Functions
>
>
> On 23 July 2015 at 09:24, Sunilmanohar Kancharlapalli -X (sunkanch -
> ZENSAR TECHNOLOGIES INC at Cisco) <[email protected]> wrote:
>
> > I am trying to load a csv file which has ‘\n’ character in the field
> > and Pig is considering that as a new record. I am missing the data in
> > that particular column and getting additional records in the output
> table.
> >
> >
> >
> > I am using d = LOAD
> > '/location/of/the/file/name_of_the_fiel.csv' USING
> > org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'YES_MULTILINE',
> > 'UNIX', 'SKIP_INPUT_HEADER'); to allow the multi-line possibility in a
> > field. Still I am facing the same issue. Where the data is shifting
> > into next row.
> >
> >
> >
> > Appreciate any help.
> >
> >
> >
> >
> >
> > Thanks
> >
> > Sunil Kancharlapalli
> >
> >
> >
> > [image:
> > http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
> >
> > *Sunilmanohar Kancharlapalli*
> > Engineer - IT
> > [email protected]
> > Phone:
> >
> > *Cisco Systems Limited*
> >
> >
> >
> >
> > US
> > Cisco.com <http://www.cisco.com>
> >
> >
> >
> > [image: Think before you print.]Think before you print.
> >
> > This email may contain confidential and privileged material for the
> > sole use of the intended recipient. Any review, use, distribution or
> > disclosure by others is strictly prohibited. If you are not the
> > intended recipient (or authorized to receive for the recipient),
> > please contact the sender by reply email and delete all copies of this
> message.
> >
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/index.html
> >
> >
> >
> >
> >
>