Thanks for your inputs.

I have written a UDF which I have used in addition to the previous 
YES_MULTILINE argument in the CSVExcelStorage() of piggybank and it worked 
perfect. The particular column has got additional parenthesis surrounded to it. 
But that’s ok and I can deal with it.

Here is how I have done

A = LOAD '/path/to/file/location/filename.csv' USING 
org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'YES_MULTILINE', 'UNIX', 
'SKIP_INPUT_HEADER'); 
B = FOREACH A GENERATE $0,$1,$2,clean($3);                    clean is the UDF 
to replace \n character in the string
store B into '/desired/file /location using PigStorage('\t');


Sunilmanohar Kancharlapalli
Engineer - IT
[email protected]
Phone: 
Cisco Systems Limited




US
Cisco.com





Think before you print.
This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html



-----Original Message-----
From: Divya Gehlot [mailto:[email protected]] 
Sent: Wednesday, July 22, 2015 11:49 PM
To: [email protected]
Subject: Re: Loading data from a CSV file which has '\n' character in a field

you can try this
http://pig.apache.org/docs/r0.7.0/udf.html#Load%2FStore+Functions


On 23 July 2015 at 09:24, Sunilmanohar Kancharlapalli -X (sunkanch - ZENSAR 
TECHNOLOGIES INC at Cisco) <[email protected]> wrote:

>  I am trying to load a csv file which has ‘\n’ character in the field 
> and Pig is considering that as a new record. I am missing the data in 
> that particular column and getting additional records in the output table.
>
>
>
> I am using             d  = LOAD
> '/location/of/the/file/name_of_the_fiel.csv' USING 
> org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'YES_MULTILINE', 
> 'UNIX', 'SKIP_INPUT_HEADER'); to allow the multi-line possibility in a 
> field. Still I am facing the same issue. Where the data is shifting 
> into next row.
>
>
>
> Appreciate any help.
>
>
>
>
>
> Thanks
>
> Sunil Kancharlapalli
>
>
>
> [image: 
> http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
>
> *Sunilmanohar Kancharlapalli*
> Engineer - IT
> [email protected]
> Phone:
>
> *Cisco Systems Limited*
>
>
>
>
> US
> Cisco.com <http://www.cisco.com>
>
>
>
> [image: Think before you print.]Think before you print.
>
> This email may contain confidential and privileged material for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the 
> intended recipient (or authorized to receive for the recipient), 
> please contact the sender by reply email and delete all copies of this 
> message.
>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html
>
>
>
>
>
  • ... Sunilmanohar Kancharlapalli -X (sunkanch - ZENSAR TECHNOLOGIES INC at Cisco)
    • ... Divya Gehlot
      • ... Sunilmanohar Kancharlapalli -X (sunkanch - ZENSAR TECHNOLOGIES INC at Cisco)
        • ... Divya Gehlot

Reply via email to