Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It may help.
[1] http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html [2] http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html CCed pig user-list also. On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P < sandeepreddy.3...@gmail.com> wrote: > Thanks Michael Sorry i didnt get that soon. I'll try that and reply you > back. > > On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <michael_se...@hotmail.com > >wrote: > > > Sorry, > > I was saying that you can write a python script that replaces the > > delimiter with a | and ignore the commas within quotes. > > > > > > Sent from a remote device. Please excuse any typos... > > > > Mike Segel > > > > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P < > sandeepreddy.3...@gmail.com> > > wrote: > > > > > If i do that my data will be d|"abc|def"|abcd my problem is not solved > > > > > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel < > michael_se...@hotmail.com > > >wrote: > > > > > >> Yup. I just didnt add the quotes. > > >> > > >> Sent from a remote device. Please excuse any typos... > > >> > > >> Mike Segel > > >> > > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P < > > sandeepreddy.3...@gmail.com> > > >> wrote: > > >> > > >>> Thanks for the reply. > > >>> I didnt get that Michael. My f2 should be "abc,def" > > >>> > > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel < > > >> michael_se...@hotmail.com>wrote: > > >>> > > >>>> Alternatively you could write a simple script to convert the csv to > a > > >> pipe > > >>>> delimited file so that "abc,def" will be abc,def. > > >>>> > > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote: > > >>>> > > >>>>> Hive's delimited-fields-format record reader does not handle quoted > > >>>>> text that carry the same delimiter within them. Excel supports such > > >>>>> records, so it reads it fine. > > >>>>> > > >>>>> You will need to create your table with a custom InputFormat class > > >>>>> that can handle this (Try using OpenCSV readers, they support > this), > > >>>>> instead of relying on Hive to do this for you. If you're successful > > in > > >>>>> your approach, please also consider contributing something back to > > >>>>> Hive/Pig to help others. > > >>>>> > > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P > > >>>>> <sandeepreddy.3...@gmail.com> wrote: > > >>>>>> > > >>>>>> > > >>>>>> Hi all, > > >>>>>> I have a csv file with 46 columns but i'm getting error when i do > > some > > >>>>>> analysis on that data type. For simplification i have taken 3 > > columns > > >>>> and > > >>>>>> now my csv is like > > >>>>>> c,zxy,xyz > > >>>>>> d,"abc,def",abcd > > >>>>>> > > >>>>>> i have created table for this data using, > > >>>>>> hive> create table test3( > > >>>>>>> f1 string, > > >>>>>>> f2 string, > > >>>>>>> f3 string) > > >>>>>>> row format delimited > > >>>>>>> fields terminated by ","; > > >>>>>> OK > > >>>>>> Time taken: 0.143 seconds > > >>>>>> hive> load data local inpath '/home/training/a.csv' > > >>>>>>> into table test3; > > >>>>>> Copying data from file:/home/training/a.csv > > >>>>>> Copying file: file:/home/training/a.csv > > >>>>>> Loading data to table default.test3 > > >>>>>> OK > > >>>>>> Time taken: 0.276 seconds > > >>>>>> hive> select * from test3; > > >>>>>> OK > > >>>>>> c zxy xyz > > >>>>>> d "abc def" > > >>>>>> Time taken: 0.156 seconds > > >>>>>> > > >>>>>> When i do select f2 from test3; > > >>>>>> my results are, > > >>>>>> OK > > >>>>>> zxy > > >>>>>> "abc > > >>>>>> but this should be abc,def > > >>>>>> When i open the same csv file with Microsoft Excel i got abc,def > > >>>>>> How should i solve this error?? > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> -- > > >>>>>> Thanks, > > >>>>>> sandeep > > >>>>>> > > >>>>>> -- > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> Harsh J > > >>>>> > > >>>> > > >>>> > > >>> > > >>> > > >>> -- > > >>> Thanks, > > >>> sandeep > > >> > > > > > > > > > > > > -- > > > Thanks, > > > sandeep > > > > > > -- > Thanks, > sandeep >