Hi,
I wrote a shell script to get csv data but when i run that script on a 12GB
csv its taking more time. If i run a python script will that be faster?

On Fri, Sep 7, 2012 at 10:39 AM, Connell, Chuck <[email protected]>wrote:

>  How about a Python script that changes it into plain tab-separated text?
> So it would look like this…****
>
> ** **
>
> 174969274<tab>14-mar-2006<tab>3522876<tab>
> <tab>14-mar-2006<tab>500000308<tab>65<tab>1<newline>
> etc…****
>
> ** **
>
> Tab-separated with newlines is easy to read and works perfectly on import.
> ****
>
> ** **
>
> Chuck Connell****
>
> Nuance R&D Data Team****
>
> Burlington, MA****
>
> 781-565-4611****
>
> ** **
>
> *From:* Sandeep Reddy P [mailto:[email protected]]
> *Subject:* How to load csv data into HIVE****
>
> ** **
>
> Hi,
> Here is the sample data
> "174969274","14-mar-2006","****
>
> 3522876","","14-mar-2006","500000308","65","1"|
> "174969275","19-jul-2006","3523154","","19-jul-2006","500000308","65","1"|
> "174969276","31-dec-2005","3530333","","31-dec-2005","500000308","65","1"|
> "174969277","14-apr-2005","3531470","","14-apr-2005","500000308","65","1"|
>
> How to load this kind of data into HIVE?
> I'm using shell script to get rid of double quotes and '|' but its taking
> very long time to work on each csv which are 12GB each. What is the best
> way to do this?****
>
> ** **
>



-- 
Thanks,
sandeep

Reply via email to