Hi Mark,

On Oct 21, 2011, at 12:12am, Mark Roddy wrote:

> I'm moving free form data out of a RDBMS that has a lot of \n, \r\n,
> and \t characters.
> 
> I used "--escaped-by \\" (extra \ cause of bash), but I'm a little
> confused about what to do with this data now.  I can't seem to find
> any tools that will honor the '\' escape char.  TextInputFormat does
> not seem to.
> 
> I'm working on replacing an existing in house tool w/sqoop that
> replace newlines with the literal string '\n'.  I'd be happy to do as
> such but I don't see any way of doing so.
> 
> I'm sure I'm not the first person to run into this so I appreciate any
> suggestions.

We use CSVParser (au.com.bytecode.opencsv.CSVParser) to parse escaped text that 
we're importing using Sqoop.

We're using a custom Cascading Function to do this processing during our 
workflow.

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr



Reply via email to