Hi Mark, On Oct 21, 2011, at 12:12am, Mark Roddy wrote:
> I'm moving free form data out of a RDBMS that has a lot of \n, \r\n, > and \t characters. > > I used "--escaped-by \\" (extra \ cause of bash), but I'm a little > confused about what to do with this data now. I can't seem to find > any tools that will honor the '\' escape char. TextInputFormat does > not seem to. > > I'm working on replacing an existing in house tool w/sqoop that > replace newlines with the literal string '\n'. I'd be happy to do as > such but I don't see any way of doing so. > > I'm sure I'm not the first person to run into this so I appreciate any > suggestions. We use CSVParser (au.com.bytecode.opencsv.CSVParser) to parse escaped text that we're importing using Sqoop. We're using a custom Cascading Function to do this processing during our workflow. -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://bixolabs.com custom big data solutions & training Hadoop, Cascading, Mahout & Solr
