Hey man, Maybe regexp_replace + free-form query?
http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_free_form_query_imports http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions130.htm -Abe On Tue, Mar 10, 2015 at 9:34 AM, Glen Hein <[email protected]> wrote: > > I have an oracle db were some of the records contain \n and \r characters. > I > m trying to use --hive-delims-replacement to convert them to spaces. I've > had mixed results. It seems that when the \n and \r appear at the beginning > of the fields data, then they are converted as expected. But when they are > embedded in the middle of the field data, then they are not being replaced. > Here's a hexdump showing the unconverted \n and \r characters: > > 000000c0 6e 75 6c 6c 01 48 65 6c 6c 6f 2c 0d 0a 20 20 20 > |null.Hello,.. | > 000000d0 20 20 20 4d 79 20 6e 61 6d 65 20 69 73 20 4d 61 | My name is > Ma| > 000000e0 72 79 20 48 75 6e 74 2c 20 49 20 61 6d 20 31 39 |ry Hunt, I am > 19| > 000000f0 20 79 65 61 72 73 20 6f 6c 64 20 61 6e 64 20 6c | years old > and l| > 00000100 69 76 65 20 69 6e 20 43 61 6c 69 66 6f 72 6e 69 |ive in > Californi| > 00000110 61 2e 20 49 20 67 72 61 64 75 61 74 65 64 20 68 |a. I > graduated h| > 00000120 69 67 68 20 73 63 68 6f 6f 6c 20 6f 6e 20 32 30 |igh school on > 20| > 00000130 31 31 20 61 6e 64 20 64 65 63 69 64 65 64 20 74 |11 and > decided t| > 00000140 6f 20 74 61 6b 65 20 61 20 79 65 61 72 20 6f 66 |o take a year > of| > 00000150 66 20 74 6f 20 73 65 65 20 69 66 20 77 68 61 74 |f to see if > what| > 00000160 20 49 20 63 68 6f 73 65 20 61 73 20 61 20 66 75 | I chose as a > fu| > 00000170 74 75 72 65 20 63 61 72 65 65 72 20 69 73 20 77 |ture career > is w| > 00000180 68 61 74 20 49 20 72 65 61 6c 79 20 77 61 6e 74 |hat I realy > want| > 00000190 65 64 20 74 6f 20 64 6f 2e 20 4d 79 20 79 65 61 |ed to do. My > yea| > 000001a0 72 20 69 73 20 75 70 20 61 6e 64 20 49 20 73 74 |r is up and I > st| > 000001b0 69 6c 6c 20 77 61 6e 74 20 74 6f 20 62 65 20 61 |ill want to > be a| > 000001c0 20 74 65 61 63 68 65 72 2e 20 49 20 6c 6f 6f 6b | teacher. I > look| > 000001d0 20 66 6f 77 61 72 64 20 74 6f 20 6c 65 61 72 6e | foward to > learn| > 000001e0 69 6e 67 20 61 6c 6c 20 74 68 61 74 20 49 20 63 |ing all that > I c| > 000001f0 61 6e 20 61 6e 64 20 62 65 69 6e 67 20 61 6e 20 |an and being > an | > 00000200 65 6c 65 6d 65 6e 74 72 79 20 73 63 68 6f 6f 6c |elementry > school| > 00000210 20 74 65 61 63 68 65 72 2e 01 6e 75 6c 6c 01 31 | > teacher..null.1| > > You can see in the first row of data there is a "01" that starts the > field, and then a few characters later the 0d and 0a. I'm passing the > following option to sqoop: > > <arg>--hive-delims-replacement</arg> > <arg>ABCDEFG</arg> > > I've tried several different values for the actual replacement string, but > the results are always the same. > > Is there a better way to replace the unwanted characters? > > Thanks, > Glen > >
