I can absolutely try! I was just hoping to get a read on if this would be considered a worthwhile change to pursue or if it would be considered "working as intended". Regardless, I'll open an issue in JIRA and see where it goes from there.
On Fri, Dec 18, 2015 at 1:25 AM, Jarek Jarcec Cecho <[email protected]> wrote: > Can you create a JIRA Marcus? > > Jarcec > > > On Dec 17, 2015, at 6:49 PM, Marcus Truscello < > [email protected]> wrote: > > > > This isn't so much as a bug report as a feature request. > > > > With sqoop, one can specify a --fields-terminated-by value greater than > 127 using octal notation and it will work correctly. The resulting file > will have the correct delimiter. > > > > However, if you include the --hive-import option, the delimiter will > result in error when being imported into Hive even though the file retains > the correct delimiter. This is the region of code responsible for the > error: > > > https://github.com/apache/sqoop/blob/f19e2a523579db8c28a96febfd3cf35a5d58adc6/src/java/org/apache/sqoop/hive/TableDefWriter.java#L278-L300 > > > > However, Hive supports delimiters with ASCII values between 128 and 255, > just not in the octal escape form. Instead, they must be specified as > negative values (two's compliment, signed char). For example, ASCII 254 in > octal would normally be FIELDS TERMINATED BY '\0376' which is an error in > Hive, but FIELDS TERMINATED BY '-2' works correctly. > > > > I believe that sqoop's --hive-import function should convert the > --fields-terminated-by value into a form usable by Hive even if the value > is greater than 127. Values greater than 255 should probably still be an > error. > > > > > > Thanks for your time and consideration. > > -Marcus > >
