Thanks for getting back to the mailing list and figuring this out!

-Abe

On Tue, Jul 21, 2015 at 7:32 PM, Nikolay Christov <[email protected]>
wrote:

> Hey Abe, thanks for the response I figured what the issue is and will post
> here just in case someone else bumps into this or has better solution.
>
> The data I was reading was badly formatted UTF-8 (non valid UTF8) but
> postgresql 9.3 wasn't handling this correctly and was returning stripped or
> altered characters depend on your pg client's encoding. in postgresql 9.4
> they fixed the issue and now when the client requests non UTF8 but client
> encoding is set to UTF8 then it throws an error. To handle this I am
> converting the bad UTF8 into byte array and then convert to base64 during
> the sqoop:
> encode(name::bytea, 'base64')
> since hive has native support for base64, I can use unbase64() and get my
> original non UTF8 string. It will be nice to skip this encoding yoga and
> directly dump those non UTF8 strings into hive.
>
> I am on pgsql 9.4
> sqoop 1.4.5.2.2.0.0-2041
> byte representation of bad UTF8: 
> \x695465787420ae2c2061204a4156412d504446206c696272617279
> (iText ®, a JAVA-PDF library)
> command: sqoop import --verbose --username xxxxx --password xxxxx
> --connect "jdbc:postgresql://xxxxx:5432/db_name" --split-by id --target-dir
> /tmp/oozietemp --num-mappers 20 --direct --hive-table=metadata
> --hive-import --hive-overwrite --query "select name, id from table where
> \$CONDITIONS"
>
> some stack trace:
> 15/07/21 17:43:44 INFO mapreduce.Job: Task Id :
> attempt_1433455729125_6011_m_000011_2, Status : FAILED
> Error: java.io.IOException: SQLException in nextKeyValue
>         at
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
>         at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
>         at
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>         at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.postgresql.util.PSQLException: ERROR: invalid regular
> expression: invalid escape \ sequence
>         at
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2270)
>         at
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1998)
>         at
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
>         at
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:570)
>         at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:420)
>         at
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:305)
>         at
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
>         at
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>         ... 12 more
>
> On Tue, Jul 21, 2015 at 4:49 PM, Abraham Elmahrek <[email protected]>
> wrote:
>
>> Hey man,
>>
>> Could you send over the following:
>>
>>    - Command that you're running
>>    - Version of Sqoop you're using
>>    - Run the command with --verbose and send the full output?
>>
>> -Abe
>>
>> On Tue, Jul 21, 2015 at 12:43 PM, Nikolay Christov <[email protected]>
>> wrote:
>>
>>> co
>>
>>
>>
>>
>

Reply via email to