Hi here,
I'm trying to export some data from amazon S3 to postgresql. The data are
coming from redshift and the unload command.
It works well with 20millions entries. However , with 800millions, I got
some errors right after the map-reduce job starts. I also tried with
different delimiters. The data are escaped with the backslash character,
and the fields are enclosed by double-quotes.
My command is:
sqoop export --connect jdbc:postgresql://uri/db --username [] --table []
--export-dir s3://[] --password '7dzeqwYb?WaqKMGPz8NA(y'
--fields-terminated-by ',' --enclosed-by '\"' --null-string ""
--null-non-string "" --escaped-by \\
I was wondering if there was a way to fine the entries causing this? I
tried to find in all the logs and couldn't find anything.
Do I miss anything? Is there a proper way to solve this? Thank you !
The output :
Error: java.io.IOException: Can't export data, please check failed map task
logs
at
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
at
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
Caused by: com.cloudera.sqoop.lib.RecordParser$ParseError: Expected
delimiter at position 235
at org.apache.sqoop.lib.RecordParser.parseRecord(RecordParser.java:319)
at org.apache.sqoop.lib.RecordParser.parseRecord(RecordParser.java:108)
at org.apache.sqoop.lib.RecordParser.parseRecord(RecordParser.java:125)
at updates_backfill.parse(updates_backfill.java:1498)
at
org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)