[
https://issues.apache.org/jira/browse/SQOOP-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894385#comment-13894385
]
Hari Sekhon commented on SQOOP-1283:
------------------------------------
Thanks Harsh! I'd prefer if Sqoop did the detection regardless of the file
extension... it's one less thing for users to worry about. If you've already
got the backing files without .avro then having to transform a large table is
annoying...
> Export doesn't detect Avro files without .avro extension (ie created by Hive)
> -----------------------------------------------------------------------------
>
> Key: SQOOP-1283
> URL: https://issues.apache.org/jira/browse/SQOOP-1283
> Project: Sqoop
> Issue Type: Bug
> Components: connectors/postgresql, hive-integration
> Affects Versions: 1.4.3
> Environment: CDH 4.5
> Reporter: Hari Sekhon
>
> Exporting to PostgreSQL, Sqoop doesn't detect Avro files properly if they
> don't have the .avro extension (ie they are called 000000_0 in HDFS as they
> were created by Hive) and falls back to unknown file type in the code, which
> then attempts to use Text export mapper which fails with a parse exception:
> java.io.IOException: Can't export data, please check failed map task logs
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.lang.RuntimeException: Can't parse input data:
> 'Objavro.codecdeflateavro.schema�{"type":"record","name":"<scrubbed>","namespace":"<scrubbed>.avro","fields":[{"name":"pane
>
> 14/02/03 17:13:52 INFO mapred.JobClient: Task Id :
> attempt_201312101527_93532_m_000000_0, Status : FAILED
> java.io.IOException: Can't export data, please check failed map task logs
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Thanks
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)