[ 
https://issues.apache.org/jira/browse/SQOOP-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandish Kumar HN reassigned SQOOP-3147:
---------------------------------------

    Assignee: Sandish Kumar HN

> Import data to Hive Table in S3 in Parquet format
> -------------------------------------------------
>
>                 Key: SQOOP-3147
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3147
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>            Reporter: Ahmed Kamal
>            Assignee: Sandish Kumar HN
>
> Using this command succeeds only if the Hive Table's location is HDFS. If the 
> table is backed by S3 it throws an exception while trying to move the data 
> from HDFS tmp directory to S3
> Job job_1486539699686_3090 failed with state FAILED due to: Job commit 
> failed: org.kitesdk.data.DatasetIOException: Dataset merge failed
>       at 
> org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:333)
>       at 
> org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:56)
>       at 
> org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:370)
>       at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
>       at 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Dataset merge failed during rename of 
> hdfs://hdfs-path/tmp/dev_kamal/.temp/job_1486539699686_3090/mr/job_1486539699686_3090/0192f987-bd4c-4cb7-836f-562ac483e008.parquet
>  to 
> s3://bucket_name/dev_kamal/address/0192f987-bd4c-4cb7-836f-562ac483e008.parquet
>       at 
> org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:329)
>       ... 7 more
> sqoop import  --connect "jdbc:mysql://connectionUrl"   --table "tableName" 
> --as-parquetfile --verbose  --username=uname --password=pass --hive-import  
> --delete-target-dir --hive-database dev_kamal --hive-table  tableName 
> --hive-overwrite -m 150
> Another issue that I noticed is that Sqoop loads the Avro schema in 
> TBLProperties under avro.schema.literal attribute and if the table has a lot 
> of columns, the schema would be truncated and this would cause a weird 
> exception like this one.
> *Exception :*
> 17/03/07 12:13:13 INFO hive.metastore: Trying to connect to metastore with 
> URI thrift://url:9083
> 17/03/07 12:13:13 INFO hive.metastore: Opened a connection to metastore, 
> current connections: 1
> 17/03/07 12:13:13 INFO hive.metastore: Connected to metastore.
> 17/03/07 12:13:17 DEBUG util.ClassLoaderStack: Restoring classloader: 
> sun.misc.Launcher$AppClassLoader@3e9b1010
> 17/03/07 12:13:17 ERROR sqoop.Sqoop: Got exception running Sqoop: 
> org.apache.avro.SchemaParseException: 
> org.codehaus.jackson.JsonParseException: Unexpected end-of-input: was 
> expecting closing quote for a string value
>  at [Source: java.io.StringReader@3fb42ec7; line: 1, column: 6001]
> org.apache.avro.SchemaParseException: 
> org.codehaus.jackson.JsonParseException: Unexpected end-of-input: was 
> expecting closing quote for a string value
>  at [Source: java.io.StringReader@3fb42ec7; line: 1, column: 6001]
>       at org.apache.avro.Schema$Parser.parse(Schema.java:929)
>       at org.apache.avro.Schema$Parser.parse(Schema.java:917)
>       at 
> org.kitesdk.data.DatasetDescriptor$Builder.schemaLiteral(DatasetDescriptor.java:475)
>       at 
> org.kitesdk.data.spi.hive.HiveUtils.descriptorForTable(HiveUtils.java:154)
>       at 
> org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:104)
>       at 
> org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:192)
>       at org.kitesdk.data.Datasets.load(Datasets.java:108)
>       at org.kitesdk.data.Datasets.load(Datasets.java:165)
>       at org.kitesdk.data.Datasets.load(Datasets.java:187)
>       at 
> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:78)
>       at 
> org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108)
>       at 
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
>       at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
>       at 
> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
>       at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
>       at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
>       at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
>       at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
>       at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
>       at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
> Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: 
> was expecting closing quote for a string value
>  at [Source: java.io.StringReader@3fb42ec7; line: 1, column: 6001]
>       at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433)
>       at 
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521)
>       at 
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:454)
>       at 
> org.codehaus.jackson.impl.ReaderBasedParser._finishString2(ReaderBasedParser.java:1342)
>       at 
> org.codehaus.jackson.impl.ReaderBasedParser._finishString(ReaderBasedParser.java:1330)
>       at 
> org.codehaus.jackson.impl.ReaderBasedParser.getText(ReaderBasedParser.java:200)
>       at 
> org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:203)
>       at 
> org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:224)
>       at 
> org.codehaus.jackson.map.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:200)
>       at 
> org.codehaus.jackson.map.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:58)
>       at 
> org.codehaus.jackson.map.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15)
>       at 
> org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2704)
>       at 
> org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1344)
>       at org.apache.avro.Schema$Parser.parse(Schema.java:927)
>       ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to