[
https://issues.apache.org/jira/browse/SQOOP-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186551#comment-17186551
]
Ram edited comment on SQOOP-2907 at 8/28/20, 2:04 PM:
------------------------------------------------------
[~yuan_zac] [[email protected]] [~vasas] [~stanleyxu2005]
We are using *sqoop 1.4.7* to upload parquet data that is stored in HDFS -
*Plain parquet files and NOT a Hive table*
**We're still facing the same issue -
{code:java}
20/08/28 13:37:02 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.DatasetIOException: Cannot access descriptor location:
hdfs:///<location>/part-00000-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata
org.kitesdk.data.DatasetIOException: Cannot access descriptor location:
hdfs:///<location>/part-00000-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata{code}
The command we're running -
{code:java}
/sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop export --connect
jdbc:postgresql://<postgres_db_details> --username <username> --password
<password> --table <table_name> --export-dir
hdfs:///<location>/part-00000-f9f92493-36a1-4714-bcc6-291c118cf599-c000.parquet
{code}
Postgres JAR - postgresql-42.2.11.jar
Please do suggest a solution ASAP.
was (Author: ramkrishnan):
[~yuan_zac] [[email protected]] [~vasas]
We are using *sqoop 1.4.7* to upload parquet data that is stored in HDFS -
*Plain parquet files and NOT a Hive table*
**We're still facing the same issue -
{code:java}
20/08/28 13:37:02 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.DatasetIOException: Cannot access descriptor location:
hdfs:///<location>/part-00000-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata
org.kitesdk.data.DatasetIOException: Cannot access descriptor location:
hdfs:///<location>/part-00000-f9f92493-36a1-4714-bcc6-291c118cf599-c000/snappy/parquet/.metadata{code}
The command we're running -
{code:java}
/sqoop-1.4.7.bin__hadoop-2.6.0/bin/sqoop export --connect
jdbc:postgresql://<postgres_db_details> --username <username> --password
<password> --table <table_name> --export-dir
hdfs:///<location>/part-00000-f9f92493-36a1-4714-bcc6-291c118cf599-c000.parquet
{code}
Postgres JAR - postgresql-42.2.11.jar
Please do suggest a solution ASAP.
> Export parquet files to RDBMS: don't require .metadata for parquet files
> ------------------------------------------------------------------------
>
> Key: SQOOP-2907
> URL: https://issues.apache.org/jira/browse/SQOOP-2907
> Project: Sqoop
> Issue Type: Improvement
> Components: metastore
> Affects Versions: 1.4.6
> Environment: sqoop 1.4.6
> export parquet files to Oracle
> Reporter: Ruslan Dautkhanov
> Assignee: Sandish Kumar HN
> Priority: Major
> Labels: sqoop
> Attachments: SQOOP-2907-3.patch, SQOOP-2907.patch, SQOOP-2907.patch1,
> SQOOP-2907.patch2
>
>
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)