Hi Mark,

The Hive (Parquet) export has limitation for Avro format contents. It works 
only with a Hive table created by Kite CLI or a previous Sqoop imported Hive 
table (which uses Kite internally). In concrete, metadata will be stored in a 
hidden directory called “.metadata”. So when exporting a Hive table, Sqoop will 
look for the directory. This is the reason why it fails.

--Stanley (Qian) Xu


From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Mark Grover
Sent: Monday, March 02, 2015 8:30 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: How does sqoop export detect Avro schema?

Forgot to mention, here's the error I am getting:
https://gist.github.com/markgrover/113196fecd1ec5bd0b38

And, please include me on cc. I am not on the list. Thanks again!

On Sun, Mar 1, 2015 at 4:29 PM, Mark Grover 
<[email protected]<mailto:[email protected]>> wrote:
Hi Sqoop folks,
I am trying to better understand how sqoop export works.

In the sqoop export command, we don't put any information about the metadata of 
the HDFS data being exported. So, how does sqoop figure out the avro schema of 
the data being exported?

Does it use Kite's .metadata directory for this? If so, that'd mean you can't 
export data not populated by Kite. I don't think that's the case.
Does it parse our the file header or look at file extensions? If so, that 
doesn't work, I just populated an hive table which stores data in avro, and 
it's file extension is not avro.
Does it do something else that I am missing?

I created a Hive avro table using some new syntax supported in Hive 0.14+:

CREATE EXTERNAL TABLE avg_movie_rating2(movie_id INT, rating DOUBLE)

STORED AS AVRO

LOCATION '/data/movielens/aggregated_ratings'
And, I just haven't been able to get Sqoop to be able to export that data. 
Here's the sqoop export command that I ran:

sqoop export --connect \
jdbc:mysql://mgrover-haa-2.vpc.cloudera.com:3306/movie_dwh<http://mgrover-haa-2.vpc.cloudera.com:3306/movie_dwh>
 \
--username root --table avg_movie_rating --export-dir \ 
/data/movielens/aggregated_ratings -m 16 \
--update-key movie_id --update-mode allowinsert

Any thoughts/insights would be much appreciated!

Thanks!
Mark

Reply via email to