Thanks, Stanley. Appreciate the feedback. On Sun, Mar 1, 2015 at 6:23 PM, Xu, Qian A <[email protected]> wrote:
> Hi Mark, > > > > The Hive (Parquet) export has limitation for Avro format contents. It > works only with a Hive table created by Kite CLI or a previous Sqoop > imported Hive table (which uses Kite internally). In concrete, metadata > will be stored in a hidden directory called “.metadata”. So when exporting > a Hive table, Sqoop will look for the directory. This is the reason why it > fails. > > > > --Stanley (Qian) Xu > > > > > > *From:* [email protected] [mailto:[email protected] > <[email protected]>] *On Behalf Of *Mark Grover > *Sent:* Monday, March 02, 2015 8:30 AM > *To:* [email protected] > *Subject:* Re: How does sqoop export detect Avro schema? > > > > Forgot to mention, here's the error I am getting: > > https://gist.github.com/markgrover/113196fecd1ec5bd0b38 > > > > And, please include me on cc. I am not on the list. Thanks again! > > > > On Sun, Mar 1, 2015 at 4:29 PM, Mark Grover <[email protected]> wrote: > > Hi Sqoop folks, > > I am trying to better understand how sqoop export works. > > > > In the sqoop export command, we don't put any information about the > metadata of the HDFS data being exported. So, how does sqoop figure out the > avro schema of the data being exported? > > > > Does it use Kite's .metadata directory for this? If so, that'd mean you > can't export data not populated by Kite. I don't think that's the case. > > Does it parse our the file header or look at file extensions? If so, that > doesn't work, I just populated an hive table which stores data in avro, and > it's file extension is not avro. > > Does it do something else that I am missing? > > > > I created a Hive avro table using some new syntax supported in Hive 0.14+: > > CREATE EXTERNAL TABLE avg_movie_rating2(movie_id INT, rating DOUBLE) > > STORED AS AVRO > > LOCATION '/data/movielens/aggregated_ratings' > > And, I just haven't been able to get Sqoop to be able to export that data. > Here's the sqoop export command that I ran: > > > sqoop export --connect \ > jdbc:mysql://mgrover-haa-2.vpc.cloudera.com:3306/movie_dwh \ > --username root --table avg_movie_rating --export-dir \ > /data/movielens/aggregated_ratings -m 16 \ > --update-key movie_id --update-mode allowinsert > > > > Any thoughts/insights would be much appreciated! > > > > Thanks! > > Mark > > >
