Solved by changing Hive metastore to postgresql instead of derby. On Fri, Mar 6, 2015 at 8:16 AM, Jack Arenas <[email protected]> wrote:
> Abe et al, > > How do you mean? Isn't that the point of the --hive-table flag? Based on > the schema add the table to the proper schema.db folder in <path>/Hive/Lab > for each sqoop job? I'm not sure what you mean... I tried setting > --target-dir as <path>/Hive/Lab/<schema>.db/<table> and yes it's able to > ingest the data into HDFS into that folder but hive doesn't recognize that > the tables are there. It's like the step that actually links the data to > hive breaks when parallelized. > > Hope this info helps. > > Best, > Jack > > On Mar 3, 2015, at 8:46 PM, Abraham Elmahrek <[email protected]> wrote: > > Jack, > > Just a thought... but have you tried using --target-dir? > > -Abe > > On Mon, Mar 2, 2015 at 12:24 PM, Jack Arenas <[email protected]> wrote: > >> Hi team, >> >> I'm building an ETL tool that requires me to pull in a bunch of tables >> from a db into HDFS and I'm currently doing this sequentially using Sqoop. >> I figured it might be a faster to submit the Sqoop jobs in parallel, that >> is with a predefined thread pool (currently trying 8) because it took about >> two hours to ingest 150 tables of various sizes, frankly not very big >> tables as this is POC. So sequentially this works fine, but as soon as I >> add parallelism, roughly 75% of my Sqoop jobs fail, and I'm not saying that >> they don't ingest any data, simply that the data gets stuck in the staging >> area (I.e /user/username) as opposed to the proper hive table (I.e >> /user/username/Hive/Lab). Has anyone experienced this before? I figure I >> may be able to shoot a separate process that moves the hive tables from the >> staging area into the hive table area, but I'm not sure if that process >> would simply be to move the tables or if there is more involved. >> >> Thanks! >> >> Specs: HDP 2.1, Sqoop 1.4.4.2 >> >> Cheers, >> Jack >> >> > -- Jack Arenas Data Engineer & Web Developer [email protected] +1.805.259.8059 <http://www.linkedin.com/in/jackarenas>
