Thanks for all the help and replies. I tracked this error down to the fact
that I was using the --warehouse-dir option in swoop to be the directory of
the Hive warehouse. That meant the Hive import step in Sqoop was trying to
overwrite the source of the import, namely the data that was produced by
the Sqoop dump step.

Thanks again for the help.

Jurgen

On Tue, Nov 29, 2011 at 12:47 PM, Miguel Cabero <miguel.cab...@gmail.com>wrote:

> Hi Jurgen,
>
> May be you can find some hints in
> http://www.slideshare.net/kate_ting/habits-of-effective-sqoop-users
>
> Regards,
>
> Miguel
>
> On 29 Nov 2011, at 00:44, arv...@cloudera.com wrote:
>
> Hi Jurgen,
>
> What version of Hive and Sqoop are you using? Also, please look under
> /tmp/${USER}/hive.log file which will have more detailed information on
> what may be going wrong.
>
> Thanks,
> Arvind
>
> On Mon, Nov 28, 2011 at 3:17 PM, Jurgen Van Gael <jur...@rangespan.com>wrote:
>
>> Hi,
>> I am running the Cloudera CDH3 Hive distribution in pseudo-distributed
>> mode on my local Mac OS Lion laptop. Hive generally works fine except
>> when I use it together with Sqoop. A command like
>>
>> sqoop import --connect jdbc:mysql://localhost/db --username root
>> --password foobar --table sometable --warehouse-dir
>> /user/hive/warehouse
>>
>> completes succesfully and generates part_files, a _logs directory and
>> a _SUCCESS file in the hive warehouse directory on HDFS. However, when
>> I add the --import-hive part to the Sqoop command, the import still
>> works but Hive seems to get into an infinite loop. Looking at the logs
>> I find entries
>>
>> 2011-11-28 22:54:57,279 WARN org.apache.hadoop.hdfs.StateChange: DIR*
>> FSDirectory.unprotectedRenameTo: failed to rename
>> /user/hive/warehouse/sometable/_SUCCESS to
>> /user/hive/warehouse/sometable/_SUCCESS_copy_2 because source does not
>> exist
>> 2011-11-28 22:54:57,281 WARN org.apache.hadoop.hdfs.StateChange: DIR*
>> FSDirectory.unprotectedRenameTo: failed to rename
>> /user/hive/warehouse/sometable/_SUCCESS to
>> /user/hive/warehouse/sometable/_SUCCESS_copy_3 because source does not
>> exist
>> 2011-11-28 22:54:57,282 WARN org.apache.hadoop.hdfs.StateChange: DIR*
>> FSDirectory.unprotectedRenameTo: failed to rename
>> /user/hive/warehouse/sometable/_SUCCESS to
>> /user/hive/warehouse/sometable/_SUCCESS_copy_4 because source does not
>> exist
>>
>> I started digging into the source code and can trace it back to
>> ql/metadata/Hive.java:checkPaths which tries to find a name for a
>> _SUCCESS file during the actual Hive load but somehow fails because
>> the Sqoop import MR job already created a _SUCCESS file. I already
>> tried disabling MR creation of _SUCCESS files but Hive seems to wait
>> for that file to kick off the Hive import and hence fails.
>>
>> Does anyone have any suggestions on where to search next?
>>
>> Thanks! Jurgen
>>
>
>
>


-- 
___________________
*Jurgen Van Gael*
Data Scientist at rangespan.com <http://www.rangespan.com>
Mobile: +44 (0) 794 3407 007

Reply via email to