[ 
https://issues.apache.org/jira/browse/SQOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970334#comment-15970334
 ] 

Eric Lin commented on SQOOP-3150:
---------------------------------

Hi Ankit,

I just did some review on the issue you raised, and I noticed that the 
--target-dir is not used to control where the hive table will be created, or 
the destination of the target partition data will be stored. Rather, the 
--target-dir is used to control ONLY the data that is generated before loading 
into Hive table.

For example, you specified --target-dir as 
"/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES", so the data will be 
stored into this directory and the final Hive query that will import data into 
Hive will be something like below:

LOAD DATA INPATH 
'hdfs://localhost:9000/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES' 
OVERWRITE INTO TABLE `employees_p` PARTITION (date='10-03-2017');

You will have no control of where the final directory that the partition goes 
into in Hive.

Hope that makes sense to you. So this is not a bug, but work as expected.

> issue with sqoop hive import with partitions
> --------------------------------------------
>
>                 Key: SQOOP-3150
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3150
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.4.6
>         Environment: Cent-Os
>            Reporter: Ankit Kumar
>            Assignee: Eric Lin
>              Labels: features
>
> Sqoop Command:
>       sqoop import \
>       ...
>   --hive-import  \
>   --hive-overwrite  \
>   --hive-table employees_p  \
>   --hive-partition-key date  \
>   --hive-partition-value 10-03-2017  \
>   --target-dir ..\
>   -m 1  
>   
>   hive-table script:
>   employees_p is a partitioned table on date(string) column
>   
>   Issue:- 
>   Case1: When  --target-dir 
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES \
>   while running above sqoop command, gets an error "directory already 
> exissts".
>   
>   When : --target-dir 
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/anyname 
>   2. Above sqoop command creates a hive partition (date=10-03-2017) and 
> directory as
>       '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'
>       
> Expected Behaviour:- As in sqoop command  --hive-partition-key and  
> --hive-partition-value is present, so it should auto create partioned 
> directory inside EMPLOYEES.
> ie. '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to