[
https://issues.apache.org/jira/browse/SQOOP-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283859#comment-13283859
]
Cheolsoo Park commented on SQOOP-489:
-------------------------------------
Hi Jarcec, you're right. My patch screws up Hive view if I do Hive import.
Here is the result of my experiment:
1. Table foo in Oracle
{code}
SQL> select * from foo;
I J K
---------- ---------- ----------
1 2 3
{code}
2. Sqoop command
{code}
sqoop import --verbose ... --table SQOOPTEST.FOO -m 1 --hive-import
--hive-table FOO --hive-partition-key I --hive-partition-value 1.0
{code}
3. Output file in Hdfs
{code}
1,2,3
{code}
4. Hive view
{code}
hive> select * from foo;
OK
1.0 2.0 1.0
{code}
5. Hive table definition
{code}
hive> SHOW TABLE EXTENDED LIKE foo;
OK
tableName:foo
owner:cheolsoo
location:hdfs://localhost/user/hive/warehouse/foo
inputformat:org.apache.hadoop.mapred.TextInputFormat
outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
columns:struct columns { double j, double k}
partitioned:true
partitionColumns:struct partition_columns { string i}
totalNumberFiles:2
totalFileSize:6
maxFileSize:6
minFileSize:0
lastAccessTime:1337993171274
lastUpdateTime:1337993176838
{code}
> Cannot define partition keys for Hive tables created through Sqoop
> ------------------------------------------------------------------
>
> Key: SQOOP-489
> URL: https://issues.apache.org/jira/browse/SQOOP-489
> Project: Sqoop
> Issue Type: Bug
> Affects Versions: 1.4.1-incubating
> Reporter: Kathleen Ting
> Attachments: SQOOP-489.patch
>
>
> By enabling the table option, Sqoop includes every column in the table in the
> create table query, and by enabling the hive-partition-key option, Sqoop
> blindly appends the "partitioned by" clause. Now if you specify one of
> columns in the table in the hive-partition-key, this will cause a syntax
> error in Hive.
> For example, if we have a table 'FOO' that has columns 'I' and 'J':
> sqoop create-hive-table --table FOO ...
> will generate the following Hive query:
> CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING)
> Now if we add "--hive-partition-key I" to the command, Sqoop generates the
> following query:
> CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING) PARTITIONED BY (I
> STRING)
> The problem is that since 'I' is defined twice (once in CRATE TABLE and once
> in PARTITIONED BY), this is a syntax error in Hive.
> This correct query would be something like:
> CREATE TABLE IF NOT EXISTS `FOO` (`J` STRING) PARTITIONED BY (I STRING)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira