[ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
---------------------------------

added note to 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Synopsis.1

> allow column list specification in INSERT statement
> ---------------------------------------------------
>
>                 Key: HIVE-9481
>                 URL: https://issues.apache.org/jira/browse/HIVE-9481
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser, Query Processor, SQL
>    Affects Versions: 0.14.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>              Labels: TODOC1.2
>             Fix For: 1.2.0
>
>         Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source 
>   insert into target(y,z) select null as x, * 
>   insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  
>    VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
>     VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table 
> which are matched by position to the values produced by VALUES clause/SELECT 
> statement.  If the producer side provides values for a dynamic partition 
> column, the column should be in the specified schema.  Static partition 
> values are part of the partition spec and thus are not produced by the 
> producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to