OK Thanks. I believe the correct format is 

INSERT INTO TABLE t1 partition(p1='x', p2='y')
select a, b from (select a, b from t2 union all select a, b from t1 where 
p1=’x’ and p2=’y’) tmp2;

since direct union is not supported, only in subqueries. (per the error msg I 
got).

Thanks again.

Also I believe the numbers reported (xx Rows loaded into t1) were way off.
Doing a count on the file afterwards yielded correct results though. I wonder 
why?

Thanks,
Vinay




________________________________
From: Namit Jain <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Tuesday, October 6, 2009 3:10:32 PM
Subject: RE: INSERT OVERWRITE restriction

 
Currently, the only way to do this is to select the existing
data and do a union:
 
 
INSERT INTO
TABLE t1 partition(p1='x', p2='y') 
select a, b
from t2 union all select * from t1 where p1=’x’ and p2=’y’
 
 
 
From:vinay gupta
[mailto:[email protected]] 
Sent: Tuesday, October 06, 2009 2:43 PM
To: [email protected]
Subject: INSERT OVERWRITE restriction
 
Hi,
 
http://wiki.apache.org/hadoop/Hive/LanguageManual/DML
 
From the wiki
above.... " Currently the OVERWRITE keyword is
mandatory and implies that the contents of the chosen table or partition are
replaced with the output of corresponding select statement."
 
How would one
insert (more data) into an existing table(with some data already in it).
 
If I am reading
the correct documentation, then how do people workaround this restriction. I am
sure this is common enough use-case.
 
I am trying to
do
INSERT INTO
TABLE t1 partition(p1='x', p2='y') select a, b from t2;
 
and I need to
do this repeatedly when more data arrives into t2(or some other table);
 
 
Thanks
Vinay


      

Reply via email to