OK Thanks. I believe the correct format is INSERT INTO TABLE t1 partition(p1='x', p2='y') select a, b from (select a, b from t2 union all select a, b from t1 where p1=’x’ and p2=’y’) tmp2;
since direct union is not supported, only in subqueries. (per the error msg I got). Thanks again. Also I believe the numbers reported (xx Rows loaded into t1) were way off. Doing a count on the file afterwards yielded correct results though. I wonder why? Thanks, Vinay ________________________________ From: Namit Jain <[email protected]> To: "[email protected]" <[email protected]> Sent: Tuesday, October 6, 2009 3:10:32 PM Subject: RE: INSERT OVERWRITE restriction Currently, the only way to do this is to select the existing data and do a union: INSERT INTO TABLE t1 partition(p1='x', p2='y') select a, b from t2 union all select * from t1 where p1=’x’ and p2=’y’ From:vinay gupta [mailto:[email protected]] Sent: Tuesday, October 06, 2009 2:43 PM To: [email protected] Subject: INSERT OVERWRITE restriction Hi, http://wiki.apache.org/hadoop/Hive/LanguageManual/DML From the wiki above.... " Currently the OVERWRITE keyword is mandatory and implies that the contents of the chosen table or partition are replaced with the output of corresponding select statement." How would one insert (more data) into an existing table(with some data already in it). If I am reading the correct documentation, then how do people workaround this restriction. I am sure this is common enough use-case. I am trying to do INSERT INTO TABLE t1 partition(p1='x', p2='y') select a, b from t2; and I need to do this repeatedly when more data arrives into t2(or some other table); Thanks Vinay
