You are correct. Just opened https://issues.apache.org/jira/browse/HIVE-1002 This is a highly wanted feature from a lot of users.
Please comment on the JIRA. Let's figure out how we want to do it. Zheng On Wed, Dec 16, 2009 at 6:36 PM, <[email protected]> wrote: > Correct me if I’m wrong, but it looks like I’ll need an INSERT statement for > each (DS, TYPE) partition: > > > > hive> INSERT OVERWRITE TABLE SYSLOG_SEQUENCE > PARTITION(ds='2009-12-07',type='user') select month, day, time, host, > logline from syslog where syslog.ds='2009-12-07' and syslog.type='user'; > > Loading data to table syslog_sequence partition {ds=2009-12-07, type=user} > > OK > > > > Since I’m programmatically loading the data from files into the syslog table > in the first place, I just need to add another statement to that program > that adds the data to syslog_sequence every time it adds a new partition to > syslog. > > > > From: [email protected] [mailto:[email protected]] > Sent: Wednesday, December 16, 2009 6:16 PM > To: [email protected] > Subject: How do I INSERT OVERWRITE into a new table if it's partitioned? > > > > Hello, > > > > I’d like to store my log file data that’s imported into Hive in compressed > format. I was following some steps outlined by Zheng on how to do that, > where he says: > > > > CREATE TABLE texttable (...) STORED AS TEXTFILE; > > LOAD DATA ... OVERWRITE INTO texttable; > > CREATE TABLE seqtable (...) STORED AS SEQUENCEFILE; > > set hive.exec.compress.output=true; > > set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; > > set mapred.output.compression.type=BLOCK; > > INSERT OVERWRITE TABLE seqtable SELECT * FROM texttable; > > > > but I get stuck on the last step. > > > > I can’t write to my new SYSLOG_SEQUENCE table because the tables are > partitioned: > > > > hive> INSERT OVERWRITE TABLE SYSLOG_SEQUENCE SELECT * FROM SYSLOG; > > FAILED: Error in semantic analysis: need to specify partition columns > because the destination table is partitioned. > > > > What syntax can I use to get the data in the new table? It has DS and TYPE > as partition columns: > > > > hive> describe syslog; > > OK > > month string from deserializer > > day string from deserializer > > time string from deserializer > > host string from deserializer > > logline string from deserializer > > ds string > > type string > > > > I took a stab at it like below, but that only gives me two partitions in > total, and of course what I want is the same partitions as exist in the > original SYSLOG table. > > > > Any pointers welcome- > > Thanks > > Ken > > > > hive> INSERT OVERWRITE TABLE SYSLOG_SEQUENCE PARTITION(ds='*',type='*') > select month, day, time, host, logline from syslog; > > OK > > Time taken: 94.885 seconds > > -- Yours, Zheng
