How do I INSERT OVERWRITE into a new table if it's partitioned?

Ken.Barclay Wed, 16 Dec 2009 18:16:08 -0800

Hello,

I'd like to store my log file data that's imported into Hive in compressed 
format. I was following some steps outlined by Zheng on how to do that, where 
he says:


CREATE TABLE texttable (...) STORED AS TEXTFILE;
LOAD DATA ... OVERWRITE INTO texttable;
CREATE TABLE seqtable (...) STORED AS SEQUENCEFILE;
set hive.exec.compress.output=true;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
set mapred.output.compression.type=BLOCK;
INSERT OVERWRITE TABLE seqtable SELECT * FROM texttable;

but I get stuck on the last step.

I can't write to my new SYSLOG_SEQUENCE  table because the tables are 
partitioned:

hive> INSERT OVERWRITE TABLE SYSLOG_SEQUENCE  SELECT * FROM SYSLOG;
FAILED: Error in semantic analysis: need to specify partition columns because 
the destination table is partitioned.

What syntax can I use to get the data in the new table? It has DS and TYPE as 
partition columns:

hive> describe syslog;
OK
month   string  from deserializer
day     string  from deserializer
time    string  from deserializer
host    string  from deserializer
logline string  from deserializer
ds      string
type    string

I took a stab at it like below, but that only gives me two partitions in total, 
and of course what I want is the same partitions as exist in the original 
SYSLOG table.

Any pointers welcome-
Thanks
Ken

hive> INSERT OVERWRITE TABLE SYSLOG_SEQUENCE PARTITION(ds='*',type='*') select 
month, day, time, host, logline from syslog;
OK
Time taken: 94.885 seconds

How do I INSERT OVERWRITE into a new table if it's partitioned?

Reply via email to