Hi Carla, I assume you are using dynamic partitioning for this, correct??
Assuming so, I have the same question and am trying to figure it out, and will
let you know if I do.
If you are using static partitions, you just need to specify the location on
the 'alter table' command when the partition(s) is/are added...
alter table my table add if not exists partition(year=2012,month=10,day=02)
location '2012/10/02';
Again, I have not yet figured out if I can get this to occur with dynamic
partitions.
- Original Message -
From: "carla staeben"
To: user@hive.apache.org, "bejoy ks"
Sent: Tuesday, October 2, 2012 8:56:50 AM
Subject: RE: File Path and Partition names
Thanks Bejoy, I was kind of hoping to avoid all of the ‘extra’ work…it would be
nice if hive didn’t include the partition name in the path creation…I was
hoping that there was a ‘set’ parameter/config I was missing.
Thanks
Carla
From: ext Bejoy KS [mailto:bejoy...@yahoo.com]
Sent: Tuesday, October 02, 2012 08:54
To: user@hive.apache.org
Subject: Re: File Path and Partition names
Hi Carla
If you like to have your custom directory structure for your partitions. You
can create dirs in hdfs of your choice , load data into them (If from another
hive table then you can use 'Insert Overwrite Directory..' To populate an hdfs
dir). Now you need to register this dir as a new partition on to required table
using
'Alter Table Add Parition ...'
Regards
Bejoy KS
Sent from handheld, please excuse typos.
From: < carla.stae...@nokia.com >
Date: Tue, 2 Oct 2012 10:55:19 +
To: < user@hive.apache.org >
ReplyTo: user@hive.apache.org
Subject: File Path and Partition names
Quick question about using hive to create new hdfs file paths.
Generally speaking, we like to keep our data files with a path similar to
Dataset/year/month/day/hour
I need to create a new table in hive and populate it with data from a different
dataset, using a HiveQL query. If I do this:
CREATE EXTERNAL TABLE IF NOT EXISTS new_table
(field1 string
,field2 string
,field3 string
)
partitioned by (reg_yr string, reg_mon string, reg_day string, reg_hour string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE ;
And then do an insert overwrite into, I end up with this path in hdfs:
Dataset/reg_year=2012/reg_mon=10/reg_day=02/reg_hour=07
Is there an * easy * way to remove the partition name from the creation of the
hdfs path?
Thanks
Carla