'ALTER TABLE .. ADD PARTITION..' would just a partition entry for the table
in hive metastore. It doesn't perform any data loading, instead it expects
the data to be loaded already in the file pointed to by LOCATION.
On Tue, Jul 15, 2014 at 5:39 AM, Raymond Lau r...@ooyala.com wrote:
I've
Thanks for this clarification. I've revised the Add Partitions section
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddPartitions
in the wiki accordingly.
-- Lefty
On Fri, Jul 18, 2014 at 12:45 AM, Satish Mittal satish.mit...@inmobi.com
wrote:
'ALTER
I've created an external table partitioned by a field and am attempting to
load in the data via the command 'ALTER TABLE partitioned_table_test ADD
PARTITION (pcode = '123') LOCATION '/path/to/parquet/files';' using a
custom Parquet SerDe.
Does loading in the data this way call the serializer()
Hi ,
1) My requirement is to load a file ( a tar.gz file which has multiple tab
separated values files and one file is the main file which has huge data –
about 10 GB per day) to an externally partitioned hive table.
2) What I am doing is I have automated the process by extracting
Hi,
I am planning for a Hive External Partition Table based on a date.
Which one of the below yields a better performance or both have the same
performance?
1) Partition based on one folder per day
LIKE date INT
2) Partition based on one folder per year / month / day ( So it has three
folders
number
of files is typically preferred but partitions will help when date
restricting.
Thx,
Brad
On Thu, Oct 31, 2013 at 3:34 PM, Raj Hadoop hadoop...@yahoo.com wrote:
Hi,
I am planning for a Hive External Partition Table based on a date.
Which one of the below yields a better performance
PM, Raj Hadoop hadoop...@yahoo.com wrote:
Hi,
I am planning for a Hive External Partition Table based on a date.
Which one of the below yields a better performance or both have the same
performance?
1) Partition based on one folder per day
LIKE date INT
2) Partition based on one folder per
because Hive is still
selecting the same number of input paths in both scenarios, one just
happens to be a little deeper.
Cheers,
Tim
On Thu, Oct 31, 2013 at 4:34 PM, Raj Hadoop hadoop...@yahoo.com wrote:
Hi,
I am planning for a Hive External Partition Table based on a date.
Which one
On Thu, Oct 31, 2013 at 4:34 PM, Raj Hadoop hadoop...@yahoo.com wrote:
Hi,
I am planning for a Hive External Partition Table based on a date.
Which one of the below yields a better performance or both have the same
performance?
1) Partition based on one folder per day
LIKE date INT
2