Re: [Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-17 Thread Chris Roblee
Hi Slava, We would be interested in reviewing your patch. Can you please provide more details? Is there any other way to disable the partition creation step? Thanks, Chris On 4/13/15 10:59 PM, Slava Markeyev wrote: This is something I've encountered when doing ETL with hive and having it

Re: [Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-17 Thread Slava Markeyev
I've created HIVE-10385 and attached a patch. Unit tests to come. -Slava On Fri, Apr 17, 2015 at 1:34 PM, Chris Roblee chr...@unity3d.com wrote: Hi Slava, We would be interested in reviewing your patch. Can you please provide more details? Is there any other way to disable the partition

Re: [Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-14 Thread Slava Markeyev
This is something I've encountered when doing ETL with hive and having it create 10's of thousands partitions. The issue is each partition needs to be added to the metastore and this is an expensive operation to perform. My work around was adding a flag to hive that optionally disables the

Re: [Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-14 Thread Edward Capriolo
...@upsight.com] *Sent:* Monday, April 13, 2015 11:00 PM *To:* user@hive.apache.org *Cc:* Sergio Pena *Subject:* Re: [Hive] Slow Loading Data Process with Parquet over 30k Partitions This is something I've encountered when doing ETL with hive and having it create 10's of thousands partitions

RE: [Hive] Slow Loading Data Process with Parquet over 30k Partitions

2015-04-13 Thread Xu, Cheng A
Hi Tianqi, Can you attach hive.log as more detailed information? +Sergio Yours, Ferdinand Xu From: Tianqi Tong [mailto:tt...@brightedge.com] Sent: Friday, April 10, 2015 1:34 AM To: user@hive.apache.org Subject: [Hive] Slow Loading Data Process with Parquet over 30k Partitions Hello Hive, I'm a