Re: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-27 Thread Norbert Burger
Thanks everyone for the feedback. Just to follow up in case someone else runs into this: I can confirm that local client works around the OOMEs, but it's still very slow. It does seem like we were hitting some combination of HIVE-4051 and HIVE-5158. We'll try reducing partition count first, and

RE: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-27 Thread java8964
are using IBM BigInsight, which is using derby as the hive metastore, not as mysql as my most experience was on. Yong From: norbert.bur...@gmail.com Date: Thu, 27 Feb 2014 07:57:05 -0500 Subject: Re: Metastore performance on HDFS-backed table with 15000+ partitions To: user@hive.apache.org Thanks

Re: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-22 Thread Terje Marthinussen
Query optimizer in hive is awful on memory consumption. 15k partitions sounds a bit early for it to fail though.. What is your heap size? Regards, Terje On 22 Feb 2014, at 12:05, Norbert Burger norbert.bur...@gmail.com wrote: Hi folks, We are running CDH 4.3.0 Hive (0.10.0+121) with a

Re: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-22 Thread Edward Capriolo
Dont make tbales with that many partitions. It is an anti pattern. I hwve tables with 2000 partitions a day and that is rewlly to many. Hive needs go load that informqtion into memory to plan the query. On Saturday, February 22, 2014, Terje Marthinussen tmarthinus...@gmail.com wrote: Query

Re: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-22 Thread Stephen Sprague
yeah. That traceback pretty much spells it out - its metastore related and that's where the partitions are stored. I'm with the others on this. HiveServer2 is still a little jankey on memory management. I bounce mine once a day at midnight just to play it safe (and because i can.) Again, for

Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-21 Thread Norbert Burger
Hi folks, We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore. In Hive, we have an external table backed by HDFS which has a 3-level partitioning scheme that currently has 15000+ partitions. Within the last day or so, queries against this table have started failing. A simple

Re: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-21 Thread Stephen Sprague
most interesting. we had an issue recently with querying a table with 15K columns and running out of heap storage but not 15K partitions. 15K partitions shouldn't be causing a problem in my humble estimation. Maybe a million but not 15K. :) So is there a traceback we can look at? or its not