Thanks everyone for the feedback. Just to follow up in case someone else
runs into this: I can confirm that local client works around the OOMEs, but
it's still very slow.
It does seem like we were hitting some combination of HIVE-4051 and
HIVE-5158. We'll try reducing partition count first, and
are using IBM BigInsight, which is using derby as the hive metastore, not as
mysql as my most experience was on.
Yong
From: norbert.bur...@gmail.com
Date: Thu, 27 Feb 2014 07:57:05 -0500
Subject: Re: Metastore performance on HDFS-backed table with 15000+ partitions
To: user@hive.apache.org
Thanks
Query optimizer in hive is awful on memory consumption. 15k partitions sounds a
bit early for it to fail though..
What is your heap size?
Regards,
Terje
On 22 Feb 2014, at 12:05, Norbert Burger norbert.bur...@gmail.com wrote:
Hi folks,
We are running CDH 4.3.0 Hive (0.10.0+121) with a
Dont make tbales with that many partitions. It is an anti pattern. I hwve
tables with 2000 partitions a day and that is rewlly to many. Hive needs go
load that informqtion into memory to plan the query.
On Saturday, February 22, 2014, Terje Marthinussen tmarthinus...@gmail.com
wrote:
Query
yeah. That traceback pretty much spells it out - its metastore related and
that's where the partitions are stored.
I'm with the others on this. HiveServer2 is still a little jankey on memory
management. I bounce mine once a day at midnight just to play it safe (and
because i can.)
Again, for
Hi folks,
We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
In Hive, we have an external table backed by HDFS which has a 3-level
partitioning scheme that currently has 15000+ partitions.
Within the last day or so, queries against this table have started failing.
A simple
most interesting. we had an issue recently with querying a table with 15K
columns and running out of heap storage but not 15K partitions.
15K partitions shouldn't be causing a problem in my humble estimation.
Maybe a million but not 15K. :)
So is there a traceback we can look at? or its not