Dealing with monstrous hive startup overhead

Edward Capriolo Thu, 10 Jul 2014 08:47:57 -0700

So Everyone is running around saying "hive is slow" "x is faster". I think
hive's biggest issue is that the mr2 entire process to acquire containers
and then launch a job in them is super overkill. I see it result in 40
seconds startup time for what amounts to a 2 second job. In the old hadoop
0.20.2 days these queries were much faster. Honestly I know everyones is in
the ball park that (tez/spark) is some magical answer....but how about we
make a yarn service that just keeps N / nodes open and ready for action.
Cut down the entire ask the manager for nodes each job part out.

Dealing with monstrous hive startup overhead

Reply via email to