Re: Impact of partitioning on certain queries

Gopal Vijayaraghavan Fri, 08 Jan 2016 01:34:40 -0800

> Ok we hope that partitioning improves performance where the predicate is
>on partitioned columns


Nope.

Partitioning *only* improves performance if your queries run with

set hive.mapred.mode=strict;

That's the "use strict" easy way to make sure you're writing good queries.

Even then, schema design in hive is something you need to learn with the
assumption that neither the storage layer, nor the compute layer is part
of "hive".

It floats itself in an "access" layer above both. Not sure there's any
legacy tech to draw parallels with that.

If you haven't seen this before, here's an example of the problem

http://www.slideshare.net/Hadoop_Summit/hive-at-yahoo-letters-from-the-tren
ches/24


Cheers,
Gopal

Re: Impact of partitioning on certain queries

Reply via email to