Christoph Lipka created HIVE-10891:
--------------------------------------
Summary: Limited fetch on partitioned table can eat up all heap
Key: HIVE-10891
URL: https://issues.apache.org/jira/browse/HIVE-10891
Project: Hive
Issue Type: Bug
Components: Physical Optimizer
Affects Versions: 1.1.0
Reporter: Christoph Lipka
When doing a query like
{code}
select *
from partitioned_table
where not_the_partition_key_column = "xyz"
limit 100
{code}
it is executed in memory. For all tables except the smallest this behavior
quickly consumes the complete heap and crashes the server.
If the limit clause is omitted, a mr-job is started and the query is executed
without memory issues. One can also work around this problem by extending the
query to also select the partition_key like
{code}
select *
from partitioned_table a
where a.not_the_partition_key_column = "xyz"
and a.partition_key_column = (select b.partition_key_column from
partitioned_table b)
limit 100
{code}
In this case hive also creates a mr-job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)