Jesus Camacho Rodriguez created HIVE-14866:
----------------------------------------------
Summary: Set hive.limit.optimize.enable to true
Key: HIVE-14866
URL: https://issues.apache.org/jira/browse/HIVE-14866
Project: Hive
Issue Type: Improvement
Affects Versions: 2.1.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
Currently, we set up the global limit for the query in two different places
through two different variables: SemanticAnalyzer and through an optimization
rule GlobalLimitOptimizer (the latest is off by default).
This leads to several problems that I have observed:
- Global limit might not be set for very simple queries, e.g., if the query
does not contain a RS). GlobalLimitOptimizer would set the limit in this case,
but as stated above, it is off by default.
- Some other optimizations are not checking both variables, thus missing
opportunities.
- The variable set by SemanticAnalyzer does not take into account offset of the
query, which I think might lead to incorrect results if FetchOptimizer kicks in
(not verified yet). GlobalLimitOptimizer does take into account offset of query.
This issue is to set hive.limit.optimize.enable to _true_ by default, i.e., use
GlobalLimitOptimizer, and thus getting rid of the variable set by
SemanticAnalyzer. Maybe there are some gaps (cases covered by SemanticAnalyzer
alternative and not covered by GlobalLimitOptimizer) that we will need to work
on.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)