Krisztian Kasa created HIVE-24479:
-------------------------------------
Summary: Introduce setting to set lower bound of hash aggregation
reduction.
Key: HIVE-24479
URL: https://issues.apache.org/jira/browse/HIVE-24479
Project: Hive
Issue Type: Improvement
Components: Physical Optimizer
Affects Versions: 4.0.0
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa
Fix For: 4.0.0
* Default setting of hash group by min reduction % is 0.99.
* During compilation, we check its effectiveness and adjust it accordingly in
{{SetHashGroupByMinReduction}}:
{code}
float defaultMinReductionHashAggrFactor = desc.getMinReductionHashAggr();
float minReductionHashAggrFactor = 1f - ((float) ndvProduct / numRows);
if (minReductionHashAggrFactor < defaultMinReductionHashAggrFactor) {
desc.setMinReductionHashAggr(minReductionHashAggrFactor);
}
{code}
For certain queries, this computation turns out to be "0".
This forces operator to skip HashAggregates completely and always ends up
choosing streaming mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)