Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20379 )
Change subject: IMPALA-12383: Fix SingleNodePlanner aggregation limits ...................................................................... IMPALA-12383: Fix SingleNodePlanner aggregation limits IMPALA-2581 added enforcement of the limit when adding entries to the grouping aggregation. It would stop adding new entries if the number of entries in the grouping aggregation was >= the limit. If the grouping aggregation never contains more entries than the limit, then it would not output more entries. However, this limit was not enforced exactly when adding. It would add a whole batch before checking the limit, so it can go past the limit. In practice the exchange in a distributed aggregation would enforce limits, so this would only show up when num_nodes=1. As a result, the following query incorrectly returns 16 rows, not 10: set num_nodes=1; select distinct l_orderkey from tpch.lineitem limit 10; One option is to be exact when adding items to the group aggregation, which would require testing the limit on each row (we don't know which are duplicates). This is awkward. Removing the limit on the output of the aggregation also is not really needed for the original change (stopping the children early once the limit is reached). Instead, we restore the limit on the output of the grouping agg (which is already known to work). Testing: - added a test case where we assert number of rows returned by an aggregation node (rather than an exchange or top-n). - restores definition of ALL_CLUSTER_SIZES and makes it simpler to enable for individual test suites. Filed IMPALA-12394 to generally re-enable testing with ALL_CLUSTER_SIZES. Enables ALL_CLUSTER_SIZES for aggregation tests. Change-Id: Ic5eec1190e8e182152aa954897b79cc3f219c816 Reviewed-on: http://gerrit.cloudera.org:8080/20379 Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com> --- M be/src/exec/aggregation-node-base.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M tests/common/impala_test_suite.py M tests/common/test_dimensions.py M tests/query_test/test_aggregation.py 6 files changed, 31 insertions(+), 22 deletions(-) Approvals: Impala Public Jenkins: Verified Joe McDonnell: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/20379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic5eec1190e8e182152aa954897b79cc3f219c816 Gerrit-Change-Number: 20379 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qfc...@hotmail.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>