Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20379 )

Change subject: IMPALA-12383: Fix SingleNodePlanner aggregation limits
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20379/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20379/9//COMMIT_MSG@9
PS9, Line 9: When IMPALA-2581 was implemented, it assumed all aggregation nodes 
would
           : have a pre-aggregation step that limits could be pushed to and 
therefore
           : removed limits on the final aggregation step. All distributed
           : aggregations have an exchange, and in practice the exchange would
           : enforce limits.
Here's what I think the thought process was:
IMPALA-2581 added enforcement of the limit when adding entries to the grouping 
aggregation. It would stop adding new entries if the number of entries in the 
grouping aggregation was >= the limit. If the grouping aggregation never 
contains more entries than the limit, then it would not output more entries.

However, this limit was not enforced exactly when adding. It would add a whole 
batch before checking the limit, so it can go past the limit.

One option is to be exact when adding items to the group agg, which would 
require testing the limit on each row (we don't know which are duplicates). 
This is awkward. Removing the limit on the output of the aggregation also is 
not really needed for the original change (stopping the children early once the 
limit is reached).

Instead, we restore the limit on the output of the grouping agg (which is 
already known to work).



--
To view, visit http://gerrit.cloudera.org:8080/20379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5eec1190e8e182152aa954897b79cc3f219c816
Gerrit-Change-Number: 20379
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qfc...@hotmail.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Sep 2023 16:34:14 +0000
Gerrit-HasComments: Yes

Reply via email to