Fix rowcount estimate for gather (merge) paths In the case of a parallel plan, when computing the number of tuples processed per worker, we divide the total number of tuples by the parallel_divisor obtained from get_parallel_divisor(), which accounts for the leader's contribution in addition to the number of workers.
Accordingly, when estimating the number of tuples for gather (merge) nodes, we should multiply the number of tuples per worker by the same parallel_divisor to reverse the division. However, currently we use parallel_workers rather than parallel_divisor for the multiplication. This could result in an underestimation of the number of tuples for gather (merge) nodes, especially when there are fewer than four workers. This patch fixes this issue by using the same parallel_divisor for the multiplication. There is one ensuing plan change in the regression tests, but it looks reasonable and does not compromise its original purpose of testing parallel-aware hash join. In passing, this patch removes an unnecessary assignment for path.rows in create_gather_merge_path, and fixes an uninitialized-variable issue in generate_useful_gather_paths. No backpatch as this could result in plan changes. Author: Anthonin Bonnefoy Reviewed-by: Rafia Sabih, Richard Guo Discussion: https://postgr.es/m/CAO6_Xqr9+51NxgO=XospEkUeAg-p=ejawmtpdczwjrggkj5...@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/581df2148737fdb0ba6f2d8fda5ceb9d1e6302e6 Modified Files -------------- src/backend/optimizer/path/allpaths.c | 7 +++---- src/backend/optimizer/path/costsize.c | 18 ++++++++++++++++++ src/backend/optimizer/plan/planner.c | 7 ++----- src/backend/optimizer/util/pathnode.c | 1 - src/include/optimizer/cost.h | 1 + src/test/regress/expected/join_hash.out | 19 +++++++++---------- 6 files changed, 33 insertions(+), 20 deletions(-)