Re: MergeAppend could consider sorting cheapest child path

Andrei Lepikhov Tue, 22 Jul 2025 04:14:03 -0700

On 4/6/2025 00:41, Alexander Korotkov wrote:

On Tue, Jun 3, 2025 at 5:35 PM Andrei Lepikhov <lepi...@gmail.com> wrote:

For me, it seems like a continuation of the 7d8ac98 discussion. We may
charge a small fee for MergeAppend to adjust the balance, of course.
However, I think this small change requires a series of benchmarks to
determine how it affects the overall cost balance. Without examples it
is hard to say how important this issue is and its worthiness to
commence such work.


Yes, I think it's fair to charge the MergeAppend node.  We currently
cost it similarly to Sort merge stage, but it's clearly more
expensive.  It dealing on the executor level dealing with Slot's etc,
while Sort node have a set of lower level optimizations.

After conducting additional research, I concluded that you are correct, and the current cost model doesn't allow the optimiser to detect the best option. A simple test with a full scan and sort of a partitioned table (see attachment) shows that the query plan prefers small sortings merged by the MergeAppend node. I have got the following results for different numbers of tuples to be sorted (in the range from 10 tuples to 1E8):


EXPLAIN SELECT * FROM test ORDER BY y;

1E1: Sort  (cost=9.53..9.57 rows=17 width=8)
1E2: Sort  (cost=20.82..21.07 rows=100)
1E3: Merge Append  (cost=56.19..83.69 rows=1000)
1E4: Merge Append  (cost=612.74..887.74 rows=10000)
1E5: Merge Append  (cost=7754.19..10504.19 rows=100000)
1E6: Merge Append  (cost=94092.25..121592.25 rows=1000000)
1E7: Merge Append  (cost=1106931.22..1381931.22 rows=10000000)
1E8: Merge Append  (cost=14097413.40..16847413.40 rows=100000000)

That happens because both total costs lie within the fuzzy factor gap, and the optimiser chooses the path based on the startup cost, which is obviously better for the MergeAppend case.

At the same time, execution, involving a single Sort node, dominates the MergeAppend case:


1E3: MergeAppend: 1.927 ms, Sort: 0.720 ms
1E4: MergeAppend: 10.090 ms, Sort: 7.583 ms
1E5: MergeAppend: 118.885 ms, Sort: 88.492 ms
1E6: MergeAppend: 1372.717 ms, Sort: 1106.184 ms
1E7: MergeAppend: 15103.893 ms, Sort: 13415.806 ms
1E8: MergeAppend: 176553.133 ms, Sort: 149458.787 ms

Looking inside the code, I found the only difference we can employ to rationalise the cost model change: re-structuring of a heap. The siftDown routine employs two tuple comparisons to find the proper position for a tuple. So, we have objections to changing the constant in the cost model of the merge operation:


@@ -2448,7 +2448,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
        logN = LOG2(N);

        /* Assumed cost per tuple comparison */
-       comparison_cost = 2.0 * cpu_operator_cost;
+       comparison_cost = 4.0 * cpu_operator_cost;

        /* Heap creation cost */
        startup_cost += comparison_cost * N * logN;

The exact change also needs to be made in the cost_gather_merge function, of course. At this moment, I'm not sure that it should be multiplied by 2 - it is a subject for further discussion. However, it fixes the current issue and adds a little additional cost to the merge operation, which, as I see it, is quite limited.

Please see the new version of the patch attached.

--
regards, Andrei Lepikhov

sort-vs-mergeappend.sql
Description: application/sql

From 588607223f21622f1b0225c95b8fa200165c62eb Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepi...@gmail.com>
Date: Tue, 3 Jun 2025 11:37:23 +0200
Subject: [PATCH v7] Consider an explicit sort of the MergeAppend subpaths.

Broaden the optimiser's search scope slightly: when retrieving optimal subpaths
that match pathkeys for the planning MergeAppend, also consider the case of
an overall optimal path that includes an explicit Sort node at the top.

It may provide a more effective plan in both full and fractional scan cases:
1. The Sort node may be pushed down to subpaths under a parallel or async 
Append.
2. The case when a minor set of subpaths doesn't have a proper index, and it
is profitable to sort them instead of switching to plain Append.

Having implemented that strategy, it became clear that the cost of multiple
small sortings merged by a single MergeAppend node exceeds that of a single Sort
operation over a plain Append. The code and benchmarks demonstrate that such
an assumption is incorrect because the Sort operator has optimisations that work
faster than a MergeAppend.
To arrange the cost model, change the merge cost multiplier, considering that
heap rebuilding needs two comparison operations.
---
 src/backend/optimizer/path/allpaths.c         |  47 ++--
 src/backend/optimizer/path/costsize.c         |   6 +-
 src/backend/optimizer/path/pathkeys.c         | 115 ++++++++++
 src/include/optimizer/paths.h                 |  10 +
 src/test/regress/expected/inherit.out         |  34 +--
 .../regress/expected/partition_aggregate.out  |  32 ++-
 src/test/regress/expected/partition_join.out  | 203 ++++++++++--------
 src/test/regress/expected/partition_prune.out |  16 +-
 src/test/regress/expected/union.out           |  69 +++---
 src/test/regress/sql/inherit.sql              |   4 +-
 10 files changed, 334 insertions(+), 202 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..65ea9d477b0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1855,29 +1855,18 @@ generate_orderedappend_paths(PlannerInfo *root, 
RelOptInfo *rel,
 
                        /* Locate the right paths, if they are available. */
                        cheapest_startup =
-                               
get_cheapest_path_for_pathkeys(childrel->pathlist,
-                                                                               
           pathkeys,
-                                                                               
           NULL,
-                                                                               
           STARTUP_COST,
-                                                                               
           false);
+                               get_cheapest_path_for_pathkeys_ext(root, 
childrel, pathkeys,
+                                                                               
                   NULL, STARTUP_COST, false);
                        cheapest_total =
-                               
get_cheapest_path_for_pathkeys(childrel->pathlist,
-                                                                               
           pathkeys,
-                                                                               
           NULL,
-                                                                               
           TOTAL_COST,
-                                                                               
           false);
+                               get_cheapest_path_for_pathkeys_ext(root, 
childrel, pathkeys,
+                                                                               
                   NULL, TOTAL_COST, false);
 
                        /*
-                        * If we can't find any paths with the right order just 
use the
-                        * cheapest-total path; we'll have to sort it later.
+                        * In accordance to current planning logic there are no
+                        * parameterised paths under a merge append.
                         */
-                       if (cheapest_startup == NULL || cheapest_total == NULL)
-                       {
-                               cheapest_startup = cheapest_total =
-                                       childrel->cheapest_total_path;
-                               /* Assert we do have an unparameterized path 
for this child */
-                               Assert(cheapest_total->param_info == NULL);
-                       }
+                       Assert(cheapest_startup != NULL && cheapest_total != 
NULL);
+                       Assert(cheapest_total->param_info == NULL);
 
                        /*
                         * When building a fractional path, determine a cheapest
@@ -1904,21 +1893,17 @@ generate_orderedappend_paths(PlannerInfo *root, 
RelOptInfo *rel,
                                        path_fraction /= childrel->rows;
 
                                cheapest_fractional =
-                                       
get_cheapest_fractional_path_for_pathkeys(childrel->pathlist,
-                                                                               
                                          pathkeys,
-                                                                               
                                          NULL,
-                                                                               
                                          path_fraction);
+                                       
get_cheapest_fractional_path_for_pathkeys_ext(root,
+                                                                               
                                                  childrel,
+                                                                               
                                                  pathkeys,
+                                                                               
                                                  NULL,
+                                                                               
                                                  path_fraction);
 
                                /*
-                                * If we found no path with matching pathkeys, 
use the
-                                * cheapest total path instead.
-                                *
-                                * XXX We might consider partially sorted paths 
too (with an
-                                * incremental sort on top). But we'd have to 
build all the
-                                * incremental paths, do the costing etc.
+                                * In accordance to current planning logic 
there are no
+                                * parameterised fractional paths under a merge 
append.
                                 */
-                               if (!cheapest_fractional)
-                                       cheapest_fractional = cheapest_total;
+                               Assert(cheapest_fractional != NULL);
                        }
 
                        /*
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 1f04a2c182c..d22eb8468a4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -512,7 +512,7 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
        logN = LOG2(N);
 
        /* Assumed cost per tuple comparison */
-       comparison_cost = 2.0 * cpu_operator_cost;
+       comparison_cost = 4.0 * cpu_operator_cost;
 
        /* Heap creation cost */
        startup_cost += comparison_cost * N * logN;
@@ -1965,7 +1965,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
                 * factor is a bit higher than for quicksort.  Tweak it so that 
the
                 * cost curve is continuous at the crossover point.
                 */
-               *startup_cost = comparison_cost * tuples * LOG2(2.0 * 
output_tuples);
+               *startup_cost = 2.0 * comparison_cost * tuples * LOG2(2.0 * 
output_tuples);
        }
        else
        {
@@ -2474,7 +2474,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
        logN = LOG2(N);
 
        /* Assumed cost per tuple comparison */
-       comparison_cost = 2.0 * cpu_operator_cost;
+       comparison_cost = 4.0 * cpu_operator_cost;
 
        /* Heap creation cost */
        startup_cost += comparison_cost * N * logN;
diff --git a/src/backend/optimizer/path/pathkeys.c 
b/src/backend/optimizer/path/pathkeys.c
index 8b04d40d36d..3a13b3d02ee 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -19,11 +19,13 @@
 
 #include "access/stratnum.h"
 #include "catalog/pg_opfamily.h"
+#include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
+#include "optimizer/planner.h"
 #include "partitioning/partbounds.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
@@ -648,6 +650,64 @@ get_cheapest_path_for_pathkeys(List *paths, List *pathkeys,
        return matched_path;
 }
 
+/*
+ * get_cheapest_path_for_pathkeys_ext
+ *       Calls get_cheapest_path_for_pathkeys to obtain cheapest path that
+ *       satisfies defined criterias and сonsiders one more option: choose
+ *       overall-optimal path (according the criterion) and explicitly sort its
+ *       output to satisfy the pathkeys.
+ *
+ *       Caller is responsible to insert corresponding sort path at the top of
+ *       returned path if it will be chosen to be used.
+ *
+ *       Return NULL if no such path.
+ */
+Path *
+get_cheapest_path_for_pathkeys_ext(PlannerInfo *root, RelOptInfo *rel,
+                                                                  List 
*pathkeys, Relids required_outer,
+                                                                  CostSelector 
cost_criterion,
+                                                                  bool 
require_parallel_safe)
+{
+       Path            sort_path;
+       Path       *base_path = rel->cheapest_total_path;
+       Path       *path;
+
+       /* In generate_orderedappend_paths() all childrels do have some paths */
+       Assert(base_path);
+
+       path = get_cheapest_path_for_pathkeys(rel->pathlist, pathkeys,
+                                                                               
  required_outer, cost_criterion,
+                                                                               
  require_parallel_safe);
+
+       /*
+        * Stop here if the cheapest total path doesn't satisfy necessary
+        * conditions
+        */
+       if ((require_parallel_safe && !base_path->parallel_safe) ||
+               !bms_is_subset(PATH_REQ_OUTER(base_path), required_outer))
+               return path;
+
+       if (path == NULL)
+
+               /*
+                * Current pathlist doesn't fit the pathkeys. No need to check 
extra
+                * sort path ways.
+                */
+               return base_path;
+
+       /* Consider the cheapest total path with extra sort */
+       if (path != base_path)
+       {
+               cost_sort(&sort_path, root, pathkeys, base_path->disabled_nodes,
+                                 base_path->total_cost, base_path->rows,
+                                 base_path->pathtarget->width, 0.0, work_mem, 
-1.0);
+
+               if (compare_path_costs(&sort_path, path, cost_criterion) < 0)
+                       return base_path;
+       }
+       return path;
+}
+
 /*
  * get_cheapest_fractional_path_for_pathkeys
  *       Find the cheapest path (for retrieving a specified fraction of all
@@ -690,6 +750,61 @@ get_cheapest_fractional_path_for_pathkeys(List *paths,
        return matched_path;
 }
 
+/*
+ * get_cheapest_fractional_path_for_pathkeys_ext
+ *       obtain cheapest fractional path that satisfies defined criterias 
excluding
+ *       pathkeys and explicitly sort its output to satisfy the pathkeys.
+ *
+ *       Caller is responsible to insert corresponding sort path at the top of
+ *       returned path if it will be chosen to be used.
+ *
+ *       Return NULL if no such path.
+ */
+Path *
+get_cheapest_fractional_path_for_pathkeys_ext(PlannerInfo *root,
+                                                                               
          RelOptInfo *rel,
+                                                                               
          List *pathkeys,
+                                                                               
          Relids required_outer,
+                                                                               
          double fraction)
+{
+       Path            sort_path;
+       Path       *base_path = rel->cheapest_total_path;
+       Path       *path;
+
+       /* In generate_orderedappend_paths() all childrels do have some paths */
+       Assert(base_path);
+
+       path = get_cheapest_fractional_path_for_pathkeys(rel->pathlist, 
pathkeys,
+                                                                               
                         required_outer, fraction);
+
+       /*
+        * Stop here if the cheapest total path doesn't satisfy necessary
+        * conditions
+        */
+       if (!bms_is_subset(PATH_REQ_OUTER(base_path), required_outer))
+               return path;
+
+       if (path == NULL)
+
+               /*
+                * Current pathlist doesn't fit the pathkeys. No need to check 
extra
+                * sort path ways.
+                */
+               return base_path;
+
+       /* Consider the cheapest total path with extra sort */
+       if (path != base_path)
+       {
+               cost_sort(&sort_path, root, pathkeys, base_path->disabled_nodes,
+                                 base_path->total_cost, base_path->rows,
+                                 base_path->pathtarget->width, 0.0, work_mem, 
-1.0);
+
+               if (compare_fractional_path_costs(&sort_path, path, fraction) 
<= 0)
+                       return base_path;
+       }
+
+       return path;
+}
 
 /*
  * get_cheapest_parallel_safe_total_inner
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 8410531f2d6..26cd4585c97 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -222,10 +222,20 @@ extern Path *get_cheapest_path_for_pathkeys(List *paths, 
List *pathkeys,
                                                                                
        Relids required_outer,
                                                                                
        CostSelector cost_criterion,
                                                                                
        bool require_parallel_safe);
+extern Path *get_cheapest_path_for_pathkeys_ext(PlannerInfo *root,
+                                                                               
                RelOptInfo *rel, List *pathkeys,
+                                                                               
                Relids required_outer,
+                                                                               
                CostSelector cost_criterion,
+                                                                               
                bool require_parallel_safe);
 extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
                                                                                
                           List *pathkeys,
                                                                                
                           Relids required_outer,
                                                                                
                           double fraction);
+extern Path *get_cheapest_fractional_path_for_pathkeys_ext(PlannerInfo *root,
+                                                                               
                                   RelOptInfo *rel,
+                                                                               
                                   List *pathkeys,
+                                                                               
                                   Relids required_outer,
+                                                                               
                                   double fraction);
 extern Path *get_cheapest_parallel_safe_total_inner(List *paths);
 extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
                                                                  ScanDirection 
scandir);
diff --git a/src/test/regress/expected/inherit.out 
b/src/test/regress/expected/inherit.out
index 5b5055babdc..bb6125090bd 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1665,11 +1665,13 @@ insert into matest2 (name) values ('Test 3');
 insert into matest2 (name) values ('Test 4');
 insert into matest3 (name) values ('Test 5');
 insert into matest3 (name) values ('Test 6');
-set enable_indexscan = off;  -- force use of seqscan/sort, so no merge
+set enable_indexscan = off;  -- force use of seqscan/sort
+set enable_sort = off; -- since merge append may employ sort in children we 
need to disable sort
 explain (verbose, costs off) select * from matest0 order by 1-id;
                          QUERY PLAN                         
 ------------------------------------------------------------
  Sort
+   Disabled: true
    Output: matest0.id, matest0.name, ((1 - matest0.id))
    Sort Key: ((1 - matest0.id))
    ->  Result
@@ -1683,7 +1685,7 @@ explain (verbose, costs off) select * from matest0 order 
by 1-id;
                      Output: matest0_3.id, matest0_3.name
                ->  Seq Scan on public.matest3 matest0_4
                      Output: matest0_4.id, matest0_4.name
-(14 rows)
+(15 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
@@ -1719,6 +1721,7 @@ select min(1-id) from matest0;
 (1 row)
 
 reset enable_indexscan;
+reset enable_sort;
 set enable_seqscan = off;  -- plan with fewest seqscans should be merge
 set enable_parallel_append = off; -- Don't let parallel-append interfere
 explain (verbose, costs off) select * from matest0 order by 1-id;
@@ -1844,16 +1847,20 @@ order by t1.b limit 10;
          Merge Cond: (t1.b = t2.b)
          ->  Merge Append
                Sort Key: t1.b
-               ->  Index Scan using matest0i on matest0 t1_1
+               ->  Sort
+                     Sort Key: t1_1.b
+                     ->  Seq Scan on matest0 t1_1
                ->  Index Scan using matest1i on matest1 t1_2
          ->  Materialize
                ->  Merge Append
                      Sort Key: t2.b
-                     ->  Index Scan using matest0i on matest0 t2_1
-                           Filter: (c = d)
+                     ->  Sort
+                           Sort Key: t2_1.b
+                           ->  Seq Scan on matest0 t2_1
+                                 Filter: (c = d)
                      ->  Index Scan using matest1i on matest1 t2_2
                            Filter: (c = d)
-(14 rows)
+(18 rows)
 
 reset enable_nestloop;
 drop table matest0 cascade;
@@ -1867,17 +1874,16 @@ analyze matest0;
 analyze matest1;
 explain (costs off)
 select * from matest0 where a < 100 order by a;
-                          QUERY PLAN                           
----------------------------------------------------------------
- Merge Append
+                QUERY PLAN                 
+-------------------------------------------
+ Sort
    Sort Key: matest0.a
-   ->  Index Only Scan using matest0_pkey on matest0 matest0_1
-         Index Cond: (a < 100)
-   ->  Sort
-         Sort Key: matest0_2.a
+   ->  Append
+         ->  Seq Scan on matest0 matest0_1
+               Filter: (a < 100)
          ->  Seq Scan on matest1 matest0_2
                Filter: (a < 100)
-(8 rows)
+(7 rows)
 
 drop table matest0 cascade;
 NOTICE:  drop cascades to table matest1
diff --git a/src/test/regress/expected/partition_aggregate.out 
b/src/test/regress/expected/partition_aggregate.out
index 5f2c0cf5786..9245506279b 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -1380,28 +1380,26 @@ SELECT x, sum(y), avg(y), count(*) FROM pagg_tab_para 
GROUP BY x HAVING avg(y) <
 -- When GROUP BY clause does not match; partial aggregation is performed for 
each partition.
 EXPLAIN (COSTS OFF)
 SELECT y, sum(x), avg(x), count(*) FROM pagg_tab_para GROUP BY y HAVING avg(x) 
< 12 ORDER BY 1, 2, 3;
-                                        QUERY PLAN                             
            
--------------------------------------------------------------------------------------------
+                                     QUERY PLAN                                
      
+-------------------------------------------------------------------------------------
  Sort
    Sort Key: pagg_tab_para.y, (sum(pagg_tab_para.x)), (avg(pagg_tab_para.x))
-   ->  Finalize GroupAggregate
+   ->  Finalize HashAggregate
          Group Key: pagg_tab_para.y
          Filter: (avg(pagg_tab_para.x) < '12'::numeric)
-         ->  Gather Merge
+         ->  Gather
                Workers Planned: 2
-               ->  Sort
-                     Sort Key: pagg_tab_para.y
-                     ->  Parallel Append
-                           ->  Partial HashAggregate
-                                 Group Key: pagg_tab_para.y
-                                 ->  Parallel Seq Scan on pagg_tab_para_p1 
pagg_tab_para
-                           ->  Partial HashAggregate
-                                 Group Key: pagg_tab_para_1.y
-                                 ->  Parallel Seq Scan on pagg_tab_para_p2 
pagg_tab_para_1
-                           ->  Partial HashAggregate
-                                 Group Key: pagg_tab_para_2.y
-                                 ->  Parallel Seq Scan on pagg_tab_para_p3 
pagg_tab_para_2
-(19 rows)
+               ->  Parallel Append
+                     ->  Partial HashAggregate
+                           Group Key: pagg_tab_para.y
+                           ->  Parallel Seq Scan on pagg_tab_para_p1 
pagg_tab_para
+                     ->  Partial HashAggregate
+                           Group Key: pagg_tab_para_1.y
+                           ->  Parallel Seq Scan on pagg_tab_para_p2 
pagg_tab_para_1
+                     ->  Partial HashAggregate
+                           Group Key: pagg_tab_para_2.y
+                           ->  Parallel Seq Scan on pagg_tab_para_p3 
pagg_tab_para_2
+(17 rows)
 
 SELECT y, sum(x), avg(x), count(*) FROM pagg_tab_para GROUP BY y HAVING avg(x) 
< 12 ORDER BY 1, 2, 3;
  y  |  sum  |         avg         | count 
diff --git a/src/test/regress/expected/partition_join.out 
b/src/test/regress/expected/partition_join.out
index d5368186caa..601b951d1e7 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -65,31 +65,34 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE 
t1.a = t2.b AND t1.b =
 -- inner join with partially-redundant join clauses
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a 
= t2.b ORDER BY t1.a, t2.b;
-                          QUERY PLAN                           
----------------------------------------------------------------
- Sort
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Merge Append
    Sort Key: t1.a
-   ->  Append
-         ->  Merge Join
-               Merge Cond: (t1_1.a = t2_1.a)
-               ->  Index Scan using iprt1_p1_a on prt1_p1 t1_1
-               ->  Sort
-                     Sort Key: t2_1.b
-                     ->  Seq Scan on prt2_p1 t2_1
-                           Filter: (a = b)
+   ->  Merge Join
+         Merge Cond: (t1_1.a = t2_1.a)
+         ->  Index Scan using iprt1_p1_a on prt1_p1 t1_1
+         ->  Sort
+               Sort Key: t2_1.b
+               ->  Seq Scan on prt2_p1 t2_1
+                     Filter: (a = b)
+   ->  Sort
+         Sort Key: t1_2.a
          ->  Hash Join
                Hash Cond: (t1_2.a = t2_2.a)
                ->  Seq Scan on prt1_p2 t1_2
                ->  Hash
                      ->  Seq Scan on prt2_p2 t2_2
                            Filter: (a = b)
+   ->  Sort
+         Sort Key: t1_3.a
          ->  Hash Join
                Hash Cond: (t1_3.a = t2_3.a)
                ->  Seq Scan on prt1_p3 t1_3
                ->  Hash
                      ->  Seq Scan on prt2_p3 t2_3
                            Filter: (a = b)
-(22 rows)
+(25 rows)
 
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a 
= t2.b ORDER BY t1.a, t2.b;
  a  |  c   | b  |  c   
@@ -643,52 +646,41 @@ EXPLAIN (COSTS OFF)
 SELECT a, b FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b)
   WHERE a BETWEEN 490 AND 510
   GROUP BY 1, 2 ORDER BY 1, 2;
-                                                   QUERY PLAN                  
                                  
------------------------------------------------------------------------------------------------------------------
+                                                QUERY PLAN                     
                            
+-----------------------------------------------------------------------------------------------------------
  Group
    Group Key: (COALESCE(prt1.a, p2.a)), (COALESCE(prt1.b, p2.b))
-   ->  Merge Append
+   ->  Sort
          Sort Key: (COALESCE(prt1.a, p2.a)), (COALESCE(prt1.b, p2.b))
-         ->  Group
-               Group Key: (COALESCE(prt1.a, p2.a)), (COALESCE(prt1.b, p2.b))
-               ->  Sort
-                     Sort Key: (COALESCE(prt1.a, p2.a)), (COALESCE(prt1.b, 
p2.b))
-                     ->  Merge Full Join
-                           Merge Cond: ((prt1.a = p2.a) AND (prt1.b = p2.b))
-                           Filter: ((COALESCE(prt1.a, p2.a) >= 490) AND 
(COALESCE(prt1.a, p2.a) <= 510))
-                           ->  Sort
-                                 Sort Key: prt1.a, prt1.b
-                                 ->  Seq Scan on prt1_p1 prt1
-                           ->  Sort
-                                 Sort Key: p2.a, p2.b
-                                 ->  Seq Scan on prt2_p1 p2
-         ->  Group
-               Group Key: (COALESCE(prt1_1.a, p2_1.a)), (COALESCE(prt1_1.b, 
p2_1.b))
-               ->  Sort
-                     Sort Key: (COALESCE(prt1_1.a, p2_1.a)), 
(COALESCE(prt1_1.b, p2_1.b))
-                     ->  Merge Full Join
-                           Merge Cond: ((prt1_1.a = p2_1.a) AND (prt1_1.b = 
p2_1.b))
-                           Filter: ((COALESCE(prt1_1.a, p2_1.a) >= 490) AND 
(COALESCE(prt1_1.a, p2_1.a) <= 510))
-                           ->  Sort
-                                 Sort Key: prt1_1.a, prt1_1.b
-                                 ->  Seq Scan on prt1_p2 prt1_1
-                           ->  Sort
-                                 Sort Key: p2_1.a, p2_1.b
-                                 ->  Seq Scan on prt2_p2 p2_1
-         ->  Group
-               Group Key: (COALESCE(prt1_2.a, p2_2.a)), (COALESCE(prt1_2.b, 
p2_2.b))
-               ->  Sort
-                     Sort Key: (COALESCE(prt1_2.a, p2_2.a)), 
(COALESCE(prt1_2.b, p2_2.b))
-                     ->  Merge Full Join
-                           Merge Cond: ((prt1_2.a = p2_2.a) AND (prt1_2.b = 
p2_2.b))
-                           Filter: ((COALESCE(prt1_2.a, p2_2.a) >= 490) AND 
(COALESCE(prt1_2.a, p2_2.a) <= 510))
-                           ->  Sort
-                                 Sort Key: prt1_2.a, prt1_2.b
-                                 ->  Seq Scan on prt1_p3 prt1_2
-                           ->  Sort
-                                 Sort Key: p2_2.a, p2_2.b
-                                 ->  Seq Scan on prt2_p3 p2_2
-(43 rows)
+         ->  Append
+               ->  Merge Full Join
+                     Merge Cond: ((prt1_1.a = p2_1.a) AND (prt1_1.b = p2_1.b))
+                     Filter: ((COALESCE(prt1_1.a, p2_1.a) >= 490) AND 
(COALESCE(prt1_1.a, p2_1.a) <= 510))
+                     ->  Sort
+                           Sort Key: prt1_1.a, prt1_1.b
+                           ->  Seq Scan on prt1_p1 prt1_1
+                     ->  Sort
+                           Sort Key: p2_1.a, p2_1.b
+                           ->  Seq Scan on prt2_p1 p2_1
+               ->  Merge Full Join
+                     Merge Cond: ((prt1_2.a = p2_2.a) AND (prt1_2.b = p2_2.b))
+                     Filter: ((COALESCE(prt1_2.a, p2_2.a) >= 490) AND 
(COALESCE(prt1_2.a, p2_2.a) <= 510))
+                     ->  Sort
+                           Sort Key: prt1_2.a, prt1_2.b
+                           ->  Seq Scan on prt1_p2 prt1_2
+                     ->  Sort
+                           Sort Key: p2_2.a, p2_2.b
+                           ->  Seq Scan on prt2_p2 p2_2
+               ->  Merge Full Join
+                     Merge Cond: ((prt1_3.a = p2_3.a) AND (prt1_3.b = p2_3.b))
+                     Filter: ((COALESCE(prt1_3.a, p2_3.a) >= 490) AND 
(COALESCE(prt1_3.a, p2_3.a) <= 510))
+                     ->  Sort
+                           Sort Key: prt1_3.a, prt1_3.b
+                           ->  Seq Scan on prt1_p3 prt1_3
+                     ->  Sort
+                           Sort Key: p2_3.a, p2_3.b
+                           ->  Seq Scan on prt2_p3 p2_3
+(32 rows)
 
 SELECT a, b FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b)
   WHERE a BETWEEN 490 AND 510
@@ -1383,28 +1375,32 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM 
(prt1 t1 LEFT JOIN prt2 t2
 -- This should generate a partitionwise join, but currently fails to
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t2.b FROM (SELECT * FROM prt1 WHERE a < 450) t1 LEFT JOIN (SELECT 
* FROM prt2 WHERE b > 250) t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b;
-                        QUERY PLAN                         
------------------------------------------------------------
- Incremental Sort
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Sort
    Sort Key: prt1.a, prt2.b
-   Presorted Key: prt1.a
-   ->  Merge Left Join
-         Merge Cond: (prt1.a = prt2.b)
-         ->  Sort
-               Sort Key: prt1.a
-               ->  Append
-                     ->  Seq Scan on prt1_p1 prt1_1
-                           Filter: ((a < 450) AND (b = 0))
-                     ->  Seq Scan on prt1_p2 prt1_2
-                           Filter: ((a < 450) AND (b = 0))
-         ->  Sort
-               Sort Key: prt2.b
-               ->  Append
+   ->  Merge Right Join
+         Merge Cond: (prt2.b = prt1.a)
+         ->  Append
+               ->  Sort
+                     Sort Key: prt2_1.b
                      ->  Seq Scan on prt2_p2 prt2_1
                            Filter: (b > 250)
+               ->  Sort
+                     Sort Key: prt2_2.b
                      ->  Seq Scan on prt2_p3 prt2_2
                            Filter: (b > 250)
-(19 rows)
+         ->  Materialize
+               ->  Append
+                     ->  Sort
+                           Sort Key: prt1_1.a
+                           ->  Seq Scan on prt1_p1 prt1_1
+                                 Filter: ((a < 450) AND (b = 0))
+                     ->  Sort
+                           Sort Key: prt1_2.a
+                           ->  Seq Scan on prt1_p2 prt1_2
+                                 Filter: ((a < 450) AND (b = 0))
+(23 rows)
 
 SELECT t1.a, t2.b FROM (SELECT * FROM prt1 WHERE a < 450) t1 LEFT JOIN (SELECT 
* FROM prt2 WHERE b > 250) t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b;
   a  |  b  
@@ -1424,25 +1420,33 @@ SELECT t1.a, t2.b FROM (SELECT * FROM prt1 WHERE a < 
450) t1 LEFT JOIN (SELECT *
 -- partitionwise join does not apply
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t2.b FROM prt1 t1, prt2 t2 WHERE t1::text = t2::text AND t1.a = 
t2.b ORDER BY t1.a;
-                                       QUERY PLAN                              
          
------------------------------------------------------------------------------------------
+                            QUERY PLAN                            
+------------------------------------------------------------------
  Merge Join
-   Merge Cond: ((t1.a = t2.b) AND (((((t1.*)::prt1))::text) = 
((((t2.*)::prt2))::text)))
-   ->  Sort
-         Sort Key: t1.a, ((((t1.*)::prt1))::text)
-         ->  Result
-               ->  Append
-                     ->  Seq Scan on prt1_p1 t1_1
-                     ->  Seq Scan on prt1_p2 t1_2
-                     ->  Seq Scan on prt1_p3 t1_3
-   ->  Sort
-         Sort Key: t2.b, ((((t2.*)::prt2))::text)
-         ->  Result
-               ->  Append
+   Merge Cond: (t1.a = t2.b)
+   Join Filter: ((((t2.*)::prt2))::text = (((t1.*)::prt1))::text)
+   ->  Append
+         ->  Sort
+               Sort Key: t1_1.a
+               ->  Seq Scan on prt1_p1 t1_1
+         ->  Sort
+               Sort Key: t1_2.a
+               ->  Seq Scan on prt1_p2 t1_2
+         ->  Sort
+               Sort Key: t1_3.a
+               ->  Seq Scan on prt1_p3 t1_3
+   ->  Materialize
+         ->  Append
+               ->  Sort
+                     Sort Key: t2_1.b
                      ->  Seq Scan on prt2_p1 t2_1
+               ->  Sort
+                     Sort Key: t2_2.b
                      ->  Seq Scan on prt2_p2 t2_2
+               ->  Sort
+                     Sort Key: t2_3.b
                      ->  Seq Scan on prt2_p3 t2_3
-(16 rows)
+(24 rows)
 
 SELECT t1.a, t2.b FROM prt1 t1, prt2 t2 WHERE t1::text = t2::text AND t1.a = 
t2.b ORDER BY t1.a;
  a  | b  
@@ -4508,9 +4512,10 @@ EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM plt1_adv t1 LEFT JOIN plt2_adv 
t2 ON (t1.a = t2.a AND t1.c = t2.c) LEFT JOIN plt1_adv t3 ON (t1.a = t3.a AND 
t1.c = t3.c) WHERE t1.b < 10 ORDER BY t1.a;
                                    QUERY PLAN                                  
 
 
--------------------------------------------------------------------------------
- Sort
+ Merge Append
    Sort Key: t1.a
-   ->  Append
+   ->  Sort
+         Sort Key: t1_1.a
          ->  Hash Right Join
                Hash Cond: ((t3_1.a = t1_1.a) AND (t3_1.c = t1_1.c))
                ->  Seq Scan on plt1_adv_p1 t3_1
@@ -4521,6 +4526,8 @@ SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM plt1_adv 
t1 LEFT JOIN plt2_adv t2
                            ->  Hash
                                  ->  Seq Scan on plt1_adv_p1 t1_1
                                        Filter: (b < 10)
+   ->  Sort
+         Sort Key: t1_2.a
          ->  Hash Right Join
                Hash Cond: ((t3_2.a = t1_2.a) AND (t3_2.c = t1_2.c))
                ->  Seq Scan on plt1_adv_p2 t3_2
@@ -4531,6 +4538,8 @@ SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM plt1_adv 
t1 LEFT JOIN plt2_adv t2
                            ->  Hash
                                  ->  Seq Scan on plt1_adv_p2 t1_2
                                        Filter: (b < 10)
+   ->  Sort
+         Sort Key: t1_3.a
          ->  Hash Right Join
                Hash Cond: ((t3_3.a = t1_3.a) AND (t3_3.c = t1_3.c))
                ->  Seq Scan on plt1_adv_p3 t3_3
@@ -4541,15 +4550,19 @@ SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM plt1_adv 
t1 LEFT JOIN plt2_adv t2
                            ->  Hash
                                  ->  Seq Scan on plt1_adv_p3 t1_3
                                        Filter: (b < 10)
-         ->  Nested Loop Left Join
-               Join Filter: ((t1_4.a = t3_4.a) AND (t1_4.c = t3_4.c))
-               ->  Nested Loop Left Join
-                     Join Filter: ((t1_4.a = t2_4.a) AND (t1_4.c = t2_4.c))
+   ->  Nested Loop Left Join
+         Join Filter: ((t1_4.a = t3_4.a) AND (t1_4.c = t3_4.c))
+         ->  Merge Left Join
+               Merge Cond: ((t1_4.a = t2_4.a) AND (t1_4.c = t2_4.c))
+               ->  Sort
+                     Sort Key: t1_4.a, t1_4.c
                      ->  Seq Scan on plt1_adv_extra t1_4
                            Filter: (b < 10)
+               ->  Sort
+                     Sort Key: t2_4.a, t2_4.c
                      ->  Seq Scan on plt2_adv_extra t2_4
-               ->  Seq Scan on plt1_adv_extra t3_4
-(41 rows)
+         ->  Seq Scan on plt1_adv_extra t3_4
+(50 rows)
 
 SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM plt1_adv t1 LEFT JOIN plt2_adv 
t2 ON (t1.a = t2.a AND t1.c = t2.c) LEFT JOIN plt1_adv t3 ON (t1.a = t3.a AND 
t1.c = t3.c) WHERE t1.b < 10 ORDER BY t1.a;
  a  |  c   | a |  c   | a |  c   
diff --git a/src/test/regress/expected/partition_prune.out 
b/src/test/regress/expected/partition_prune.out
index d1966cd7d82..43e2962009e 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4768,9 +4768,10 @@ select min(a) over (partition by a order by a) from 
part_abc where a >= stable_o
                Window: w1 AS (PARTITION BY part_abc.a ORDER BY part_abc.a)
                ->  Append
                      Subplans Removed: 1
-                     ->  Index Scan using part_abc_2_a_idx on part_abc_2 
part_abc_1
-                           Index Cond: (a >= (stable_one() + 1))
-                           Filter: (d <= stable_one())
+                     ->  Sort
+                           Sort Key: part_abc_1.a
+                           ->  Seq Scan on part_abc_2 part_abc_1
+                                 Filter: ((d <= stable_one()) AND (a >= 
(stable_one() + 1)))
                      ->  Merge Append
                            Sort Key: part_abc_3.a
                            Subplans Removed: 1
@@ -4785,9 +4786,10 @@ select min(a) over (partition by a order by a) from 
part_abc where a >= stable_o
                Window: w1 AS (PARTITION BY part_abc_5.a ORDER BY part_abc_5.a)
                ->  Append
                      Subplans Removed: 1
-                     ->  Index Scan using part_abc_2_a_idx on part_abc_2 
part_abc_6
-                           Index Cond: (a >= (stable_one() + 1))
-                           Filter: (d >= stable_one())
+                     ->  Sort
+                           Sort Key: part_abc_6.a
+                           ->  Seq Scan on part_abc_2 part_abc_6
+                                 Filter: ((d >= stable_one()) AND (a >= 
(stable_one() + 1)))
                      ->  Merge Append
                            Sort Key: a
                            Subplans Removed: 1
@@ -4797,7 +4799,7 @@ select min(a) over (partition by a order by a) from 
part_abc where a >= stable_o
                            ->  Index Scan using part_abc_3_3_a_idx on 
part_abc_3_3 part_abc_9
                                  Index Cond: (a >= (stable_one() + 1))
                                  Filter: (d >= stable_one())
-(35 rows)
+(37 rows)
 
 drop view part_abc_view;
 drop table part_abc;
diff --git a/src/test/regress/expected/union.out 
b/src/test/regress/expected/union.out
index 96962817ed4..e6ba4880b9c 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1207,12 +1207,14 @@ select event_id
 ----------------------------------------------------------
  Merge Append
    Sort Key: events.event_id
-   ->  Index Scan using events_pkey on events
+   ->  Sort
+         Sort Key: events.event_id
+         ->  Seq Scan on events
    ->  Sort
          Sort Key: events_1.event_id
          ->  Seq Scan on events_child events_1
    ->  Index Scan using other_events_pkey on other_events
-(7 rows)
+(9 rows)
 
 drop table events_child, events, other_events;
 reset enable_indexonlyscan;
@@ -1334,24 +1336,22 @@ select distinct q1 from
    union all
    select distinct * from int8_tbl i82) ss
 where q2 = q2;
-                        QUERY PLAN                        
-----------------------------------------------------------
- Unique
-   ->  Merge Append
-         Sort Key: "*SELECT* 1".q1
+                     QUERY PLAN                     
+----------------------------------------------------
+ HashAggregate
+   Group Key: "*SELECT* 1".q1
+   ->  Append
          ->  Subquery Scan on "*SELECT* 1"
-               ->  Unique
-                     ->  Sort
-                           Sort Key: i81.q1, i81.q2
-                           ->  Seq Scan on int8_tbl i81
-                                 Filter: (q2 IS NOT NULL)
+               ->  HashAggregate
+                     Group Key: i81.q1, i81.q2
+                     ->  Seq Scan on int8_tbl i81
+                           Filter: (q2 IS NOT NULL)
          ->  Subquery Scan on "*SELECT* 2"
-               ->  Unique
-                     ->  Sort
-                           Sort Key: i82.q1, i82.q2
-                           ->  Seq Scan on int8_tbl i82
-                                 Filter: (q2 IS NOT NULL)
-(15 rows)
+               ->  HashAggregate
+                     Group Key: i82.q1, i82.q2
+                     ->  Seq Scan on int8_tbl i82
+                           Filter: (q2 IS NOT NULL)
+(13 rows)
 
 select distinct q1 from
   (select distinct * from int8_tbl i81
@@ -1370,24 +1370,25 @@ select distinct q1 from
    union all
    select distinct * from int8_tbl i82) ss
 where -q1 = q2;
-                       QUERY PLAN                       
---------------------------------------------------------
+                          QUERY PLAN                          
+--------------------------------------------------------------
  Unique
-   ->  Merge Append
+   ->  Sort
          Sort Key: "*SELECT* 1".q1
-         ->  Subquery Scan on "*SELECT* 1"
-               ->  Unique
-                     ->  Sort
-                           Sort Key: i81.q1, i81.q2
-                           ->  Seq Scan on int8_tbl i81
-                                 Filter: ((- q1) = q2)
-         ->  Subquery Scan on "*SELECT* 2"
-               ->  Unique
-                     ->  Sort
-                           Sort Key: i82.q1, i82.q2
-                           ->  Seq Scan on int8_tbl i82
-                                 Filter: ((- q1) = q2)
-(15 rows)
+         ->  Append
+               ->  Subquery Scan on "*SELECT* 1"
+                     ->  Unique
+                           ->  Sort
+                                 Sort Key: i81.q1, i81.q2
+                                 ->  Seq Scan on int8_tbl i81
+                                       Filter: ((- q1) = q2)
+               ->  Subquery Scan on "*SELECT* 2"
+                     ->  Unique
+                           ->  Sort
+                                 Sort Key: i82.q1, i82.q2
+                                 ->  Seq Scan on int8_tbl i82
+                                       Filter: ((- q1) = q2)
+(16 rows)
 
 select distinct q1 from
   (select distinct * from int8_tbl i81
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 699e8ac09c8..c58beebbd1e 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -641,12 +641,14 @@ insert into matest2 (name) values ('Test 4');
 insert into matest3 (name) values ('Test 5');
 insert into matest3 (name) values ('Test 6');
 
-set enable_indexscan = off;  -- force use of seqscan/sort, so no merge
+set enable_indexscan = off;  -- force use of seqscan/sort
+set enable_sort = off; -- since merge append may employ sort in children we 
need to disable sort
 explain (verbose, costs off) select * from matest0 order by 1-id;
 select * from matest0 order by 1-id;
 explain (verbose, costs off) select min(1-id) from matest0;
 select min(1-id) from matest0;
 reset enable_indexscan;
+reset enable_sort;
 
 set enable_seqscan = off;  -- plan with fewest seqscans should be merge
 set enable_parallel_append = off; -- Don't let parallel-append interfere
-- 
2.50.1

Re: MergeAppend could consider sorting cheapest child path

Reply via email to