On Wed, Jan 14, 2026 at 11:11 PM Robert Haas <[email protected]> wrote:

Hi Robert,

> On Wed, Jan 14, 2026 at 6:02 AM Jakub Wartak
> <[email protected]> wrote:
> > a) q4.sql (please see attached file for repro). More or less: right
> > after import I get a hard failure if the earlier recommended advice is
> > enabled (smells like a bug to me: we shouldn't get any errors even if
> > advice is bad). This can be solved by ANALYZE, but brought up back by
> > truncating pg_statistics
> > ERROR:  unique semijoin found for relids (b 3) but not observed during 
> > planning
> > STATEMENT:  explain (timing off, costs off, settings off, memory off)
>
> Hmm, so the plan tree walker thinks that we did a semijoin between
> lineitem and orders by making lineitem unique on the join column and
> then performing a regular join. That appears to be correct. But
> pgpa_join_path_setup never created a pgpa_join_path_setup for that
> possibility, or created one that doesn't actually match up properly to
> what was found in the plan tree. Can you check whether a
> pgpa_sj_unique_rel gets created in pgpa_join_path_setup, and with what
> contents?

OK, so today, on just barebone v9 (even without any fixes from this $subthread),
I couldn't get it to reproduce right out of the box right on the fresh
cluster. It
appears to another missing piece of the puzzle was to have
max_parallel_workers_per_gather=0  (in addition to TRUNCATING pg_statistic),
because otherwise it did not want to generate advice out of the box that would
generate this specific ERROR.

Maybe I'm big rookie here (OR just dumb), but it took me some time to realize
why we emit SEMIJOIN_UNIQUE() there, clearly the plan without parallelism
has "Nested Loop", not like "Nested Loop Semi Join"
(with max_parallel_workers_per_gather = 2). Yet it emits that and somehow
later the query feature walker->query_features[PGPAQF_SEMIJOIN_UNIQUE]
also is there, so that "unique semijoin found for.." error could be thrown.

As per VERBOSE explain, one can spot this SemiJoin transformation is being
applied (Nested Loop/Inner Unique: true), however sj_unique_rtis is empty (?!):

[..]
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  added SEMIJOIN_UNIQUE
NOTICE:  pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=<> sj_unique_rels=<>
ERROR:  unique semijoin found for relids (b 3) but not observed during planning

Only *after* this, I've realized how hard all of that is reading comments nearby
`typedef struct pgpa_sj_unique_rel`.

Anyway it appears that pgpa_plan_walker()/pgpa_planner_walker() is not
having proper
input information to begin with about SJs? It looks there is just one
single place that
sets pps->sj_unique_rels (lappend() in pgpa_join_path_setup()), but
that's code path is
only being launched when requesting explain is asking for advice:

explain (costs off, plan_advice) SELECT
[..]
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=0
NOTICE:  not a duplicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  added SEMIJOIN_UNIQUE
WARNING:  could not dump unrecognized node type: 0 // ignore?
NOTICE:  pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=((b 3)) sj_unique_rels=({})
(+ no error!)

while with basic EXPLAIN (and advices planner/advises touching SJ
transforms), I'm getting:

dbt3=# explain (costs off) SELECT
[..]
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  added SEMIJOIN_UNIQUE
NOTICE:  pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=<> sj_unique_rels=<>
ERROR:  unique semijoin found for relids (b 3) but not observed during planning

So we have started v9 with:
  if (pps->generate_advice_string) { -- but that's wrong due to
potential crash in -02 builds + asan complaints

we fixed that above bug with:
  if (pps != NULL && pps->generate_advice_string) {  -- but that's
wrong due to not initializing SJ for normal explains

so we end up doing simply this?
  if (pps != NULL) {

The last one seems to pass all my tests (with already provided fixup
from yesterday), but I'm absolutely not sure if that's the proper way to
address that).

> > a) q8.sql (please see attached file for demo). It is even more
> > bizarre, happens right after import , fixed by ANALYZE, but even
> > TRUNCATING pg_statistic doesnt bring back the problem. Pinpointed that
> > additional pg_clear_relation_stats() triggers the problem back.
>
> I found this one. I now think that
> pgpa_planner_apply_join_path_advice() shouldn't added anything to
> jo_permit_indexes when a join method hint implicitly permits a join
> order. I simplified your test case to this:
>
> set pg_plan_advice.advice = 'JOIN_ORDER(n1 region customer)
> NESTED_LOOP_PLAIN(region)';
> explain (costs off, plan_advice)
>     SELECT
>         n1.n_name AS nation
>     FROM
>         customer,
>         nation n1,
>         region
>     WHERE
>         c_nationkey = n1.n_nationkey
>         AND n1.n_regionkey = r_regionkey
>         AND r_name = 'AMERICA';
>
> What was happening here is that when we considered a join between
> {customer, nation} and region, pgpa_planner_apply_join_path_advice()
> said, well, according to the JOIN_ORDER advice, this join order is not
> allowed, which is correct. And, according to the NESTED_LOOP_PLAIN
> advice, this join order is allowed, which is also correct, because
> NESTED_LOOP_PLAIN(region) denies join orders where region is the
> driving table, since those would make it impossible to respect the
> advice, and this join order doesn't do that. Then, it concludes that
> because one piece of advice says the join order is OK and the other
> says it isn't, the advice conflicts. This is where I think it's going
> off the rails: the NESTED_LOOP_PLAIN() advice should only be allowed
> to act as a negative constraint, not a positive one. So what I did is:

Yes! 3 three lines patches seems to help (and causes no other problems to
best of my knowledge). The simplified test case results seem to be only
changing like below, but it really fixes the Q8 NL->HJ.

  Supplied Plan Advice:
-   JOIN_ORDER(n1 region customer) /* matched, conflicting */
-   NESTED_LOOP_PLAIN(region) /* matched, conflicting */
+   JOIN_ORDER(n1 region customer) /* matched */
+   NESTED_LOOP_PLAIN(region) /* matched */

BTW: I have found also that it fixes another (not yet here disclosed bug,
because I've found it just today :): when running without stats, and
with enable_nestloop=OFF (globally) Q5 was failing too due some
sequence of HJ/Parallel HJ
and slightly different Hash Cond) - nvm, wIth this patch it does NOT misbehave.

Maybe it would be good to include that into tests inside 0005, for that
small tiny query above?

> > 3b) XXX - marker:I was looking for a solution and apparently cfbot
> > farm has those options, so they should be testing it anyway. And this
> > brings me to a fact, that it maybe could be detected by cfbot, however
> > the $thread is not registered so cfbot had no chance to see what's
> > more there? (I'm mainly thinking about any cross-platform issues, if
> > any).
>
> I mean, there is https://commitfest.postgresql.org/patch/6184/

Whoops, mea culpa, I was looking for PG-4 commitfest for some reason
(so I should
be looking on https://cfbot.cputube.org/next.html not just under "/"
[main] one).

-J.
--
-- step by step, demo that it doesnt work:
--
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders) 
NESTED_LOOP_PLAIN(orders) SEQ_SCAN(lineitem) INDEX_SCAN(orders 
public.pk_orders) GATHER(lineitem) SEMIJOIN_UNIQUE(lineitem) NO_GATHER(orders)';
SET
dbt3=# explain (costs off) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
ERROR:  unique semijoin found for relids (b 3) but not observed during planning
dbt3=#


-- so with max_parallel_workers_per_gather=2 works (-> NO ERROR)
set pg_plan_advice.advice = 'JOIN_ORDER(orders lineitem)
   NESTED_LOOP_PLAIN(lineitem)
   SEQ_SCAN(orders)
   INDEX_ONLY_SCAN(lineitem public.lineitem_l_orderkey_l_suppkey_idx)
   GATHER_MERGE((orders lineitem))
   SEMIJOIN_NON_UNIQUE(lineitem)';

-- when set to 0, this fails:
set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)';

-- note the SEMIJOIN_UNIQUE(lineitem) vs SEMIJOIN_NON_UNIQUE(lineitem), as it 
is going to be important later

---
--- FRESH
---
dbt3=# explain (costs off,plan_advice) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=0
NOTICE:  not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate
   Group Key: orders.o_orderpriority
   ->  Sort
         Sort Key: orders.o_orderpriority
         ->  Nested Loop
               ->  HashAggregate
                     Group Key: lineitem.l_orderkey
                     ->  Seq Scan on lineitem
                           Filter: (l_commitdate < l_receiptdate)
               ->  Index Scan using pk_orders on orders
                     Index Cond: (o_orderkey = lineitem.l_orderkey)
                     Filter: ((o_orderdate >= '1993-07-01'::date) AND 
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
 Generated Plan Advice:
   JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)
(19 rows)

--
-- quick sanity check if we really see the error (copy/paste from above):
--
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)';
SET
dbt3=# explain (costs off) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
ERROR:  unique semijoin found for relids (b 3) but not observed during planning
dbt3=#

-- yes , we do so "b 3" seems to be HashAggregate/Group Key: lineitem.l_orderkey
-- so how's that NOT observed during planning if it is there?

dbt3=# LOAD 'pg_overexplain';
LOAD
dbt3=# explain (costs off, DEBUG) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate
   Group Key: orders.o_orderpriority
   Disabled Nodes: 0
   Parallel Safe: false
   Plan Node ID: 0
   ->  Sort
         Sort Key: orders.o_orderpriority
         Disabled Nodes: 0
         Parallel Safe: false
         Plan Node ID: 1
         ->  Nested Loop
               Disabled Nodes: 0
               Parallel Safe: false
               Plan Node ID: 2
               ->  HashAggregate
                     Group Key: lineitem.l_orderkey
                     Disabled Nodes: 0
                     Parallel Safe: false
                     Plan Node ID: 3
                     ->  Seq Scan on lineitem
                           Filter: (l_commitdate < l_receiptdate)
                           Disabled Nodes: 0
                           Parallel Safe: false
                           Plan Node ID: 4
               ->  Index Scan using pk_orders on orders
                     Index Cond: (o_orderkey = lineitem.l_orderkey)
                     Filter: ((o_orderdate >= '1993-07-01'::date) AND 
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
                     Disabled Nodes: 0
                     Parallel Safe: false
                     Plan Node ID: 5
                     extParam: 0
                     allParam: 0
 PlannedStmt:
   Command Type: select
   Flags: canSetTag
   Subplans Needing Rewind: none
   Relation OIDs: 17089 17061
   Executor Parameter Types: 20
   Parse Location: 0 to end
(39 rows)

dbt3=# explain (costs off, RANGE_TABLE) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
                                                                                
                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate
   Group Key: orders.o_orderpriority
   ->  Sort
         Sort Key: orders.o_orderpriority
         ->  Nested Loop
               ->  HashAggregate
                     Group Key: lineitem.l_orderkey
                     ->  Seq Scan on lineitem
                           Filter: (l_commitdate < l_receiptdate)
                           Scan RTI: 3
               ->  Index Scan using pk_orders on orders
                     Index Cond: (o_orderkey = lineitem.l_orderkey)
                     Filter: ((o_orderdate >= '1993-07-01'::date) AND 
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
                     Scan RTI: 1
 RTI 1 (relation, in-from-clause):
   Eref: orders (o_orderkey, o_custkey, o_orderstatus, o_totalprice, 
o_orderdate, o_orderpriority, o_clerk, o_shippriority, o_comment)
   Relation: orders
   Relation Kind: relation
   Relation Lock Mode: AccessShareLock
   Permission Info Index: 1
 RTI 2 (group):
   Eref: "*GROUP*" (o_orderpriority)
 RTI 3 (relation, in-from-clause):
   Eref: lineitem (l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment)
   Relation: lineitem
   Relation Kind: relation
   Relation Lock Mode: AccessShareLock
   Permission Info Index: 2
 Unprunable RTIs: 1 3
(29 rows)

dbt3=#

--
-- 3x conflicts (still with using SEMIJOIN_UNIQUE):
--
dbt3=# explain (costs off,plan_advice) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=0
NOTICE:  not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate
   Group Key: orders.o_orderpriority
   ->  Sort
         Sort Key: orders.o_orderpriority
         ->  Nested Loop
               ->  HashAggregate
                     Group Key: lineitem.l_orderkey
                     ->  Seq Scan on lineitem
                           Filter: (l_commitdate < l_receiptdate)
               ->  Index Scan using pk_orders on orders
                     Index Cond: (o_orderkey = lineitem.l_orderkey)
                     Filter: ((o_orderdate >= '1993-07-01'::date) AND 
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
 Supplied Plan Advice:
   SEQ_SCAN(lineitem) /* matched */
   INDEX_SCAN(orders public.pk_orders) /* matched */
   JOIN_ORDER(lineitem orders) /* matched, conflicting */
   NESTED_LOOP_PLAIN(orders) /* matched, conflicting */
   SEMIJOIN_UNIQUE(lineitem) /* matched, conflicting */
   NO_GATHER(orders) /* matched */
   NO_GATHER(lineitem) /* matched */
 Generated Plan Advice:
   JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)
(27 rows)

dbt3=# explain (costs off) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
ERROR:  unique semijoin found for relids (b 3) but not observed during planning
dbt3=#


-- still if I remove SEMIJOIN_UNIQUE(liteitem) it complains:
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   NO_GATHER(orders lineitem)';
SET
dbt3=# explain (costs off) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
ERROR:  unique semijoin found for relids (b 3) but not observed during planning
dbt3=#


-- not shown here but also using SEMIJOIN_NON_UNIQUE was failing , which made 
me struggle


-- so back to parallel plan:

dbt3=# set pg_plan_advice.advice = '';
SET
dbt3=# set max_parallel_workers_per_gather to 2;
SET
dbt3=# explain (costs off) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
                                                                      QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: orders.o_orderpriority
   ->  Gather Merge
         Workers Planned: 2
         ->  Partial GroupAggregate
               Group Key: orders.o_orderpriority
               ->  Sort
                     Sort Key: orders.o_orderpriority
                     ->  Nested Loop Semi Join
                           ->  Parallel Seq Scan on orders
                                 Filter: ((o_orderdate >= '1993-07-01'::date) 
AND (o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
                           ->  Index Only Scan using 
lineitem_l_orderkey_l_suppkey_idx on lineitem
                                 Index Cond: (l_orderkey = orders.o_orderkey)
(13 rows)

dbt3=# explain (costs off, plan_advice) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=0
NOTICE:  not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
                                                                      QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: orders.o_orderpriority
   ->  Gather Merge
         Workers Planned: 2
         ->  Partial GroupAggregate
               Group Key: orders.o_orderpriority
               ->  Sort
                     Sort Key: orders.o_orderpriority
                     ->  Nested Loop Semi Join
                           ->  Parallel Seq Scan on orders
                                 Filter: ((o_orderdate >= '1993-07-01'::date) 
AND (o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
                           ->  Index Only Scan using 
lineitem_l_orderkey_l_suppkey_idx on lineitem
                                 Index Cond: (l_orderkey = orders.o_orderkey)
 Generated Plan Advice:
   JOIN_ORDER(orders lineitem)
   NESTED_LOOP_PLAIN(lineitem)
   SEQ_SCAN(orders)
   INDEX_ONLY_SCAN(lineitem public.lineitem_l_orderkey_l_suppkey_idx)
   GATHER_MERGE((orders lineitem))
   SEMIJOIN_NON_UNIQUE(lineitem)



-- back to max_parallel_workers_per_gather=0 and explain VERBOSE
-- this shows that indeed we have Nested Loop // Inner Unique: true
dbt3=# explain (costs off, VERBOSE, plan_advice) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=0
NOTICE:  not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
                                                                                
                                                                                
                          QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate
   Output: orders.o_orderpriority, count(*)
   Group Key: orders.o_orderpriority
   ->  Sort
         Output: orders.o_orderpriority
         Sort Key: orders.o_orderpriority
         ->  Nested Loop
               Output: orders.o_orderpriority
               Inner Unique: true
               ->  HashAggregate
                     Output: lineitem.l_orderkey
                     Group Key: lineitem.l_orderkey
                     ->  Seq Scan on public.lineitem
                           Output: lineitem.l_orderkey, lineitem.l_partkey, 
lineitem.l_suppkey, lineitem.l_linenumber, lineitem.l_quantity, 
lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_tax, 
lineitem.l_returnflag, lineitem.l_linestatus, lineitem.l_shipdate, 
lineitem.l_commitdate, lineitem.l_receiptdate, lineitem.l_shipinstruct, 
lineitem.l_shipmode, lineitem.l_comment
                           Filter: (lineitem.l_commitdate < 
lineitem.l_receiptdate)
               ->  Index Scan using pk_orders on public.orders
                     Output: orders.o_orderkey, orders.o_custkey, 
orders.o_orderstatus, orders.o_totalprice, orders.o_orderdate, 
orders.o_orderpriority, orders.o_clerk, orders.o_shippriority, orders.o_comment
                     Index Cond: (orders.o_orderkey = lineitem.l_orderkey)
                     Filter: ((orders.o_orderdate >= '1993-07-01'::date) AND 
(orders.o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
 Generated Plan Advice:
   JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)


-----
----- finally to realize that sj_unique_rtis is NULL there?
-----

dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)';
SET
dbt3=# explain (costs off, verbose) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  added SEMIJOIN_UNIQUE
NOTICE:  pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3, 
sj_unique_rtis=<>
ERROR:  unique semijoin found for relids (b 3) but not observed during planning


----------
----------
----------
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)';
SET
dbt3=#
dbt3=#
dbt3=# explain (costs off) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  added SEMIJOIN_UNIQUE
NOTICE:  pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3, 
sj_unique_rtis=<> sj_unique_rels=<>
ERROR:  unique semijoin found for relids (b 3) but not observed during planning
dbt3=#
dbt3=#
dbt3=# explain (costs off, plan_advice) SELECT
    o_orderpriority,
    count(*) AS order_count
FROM
    orders
WHERE
    o_orderdate >= DATE '1993-07-01'
    AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
    AND EXISTS (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            AND l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority;
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=0
NOTICE:  not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=inner pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  jointype=outer pps_NULL?=0
NOTICE:  found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE:  found=1
NOTICE:  added SEMIJOIN_UNIQUE
WARNING:  could not dump unrecognized node type: 0
NOTICE:  pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3, 
sj_unique_rtis=((b 3)) sj_unique_rels=({})
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate
   Group Key: orders.o_orderpriority
   ->  Sort
         Sort Key: orders.o_orderpriority
         ->  Nested Loop
               ->  HashAggregate
                     Group Key: lineitem.l_orderkey
                     ->  Seq Scan on lineitem
                           Filter: (l_commitdate < l_receiptdate)
               ->  Index Scan using pk_orders on orders
                     Index Cond: (o_orderkey = lineitem.l_orderkey)
                     Filter: ((o_orderdate >= '1993-07-01'::date) AND 
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
 Supplied Plan Advice:
   SEQ_SCAN(lineitem) /* matched */
   INDEX_SCAN(orders public.pk_orders) /* matched */
   JOIN_ORDER(lineitem orders) /* matched, conflicting */
   NESTED_LOOP_PLAIN(orders) /* matched, conflicting */
   SEMIJOIN_UNIQUE(lineitem) /* matched, conflicting */
   NO_GATHER(orders) /* matched */
   NO_GATHER(lineitem) /* matched */
 Generated Plan Advice:
   JOIN_ORDER(lineitem orders)
   NESTED_LOOP_PLAIN(orders)
   SEQ_SCAN(lineitem)
   INDEX_SCAN(orders public.pk_orders)
   SEMIJOIN_UNIQUE(lineitem)
   NO_GATHER(orders lineitem)
(27 rows)

diff --git a/contrib/pg_plan_advice/pgpa_planner.c 
b/contrib/pg_plan_advice/pgpa_planner.c
index fdc53da9742..a5b784fd151 100644
--- a/contrib/pg_plan_advice/pgpa_planner.c
+++ b/contrib/pg_plan_advice/pgpa_planner.c
@@ -457,7 +457,11 @@ pgpa_join_path_setup(PlannerInfo *root, RelOptInfo 
*joinrel,
 
                uniquerel = jointype == JOIN_UNIQUE_OUTER ? outerrel : innerrel;
                pps = GetPlannerGlobalExtensionState(root->glob, 
planner_extension_id);
-               if (pps->generate_advice_string)
+               elog(NOTICE, "jointype=%s pps_NULL?=%d", jointype == 
JOIN_UNIQUE_OUTER ? "outer" : "inner", pps == NULL);
+               // JW: pps NULL fix
+               //if (pps != NULL && pps->generate_advice_string)
+               // JW: pps NULL fix with generating SJ also for normal EXPLAIN
+               if (pps != NULL)
                {
                        bool            found = false;
 
@@ -471,19 +475,27 @@ pgpa_join_path_setup(PlannerInfo *root, RelOptInfo 
*joinrel,
                                if (root->plan_name == ur->plan_name &&
                                        bms_equal(uniquerel->relids, 
ur->relids))
                                {
+                                       elog(NOTICE, "found=true! 
(ur->plan_name=%s bms_ur_relids=%s)", 
+                                               ur->plan_name, 
pgpa_bms_to_cstring(ur->relids));
                                        found = true;
                                        break;
                                }
                        }
+                       elog(NOTICE, "found=%d", found);
 
                        /* If not a duplicate, append to the list. */
                        if (!found)
                        {
+                               StringInfoData buf;
+
                                pgpa_sj_unique_rel *ur = 
palloc_object(pgpa_sj_unique_rel);
 
                                ur->plan_name = root->plan_name;
                                ur->relids = uniquerel->relids;
                                pps->sj_unique_rels = 
lappend(pps->sj_unique_rels, ur);
+                               initStringInfo(&buf);
+                               outBitmapset(&buf, uniquerel->relids);
+                               elog(NOTICE, "not a dupllicate, appending 
\"%s\" to pps->sj_unique_rels", buf.data);
                        }
                }
        }
@@ -982,7 +994,8 @@ pgpa_planner_apply_join_path_advice(JoinType jointype, 
uint64 *pgs_mask_p,
                                jo_deny_indexes = 
bms_add_member(jo_deny_indexes, i);
                        else if (restrict_method)
                        {
-                               jo_permit_indexes = 
bms_add_member(jo_permit_indexes, i);
+                               //JW
+                               //jo_permit_indexes = 
bms_add_member(jo_permit_indexes, i);
                                jm_indexes = bms_add_member(jo_permit_indexes, 
i);
                                if (join_mask != 0 && join_mask != my_join_mask)
                                        jm_conflict = true;
@@ -1038,8 +1051,9 @@ pgpa_planner_apply_join_path_advice(JoinType jointype, 
uint64 *pgs_mask_p,
                                }
                                else if (advice_unique != jt_unique)
                                        jo_deny_indexes = 
bms_add_member(jo_deny_indexes, i);
-                               else
-                                       jo_permit_indexes = 
bms_add_member(jo_permit_indexes, i);
+                               //JW
+                               //else
+                               //      jo_permit_indexes = 
bms_add_member(jo_permit_indexes, i);
                        }
                        continue;
                }
diff --git a/contrib/pg_plan_advice/pgpa_walker.c 
b/contrib/pg_plan_advice/pgpa_walker.c
index 29973c93b0b..07db28630a8 100644
--- a/contrib/pg_plan_advice/pgpa_walker.c
+++ b/contrib/pg_plan_advice/pgpa_walker.c
@@ -59,6 +59,31 @@ static Index pgpa_walker_get_rti(Index rtable_length,
                                                                 
pgpa_identifier *rt_identifiers,
                                                                 
pgpa_identifier *rid);
 
+
+/*
+ * Convert a bitmapset to a C string of comma-separated integers.
+ */
+static char *
+pgpa_bms_to_cstring(Bitmapset *bms)
+{
+       StringInfoData buf;
+       int                     x = -1;
+
+       if (bms_is_empty(bms))
+               return "none";
+
+       initStringInfo(&buf);
+       while ((x = bms_next_member(bms, x)) >= 0)
+       {
+               if (buf.len > 0)
+                       appendStringInfo(&buf, ", %d", x);
+               else
+                       appendStringInfo(&buf, "%d", x);
+       }
+
+       return buf.data;
+}
+
 /*
  * Top-level entrypoint for the plan tree walk.
  *
@@ -135,6 +160,9 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, 
PlannedStmt *pstmt,
        foreach_ptr(pgpa_query_feature, qf,
                                
walker->query_features[PGPAQF_SEMIJOIN_NON_UNIQUE])
        {
+               elog(NOTICE, "pgpa_plan_walker: walking over 
SEMIJOIN_NON_UNIQUE features: %s",
+                       pgpa_bms_to_cstring(qf->relids));
+
                if (list_member(sj_unique_rtis, qf->relids))
                        sj_nonunique_qfs = lappend(sj_nonunique_qfs, qf);
        }
@@ -148,6 +176,8 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker, 
PlannedStmt *pstmt,
        foreach_ptr(pgpa_query_feature, qf,
                                walker->query_features[PGPAQF_SEMIJOIN_UNIQUE])
        {
+               elog(NOTICE, "pgpa_plan_walker: walking over SEMIJOIN_UNIQUE 
features: %s, sj_unique_rtis=%s sj_unique_rels=%s",
+                       pgpa_bms_to_cstring(qf->relids), 
nodeToString(sj_unique_rtis), nodeToString(sj_unique_rels));
                if (!list_member(sj_unique_rtis, qf->relids))
                {
                        StringInfoData buf;
@@ -479,6 +509,11 @@ pgpa_add_future_feature(pgpa_plan_walker_context *walker,
 
        walker->future_query_features =
                lappend(walker->future_query_features, qf);
+
+       if(type == PGPAQF_SEMIJOIN_NON_UNIQUE)
+               elog(NOTICE, "added SEMIJOIN_NON_UNIQUE");
+       else if(type == PGPAQF_SEMIJOIN_UNIQUE)
+               elog(NOTICE, "added SEMIJOIN_UNIQUE");
 }
 
 /*

Reply via email to