Re: should we have a fast-path planning for OLTP starjoins?

Tomas Vondra Sat, 08 Nov 2025 12:15:25 -0800

On 9/23/25 21:46, Tom Lane wrote:
> [ sorry for ridiculously slow response to this ]
> 
> Tomas Vondra <[email protected]> writes:
>> Here's a patch trying to do it more like this - by manipulating the
>> lists describing the join problems, before it's passed the the actual
>> join search algorithm (which is where the PoC patch did this).
>> I wonder if this is roughly the place where you imagined this would be
>> done, or if you envision some other issue with this approach.
> 
> Cool.  This is proof-of-concept that manipulating the joinlist can
> do what we need done.  So we can move on to what heuristics we need
> to use.
>


Thanks. Good to hear this seems like a reasonable place.

>> I initially tried to manipulate the joinlist much earlier - pretty much
>> right at the end of deconstruct_jointree. But that turned out to be way
>> too early. To identify dimensions etc. we need to check stuff about
>> foreign keys, join clauses, ... and that's not available that early.
> 
>> So I think this needs to happen much later in query_planner(), and the
>> patch does it right before the make_one_rel() call. Maybe that's too
>> late, but it needs to happen after match_foreign_keys_to_quals(), as it
>> relies on some of the FK info built by that call. Maybe we could call
>> match_foreign_keys_to_quals() earlier, but I don't quite see any
>> benefits of doing that ...
> 
> I don't have a problem with doing it where you did it, but the comment
> should explain the placement.  What you do have in the comment mostly
> belongs with the code, too; it's not the business of the caller.  So
> in planmain.c something like
> 
> +     /*
> +      * Try to simplify the join search problem for starjoin-like joins.
> +      * This step relies on info about FK relationships, so we can't do it
> +      * till after match_foreign_keys_to_quals().
> +      */
> 
> would be more appropriate IMO.

I agree. I've adopted your wording, and moved the original comment to
starjoin_adjust_joins (with some changes).

> I'd be slightly inclined to put the GUC test there, too:
> 
> +     if (enable_starjoin_join_search)
> +             joinlist = starjoin_adjust_joins(root, joinlist);
> 

I'm not sure I like this very mcuh. No other call in query_planner()
does it like that. Most don't have such GUC, but at least
remove_useless_self_joins does, and it still doesn't check it here.

> 
> I agree that you need to worry about join order restrictions,
> and that it's not immediately clear how to do that.  join_is_legal
> would work if we could call it, but the problem is that at this
> stage we'll only have RelOptInfos for base rels not join rels.
> If we have a joinlist (A B C D) and we are considering treating
> C as a dimension table, then the questions we have to ask are:
> (a) is it okay to build the (A B D) join without C?
> (b) is it okay to join (A B D) to C?
> 
> In this simple case, I think (b) must be true if (a) is, but
> I'm not quite sure that that's so in more complex cases with
> multiple candidates for dimension tables.  In any case,
> join_is_legal won't help us if we don't have join RelOptInfos.
> 
> I'm inclined to start by using has_join_restriction: if that
> says "false" for a candidate dimension table, it should be safe
> to postpone the join to the dimension table.  We might be able
> to refine that later.
> 

Thanks. I agree has_join_restriction seems like a good start, I'll give
that a try in the next patch version.

When thinking about this, I realized the has_join_restriction() is only
ever used in join_search_one_level(), i.e. when dealing with each small
join order problem. Doesn't this mean the deconstructed jointree must
already consider the restrictions in some way? I don't see any explicit
mentions of such join order restrictions in deconstruct_recurse. It must
not violate any ordering restrictions by splitting the joins in a
"wrong" way, right? If I set join_collapse_limit=1 it still needs to
satisfy all the rules.

I was wondering if maybe we could piggy-back on that, somehow. But maybe
that's not very practical, and has_join_restriction() is the way to go.

It's been a while since I looked at this patch, so it's possible I
already concluded that wouldn't work, and forgot about it :-/

>> The second example (create-2/select-2) is quite different, in that it's
>> nor a starjoin schema. It still joins one "main" table with multiple
>> "dimensions", but the FKs go in the other direction (to a single column
>> in the main table). But it has a very similar bottleneck - the order of
>> the joins is expensive, but it often does not matter very much, because
>> the query matches just one row anyway. And even if it returns more rows,
>> isn't the join order determined just by the selectivity of each join?
>> Maybe the starjoin optimization could be made to work for this type too?
> 
> Yeah, I'm slightly itchy about relying on FKs in this heuristic at
> all; it doesn't seem like quite the right thing.  I think we do want
> one side of the join to be joining to a PK or at least a unique index,
> but I'm not sure we need to insist on there being an FK relationship.
> 

True, requiring the FK may be unnecessary in this case. We do need to
guarantee the cardinality does not change, but a UNIQUE + LEFT JOIN
seems to be enough for that.

BTW the v3 lost the patch/ directory. I assume that wasn't intentional,
so I added it back into v4.

> A couple of minor coding notes:
> 
> * There's no point in doing anything (except recursing) if the joinlist
> contains fewer than 3 items, and maybe as a further heuristic
> this shouldn't kick in till later yet, like 5 or so items.
> When there are just a few items, the possibility of missing the
> best plan seems to outweigh the minimal savings in plan time.
> 
> * Joinlists never contain anything but RangeTblRefs and sub-lists.
> See make_rel_from_joinlist.
> 
> * Your reconstructed joinlist is overly complicated.  Instead of
> 
> +             newlist = list_make2(newlist, list_make1(lfirst(lc)));
> 
> you could just do
> 
> +             newlist = list_make2(newlist, lfirst(lc));
> 
> because a single-element subproblem is useless.
> 
> I notice that the patch doesn't apply cleanly anymore because of
> the introduction of guc_parameters.dat.  So here's a v3 that
> rebases over that, and I took the liberty of fixing the joinlist
> construction as above, but I didn't do anything else.
> 

Thanks. I agree with all of these comments, and updated v4 accordingly.
cfbot started complaining about guc_parameters.dat and a couple more
things, so v4 fixes that too.


regards

-- 
Tomas Vondra

From b8473cd5c93bf594896d2be40a7f093e90aeca16 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <[email protected]>
Date: Sat, 8 Nov 2025 18:16:30 +0100
Subject: [PATCH v4] Simplify planning of starjoin queries

---
 patch/create-1.sql                            |  18 +
 patch/create-2.sql                            |  11 +
 patch/select-1.sql                            |   9 +
 patch/select-2.sql                            |   9 +
 src/backend/optimizer/plan/analyzejoins.c     | 443 ++++++++++++++++++
 src/backend/optimizer/plan/planmain.c         |   7 +
 src/backend/utils/misc/guc_parameters.dat     |   7 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/optimizer/planmain.h              |   2 +
 src/test/regress/expected/sysviews.out        |   3 +-
 10 files changed, 509 insertions(+), 1 deletion(-)
 create mode 100644 patch/create-1.sql
 create mode 100644 patch/create-2.sql
 create mode 100644 patch/select-1.sql
 create mode 100644 patch/select-2.sql

diff --git a/patch/create-1.sql b/patch/create-1.sql
new file mode 100644
index 00000000000..8df04fd0677
--- /dev/null
+++ b/patch/create-1.sql
@@ -0,0 +1,18 @@
+create table dim1 (id int primary key, val1 text);
+create table dim2 (id int primary key, val2 text);
+create table dim3 (id int primary key, val3 text);
+create table dim4 (id int primary key, val4 text);
+create table dim5 (id int primary key, val5 text);
+create table dim6 (id int primary key, val6 text);
+create table dim7 (id int primary key, val7 text);
+
+create table t (id serial primary key,
+                id1 int references dim1(id),
+                id2 int references dim2(id),
+                id3 int references dim3(id),
+                id4 int references dim4(id),
+                id5 int references dim5(id),
+                id6 int references dim6(id),
+                id7 int references dim7(id));
+
+vacuum analyze;
diff --git a/patch/create-2.sql b/patch/create-2.sql
new file mode 100644
index 00000000000..cdf612dde8f
--- /dev/null
+++ b/patch/create-2.sql
@@ -0,0 +1,11 @@
+create table t (id serial primary key, a text);
+
+create table dim1 (id1 int primary key references t(id), val1 text);
+create table dim2 (id2 int primary key references t(id), val2 text);
+create table dim3 (id3 int primary key references t(id), val3 text);
+create table dim4 (id4 int primary key references t(id), val4 text);
+create table dim5 (id5 int primary key references t(id), val5 text);
+create table dim6 (id6 int primary key references t(id), val6 text);
+create table dim7 (id7 int primary key references t(id), val7 text);
+
+vacuum analyze;
diff --git a/patch/select-1.sql b/patch/select-1.sql
new file mode 100644
index 00000000000..1535ddcdc8f
--- /dev/null
+++ b/patch/select-1.sql
@@ -0,0 +1,9 @@
+--set join_collapse_limit = 1;
+select * from t
+    join dim1 on (dim1.id = id1)
+    join dim2 on (dim2.id = id2)
+    join dim3 on (dim3.id = id3)
+    join dim4 on (dim4.id = id4)
+    join dim5 on (dim5.id = id5)
+    join dim6 on (dim6.id = id6)
+    join dim7 on (dim7.id = id7);
diff --git a/patch/select-2.sql b/patch/select-2.sql
new file mode 100644
index 00000000000..4e1d2a7b0e7
--- /dev/null
+++ b/patch/select-2.sql
@@ -0,0 +1,9 @@
+-- set join_collapse_limit = 1;
+select * from t
+    left join dim1 on (id = id1)
+    left join dim2 on (id = id2)
+    left join dim3 on (id = id3)
+    left join dim4 on (id = id4)
+    left join dim5 on (id = id5)
+    left join dim6 on (id = id6)
+    left join dim7 on (id = id7);
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index e592e1ac3d1..66b1c8c540c 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -51,6 +51,7 @@ typedef struct
 } SelfJoinCandidate;
 
 bool		enable_self_join_elimination;
+bool		enable_starjoin_join_search;
 
 /* local functions */
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
@@ -2511,3 +2512,445 @@ remove_useless_self_joins(PlannerInfo *root, List *joinlist)
 
 	return joinlist;
 }
+
+/*
+ * starjoin_match_to_foreign_key
+ *		Try to match a join to a FK constraint.
+ *
+ * For a relation to be a dimension (for the starjoin heuristics), it needs
+ * to be joined through a FK constraint. The dimension is expected to be
+ * on the PK side of the join. The relation must not have any additional
+ * join clauses, beyond those matching the foreign key.
+ *
+ * We already have a list of relevant foreign keys, and we use that info
+ * for selectivity estimation in get_foreign_key_join_selectivity(). And
+ * we're actually doing something quite similar here.
+ *
+ * XXX Do we need to worry about the join type, e.g. inner/outer joins,
+ * semi/anti? get_foreign_key_join_selectivity() does care about it, and
+ * ignores some of those cases. Maybe we should too?
+ *
+ * XXX Check there are no other join clauses, beyond those matching the
+ * foreign key. But do we already have the joininfo at this point? Some
+ * of this stuff gets build only during the join order search later.
+ * The match_foreign_keys_to_quals() probably needs to be aware of all
+ * this, so how does it do that?
+ */
+static bool
+starjoin_match_to_foreign_key(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+
+	/* Consider each FK constraint that is known to match the query */
+	foreach(lc, root->fkey_list)
+	{
+		ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc);
+		int			nmatches = 0;
+
+		/* rel is not the referenced table of the FK */
+		if (fkinfo->ref_relid != rel->relid)
+			continue;
+
+		/*
+		 * Do we have a match for each key of the FK?
+		 *
+		 * XXX get_foreign_key_join_selectivity checks EquivalenceClasses,
+		 * we should probably (definitely) do that here too.
+		 *
+		 * XXX We should check that all the clauses have the same relation
+		 * on the other side (for multi-column keys). And that there are
+		 * no other join clauses other than those matching the FK.
+		 *
+		 * XXX Do we need to check that the FK side of the join (i.e. the fact
+		 * table) has the columns referenced as NOT NULL? Otherwise we could
+		 * have a FK join that reduces the cardinality, which is one of
+		 * the arguments why it's fine to move the join (that it doesn't
+		 * change the cardinality). But if the join is LEFT JOIN, this
+		 * should be fine too - but do we get here with LEFT JOINs?
+		 *
+		 * XXX Do we need to check if the other side of the FK is in the
+		 * current join list? Maybe it's in some later one?
+		 */
+		for (int i = 0; i < fkinfo->nkeys; i++)
+		{
+			bool		has_matching_clause = false;
+
+			/*
+			 * Is there a clause matching this FK key?
+			 *
+			 * XXX We need to make sure it's a valid match, e.g. that the
+			 * same referencing table matches all keys in a composite FK,
+			 * and so on.
+			 *
+			 * XXX Do we need to check some relationship to other rels in
+			 * the same jointree item? E.g. the referencing table should
+			 * not be a dimensions we already removed.
+			 */
+			if ((fkinfo->rinfos[i] != NULL) || (fkinfo->eclass[i] != NULL))
+			{
+				has_matching_clause = true;
+				nmatches++;
+				continue;
+			}
+
+			/* found a FK key without a matching join clause, ignore the FK */
+			if (has_matching_clause)
+				break;
+		}
+
+		/* matched all FK keys */
+		if (nmatches == fkinfo->nkeys)
+		{
+			return true;
+		}
+	}
+
+	return false;
+}
+
+
+/*
+ * starjoin_is_dimension
+ *		Determine if a range table entry is a dimension in a starjoin.
+ *
+ * To be considered a dimension for the simplified join order search, the
+ * join must not affect the cardinality of the join. We ensure that by
+ * requiring a couple things:
+ *
+ * 1) the join clause has to match a FK (that is, the fact does have at
+ *    most one matching row in the dimension)
+ *
+ * 2) the FK side (the fact table) should be marked as NOT NULL, so that
+ *    we know there's exactly one dimension row for each fact table row
+ *
+ * 3) there must be no additional restrictions on the relation (that
+ *    might eliminate some of the rows, reducing the cardinality)
+ *
+ * XXX The Implementation is incomplete. It probably needs more thought,
+ * considering some join types would allow relaxing some of the checks
+ * (e.g. outer joins may not require checking (2) or possibly even (3),
+ * depending on where the condition is, what varnullingrels it has).
+ *
+ * XXX I wonder if we could handle (3) by ordering the dimensions by the
+ * selectivity of the restriction. There are no join clauses between the
+ * dimensions (ignoring the snowflake joins, but even there the clauses
+ * don't go between branches), so the selectivity could be treated as
+ * a measure of how much it shrinks the join result. So we could just
+ * sort the dimensions by this, starting with the lowest selectivity
+ * (close to 0.0), and ending with dimensions without restrictions (in
+ * which case the selectivity is 1.0).
+ *
+ * XXX If the join in INNER, and the fact side has NULL values in the
+ * join key, we might consider nullfrac as restriction.
+ *
+ * XXX I'm not sure how careful this needs to be about join order
+ * restrictions. Maybe it should call have_relevant_joinclause and
+ * have_join_order_restriction, to ensure the join order is OK?
+ *
+ * The optimizer/README is not very clear about this, but maybe it's
+ * a too specific question. It seems to say the relations in those
+ * lists can be joined in any order (lines 94 and 106). Maybe that's
+ * not what it means, or I'm misunderstanding it.
+ *
+ * It however seems has_join_restrictions() in join_search_one_level()
+ * forces the code to look only at "earlier" rels in the list
+ *
+ *     first_rel = foreach_current_index(r) + 1
+ *
+ * So maybe we just need to stop once we find a rel with a restriction,
+ * as determined byhas_join_restrictions()?
+ *
+ * But there's also join_is_legal() to check legality of joins, with
+ * LEFT/RIGHT joins, and IN/EXISTS clauses. See README line 188. And it
+ * also looks-up the SpecialJoinInfo for the join. So maybe we should
+ * lookup RelOptInfo for both sides of the join, and call join_is_legal
+ * on that? Might be too expensive, though. Maybe do that only when
+ * has_join_restrictions already says yes?
+ *
+ * Maybe we should use has_join_restrictions(), but in a  different way.
+ * We could still treat rels with restrictions as dimensions, and move
+ * that to the separate list (that doesn't change the join order), but
+ * stop once we hit the first non-dimension with a restriction? Because
+ * if any relation after that was a dimention, we wouldn't be able to
+ * move it to the separate list. It'd change the join order in a way
+ * that might violate the restriction. I believe that's the idea behind
+ * first_rel in join_search_one_level(), but maybe not.
+ *
+ * Perhaps have_join_order_restriction and have_relevant_joinclause are
+ * useful for this, rather than has_join_restrictions? We might look at
+ * actual pairs of relations, and/or check there's no join order
+ * restriction with respect to the relations we skipped/moved to the
+ * list of dimension?
+ *
+ * AFAICS it's just the skipping that can break the order restrictions?
+ * Adding something to the list of dimensions keeps the order (at least
+ * with respect to the rels after it).
+ */
+static bool
+starjoin_is_dimension(PlannerInfo *root, RangeTblRef *rtr)
+{
+	Index			rti = rtr->rtindex;
+	RangeTblEntry  *rte = root->simple_rte_array[rti];
+	RelOptInfo	   *rel = root->simple_rel_array[rti];
+
+	/* only plain relations can be dimensions (we need FK/PK join) */
+	if ((rte->rtekind != RTE_RELATION) ||
+		(rel->reloptkind != RELOPT_BASEREL))
+		return false;
+
+	/*
+	 * Does it have any conditions/restrictions that may affect the number
+	 * of rows matched? If yes, don't treat as dimension.
+	 *
+	 * Dimensions in a starjoin may have restrictions, but that means it'll
+	 * change cardinality of the joins (reduce it), so it may be better to
+	 * join it early. We leave it to the regular join order planning. The
+	 * expectation is that most dimensions won't have extra restrictions.
+	 *
+	 * XXX Should we check some other fields, like lateral references, and
+	 * so on? Or is that unnecessary? What about eclasses?
+	 */
+	if (rel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * See if the join clause matches a foreign key. It should match a
+	 * single relation on the other side, and the dimension should be on
+	 * the PK side.
+	 *
+	 * XXX loosely inspired by get_foreign_key_join_selectivity()
+	 */
+	if (!starjoin_match_to_foreign_key(root, rel))
+		return false;
+
+	/*
+	 * XXX Maybe some additional checks here ...
+	 */
+
+	return true;
+}
+
+/*
+ * starjoin_adjust_joins
+ *		Adjust the jointree for starjoins, to simplify the join order search.
+ *
+ * Try to simplify the join search problem for starjoin-like joins, with
+ * joins over FK relationships. The dimensions can be joined in almost
+ * any order, which is about the worst case for the standard join order
+ * search, and can be close to factorial complexity. But there's not much
+ * difference between such join orders, so we just leave the dimensions at
+ * the end of each group (as determined by the join_collapse_limit earlier).
+ *
+ * The join search for starjoin queries is surprisingly expensive, because
+ * there are very few join order restrictions. Consider a starjoin query
+ *
+ *  SELECT * FROM f
+ *     JOIN d1 ON (f.id1 = d1.id)
+ *     JOIN d2 ON (f.id2 = d2.id)
+ *     ...
+ *     JOIN d9 ON (f.id9 = d9.id)
+ *
+ * There are no clauses between the dimension tables (d#), which means those
+ * tables can be joined in almost arbitrary order. This means the standard
+ * join_order_search() would explore a N! possible join orders. It is not
+ * that bad in practice, because we split the problem by from_collapse_limit
+ * into a sequence of smaller problems, but even for the default limit of
+ * 8 relations it's quite expensive. This can be easily demonstrated by
+ * setting from_collapse_limit=1 for example starjoin queries.
+ *
+ * The idea here is to apply a much simpler join order search for this type
+ * of queries, without too much risk of picking a much worse plans. It is
+ * however a trade off between how expensive we allow this to be, and how
+ * good the decisions will be. This can help only starjoins with multiple
+ * dimension tables, and we don't want to harm planning of other queries,
+ * so the basic "query shape" detection needs to be very cheap. And then
+ * it needs to be cheaper than the regular join order search.
+ *
+ * If a perfect check is impossible or too expensive, it's better to end
+ * up with a cheap false negative (i.e. and not use the optimization),
+ * rather than risk regressions in other cases.
+ *
+ * The simplified join order search relies on the fact that if the joins
+ * to dimensions do not alter the cardinality of the join relation, then
+ * the relative order of those joins does not matter. All the possible
+ * orders are guaranteed to perform the same. So we can simply pick one
+ * of those orders, and "hardcode" it in the join tree later passed to the
+ * join_order_search().
+ *
+ * The query may involve joins to additional (non-dimension) tables, and
+ * those may alter cardinality. Some joins may increase it, other joins
+ * may decrease it. In principle, it'd be best to first perform all the
+ * joins that reduce join size, then join all the dimensions, and finally
+ * perform joins that may increase the join size. But this is not done
+ * now, currently we simply apply all the dimensions at the end, hoping
+ * that the earlier joins did not inflate the join too much.
+ *
+ * The definition of a dimension is a bit vague. For our definition see
+ * the comment at starjoin_is_dimension().
+ *
+ * The optimization works by manipulating the joinlist (originally built
+ * by deconstruct_jointree), which decomposed the original jointree into
+ * smaller "problems" depending on join type and join_collapse_limit. We
+ * inspect those smaller lists, and selectively split them into smaller
+ * problems to force a join order. This may effectively undo some of the
+ * merging done by deconstruct_jointree(), which tries to build problems
+ * with up to join_collapse_limit relations.
+ *
+ * For example, imagine a join problem with 8 rels - one fact table and
+ * then 7 dimensions, which we can represent a joinlist with 8 elements.
+ *
+ * (D7, D6, D5, D4, D3, D2, D1, F)
+ *
+ * Assuming all those joins meet the requirements (have a matching FK,
+ * don't affect the join cardinality, ...), then we can split this into
+ *
+ * (D7, (D6, (D5, (D4, (D3, (D2, (D1, F)))))))
+ *
+ * which is a nested joinlist, with only two elements on each level. That
+ * means there's no need for expensive join order search, there's only
+ * one way to join the relations (two, if we consider the relations may
+ * switch sides).
+ *
+ * The joinlist may already be nested, with multiple smaller subproblems.
+ * We look at each individual join problem independently, i.e. we don't
+ * try to merge problems to build join_collapse_limit problems again.
+ * This is partially to keep it cheap/simple, but also so not change
+ * behavior for cases when people use join_collapse_limit to force some
+ * particular join shape.
+ *
+ * XXX A possible improvement is to allow handling snowflake joins, i.e.
+ * recursive dimensions. That would require a somewhat more complicated
+ * processing, because a dimension would be allowed other rels, as long
+ * as those are dimensions too. And we'd need to be more careful about
+ * the order in which join them to the top of the join.
+ *
+ * XXX One possible risk is that moving the dimension joins at the very
+ * end may move that after joins that increase the cardinality. Which
+ * may cause a regression. Such joins however don't seem very common, at
+ * least in regular starjoin queries. So maybe we could simply check if
+ * there are any such joins and disable this optimization. Is there a
+ * cheap way to identify that a join increases cardinality?
+ *
+ * XXX Ideally, we'd perform the dimension joins at the place with the
+ * lowest cardinality. Imagine a joinlist
+ *
+ * (D1, D2, A, B, F)
+ *
+ * Where A increases join cardinality, while B does not (possibly even
+ * reduces it). Ideally, we'd do the join like this
+ *
+ * (A, (D2, (D1, (B, F))))
+ *
+ * so D1/D2 get joined at the point of "lowest cardinality". We probably
+ * don't want to do all this cardinality estimation work here, it'd copy
+ * what we already do in the join_order_search(). Perhaps we could invent
+ * a "join item" representing a join to all those dimensions, and pass it
+ * to join_order_search()? And let it pick the right place for it? It'd
+ * always join them in the same order, it'd not reorder them. It would
+ * still do the regular cardinality estimations etc. It would be trivial
+ * to disable the optimization if needed - don't collapse the dimensions
+ * into the new type of join item.
+ */
+List *
+starjoin_adjust_joins(PlannerInfo *root, List *joinlist)
+{
+	ListCell *lc;
+	List *newlist = NIL;
+	List *dimensions = NIL;
+	int		nlist = list_length(joinlist);
+
+	/* Do nothing if starjoin optimization not enabled. */
+	if (!enable_starjoin_join_search)
+		return joinlist;
+
+	/*
+	 * Do nothing if starjoin optimization not applicable.
+	 *
+	 * XXX It might seems we can skip this for lists with <= 2 items, but
+	 * that does not work, the elements may be nested list, and we need to
+	 * descend into those. So do what remove_useless_self_joins() does, and
+	 * only bail out for the simplest single-relation case (i.e. no joins).
+	 */
+	if (joinlist == NIL ||
+		(nlist == 1 && !IsA(linitial(joinlist), List)))
+		return joinlist;
+
+	/*
+	 * Process the current join problem - split the elements into dimensions
+	 * and non-dimensions. If there are dimensions, add them back at the end,
+	 * as small single-rel joins.
+	 *
+	 * The list may contain various types of elements. It may contain a list,
+	 * which means it's an independent join search problem and we need to
+	 * process it recursively. Or it may be RangeTblRef, in which case we
+	 * need to check if it's a dimension. Other types of elements are just
+	 * added back to the list as-is.
+	 *
+	 * XXX I think we need to be careful to keep the order of the list (for
+	 * the non-dimension entries). The join_search_one_level() relies on
+	 * that when handling join order restrictions.
+	 *
+	 * XXX It might be better to only create a new list if needed, i.e. once
+	 * we find the first dimension. So that non-starjoin queries don't pay
+	 * for something they don't need. A mutable iterator might be a way, but
+	 * I'm not sure how expensive this really is.
+	 */
+	foreach (lc, joinlist)
+	{
+		Node *item = (Node *) lfirst(lc);
+
+		/* a separate join search problem, handle it recursively */
+		if (IsA(item, List))
+		{
+			newlist = lappend(newlist,
+							  starjoin_adjust_joins(root, (List *) item));
+			continue;
+		}
+
+		/*
+		 * If it's not a List, it has to be a RangeTblRef - jinlists can't
+		 * contain any other elements (see make_rel_from_joinlist).
+		 */
+		Assert(IsA(item, RangeTblRef));
+
+		/*
+		 * An entry representing a baserel. If it's a dimension, save it in
+		 * a separate list, and we'll add it at the "top" of the join at the
+		 * end. Otherwise add it to the list just like other elements.
+		 *
+		 * We do this only when the joinlist has at least 3 items. For fewer
+		 * rels the optimization does not matter, there's only a single join
+		 * order anyway. That only skips the optimization on this level - we
+		 * still do the recursion, and that might hit a larger join problem.
+		 *
+		 * XXX We might have a new GUC to customize the cutoff limit, but for
+		 * now it seems good enough to do it whenever applicable. If we find
+		 * it's not worth it for less than N rels, we can add it later.
+		 */
+		if ((nlist >= 3) &&
+			starjoin_is_dimension(root, (RangeTblRef *) item))
+		{
+			dimensions = lappend(dimensions, item);
+			continue;
+		}
+
+		/* not a dimension, add it to the list directly */
+		newlist = lappend(newlist, item);
+	}
+
+	/*
+	 * If we found some dimensions, add them to the join tree one by one.
+	 * The exact order does not matter, so we add them in the order we
+	 * found them in the original list.
+	 *
+	 * We need to add them by creating smaller 2-element lists, with the rest
+	 * of the list being a sub-problem and then adding the dimension
+	 * table. This is how we force the desired join order.
+	 */
+	foreach (lc, dimensions)
+	{
+		newlist = list_make2(newlist, lfirst(lc));
+	}
+
+	return newlist;
+}
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index eefc486a566..14abbbbd361 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -291,6 +291,13 @@ query_planner(PlannerInfo *root,
 	 */
 	distribute_row_identity_vars(root);
 
+	/*
+	 * Try to simplify the join search problem for starjoin-like joins.
+	 * This step relies on info about FK relationships, so we can't do it
+	 * till after match_foreign_keys_to_quals().
+	 */
+	joinlist = starjoin_adjust_joins(root, joinlist);
+
 	/*
 	 * Ready to do the primary planning.
 	 */
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..510241e75be 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -968,6 +968,13 @@
   boot_val => 'true',
 },
 
+{ name => 'enable_starjoin_join_search', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables simplified join order planning for starjoins.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_starjoin_join_search',
+  boot_val => 'false',
+},
+
 { name => 'enable_tidscan', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables the planner\'s use of TID scan plans.',
   flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f62b61967ef..4e5542f690c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -424,6 +424,7 @@
 #enable_presorted_aggregate = on
 #enable_seqscan = on
 #enable_sort = on
+enable_starjoin_join_search = off
 #enable_tidscan = on
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 00addf15992..c30ac3a5754 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -21,6 +21,7 @@
 #define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
 extern PGDLLIMPORT double cursor_tuple_fraction;
 extern PGDLLIMPORT bool enable_self_join_elimination;
+extern PGDLLIMPORT bool enable_starjoin_join_search;
 
 /* query_planner callback to compute query_pathkeys */
 typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
@@ -120,6 +121,7 @@ extern bool innerrel_is_unique_ext(PlannerInfo *root, Relids joinrelids,
 								   JoinType jointype, List *restrictlist,
 								   bool force_cache, List **extra_clauses);
 extern List *remove_useless_self_joins(PlannerInfo *root, List *joinlist);
+extern List *starjoin_adjust_joins(PlannerInfo *root, List *joinlist);
 
 /*
  * prototypes for plan/setrefs.c
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 3b37fafa65b..0c3b1eeaba7 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -172,8 +172,9 @@ select name, setting from pg_settings where name like 'enable%';
  enable_self_join_elimination   | on
  enable_seqscan                 | on
  enable_sort                    | on
+ enable_starjoin_join_search    | off
  enable_tidscan                 | on
-(25 rows)
+(26 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
-- 
2.51.1

Re: should we have a fast-path planning for OLTP starjoins?

Reply via email to