Re: should we have a fast-path planning for OLTP starjoins?

Tomas Vondra Mon, 28 Jul 2025 12:44:42 -0700

On 2/4/25 22:55, Tom Lane wrote:
> Tomas Vondra <[email protected]> writes:
>>> The interesting thing about this is we pretty much have all the
>>> infrastructure for detecting such FK-related join conditions
>>> already.  Possibly the join order forcing could be done with
>>> existing infrastructure too (by manipulating the joinlist).
>
>> Maybe, interesting. I've ruled out relying on the FKeys early in the
>> coding, but I'm sure there's infrastructure the patch could use.
>
> It would be very sad to do that work twice in a patch that purports
> to reduce planning time.  If it's done too late to suit you now,
> could we move it to happen earlier?
>
>> What kind of "manipulation" of the joinlist you have in mind?
>
> Right now, if we have four tables to join, we have a joinlist
> (A B C D).  (Really they're integer relids, but let's use names here.)
> If we decide to force C to be joined last, it should be sufficient to
> convert this to ((A B D) C).  If C and D both look like candidates for
> this treatment, we can make it be (((A B) C) D) or (((A B) D) C).
> This is pretty much the same thing that happens if you set
> join_collapse_limit to 1 and use JOIN syntax to force a join order.
> In fact, IIRC we start out with nested joinlists and there is some
> code that normally flattens them until it decides it'd be creating
> too large a sub-problem.  I'm suggesting selectively reversing the
> flattening.
>
>                       regards, tom lane

Here's a patch trying to do it more like this - by manipulating the
lists describing the join problems, before it's passed the the actual
join search algorithm (which is where the PoC patch did this).

I wonder if this is roughly the place where you imagined this would be
done, or if you envision some other issue with this approach. The patch
is definitely incomplete, there's plenty of XXX comments about places
missing some code, etc.

I initially tried to manipulate the joinlist much earlier - pretty much
right at the end of deconstruct_jointree. But that turned out to be way
too early. To identify dimensions etc. we need to check stuff about
foreign keys, join clauses, ... and that's not available that early.

So I think this needs to happen much later in query_planner(), and the
patch does it right before the make_one_rel() call. Maybe that's too
late, but it needs to happen after match_foreign_keys_to_quals(), as it
relies on some of the FK info built by that call. Maybe we could call
match_foreign_keys_to_quals() earlier, but I don't quite see any
benefits of doing that ...

On 2/8/25 02:49, Tomas Vondra wrote:
> On 2/8/25 01:23, Tom Lane wrote:
>> Tomas Vondra <[email protected]> writes:
>>> Instead, I was thinking about the "other" joins (if there are any), that
>>> may add or remove rows. AFAIK we want to join the dimensions at the
>>> place with the lowest cardinality - the discussion mostly assumed the
>>> joins would only reduce the cardinality, in which case we'd just leave
>>> the dimensions until the very end.
>>
>>> But ISTM that may not be necessarily true. Let's say there's a join that
>>> "multiplies" each row. It'll probably be done at the end, and the
>>> dimension joins should probably happen right before it ... not sure.
>>
>> I thought the idea here was to get rid of as much join order searching
>> as we could.  Insisting that we get the best possible plan anyway
>> seems counterproductive, not to mention very messy to implement.
>> So I'd just push all these joins to the end and be done with it.
>>
> 
> Right, cutting down on the join order searching is the point. But most
> of the savings comes (I think) from not considering different ordering
> of the dimensions, because those are all essentially the same.
> 
> Consider a join with 16 relations, 10 of which are dimensions. There are
> 10! possible orders of the dimensions, but all of them behave pretty
> much exactly the same. In a way, this behaves almost like a join with 7
> relations, one of which represents all the 10 dimensions.
> 
> I think this "join group" abstraction (a relation representing a bunch
> of relations in a particular order) would make this reasonably clean to
> implement. I haven't tried yet, though.
> 
> Yes, this means we'd explore more orderings, compared to just pushing
> all the dimensions to the end. In the example above, that'd be 7!/6!, so
> up to ~7x orderings.
> 
> I don't know if this is worth the extra complexity, of course.
> 

I'm still concerned about regressions when happen to postpone some
expensive dimension joins after a join that increases the cardinality.
And the "join group" would address that. But It probably is not a very
common query pattern, and it'd require changes to join_search_one_level.

I'm not sure what could / should count as 'dimension'. The patch doesn't
implement this yet, but I think it can mostly copy how we match the FK
to the join in get_foreign_key_join_selectivity. There's probably more
to think about, though. For example, don't we need to check NOT NULL /
nullfrac on the referencing side? Also, it probably interacts with
LEFT/OUTER joins. I plan to start strict and then relax and handle some
additional cases.

I'm however struggling with the concept of join order restrictions a
bit. I suspect we need to worry about that when walking the relation
list and figuring out what can be a dimension, but I've never worked
with this, so my mental model of how this works is not great.

Another question is if this planning shortcut (which for the dimensions
mostly picks a random join order) could have some unexpected impact on
the rest of the planning. For example, could we "miss" some join
producing tuples in an interesting order? Or could we fail to consider a
partition-wise join?

Could this "shortcut" restrict the rest of the plan in some undesirable
way? For example, if some of the tables are partitioned, could this mean
we no longer can do partition-wise joins with the (mostly arbitrary)
join order we picked?

There's also a "patch" directory, with some SQL scripts creating two
simple examples of schemas/queries that would benefit from this. The
"create-1/select-1" examples are the simple starjoins, this thread
focuses on. It might be modified to do "snowflake" join, which is
fundamentally a variant of this query type.

The second example (create-2/select-2) is quite different, in that it's
nor a starjoin schema. It still joins one "main" table with multiple
"dimensions", but the FKs go in the other direction (to a single column
in the main table). But it has a very similar bottleneck - the order of
the joins is expensive, but it often does not matter very much, because
the query matches just one row anyway. And even if it returns more rows,
isn't the join order determined just by the selectivity of each join?
Maybe the starjoin optimization could be made to work for this type too?

regards

-- 
Tomas Vondra

From a821bf2382cd08829b36cd67cfb424340d9f6408 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <[email protected]>
Date: Thu, 26 Jun 2025 17:49:29 +0200
Subject: [PATCH v2] Simplify planning of starjoin queries

---
 patch/create-1.sql                        |  18 +
 patch/create-2.sql                        |  11 +
 patch/select-1.sql                        |   9 +
 patch/select-2.sql                        |   9 +
 src/backend/optimizer/plan/analyzejoins.c | 425 ++++++++++++++++++++++
 src/backend/optimizer/plan/planmain.c     |  10 +
 src/backend/utils/misc/guc_tables.c       |  10 +
 src/include/optimizer/planmain.h          |   2 +
 8 files changed, 494 insertions(+)
 create mode 100644 patch/create-1.sql
 create mode 100644 patch/create-2.sql
 create mode 100644 patch/select-1.sql
 create mode 100644 patch/select-2.sql

diff --git a/patch/create-1.sql b/patch/create-1.sql
new file mode 100644
index 00000000000..8df04fd0677
--- /dev/null
+++ b/patch/create-1.sql
@@ -0,0 +1,18 @@
+create table dim1 (id int primary key, val1 text);
+create table dim2 (id int primary key, val2 text);
+create table dim3 (id int primary key, val3 text);
+create table dim4 (id int primary key, val4 text);
+create table dim5 (id int primary key, val5 text);
+create table dim6 (id int primary key, val6 text);
+create table dim7 (id int primary key, val7 text);
+
+create table t (id serial primary key,
+                id1 int references dim1(id),
+                id2 int references dim2(id),
+                id3 int references dim3(id),
+                id4 int references dim4(id),
+                id5 int references dim5(id),
+                id6 int references dim6(id),
+                id7 int references dim7(id));
+
+vacuum analyze;
diff --git a/patch/create-2.sql b/patch/create-2.sql
new file mode 100644
index 00000000000..cdf612dde8f
--- /dev/null
+++ b/patch/create-2.sql
@@ -0,0 +1,11 @@
+create table t (id serial primary key, a text);
+
+create table dim1 (id1 int primary key references t(id), val1 text);
+create table dim2 (id2 int primary key references t(id), val2 text);
+create table dim3 (id3 int primary key references t(id), val3 text);
+create table dim4 (id4 int primary key references t(id), val4 text);
+create table dim5 (id5 int primary key references t(id), val5 text);
+create table dim6 (id6 int primary key references t(id), val6 text);
+create table dim7 (id7 int primary key references t(id), val7 text);
+
+vacuum analyze;
diff --git a/patch/select-1.sql b/patch/select-1.sql
new file mode 100644
index 00000000000..1535ddcdc8f
--- /dev/null
+++ b/patch/select-1.sql
@@ -0,0 +1,9 @@
+--set join_collapse_limit = 1;
+select * from t
+    join dim1 on (dim1.id = id1)
+    join dim2 on (dim2.id = id2)
+    join dim3 on (dim3.id = id3)
+    join dim4 on (dim4.id = id4)
+    join dim5 on (dim5.id = id5)
+    join dim6 on (dim6.id = id6)
+    join dim7 on (dim7.id = id7);
diff --git a/patch/select-2.sql b/patch/select-2.sql
new file mode 100644
index 00000000000..4e1d2a7b0e7
--- /dev/null
+++ b/patch/select-2.sql
@@ -0,0 +1,9 @@
+-- set join_collapse_limit = 1;
+select * from t
+    left join dim1 on (id = id1)
+    left join dim2 on (id = id2)
+    left join dim3 on (id = id3)
+    left join dim4 on (id = id4)
+    left join dim5 on (id = id5)
+    left join dim6 on (id = id6)
+    left join dim7 on (id = id7);
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 4d55c2ea591..c57c3db94ef 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -51,6 +51,7 @@ typedef struct
 } SelfJoinCandidate;
 
 bool		enable_self_join_elimination;
+bool		enable_starjoin_join_search;
 
 /* local functions */
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
@@ -2512,3 +2513,427 @@ remove_useless_self_joins(PlannerInfo *root, List *joinlist)
 
 	return joinlist;
 }
+
+/*
+ * starjoin_match_to_foreign_key
+ *		Try to match a join to a FK constraint.
+ *
+ * For a relation to be a dimension (for the starjoin heuristics), it needs
+ * to be joined through a FK constraint. The dimension is expected to be
+ * on the PK side of the join. The relation must not have any additional
+ * join clauses, beyond those matching the foreign key.
+ *
+ * We already have a list of relevant foreign keys, and we use that info
+ * for selectivity estimation in get_foreign_key_join_selectivity(). And
+ * we're actually doing something quite similar here.
+ *
+ * XXX Do we need to worry about the join type, e.g. inner/outer joins,
+ * semi/anti? get_foreign_key_join_selectivity() does care about it, and
+ * ignores some of those cases. Maybe we should too?
+ *
+ * XXX Check there are no other join clauses, beyond those matching the
+ * foreign key. But do we already have the joininfo at this point? Some
+ * of this stuff gets build only during the join order search later.
+ * The match_foreign_keys_to_quals() probably needs to be aware of all
+ * this, so how does it do that?
+ */
+static bool
+starjoin_match_to_foreign_key(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+
+	/* Consider each FK constraint that is known to match the query */
+	foreach(lc, root->fkey_list)
+	{
+		ForeignKeyOptInfo *fkinfo = (ForeignKeyOptInfo *) lfirst(lc);
+		int			nmatches = 0;
+
+		/* rel is not the referenced table of the FK */
+		if (fkinfo->ref_relid != rel->relid)
+			continue;
+
+		/*
+		 * Do we have a match for each key of the FK?
+		 *
+		 * XXX get_foreign_key_join_selectivity checks EquivalenceClasses,
+		 * we should probably (definitely) do that here too.
+		 *
+		 * XXX We should check that all the clauses have the same relation
+		 * on the other side (for multi-column keys). And that there are
+		 * no other join clauses other than those matching the FK.
+		 *
+		 * XXX Do we need to check that the FK side of the join (i.e. the fact
+		 * table) has the columns referenced as NOT NULL? Otherwise we could
+		 * have a FK join that reduces the cardinality, which is one of
+		 * the arguments why it's fine to move the join (that it doesn't
+		 * change the cardinality). But if the join is LEFT JOIN, this
+		 * should be fine too - but do we get here with LEFT JOINs?
+		 *
+		 * XXX Do we need to check if the other side of the FK is in the
+		 * current join list? Maybe it's in some later one?
+		 */
+		for (int i = 0; i < fkinfo->nkeys; i++)
+		{
+			bool		has_matching_clause = false;
+
+			/*
+			 * Is there a clause matching this FK key?
+			 *
+			 * XXX We need to make sure it's a valid match, e.g. that the
+			 * same referencing table matches all keys in a composite FK,
+			 * and so on.
+			 *
+			 * XXX Do we need to check some relationship to other rels in
+			 * the same jointree item? E.g. the referencing table should
+			 * not be a dimensions we already removed.
+			 */
+			if ((fkinfo->rinfos[i] != NULL) || (fkinfo->eclass[i] != NULL))
+			{
+				has_matching_clause = true;
+				nmatches++;
+				continue;
+			}
+
+			/* found a FK key without a matching join clause, ignore the FK */
+			if (has_matching_clause)
+				break;
+		}
+
+		/* matched all FK keys */
+		if (nmatches == fkinfo->nkeys)
+		{
+			return true;
+		}
+	}
+
+	return false;
+}
+
+
+/*
+ * starjoin_is_dimension
+ *		Determine if a range table entry is a dimension in a starjoin.
+ *
+ * To be considered a dimension for the simplified join order search, the
+ * join must not affect the cardinality of the join. We ensure that by
+ * requiring a couple things:
+ *
+ * 1) the join clause has to match a FK (that is, the fact does have at
+ *    most one matching row in the dimension)
+ *
+ * 2) the FK side (the fact table) should be marked as NOT NULL, so that
+ *    we know there's exactly one dimension row for each fact table row
+ *
+ * 3) there must be no additional restrictions on the relation (that
+ *    might eliminate some of the rows, reducing the cardinality)
+ *
+ * XXX The Implementation is incomplete. It probably needs more thought,
+ * considering some join types would allow relaxing some of the checks
+ * (e.g. outer joins may not require checking (2) or possibly even (3),
+ * depending on where the condition is, what varnullingrels it has).
+ *
+ * XXX I wonder if we could handle (3) by ordering the dimensions by the
+ * selectivity of the restriction. There are no join clauses between the
+ * dimensions (ignoring the snowflake joins, but even there the clauses
+ * don't go between branches), so the selectivity could be treated as
+ * a measure of how much it shrinks the join result. So we could just
+ * sort the dimensions by this, starting with the lowest selectivity
+ * (close to 0.0), and ending with dimensions without restrictions (in
+ * which case the selectivity is 1.0).
+ *
+ * XXX If the join in INNER, and the fact side has NULL values in the
+ * join key, we might consider nullfrac as restriction.
+ *
+ * XXX I'm not sure how careful this needs to be about join order
+ * restrictions. Maybe it should call have_relevant_joinclause and
+ * have_join_order_restriction, to ensure the join order is OK?
+ *
+ * The optimizer/README is not very clear about this, but maybe it's
+ * a too specific question. It seems to say the relations in those
+ * lists can be joined in any order (lines 94 and 106). Maybe that's
+ * not what it means, or I'm misunderstanding it.
+ *
+ * It however seems has_join_restrictions() in join_search_one_level()
+ * forces the code to look only at "earlier" rels in the list
+ *
+ *     first_rel = foreach_current_index(r) + 1
+ *
+ * So maybe we just need to stop once we find a rel with a restriction,
+ * as determined byhas_join_restrictions()?
+ *
+ * But there's also join_is_legal() to check legality of joins, with
+ * LEFT/RIGHT joins, and IN/EXISTS clauses. See README line 188. And it
+ * also looks-up the SpecialJoinInfo for the join. So maybe we should
+ * lookup RelOptInfo for both sides of the join, and call join_is_legal
+ * on that? Might be too expensive, though. Maybe do that only when
+ * has_join_restrictions already says yes?
+ *
+ * Maybe we should use has_join_restrictions(), but in a  different way.
+ * We could still treat rels with restrictions as dimensions, and move
+ * that to the separate list (that doesn't change the join order), but
+ * stop once we hit the first non-dimension with a restriction? Because
+ * if any relation after that was a dimention, we wouldn't be able to
+ * move it to the separate list. It'd change the join order in a way
+ * that might violate the restriction. I believe that's the idea behind
+ * first_rel in join_search_one_level(), but maybe not.
+ *
+ * Perhaps have_join_order_restriction and have_relevant_joinclause are
+ * useful for this, rather than has_join_restrictions? We might look at
+ * actual pairs of relations, and/or check there's no join order
+ * restriction with respect to the relations we skipped/moved to the
+ * list of dimension?
+ *
+ * AFAICS it's just the skipping that can break the order restrictions?
+ * Adding something to the list of dimensions keeps the order (at least
+ * with respect to the rels after it).
+ */
+static bool
+starjoin_is_dimension(PlannerInfo *root, RangeTblRef *rtr)
+{
+	Index			rti = rtr->rtindex;
+	RangeTblEntry  *rte = root->simple_rte_array[rti];
+	RelOptInfo	   *rel = root->simple_rel_array[rti];
+
+	/* only plain relations can be dimensions (we need FK/PK join) */
+	if ((rte->rtekind != RTE_RELATION) ||
+		(rel->reloptkind != RELOPT_BASEREL))
+		return false;
+
+	/*
+	 * Does it have any conditions/restrictions that may affect the number
+	 * of rows matched? If yes, don't treat as dimension.
+	 *
+	 * Dimensions in a starjoin may have restrictions, but that means it'll
+	 * change cardinality of the joins (reduce it), so it may be better to
+	 * join it early. We leave it to the regular join order planning. The
+	 * expectation is that most dimensions won't have extra restrictions.
+	 *
+	 * XXX Should we check some other fields, like lateral references, and
+	 * so on? Or is that unnecessary? What about eclasses?
+	 */
+	if (rel->baserestrictinfo != NIL)
+		return false;
+
+	/*
+	 * See if the join clause matches a foreign key. It should match a
+	 * single relation on the other side, and the dimension should be on
+	 * the PK side.
+	 *
+	 * XXX loosely inspired by get_foreign_key_join_selectivity()
+	 */
+	if (!starjoin_match_to_foreign_key(root, rel))
+		return false;
+
+	/*
+	 * XXX Maybe some additional checks here ...
+	 */
+
+	return true;
+}
+
+/*
+ * starjoin_adjust_joins
+ *		Adjust the jointree for starjoins, to simplify the join order search.
+ *
+ * The join search for starjoin queries is surprisingly expensive, because
+ * there are very few join order restrictions. Consider a starjoin query
+ *
+ *  SELECT * FROM f
+ *     JOIN d1 ON (f.id1 = d1.id)
+ *     JOIN d2 ON (f.id2 = d2.id)
+ *     ...
+ *     JOIN d9 ON (f.id9 = d9.id)
+ *
+ * There are no clauses between the dimension tables (d#), which means those
+ * tables can be joined in almost arbitrary order. This means the standard
+ * join_order_search() would explore a N! possible join orders. It is not
+ * that bad in practice, because we split the problem by from_collapse_limit
+ * into a sequence of smaller problems, but even for the default limit of
+ * 8 relations it's quite expensive. This can be easily demonstrated by
+ * setting from_collapse_limit=1 for example starjoin queries.
+ *
+ * The idea here is to apply a much simpler join order search for this type
+ * of queries, without too much risk of picking a much worse plans. It is
+ * however a trade off between how expensive we allow this to be, and how
+ * good the decisions will be. This can help only starjoins with multiple
+ * dimension tables, and we don't want to harm planning of other queries,
+ * so the basic "query shape" detection needs to be very cheap. And then
+ * it needs to be cheaper than the regular join order search.
+ *
+ * If a perfect check is impossible or too expensive, it's better to end
+ * up with a cheap false negative (i.e. and not use the optimization),
+ * rather than risk regressions in other cases.
+ *
+ * The simplified join order search relies on the fact that if the joins
+ * to dimensions do not alter the cardinality of the join relation, then
+ * the relative order of those joins does not matter. All the possible
+ * orders are guaranteed to perform the same. So we can simply pick one
+ * of those orders, and "hardcode" it in the join tree later passed to the
+ * join_order_search().
+ *
+ * The query may involve joins to additional (non-dimension) tables, and
+ * those may alter cardinality. Some joins may increase it, other joins
+ * may decrease it. In principle, it'd be best to first perform all the
+ * joins that reduce join size, then join all the dimensions, and finally
+ * perform joins that may increase the join size. But this is not done
+ * now, currently we simply apply all the dimensions at the end, hoping
+ * that the earlier joins did not inflate the join too much.
+ *
+ * The definition of a dimension is a bit vague. For our definition see
+ * the comment at starjoin_is_dimension().
+ *
+ * The optimization works by manipulating the joinlist (originally built
+ * by deconstruct_jointree), which decomposed the original jointree into
+ * smaller "problems" depending on join type and join_collapse_limit. We
+ * inspect those smaller lists, and selectively split them into smaller
+ * problems to force a join order. This may effectively undo some of the
+ * merging done by deconstruct_jointree(), which tries to build problems
+ * with up to join_collapse_limit relations.
+ *
+ * For example, imagine a join problem with 8 rels - one fact table and
+ * then 7 dimensions, which we can represent a joinlist with 8 elements.
+ *
+ * (D7, D6, D5, D4, D3, D2, D1, F)
+ *
+ * Assuming all those joins meet the requirements (have a matching FK,
+ * don't affect the join cardinality, ...), then we can split this into
+ *
+ * (D7, (D6, (D5, (D4, (D3, (D2, (D1, F)))))))
+ *
+ * which is a nested joinlist, with only two elements on each level. That
+ * means there's no need for expensive join order search, there's only
+ * one way to join the relations (two, if we consider the relations may
+ * switch sides).
+ *
+ * The joinlist may already be nested, with multiple smaller subproblems.
+ * We look at each individual join problem independently, i.e. we don't
+ * try to merge problems to build join_collapse_limit problems again.
+ * This is partially to keep it cheap/simple, but also so not change
+ * behavior for cases when people use join_collapse_limit to force some
+ * particular join shape.
+ *
+ * XXX A possible improvement is to allow handling snowflake joins, i.e.
+ * recursive dimensions. That would require a somewhat more complicated
+ * processing, because a dimension would be allowed other rels, as long
+ * as those are dimensions too. And we'd need to be more careful about
+ * the order in which join them to the top of the join.
+ *
+ * XXX One possible risk is that moving the dimension joins at the very
+ * end may move that after joins that increase the cardinality. Which
+ * may cause a regression. Such joins however don't seem very common, at
+ * least in regular starjoin queries. So maybe we could simply check if
+ * there are any such joins and disable this optimization. Is there a
+ * cheap way to identify that a join increases cardinality?
+ *
+ * XXX Ideally, we'd perform the dimension joins at the place with the
+ * lowest cardinality. Imagine a joinlist
+ *
+ * (D1, D2, A, B, F)
+ *
+ * Where A increases join cardinality, while B does not (possibly even
+ * reduces it). Ideally, we'd do the join like this
+ *
+ * (A, (D2, (D1, (B, F))))
+ *
+ * so D1/D2 get joined at the point of "lowest cardinality". We probably
+ * don't want to do all this cardinality estimation work here, it'd copy
+ * what we already do in the join_order_search(). Perhaps we could invent
+ * a "join item" representing a join to all those dimensions, and pass it
+ * to join_order_search()? And let it pick the right place for it? It'd
+ * always join them in the same order, it'd not reorder them. It would
+ * still do the regular cardinality estimations etc. It would be trivial
+ * to disable the optimization if needed - don't collapse the dimensions
+ * into the new type of join item.
+ */
+List *
+starjoin_adjust_joins(PlannerInfo *root, List *joinlist)
+{
+	ListCell *lc;
+	List *newlist = NIL;
+	List *dimensions = NIL;
+
+	/*
+	 * Do nothing if starjoin optimization not enabled / not applicable.
+	 *
+	 * XXX It might seems we can skip this for lists with <= 2 items, but
+	 * that does not work, the elements may be nested list, and we need to
+	 * descend into those. So do what remove_useless_self_joins() does, and
+	 * only bail out for the simplest single-relation case (i.e. no joins).
+	 */
+	if (!enable_starjoin_join_search || joinlist == NIL ||
+		(list_length(joinlist) == 1 && !IsA(linitial(joinlist), List)))
+		return joinlist;
+
+	/*
+	 * Process the current join problem - split the elements into dimensions
+	 * and non-dimensions. If there are dimensions, add them back at the end,
+	 * as small single-rel joins.
+	 *
+	 * The list may contain various types of elements. It may contain a list,
+	 * which means it's an independent join search problem and we need to
+	 * process it recursively. Or it may be RangeTblRef, in which case we
+	 * need to check if it's a dimension. Other types of elements are just
+	 * added back to the list as-is.
+	 *
+	 * XXX I think we need to be careful to keep the order of the list (for
+	 * the non-dimension entries). The join_search_one_level() relies on
+	 * that when handling join order restrictions.
+	 *
+	 * XXX It might be better to only create a new list if needed, i.e. once
+	 * we find the first dimension. So that non-starjoin queries don't pay
+	 * for something they don't need. A mutable iterator might be a way, but
+	 * I'm not sure how expensive this really is.
+	 */
+	foreach (lc, joinlist)
+	{
+		Node *item = (Node *) lfirst(lc);
+
+		/* a separate join search problem, handle it recursively */
+		if (IsA(item, List))
+		{
+			newlist = lappend(newlist,
+							  starjoin_adjust_joins(root, (List *) item));
+			continue;
+		}
+
+		/*
+		 * Non-RangeTblRef elements can't be considered a dimension (only
+		 * baserels can have FK, etc.), so just add those to the list.
+		 */
+		if (!IsA(item, RangeTblRef))
+		{
+			newlist = lappend(newlist, item);
+			continue;
+		}
+
+		/*
+		 * An entry representing a baserel. If it's a dimension, save it in
+		 * a separate list, and we'll add it at the "top" of the join at the
+		 * end. Otherwise add it to the list just like other elements.
+		 */
+		if (starjoin_is_dimension(root, (RangeTblRef *) item))
+		{
+			dimensions = lappend(dimensions, item);
+			continue;
+		}
+
+		/* not a dimension, add it to the list directly */
+		newlist = lappend(newlist, item);
+	}
+
+	/*
+	 * If we found some dimensions, add them to the join tree one by one.
+	 * The exact order does not matter, so we add them in the order we
+	 * found them in the original list.
+	 *
+	 * We need to add them by creating smaller 2-element lists, with the
+	 * rest of the list on one side and the dimension on the other. This
+	 * is how we force the explicit join order.
+	 */
+	foreach (lc, dimensions)
+	{
+		newlist = list_make2(newlist, list_make1(lfirst(lc)));
+	}
+
+	return newlist;
+}
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..c75a5203aae 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -282,6 +282,16 @@ query_planner(PlannerInfo *root,
 	 */
 	distribute_row_identity_vars(root);
 
+	/*
+	 * Try to simplify the join search problem for starjoin-like joins, with
+	 * joins over FK relationships. The dimensions can be joined in almost
+	 * any order, so the join search can be close to factorial complexity.
+	 * But there's not much difference between such join orders, so we just
+	 * leave the dimensions at the end of each group (as determined by the
+	 * join_collapse_limit earlier).
+	 */
+	joinlist = starjoin_adjust_joins(root, joinlist);
+
 	/*
 	 * Ready to do the primary planning.
 	 */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..4c7642b3102 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1036,6 +1036,16 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_starjoin_join_search", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables simplified join order planning for starjoins."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_starjoin_join_search,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..fee6c695d03 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -21,6 +21,7 @@
 #define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
 extern PGDLLIMPORT double cursor_tuple_fraction;
 extern PGDLLIMPORT bool enable_self_join_elimination;
+extern PGDLLIMPORT bool enable_starjoin_join_search;
 
 /* query_planner callback to compute query_pathkeys */
 typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
@@ -119,6 +120,7 @@ extern bool innerrel_is_unique_ext(PlannerInfo *root, Relids joinrelids,
 								   JoinType jointype, List *restrictlist,
 								   bool force_cache, List **extra_clauses);
 extern List *remove_useless_self_joins(PlannerInfo *root, List *joinlist);
+extern List *starjoin_adjust_joins(PlannerInfo *root, List *joinlist);
 
 /*
  * prototypes for plan/setrefs.c
-- 
2.50.1

Re: should we have a fast-path planning for OLTP starjoins?

Reply via email to