On Fri, May 29, 2026 at 12:11:32AM +0200, Tomas Vondra wrote: > But re-reading the old thread, this doesn't seem to be why it got stuck. > We already can identify dimensions joined on foreign keys, and that > seems like a good start. > > IIRC the thing that worried me was that just sticking the joins at the > end is pretty heavy-handed. It can easily end up making the plan worse, > if one of the other joins increases the cardinality. Would that be > common? Probably not, but it seems unnecessarily risky.
Right. > Ideally, we'd do join that reduce cardinality first (with the regular DP > join search), then join all the dimensions, and finally do all joins > that expand cardinality (again, using the regular DP). But the earlier > patches worked by adjusting the join tree in query_planner(), i.e. way > before we get to calculate join cardinalities. Yes, I remember discussing that. > It works like this: > > 1) query_planner() > > Determine if the query includes a starjoin (or multiple), and remember > the relids included in the starjoin cluster. Pick a "canonical" join > order for each starjoin cluster we found (e.g. with dimensions in relid > order). > > 2) standard_join_search()/join_search_one_level() > > When constructing the join rels (e.g. in make_join_rel or right before > it's called), check that the new rel would violate the canonical order. > If it would, refuse to create it, just like we do for various other join > restrictions. This is how you avoid the factorial explosion of plan options, right? > The new join restriction is that if the join result includes a subset of > the starjoin cluster, then it has to include the fact + prefix of the > list of dimensions (which is the canonical join order). Sorry, I got lost here. What is "prefix?" I looked at the patch and also could not understand it. > Note: It should be possible to make the restriction even more strict, if > needed (e.g. to effectively join all dimensions at once, with no other > joins in between). The patch is quite small. > Attached are a couple charts from a test with 1-15 dimensions (scripts > attached too). I was wondering how geqo affects this, so I tried with > geqo=on/off, and with join_collapse_limit=1/8/16. > > With join_collapse_limit=1 there's no difference between any of the runs > (master, patches with on/off). Here's an example of results: > > dims master(1) master sj/off sj/on master sj/off sj/on > ------------------------------------------------------------------- > 1 49485 48797 48966 49118 99% 99% 99% > 3 26886 22003 21319 24322 82% 79% 90% > 5 17759 7923 7634 15434 45% 43% 87% > 7 13110 2122 2071 11290 16% 16% 86% > 9 10390 462 445 8709 4% 4% 84% > 11 7781 87 86 6488 1% 1% 83% > 13 5948 14 14 5749 0% 0% 97% > 15 5237 1 1 4227 0% 0% 81% Impressive. -- Bruce Momjian <[email protected]> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
