On 06/06/2016 06:34 PM, Tom Lane wrote:
... BTW, another thought occurred to me yesterday: it seems like the
existing code hasn't thought through its behavior for multiple foreign
keys very carefully.  That is, suppose we have both "A.J references B.K"
and "A.X references B.Y", as separate FKs not a single multicolumn FK.
The current code goes to some lengths to decide that one of these is
better than the other and then ignore the other.  Why?  Seems to me
that in such a case you want to behave more nearly as you would for a
multicolumn FK, that is discard all the join quals matched to either FK
in favor of a single selectivity estimate based on the number of rows in
the referenced table.  Discarding only the A.J = B.K clause and then
multiplying by the independent selectivity of A.X = B.Y is surely just as
wrong as what we've historically done for multicolumn FKs.  (Correcting
for nulls would take a bit of thought, but I wouldn't be surprised if it
ends up being the same as for the multicolumn-FK case, at least to within
the precision we can hope to get with the available stats for nulls.)

Yes, that can be improved. The plan was to improve the common case first, and then look at the more complicated cases. It might seem like a hand-waving but I'd bet 99% tables are joined on a single FK, so this seems like a reasonable approach.

When it comes to improving multiple (multi-column) foreign keys, I think it may get way more complicated that it might seem. What if the foreign keys overlap, for example? Or what if the keys go in opposite directions (cycle). And so on ...


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to