Hi,

On 09/25/2015 03:39 AM, David Rowley wrote:
On 24 September 2015 at 23:57, Tomas Vondra
<tomas.von...@2ndquadrant.com <mailto:tomas.von...@2ndquadrant.com>> wrote:


    2) find_best_match_foreign_key
    ------------------------------

    I think the comment before the function needs rephrasing (seems a
    bit broken to me). I do like the approach in general, although it
    changes the semantics a bit. The original code only considered
    "fully matching" fkeys, while the new code simply takes the longest
    match.



Oops, I did not code this at all the way I had originally pictured it. I
guess the evidence of that is in the function comment, which I wrote
before coding the function.
I of course intended to only allow full matches.
A full patch which fixes this is attached. This also should fix the
clause duplication trick that I talked about.

OK, thanks. Let's leave partial FK matches for the future.


The comment before the function mentions it's possible to confuse
the code with duplicate quals. Can you give an example? >

Something like: SELECT * FROM a LEFT JOIN b ON a.id a.id=b.id b.id
and a.id a.id=b.id b.id AND a.id a.id=b.id b.id AND a.id2 = b.id2 AND
a.id3 = b.id3;

Note a.id a.id = b.id b.id repeated 3 times.

Where a foreign key exists on (a.id a.id) ref (b.id b.id), and also
(a.id2, a.id3) ref (b.id2, b.id3). In this case we match 3 quals for
the FK with 1 key, and 2 quals with the FK with 2 keys. But this is
now fixed in the attached.

I used an outer join here as they won't be transformed into eclass
members and back to quals again. INNER JOIN wouldn't allow the
duplicates to materialise again.

Ah, OK. Didn't think about outer joins.


...

I looked at this again, and I'm not all that sure it's possible to
assume that 1.0 / <tuples> is valid when there's more than one
relation at either side of the join.
>
My reasoning for this is that the whole basis for the patch is that a
if we find a foreign key match, then we can be sure enough, as far as
row estimations go, that exactly 1 tuple will exist matching that
condition. This assumption of the 1 tuple no longer holds when 2
relations have already been joined, as this can either cause tuple
duplication, or elimination.

I don't see why that would be the case. Of course, you need to take the right <tuples>, i.e. the "target" of the foreign key (the table with UNIQUE constraint) so that the selectivity matches the fact that exactly 1 tuple (on the PK side) matches.

Let's say we have 3 tables

A (1000000 rows)
B (333333 rows)
D (222222 rows)

and let's assume A references the other two tables

   A (b_id) REFERENCES B(id)
   A (c_id) REFERENCES C(id)

Now, let's estimate the join

  SELECT * FROM A JOIN B ON (a.b_id = b.id)
                  JOIN C ON (a.c_id = c.id)

Which will do this

  card(join) = card(A) * card(B) * card(C) * sel(b_id=id) * sel(c_id=id)

where card(T) is a cardinality of a table, and sel(C) is selectivity of the conditions. We do know that

  card(A) = 1000
  card(B) = 333
  card(C) = 222

and we do know that selectivity of each condition is 1/card(T) of the table with the UNIQUE constraint, referenced by the FK. So

  sel(b_id=id) = 1/card(B) = 1/333
  sel(c_id=id) = 1/card(C) = 1/222

The fact is that these selectivities perfectly eliminate the impact of cardinality of the table. So

  card(join) = 1000 * (333 * (1/333)) * (222 * (1/222))

so the expected cardinality is 1000.

Of course, this estimation effectively happens in steps, i.e. we first join 2 tables - say A and B, then (A,B) and C. So in the first step we do this:

  card(A,B) = card(A) * card(B) * sel(b_id=id) = 1000

  card((A,B),C) = card(A,B) * card(C) * sel(a_id=id) = 1000

The join order does not matter - we could easily join B,C first, and then join A.

  card(B,C) = card(B) * card(C) * sel(NULL) = 73926

and then

  card((B,C),A) = card(B,C) * card(A) * sel(a_id=id) * sel(b_id=id)
                = 1000

Notice that in this case, the singleton table (A) actually references primary keys within the join relation - obviously the whole join relation does not have unique keys or anything, but that does not matter.

The crucial part here is that selectivity of the condition needs to use the number of tuples of the base relation, because that's the right selectivity for the join clause.


It's a little hard force the join order so that it happens this way, but
say the join order in the following example was, from left to right:

a CROSS JOIN b JOIN c on a.id a.id=c.id

Of course, the planner would perform the cross join last in reality, but
it's possible to trick it too not.

Let's say a foreign key exists on c (id) references a(id). If we
assume that a.id a.id = c.id <c.id> produces 1 tuple in the (a,b)
rel, thenwe'd be very wrong, as it's not 1, it's the number of
rows in b! Which could be millions.

I think this is the part where you're wrong. What needs to happen in the estimation is this:

  card(A,B) = card(A) * card(B)  /* cross join */

  card((A,B),C) = card(A,B) * card(C) * sel(a.id=c.id)
                = card(A,B) * card(C) * (1 / card(A))
                = card(B) * card(C)

Which is what the v2 of the patch seems to be getting right. Let me demonstrate - I'll try to replicate the example you're presenting:

  create table a as select i as a_id1, i as a_id2
                      from generate_series(1,1000) s(i);

  create table b as select i as b_id1, i as b_id2
                      from generate_series(0,332) s(i);
  alter table b add unique (b_id1, b_id2);

  create table c as select i/3 as c_id1, i/3 as c_id2
                      from generate_series(0,998) s(i);

  alter table c add foreign key (c_id1, c_id2)
                   references b (b_id1, b_id2);

  analyze;

Let's force the join order:

  set join_collapse_limit = 1;

and run a bunch of queries joining the tables in different orders. I'll only present the EXPLAIN output here, but the estimates are perfectly accurate:


test=# explain select * from (a cross join b)
                        join c on (b_id1 = c_id1 AND b_id2 = c_id2);

                                  QUERY PLAN
-----------------------------------------------------------------------------------------------
 Merge Join  (cost=64.91..5981.72 rows=999000 width=24)
   Merge Cond: ((b.b_id1 = c.c_id1) AND (b.b_id2 = c.c_id2))
   ->  Nested Loop  (cost=0.15..4199.45 rows=333000 width=16)
         ->  Index Only Scan using b_b_id1_b_id2_key on b
                                   (cost=0.15..19.45 rows=333 width=8)
         ->  Materialize  (cost=0.00..20.00 rows=1000 width=8)
               ->  Seq Scan on a  (cost=0.00..15.00 rows=1000 width=8)
   ->  Sort  (cost=64.76..67.26 rows=999 width=8)
         Sort Key: c.c_id1, c.c_id2
         ->  Seq Scan on c  (cost=0.00..14.99 rows=999 width=8)
(9 rows)

test=# explain select * from (a cross join c)
                        join b on (b_id1 = c_id1 and b_id2 = c_id2);

                              QUERY PLAN
----------------------------------------------------------------------
 Hash Join  (cost=10.32..20052.81 rows=999000 width=24)
   Hash Cond: ((c.c_id1 = b.b_id1) AND (c.c_id2 = b.b_id2))
   ->  Nested Loop  (cost=0.00..12519.99 rows=999000 width=16)
         ->  Seq Scan on a  (cost=0.00..15.00 rows=1000 width=8)
         ->  Materialize  (cost=0.00..19.98 rows=999 width=8)
               ->  Seq Scan on c  (cost=0.00..14.99 rows=999 width=8)
   ->  Hash  (cost=5.33..5.33 rows=333 width=8)
         ->  Seq Scan on b  (cost=0.00..5.33 rows=333 width=8)
(8 rows)


test=# explain select * from (b join c on (b_id1=c_id1 and
                                           b_id2=c_id2)) cross join a;

                                QUERY PLAN
---------------------------------------------------------------------------
 Nested Loop  (cost=10.32..12537.84 rows=999000 width=24)
   ->  Seq Scan on a  (cost=0.00..15.00 rows=1000 width=8)
   ->  Materialize  (cost=10.32..37.83 rows=999 width=16)
         ->  Hash Join  (cost=10.32..32.84 rows=999 width=16)
               Hash Cond: ((c.c_id1 = b.b_id1) AND (c.c_id2 = b.b_id2))
               ->  Seq Scan on c  (cost=0.00..14.99 rows=999 width=8)
               ->  Hash  (cost=5.33..5.33 rows=333 width=8)
                     ->  Seq Scan on b  (cost=0.00..5.33 rows=333 width=8)
(8 rows)

I've attempted to create a test case to demo this against your v2
patch, and it does fall for it, only it does happen to actually give
better estimates than without the patch, but I'm sure there must be
some case to be found where it does not.

create table a (id1 int, id2 int, c_id1 int not null, c_id2 int not
null, primary key(id1, id2));
create table b (id int primary key, a_id1 int not null, a_id2 int not null);
create table c (id1 int, id2 int);

alter table b add constraint b_a_id1_a_id2_fkey foreign key (a_id1,
a_id2) references a;

insert into a select x,x,x,x from generate_series(1,100) s(x);
insert into b select id1,id2,id1 from a;
insert into c select c_id1,c_id2 from a, generate_series(1,100);

analyze a;
analyze b;
analyze c;

explain analyze select * from a inner join b on a.id1=b.a_id1 and a.id2
= b.a_id2 join c on c.id1%2 = a.c_id1%2 and c.id2%2 = a.c_id2%2;

No, I don't think it 'falls for it'. Table C is not connected to the other two tables by a foreign key, and even if it was, the modulo expression would make it impossible to detect the join a s FK join. Which makes the example rather irrelevant for this patch, I think.

If you try to fix those issues, you should get basically the example I've presented.

It's possible to confuse the patch - e.g. by creating foreign keys in both directions, or 3-day join with each table referencing the other two. But it should still give better estimates than the current estimation that does not consider foreign keys at all.

I'm not saying that my version of the patch does any better... I've
actually not tested, but I think we could only use the FK to test
this if we happened to back that up with something like UNIQUE join
detection. My unique join patch aims to add some of the
infrastructure which might be required to make that possible.

I'm not sure where you see the problem in the current patch, so I can't really say whether this is good or bad.

Perhaps the patch cannot work well without this, as since we are
trying to fix a row underestimation, then we might just succeed in
having the planner switch the join order to some order that is not
backed up by foreign keys, because the row estimates look more
favourable that way. Although perhaps that could be fixed by changing
the 1.0 / <tuples> to <something else> / <tuples>

I don't think the join order should matter (and I tried to demonstrate that with the example above).

It might matter if we hit the issue with equivalence classes, because then the planner comes up with new join clauses that may not be backed by foreign keys. But even in that case we should not do worse that now.

regards

--
Tomas Vondra                  www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to