No, I don't want to use LEFT JOINS.
I want to use regular joins.

But (as mentioned in my other follow-up), all the fields are not null (was not 
in the original email, sorry), and are foreign keys, so it is guaranteed to 
always match.

The key part (in my mind) is that I am not filtering on any of the useless 
tables, and I am not returning any columns from those tables either.
Both is known at planning time.

Or is my logic still broken?

Thanks,
 Igor

-----Original Message-----
From: Kevin Grittner [mailto:kgri...@gmail.com] 
Sent: Friday, July 01, 2016 1:29 PM
To: Sfiligoi, Igor <igor.sfili...@ga.com>
Cc: pgsql-general@postgresql.org
Subject: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <igor.sfili...@ga.com> wrote:

> We have a view that is very generic, and we noticed that PostgreSQL is 
> not very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to 
> the underlying tables, but would prefer not to, if there is a way around it.

If it did not do the joins it could not provide the information you are 
actually asking to see.  Of course, there is a very good chance that what you 
are asking to see is not what you *want* to see.

test=# create table a (id int primary key, name varchar(128)); CREATE TABLE 
test=# create table b (id int primary key, name varchar(128)); CREATE TABLE 
test=# create table c (id int primary key,
test(#                 a_id int references a(id),
test(#                 b1_id int references b(id),
test(#                 b2_id int references b(id),
test(#                 b3_id int references b(id));
CREATE TABLE
test=#
test=# create view v_broken as
test-#   select
test-#        c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-#        b1.name b1_name, b2.name b2_name, b3.name b3_name
test-#     from c, a, b b1, b b2, b b3
test-#     where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and
c.b3_id=b3.id;
CREATE VIEW
test=#
test=# create view v as
test-#   select
test-#       c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-#       b1.name b1_name, b2.name b2_name, b3.name b3_name
test-#   from c
test-#   left join a    on a.id = c.a_id
test-#   left join b b1 on b1.id = c.b1_id
test-#   left join b b2 on b2.id = c.b2_id
test-#   left join b b3 on b3.id = c.b3_id;
CREATE VIEW
test=#
test=# insert into a values (1, 'a1');
INSERT 0 1
test=# insert into b values (1, 'b1'), (2, 'b2'), (3, 'b3'); INSERT 0 3 test=# 
insert into c values (1, 1, 1, 2, 3), (2, 1, 1, 2, null); INSERT 0 2 test=# 
test=# select id, b1_name from v_broken;  id | b1_name
----+---------
  1 | b1
(1 row)

test=# explain analyze select id, b1_name from v_broken;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=63.40..183.90 rows=1700 width=278) (actual
time=0.049..0.052 rows=1 loops=1)
   Hash Cond: (c.b3_id = b3.id)
   ->  Hash Join  (cost=47.55..144.68 rows=1700 width=282) (actual
time=0.030..0.033 rows=2 loops=1)
         Hash Cond: (c.b2_id = b2.id)
         ->  Hash Join  (cost=31.70..105.45 rows=1700 width=286) (actual 
time=0.018..0.020 rows=2 loops=1)
               Hash Cond: (c.b1_id = b1.id)
               ->  Hash Join  (cost=15.85..66.22 rows=1700 width=16) (actual 
time=0.010..0.012 rows=2 loops=1)
                     Hash Cond: (c.a_id = a.id)
                     ->  Seq Scan on c  (cost=0.00..27.00 rows=1700
width=20) (actual time=0.001..0.001 rows=2 loops=1)
                     ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual 
time=0.003..0.003 rows=1 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on a  (cost=0.00..12.60
rows=260 width=4) (actual time=0.002..0.003 rows=1 loops=1)
               ->  Hash  (cost=12.60..12.60 rows=260 width=278) (actual 
time=0.005..0.005 rows=3 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 9kB
                     ->  Seq Scan on b b1  (cost=0.00..12.60 rows=260
width=278) (actual time=0.002..0.003 rows=3 loops=1)
         ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual
time=0.006..0.006 rows=3 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on b b2  (cost=0.00..12.60 rows=260
width=4) (actual time=0.004..0.004 rows=3 loops=1)
   ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual
time=0.013..0.013 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b b3  (cost=0.00..12.60 rows=260 width=4) (actual 
time=0.008..0.009 rows=3 loops=1)  Planning time: 0.729 ms  Execution time: 
0.153 ms
(23 rows)

test=# select id, b1_name from v;
 id | b1_name
----+---------
  1 | b1
  2 | b1
(2 rows)

test=# explain analyze select id, b1_name from v;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=15.85..66.22 rows=1700 width=278) (actual
time=0.017..0.018 rows=2 loops=1)
   Hash Cond: (c.b1_id = b1.id)
   ->  Seq Scan on c  (cost=0.00..27.00 rows=1700 width=20) (actual
time=0.005..0.005 rows=2 loops=1)
   ->  Hash  (cost=12.60..12.60 rows=260 width=278) (actual
time=0.006..0.006 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b b1  (cost=0.00..12.60 rows=260 width=278) (actual 
time=0.002..0.003 rows=3 loops=1)  Planning time: 0.177 ms  Execution time: 
0.044 ms
(8 rows)

Note the difference in results using inner joins versus left outer joins.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to