In the script below, the presence of an IN clause forces the internal components of the UNION ALL clause to fully compute even though they are fully optimizable. = ANY doesn't have this issue, so I wonder if there is any opportunity to convert the 'slow' variant (see below) to the 'fast' variant. thank you!
merlin drop table a cascade; drop table b cascade; drop table c cascade; create table a (a_id int primary key); create table b (b_id int primary key, a_id int references a); create table c (c_id int primary key, b_id int references b); insert into a select s from generate_series(1, 50000) s; insert into b select s, (s % 50000 ) + 1 from generate_series(1, 100000) s; insert into c select s, (s % 100000 ) + 1 from generate_series(1, 1000000) s; create index on b (a_id, b_id); create index on c (b_id, c_id); analyze a; analyze b; analyze c; create temp table d (a_id int); insert into d values (99); insert into d values (999); insert into d values (9999); analyze d; create or replace view v as select * from a join b using(a_id) join c using(b_id) union all select * from a join b using(a_id) join c using(b_id); explain analyze select * from v where a_id in (select a_id from d); -- this is slow explain analyze select * from v where a_id = any(array(select a_id from d)); -- this is fast