On Sat, Aug 9, 2014 at 5:15 AM, Josh Berkus <j...@agliodbs.com> wrote:
> Folks, > > So one thing we tell users who have chronically long IN() lists is that > they should create a temporary table and join against that instead. > Other than not having the code, is there a reason why PostgreSQL > shouldn't do something like this behind the scenes, automatically? > > Hi Josh, I know that problem for many years. There are some workaround which doesn't require using the temporary tables (and I used that approach quite a lot when performance matter): Instead of using: SELECT * FROM sometable WHERE somefield IN (val1, val2, ...) AND other_filters; Query could be written as: SELECT * FROM sometable JOIN (VALUES ((val1), (val2) ...)) AS v(somefield) ON v.somefield=sometable.somefield WHERE other_filters; When there no index on somefield query plans would look like as: Original query: Filter: (somefield = ANY ('{...}'::integer[])) vs optimized query: Hash Join (cost=0.25..117.89 rows=22 width=59) (actual time=5.332..5.332 rows=0 loops=1) Hash Cond: (sometable.somefield = "*VALUES*".somefield) ... -> Hash (cost=0.12..0.12 rows=10 width=4) (actual time=0.010..0.010 rows=10 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Values Scan on "*VALUES*" (cost=0.00..0.12 rows=10 width=4) (actual time=0.001..0.003 rows=10 loops=1) In synthetic data I observed the following performance results (fully in-memory data with integer values): List length IN Performance JOIN VALUES Performance 10 5.39ms 5.38ms 100 9.74ms 5.49ms 1000 53.02ms 9.89ms 10000 231.10ms 13.14ms So starting from 10 elements VALUES/HASH JOIN approach is clear winner. In case of the text literals IN list performance difference even more obvious (~2 order of magnitude for 10000 list). However, if IN list used for the primary key lookup - there are no visible performance difference between these two approaches. So yes there are some space for optimization of "Filter: (somefield = ANY ('{...}'::integer[]))" via hashing. -- Maxim Boguk Senior Postgresql DBA http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/> Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk Jabber: maxim.bo...@gmail.com МойКруг: http://mboguk.moikrug.ru/ "People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage."