On Sat, Aug 9, 2014 at 5:15 AM, Josh Berkus j...@agliodbs.com wrote:
Folks,
So one thing we tell users who have chronically long IN() lists is that
they should create a temporary table and join against that instead.
Other than not having the code, is there a reason why PostgreSQL
shouldn't do something like this behind the scenes, automatically?
Hi Josh,
I know that problem for many years.
There are some workaround which doesn't require using the temporary tables
(and I used that approach quite a lot when performance matter):
Instead of using:
SELECT * FROM sometable
WHERE
somefield IN (val1, val2, ...)
AND other_filters;
Query could be written as:
SELECT * FROM sometable
JOIN (VALUES ((val1), (val2) ...)) AS v(somefield) ON
v.somefield=sometable.somefield
WHERE
other_filters;
When there no index on somefield query plans would look like as:
Original query:
Filter: (somefield = ANY ('{...}'::integer[]))
vs optimized query:
Hash Join (cost=0.25..117.89 rows=22 width=59) (actual time=5.332..5.332
rows=0 loops=1)
Hash Cond: (sometable.somefield = *VALUES*.somefield)
...
- Hash (cost=0.12..0.12 rows=10 width=4) (actual time=0.010..0.010
rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
- Values Scan on *VALUES* (cost=0.00..0.12 rows=10 width=4)
(actual time=0.001..0.003 rows=10 loops=1)
In synthetic data I observed the following performance results (fully
in-memory data with integer values):
List lengthIN Performance JOIN VALUES Performance
10 5.39ms 5.38ms
100 9.74ms 5.49ms
100053.02ms 9.89ms
1 231.10ms13.14ms
So starting from 10 elements VALUES/HASH JOIN approach is clear winner.
In case of the text literals IN list performance difference even more
obvious (~2 order of magnitude for 1 list).
However, if IN list used for the primary key lookup - there are no visible
performance difference between these two approaches.
So yes there are some space for optimization of Filter: (somefield = ANY
('{...}'::integer[])) via hashing.
--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ http://www.postgresql-consulting.com/
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com
МойКруг: http://mboguk.moikrug.ru/
People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage.