Re: [PERFORM] two table join with order by on both tables attributes

2014-08-08 Thread Marti Raudsepp
On Fri, Aug 8, 2014 at 4:05 AM, Evgeniy Shishkin itparan...@gmail.com wrote:
 select * from users join  notifications on users.id=notifications.user_id 
 ORDER BY users.priority desc ,notifications.priority desc limit 10;

 In my understanding, i need to have two indexes
 on users(priority desc, id)
 and notifications(user_id, priority desc)

 And actually with this kind of query we really want the most wanted 
 notifications, by the user.
 So we really can rewrite to order by users.priority desc, id asc, 
 notifications.priority desc according to business logic.

You can rewrite it with LATERAL to trick the planner into sorting each
user's notifications separately. This should give you the nestloop
plan you expect:

SELECT *
FROM users,
LATERAL (
  SELECT * FROM notifications WHERE notifications.user_id=users.id
  ORDER BY notifications.priority DESC
) AS notifications
ORDER BY users.priority DESC, users.id

It would be great if Postgres could do this transformation automatically.

There's a partial sort patch in the current CommitFest, which would
solve the problem partially (it could use the index on users, but the
notifications sort would have to be done in memory still).
https://commitfest.postgresql.org/action/patch_view?id=1368

Regards,
Marti


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] two table join with order by on both tables attributes

2014-08-08 Thread Evgeniy Shishkin

 On 08 Aug 2014, at 16:29, Marti Raudsepp ma...@juffo.org wrote:
 
 On Fri, Aug 8, 2014 at 4:05 AM, Evgeniy Shishkin itparan...@gmail.com wrote:
 select * from users join  notifications on 
 users.id=notifications.user_id ORDER BY users.priority desc 
 ,notifications.priority desc limit 10;
 
 In my understanding, i need to have two indexes
 on users(priority desc, id)
 and notifications(user_id, priority desc)
 
 And actually with this kind of query we really want the most wanted 
 notifications, by the user.
 So we really can rewrite to order by users.priority desc, id asc, 
 notifications.priority desc according to business logic.
 
 You can rewrite it with LATERAL to trick the planner into sorting each
 user's notifications separately. This should give you the nestloop
 plan you expect:
 
 SELECT *
 FROM users,
 LATERAL (
  SELECT * FROM notifications WHERE notifications.user_id=users.id
  ORDER BY notifications.priority DESC
 ) AS notifications
 ORDER BY users.priority DESC, users.id
 

Thank you very much.


 It would be great if Postgres could do this transformation automatically.
 
 There's a partial sort patch in the current CommitFest, which would
 solve the problem partially (it could use the index on users, but the
 notifications sort would have to be done in memory still).
 https://commitfest.postgresql.org/action/patch_view?id=1368
 
 Regards,
 Marti



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Optimization idea for long IN() lists

2014-08-08 Thread Josh Berkus
Folks,

So one thing we tell users who have chronically long IN() lists is that
they should create a temporary table and join against that instead.
Other than not having the code, is there a reason why PostgreSQL
shouldn't do something like this behind the scenes, automatically?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimization idea for long IN() lists

2014-08-08 Thread Maxim Boguk
On Sat, Aug 9, 2014 at 5:15 AM, Josh Berkus j...@agliodbs.com wrote:

 Folks,

 So one thing we tell users who have chronically long IN() lists is that
 they should create a temporary table and join against that instead.
 Other than not having the code, is there a reason why PostgreSQL
 shouldn't do something like this behind the scenes, automatically?


Hi Josh,

I know that problem for many years.
There are some workaround which doesn't require using the temporary tables
(and I used that approach quite a lot when performance matter):

Instead of using:
SELECT * FROM sometable
WHERE
somefield IN (val1, val2, ...)
AND other_filters;

Query could be written as:
SELECT * FROM sometable
JOIN (VALUES ((val1), (val2) ...)) AS v(somefield) ON
v.somefield=sometable.somefield
WHERE
other_filters;

When there no index on somefield query plans would look like as:

Original query:

   Filter: (somefield = ANY ('{...}'::integer[]))

vs optimized query:

 Hash Join  (cost=0.25..117.89 rows=22 width=59) (actual time=5.332..5.332
rows=0 loops=1)
   Hash Cond: (sometable.somefield = *VALUES*.somefield)
...
   -  Hash  (cost=0.12..0.12 rows=10 width=4) (actual time=0.010..0.010
rows=10 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 1kB
 -  Values Scan on *VALUES*  (cost=0.00..0.12 rows=10 width=4)
(actual time=0.001..0.003 rows=10 loops=1)


In synthetic data I observed the following performance results (fully
in-memory data with integer values):

List lengthIN  Performance   JOIN VALUES Performance
   10 5.39ms 5.38ms
  100 9.74ms 5.49ms
 100053.02ms 9.89ms
1   231.10ms13.14ms

So starting from 10 elements VALUES/HASH JOIN approach is clear winner.
In case of the text literals IN list performance difference even more
obvious (~2 order of magnitude for 1 list).

However, if IN list used for the primary key lookup - there are no visible
performance difference between these two approaches.

So yes there are some space for optimization of Filter: (somefield = ANY
('{...}'::integer[])) via hashing.

-- 
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ http://www.postgresql-consulting.com/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com
МойКруг: http://mboguk.moikrug.ru/

People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage.