Re: [SQL] Query Performance

2009-12-08 Thread Postgre Novice
Tomas: Thanks for your replay.

I got it working within 3 minutes because of hash join  by rearranging the 
query. 

explain analyze select *, subs from (select user_name,A.user_id,
dnd_window_start, dnd_window_stop from users A ,subs_feed B where
A.user_id=B.user_id and b.feed_id=1413 and f_sms='t') as foo left outer
join user_subscriptions u on (foo.user_id=u.user_id);

Hash Left Join  (cost=1472105.16..2775728.42 rows=957624 width=75)
(actual time=86322.190..192125.402 rows=1340957 loops=1)
   Hash Cond: (a.user_id = u.user_id)
   ->  Hash Join  (cost=1347862.70..2638877.52 rows=957624 width=26)
(actual time=79407.144..181327.848 rows=1340957 loops=1)
 Hash Cond: (b.user_id = a.user_id)
 ->  Bitmap Heap Scan on subs_feed b 
(cost=34518.45..1127550.37 rows=957624 width=4) (actual
time=8791.019..72889.040 rows=1340957 loops=1)
   Recheck Cond: (feed_id = 1413)
   Filter: f_sms
   ->  Bitmap Index Scan on feed_user_id 
(cost=0.00..34279.04 rows=1442930 width=0) (actual
time=8492.067..8492.067 rows=1352180 loops=1)
 Index Cond: (feed_id = 1413)
 ->  Hash  (cost=916273.00..916273.00 rows=19548100
width=26) (actual time=70605.366..70605.366 rows=19548560 loops=1)
   ->  Seq Scan on users a  (cost=0.00..916273.00
rows=19548100 width=26) (actual time=10.405..55676.884 rows=19548560
loops=1)
   ->  Hash  (cost=77415.54..77415.54 rows=3746154 width=49) (actual
time=6909.632..6909.632 rows=3758304 loops=1)
 ->  Seq Scan on user_subscriptions u  (cost=0.00..77415.54
rows=3746154 width=49) (actual time=3.624..4265.114 rows=3758304
loops=1)
 Total runtime: 192585.437 ms







From: "t...@fuzzy.cz" 
To: Postgre Novice 
Cc: pgsql-sql@postgresql.org
Sent: Mon, December 7, 2009 7:23:19 PM
Subject: Re: [SQL] Query Performance

Yes, the problem is the nested loop scan - it's scanning users 609070
times, which is awful.

Could you provide explain plan that executed fast? Was it executed with
the same parameter values or did the parameters change (maybe it's slow
for some parameters values only)?

Have you tried to rewrite the subselect to a join? I.e. something like this

select user_name,A.user_id, dnd_window_start, dnd_window_stop, B.subs as
subs, B.city_id as city_id, B.source_type as source_type from
users A left join user_subscriptions B on (A.user_id=B.user_id)
join subs_feed C ON (A.user_id = C.user_id)
where feed_id=1411 and f_sms='t'

But I guess it won't solve the issue (it seems PostgreSQL did this rewrite
on it's own).

Tomas

> Hello List,
>
> I have a query which use to run very fast now has turn into show stopper .
>
> PostgreSQL:8.2
>
> explain analyze select user_name,A.user_id, dnd_window_start,
> dnd_window_stop, B.subs as subs, B.city_id as city_id, B.source_type as
> source_type from
> users A left join user_subscriptions B on (A.user_id=B.user_id)
> where A.user_id in (select user_id from subs_feed where feed_id=1411 and
> f_sms='t')
> ;
>
>Nested Loop Left Join  (cost=986325.88..1094601.47 rows=11148
> width=55) (actual time=132635.994..1590487.280 rows=609070
> loops=1)
>->  Nested Loop  (cost=986325.88..1062280.53 rows=11148 width=26)
> (actual time=132630.057..1398299.117 rows=609070 loops=1)
>  ->  HashAggregate  (cost=986325.88..986437.36 rows=11148 width=4)
> (actual time=132591.648..133386.651 rows=609070 loops=1)
>->  Bitmap Heap Scan on subs_feed
> (cost=16316.71..985194.44 rows=452576 width=4) (actual
> time=20199.571..131566.494 rows=609070 loops=1)
>  Recheck Cond: (feed_id = 1411)
>  Filter: f_sms
>  ->  Bitmap Index Scan on feed_user_id
> (cost=0.00..16203.57 rows=681933 width=0) (actual
> time=19919.512..19919.512 rows=616900 loops=1)
>Index Cond: (feed_id = 1411)
>  ->  Index Scan using users_pkey on users a  (cost=0.00..6.79
> rows=1 width=26) (actual time=2.073..2.074 rows=1 loops=609070)
>Index Cond: (a.user_id = subs_feed.user_id)
>->  Index Scan using user_subscriptions_user_id_pk on
> user_subscriptions b  (cost=0.00..2.89 rows=1 width=33) (actual
> time=0.312..0.313 rows=1 loops=609070)
>  Index Cond: (a.user_id = b.user_id)
>  Total runtime: 1590755.918 ms
> (13 rows)
>
>
>
> This query runs almost half an hour. It is evident that nested loop is
> taking most of the time (approx 27 minutes).
>
> Any tips would be very useful.
>
> Also these table have below count:
>
> select relname,reltuples from pg_class where relname in
> ('users','user_subscriptions','subs_feed');
>   relname   |  reltuples
> +-
>  user_subscriptions |  3758304
>   users  | 1.95481e+07
>  subs_feed  | 2.96492e+07
>
>
> select n_tup_ins,n_tup_upd,n_tup_del,last_vacuum,last_analyze from
> pg_stat_user_tables where relname='user_subscriptions';
>  n_tu

[SQL] constants in 2-column foreign keys or how to design a storage for text-groups ?

2009-12-08 Thread Andreas

Hi,
my frontend has a lot of combo- and listboxes where one can chose a 
textsnippet that represents a key-number which is stored in several 
tables as foreign-key attributes.
Those textsnippets are usually semantically grouped in 2-10 strings that 
belong together somehow.


stupid example:
---
color:   red, green, blue
size: tiny, little, big, giant
structure:  hard, soft, floppy

now I'd like to build tables like
thing( color_fk foreign key to color, size_fk foreign key to size, 
structure_fk foreign key to structure, sometext, atimestamp ...)

so far no problems.

With time those little text-list-tables clutter up the database so I'm 
thinking about one big text-storage that has the groups represented by a 
number like:

snippets (snippet_id, snippet_group_nr, snippet)
(100, 1, red), (101, 1, green), (102, 1, blue), (200, 2, tiny), (201, 2, 
little), ...


Simple foreign-keys still work nicely but they cant prohibit that I 
store id-values from wrong groups.
Here color_fk would only be correct if the id is out of group 1. The 
foreign key doesnt catch it if I put a group-3-id into color_fk.


Id be cool to be able to have constants in 2-column foreign keys like
color_fk integer not null default 0
FOREIGN KEY (color_fk,  1  ) REFERENCES snippets (snippet_id, 
snippet_group_nr)


This throws an error. So this approach might be not advisable.
I could add an additional column for every foreign-key that stores 
constant group-ids then I can have 2-column-fk but this looks bloated 
since those extra columns would hold eternally the same number in every row.


How would I solve the rather common text storage issue?



--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql