Hello,

I have a little 2-table JOIN, GROUP BY, ORDER BY query that does a sequential 
scan on a multi-million row table.  I _thought_ I had all the appropriate 
indices, but apparently I do not.  I was wondering if anyone can spot a way I 
can speed up this query.
The query currently takes... *gulp*: 381119.201 ms :(
  
There are only 2 tables in the game: user_url and user_url_tag.  The latter has 
FKs pointing to the former.  The sequential scan happens on the latter - 
user_url_tag:

EXPLAIN ANALYZE select DISTINCT userurltag0_.tag as x0_0_, COUNT(*) as x1_0_ 
from user_url_tag userurltag0_, user_url userurl1_ WHERE (((userurl1_.user_id=1 
)) AND ((userurltag0_.user_url_id=userurl1_.id ))) GROUP BY userurltag0_.tag 
ORDER BY  count(*) DESC;  
  
                                                                                
      QUERY PLAN   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  
 Unique  (cost=140972.22..140996.28 rows=3207 width=10) (actual 
time=381082.868..381110.094 rows=2546 loops=1)  
   ->  Sort  (cost=140972.22..140980.24 rows=3207 width=10) (actual 
time=381082.858..381091.733 rows=2546 loops=1)  
         Sort Key: count(*), userurltag0_.tag  
         ->  HashAggregate  (cost=140777.45..140785.46 rows=3207 width=10) 
(actual time=381032.844..381064.068 rows=2546 loops=1)  
               ->  Hash Join  (cost=2797.65..140758.50 rows=3790 width=10) 
(actual time=248.530..380635.132 rows=8544 loops=1)  
                     Hash Cond: ("outer".user_url_id = "inner".id)  
                     ->  Seq Scan on user_url_tag userurltag0_  
(cost=0.00..106650.30 rows=6254530 width=14) (actual time=0.017..212256.630 
rows=6259553 loops=1)  
                     ->  Hash  (cost=2795.24..2795.24 rows=962 width=4) (actual 
time=199.840..199.840 rows=0 loops=1)  
                           ->  Index Scan using ix_user_url_user_id_url_id on 
user_url userurl1_  (cost=0.00..2795.24 rows=962 width=4) (actual 
time=0.048..193.707 rows=1666 loops=1)  
                                 Index Cond: (user_id = 1)  
 Total runtime: 381119.201 ms  
(11 rows)  
  
  
  This is what the two tables look like (extra colums removed):

                                    Table "public.user_url_tag"  
   Column    |         Type          |                          Modifiers  
-------------+-----------------------+--------------------------------------------------------------
  
 id          | integer               | not null default 
nextval('public.user_url_tag_id_seq'::text)  
 user_url_id | integer               |  
 tag         | character varying(64) |  
Indexes:  
    "pk_user_url_tag_id" PRIMARY KEY, btree (id)  
    "ix_user_url_tag_tag" btree (tag)  
    "ix_user_url_tag_user_url_id" btree (user_url_id)  
Foreign-key constraints:  
    "fk_user_url_tag_user_url_id" FOREIGN KEY (user_url_id) REFERENCES 
user_url(id)  
  
                                          Table "public.user_url"  
      Column      |            Type             |                        
Modifiers  
------------------+-----------------------------+----------------------------------------------------------
  
 id               | integer                     | not null default 
nextval('public.user_url_id_seq'::text)  
 user_id          | integer                     |  
 url_id           | integer                     |  
Indexes:  
    "pk_user_url_id" PRIMARY KEY, btree (id)  
    "ix_user_url_url_id_user_id" UNIQUE, btree (url_id, user_id)  
    "ix_user_url_user_id_url_id" UNIQUE, btree (user_id, url_id)  
  
  
  Does anyone see a way to speed up this s-l-o-w query?
I cache DB results, but I'd love to get rid of that sequential scan.

Thanks,
Otis




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Reply via email to