Aha! set hashjoin=off did the trick. The PG version is: 8.0.3 NB: I removed that redundant "DISTINCT" after the SELECT.
EXPLAIN ANALYZE select userurltag0_.tag as x0_0_, COUNT(*) as x1_0_ from user_url_tag userurltag0_, user_url userurl1_ where (((userurl1_.user_id=1 ))AND((userurltag0_.user_url_id=userurl1_.id ))) group by userurltag0_.tag order by count(*)DESC; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=155766.79..155774.81 rows=3207 width=10) (actual time=2387.756..2396.578 rows=2546 loops=1) Sort Key: count(*) -> HashAggregate (cost=155572.02..155580.03 rows=3207 width=10) (actual time=2365.643..2376.626 rows=2546 loops=1) -> Nested Loop (cost=0.00..155552.68 rows=3867 width=10) (actual time=0.135..2222.028 rows=8544 loops=1) -> Index Scan using ix_user_url_user_id_url_id on user_url userurl1_ (cost=0.00..2798.12 rows=963 width=4) (actual time=0.067..9.744 rows=1666 loops=1) Index Cond: (user_id = 1) -> Index Scan using ix_user_url_tag_user_url_id on user_url_tag userurltag0_ (cost=0.00..157.34 rows=103 width=14) (actual time=1.223..1.281 rows=5 loops=1666) Index Cond: (userurltag0_.user_url_id = "outer".id) Total runtime: 2405.691 ms (9 rows) Are you still interested in other "its second-choice join type"? If you are, please tell me what join types those are, this is a bit beyond me. :( Is there a way to force PG to use the index automatically? This query is executed from something called Hibernate, and I'm not sure if that will let me set enable_hashjoin=off through its API... Thanks, Otis ----- Original Message ---- From: Tom Lane <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: pgsql-sql@postgresql.org Sent: Wednesday, May 10, 2006 8:27:01 PM Subject: Re: [SQL] Help with a seq scan on multi-million row table <[EMAIL PROTECTED]> writes: > -> Hash Join (cost=2797.65..140758.50 rows=3790 width=10) > (actual time=248.530..380635.132 rows=8544 loops=1) > Hash Cond: ("outer".user_url_id = "inner".id) > -> Seq Scan on user_url_tag userurltag0_ > (cost=0.00..106650.30 rows=6254530 width=14) (actual time=0.017..212256.630 > rows=6259553 loops=1) > -> Hash (cost=2795.24..2795.24 rows=962 width=4) > (actual time=199.840..199.840 rows=0 loops=1) > -> Index Scan using ix_user_url_user_id_url_id on > user_url userurl1_ (cost=0.00..2795.24 rows=962 width=4) (actual > time=0.048..193.707 rows=1666 loops=1) > Index Cond: (user_id = 1) Hm, I'm not sure why it's choosing that join plan. A nestloop indexscan wouldn't be terribly cheap, but just counting on my fingers it seems like it ought to come in at less than 100000 cost units. What do you get if you set enable_hashjoin off? (Then try disabling its second-choice join type too --- I'm interested to see EXPLAIN ANALYZE output for all three join types.) What PG version is this exactly? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster