Bryce, - I think that the difference between the two queries has to do with the way postgresql execute them.
In the first the SGDB does: 1º Creates a temporary table with "m" X "n" rows where the "m" and "n" are the number of the rows in the tables been joined. 2º Take only the rows that has the same "work_key" 3º It restricts using the where clause. OBS: Maybe It use the where clause first on the tables just to minimize the "m" and "n". I not sure about that. Still it creates and "m" X "n" temporary table with lots of bad rows. In the second query the SGDB: 1º Select in "article_words" only the rows that correspond with the restriction to that "context_key". It results in a much smaller number of rows. "k" <<< "n". 2º It uses "k-results" and look for the for the rows where "word_key" is in the group created by the INNER Query. That's why you have the difference between the query's "Total runtime". Regards -- Helio Campos Mello de Andrade On Sun, Nov 2, 2008 at 3:51 PM, Bryce Nesbitt <[EMAIL PROTECTED]> wrote: > I'm a bit confused why the query planner is not restricting my join, and > not using the index. Two explain analyze statements follow. > Why is the second so much better? > > lyell5=> select version(); > PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 > 20061115 (prerelease) (Debian 4.1.1-21) > > lyell5=> explain analyze select * from article_words join words using > (word_key) where context_key=535462; > > +------------------------------------------------------------------------------------------------------------------------------------------------+ > | > QUERY > PLAN | > > +------------------------------------------------------------------------------------------------------------------------------------------------+ > | Hash Join (cost=192092.90..276920.93 rows=45327 width=17) (actual > time=6020.932..60084.817 rows=777 > loops=1) | > | Hash Cond: (article_words.word_key = > words.word_key) > | > | -> Index Scan using article_word_idx on article_words > (cost=0.00..55960.50 rows=45327 width=8) (actual time=0.031..0.547 > rows=777 loops=1) | > | Index Cond: (context_key = > 535462) > | > | -> Hash (cost=93819.62..93819.62 rows=5653462 width=13) (actual > time=6020.605..6020.605 rows=5651551 loops=1) > | > | -> Seq Scan on words (cost=0.00..93819.62 rows=5653462 > width=13) (actual time=0.006..2010.962 rows=5651551 > loops=1) | > | Total runtime: 60085.616 > ms > | > > +------------------------------------------------------------------------------------------------------------------------------------------------+ > > > lyell5=> explain analyze select * from words where word_key in (select > word_key from article_words where context_key=535462); > > +------------------------------------------------------------------------------------------------------------------------------------------------------+ > | > QUERY > PLAN | > > +------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Nested Loop (cost=56073.81..56091.41 rows=2 width=13) (actual > time=0.808..4.723 rows=777 > loops=1) | > | -> HashAggregate (cost=56073.81..56073.83 rows=2 width=4) (actual > time=0.795..1.072 rows=777 > loops=1) | > | -> Index Scan using article_word_idx on article_words > (cost=0.00..55960.50 rows=45327 width=4) (actual time=0.030..0.344 > rows=777 loops=1) | > | Index Cond: (context_key = > 535462) > | > | -> Index Scan using words_pkey on words (cost=0.00..8.78 rows=1 > width=13) (actual time=0.003..0.004 rows=1 > loops=777) | > | Index Cond: (words.word_key = > article_words.word_key) > | > | Total runtime: 4.936 > ms > | > > +------------------------------------------------------------------------------------------------------------------------------------------------------+ > > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql >