Re: [SQL] Seq scan on join, not on subselect? analyze this

Helio Campos Mello de Andrade Mon, 10 Nov 2008 03:29:09 -0800

Bryce,

 - I think that the difference between the two queries has to do with the
way postgresql execute them.


   In the first the SGDB does:
         1º Creates a temporary table with "m" X "n" rows where the "m" and
"n" are the number of the rows in the tables been joined.
         2º Take only the rows that has the same "work_key"
         3º It restricts using the where clause.

        OBS: Maybe It use the where clause first on the tables just to
minimize the "m" and "n". I not sure about that. Still it creates and "m" X
"n" temporary table with lots of bad rows.

   In the second query the SGDB:
      1º Select in "article_words" only the rows that correspond with the
restriction to that "context_key". It results in a much smaller number of
rows. "k" <<< "n".
      2º It uses "k-results" and look for the for the rows where "word_key"
is in the group created by the INNER Query.

That's why you have the difference between the query's "Total runtime".

Regards

--
Helio Campos Mello de Andrade



On Sun, Nov 2, 2008 at 3:51 PM, Bryce Nesbitt <[EMAIL PROTECTED]> wrote:

> I'm a bit confused why the query planner is not restricting my join, and
> not using the index.  Two explain analyze statements follow.
> Why is the second so much better?
>
> lyell5=> select version();
> PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
> 20061115 (prerelease) (Debian 4.1.1-21)
>
> lyell5=> explain analyze select * from article_words join words using
> (word_key) where context_key=535462;
>
> +------------------------------------------------------------------------------------------------------------------------------------------------+
> |
> QUERY
> PLAN                                                                   |
>
> +------------------------------------------------------------------------------------------------------------------------------------------------+
> | Hash Join  (cost=192092.90..276920.93 rows=45327 width=17) (actual
> time=6020.932..60084.817 rows=777
> loops=1)                                  |
> |   Hash Cond: (article_words.word_key =
> words.word_key)
> |
> |   ->  Index Scan using article_word_idx on article_words
> (cost=0.00..55960.50 rows=45327 width=8) (actual time=0.031..0.547
> rows=777 loops=1) |
> |         Index Cond: (context_key =
> 535462)
> |
> |   ->  Hash  (cost=93819.62..93819.62 rows=5653462 width=13) (actual
> time=6020.605..6020.605 rows=5651551 loops=1)
>  |
> |         ->  Seq Scan on words  (cost=0.00..93819.62 rows=5653462
> width=13) (actual time=0.006..2010.962 rows=5651551
> loops=1)                  |
> | Total runtime: 60085.616
> ms
> |
>
> +------------------------------------------------------------------------------------------------------------------------------------------------+
>
>
> lyell5=> explain analyze select * from words where word_key in (select
> word_key from article_words where context_key=535462);
>
> +------------------------------------------------------------------------------------------------------------------------------------------------------+
> |
> QUERY
> PLAN                                                                      |
>
> +------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Nested Loop  (cost=56073.81..56091.41 rows=2 width=13) (actual
> time=0.808..4.723 rows=777
> loops=1)                                                   |
> |   ->  HashAggregate  (cost=56073.81..56073.83 rows=2 width=4) (actual
> time=0.795..1.072 rows=777
> loops=1)                                            |
> |         ->  Index Scan using article_word_idx on article_words
> (cost=0.00..55960.50 rows=45327 width=4) (actual time=0.030..0.344
> rows=777 loops=1) |
> |               Index Cond: (context_key =
> 535462)
> |
> |   ->  Index Scan using words_pkey on words  (cost=0.00..8.78 rows=1
> width=13) (actual time=0.003..0.004 rows=1
> loops=777)                            |
> |         Index Cond: (words.word_key =
> article_words.word_key)
> |
> | Total runtime: 4.936
> ms
> |
>
> +------------------------------------------------------------------------------------------------------------------------------------------------------+
>
>
> --
> Sent via pgsql-sql mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>

Re: [SQL] Seq scan on join, not on subselect? analyze this

Reply via email to