Re: [PERFORM] bizarre query performance question

Richard Huxton Thu, 02 Oct 2008 00:27:57 -0700

H. William Connors II wrote:
> fa_assignment has 44184945 records
> fa_assignment_detail has 82196027 records
> 
> explain select * from fa_assignment fa JOIN fa_assignment_detail fad ON
> (fad.assignment_id = fa.assignment_id) where fa.scenario_id = 0;
> 
>                                              QUERY
> PLAN                     
> -------------------------------------------------------------------------------------------------------
> 
> Hash Join  (cost=581289.72..4940729.76 rows=9283104 width=91)


Are you really expecting 9 million rows in the result? If so, this is
probably a reasonable plan.

>   Hash Cond: (fad.assignment_id = fa.assignment_id)
>   ->  Seq Scan on fa_assignment_detail fad  (cost=0.00..1748663.60
> rows=82151360 width=61)
>   ->  Hash  (cost=484697.74..484697.74 rows=4995439 width=30)
>         ->  Bitmap Heap Scan on fa_assignment fa 
> (cost=93483.75..484697.74 rows=4995439 width=30)
>               Recheck Cond: (scenario_id = 0)
>               ->  Bitmap Index Scan on fa_assignment_idx2 
> (cost=0.00..92234.89 rows=4995439 width=0)
>                     Index Cond: (scenario_id = 0)

It's restricting on scenario_id, building a bitmap to identify which
disk-blocks will contain one or more matching rows and then scanning
those. If those 5 million scenario_id=0 rows are spread over 10% of the
blocks then that's a good idea.

If it was expecting only a handful of rows with scenario_id=0 then I'd
expect it to switch to a "standard" index scan.

If your work_mem is small try something like:
  set work_mem = '50MB';
before running the query - maybe even larger.

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] bizarre query performance question

Reply via email to