I've been having problems where a HashAggregate is used because of a bad
estimate of the distinct number of elements involved.  In the following
example the total number of domain IDs is about 2/3 of the number of
rows, and it's estimated at about 1/15 of the actual value.  This will
occasionally cause the generated query to use a HashAggregate, and this
runs the backend out of memory - it will use 700 or more meg before
failing.

The following was run -immediately- after a vacuum.

explain analyze select sum(count) as sumc,class,domain_id into temp
new_clicks from clicks,countries where date > (current_date - 20) and
clicks.country_id=countries.country_id group by domain_id,class;

 GroupAggregate  (cost=1136261.89..1183383.51 rows=191406 width=12)
(actual time=138375.935..163794.452 rows=3258152 loops=1)
   ->  Sort  (cost=1136261.89..1147922.66 rows=4664311 width=12) (actual
time=138374.865..147308.343 rows=4514313 loops=1)
         Sort Key: clicks.domain_id, countries."class"
         ->  Hash Join  (cost=4.72..421864.06 rows=4664311 width=12)
(actual time=6837.405..66938.259 rows=4514313 loops=1)
               Hash Cond: ("outer".country_id = "inner".country_id)
               ->  Seq Scan on clicks  (cost=0.00..351894.67
rows=4664311 width=12) (actual time=6836.388..46865.490 rows=4514313
loops=1)
                     Filter: (date > (('now'::text)::date - 20))
               ->  Hash  (cost=4.18..4.18 rows=218 width=8) (actual
time=0.946..0.946 rows=0 loops=1)
                     ->  Seq Scan on countries  (cost=0.00..4.18
rows=218 width=8) (actual time=0.011..0.516 rows=218 loops=1)
 Total runtime: 175404.738 ms
(10 rows)
-- 
Mike Harding <[EMAIL PROTECTED]>


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Reply via email to