Don Seiler wrote:
> We have a report query that has gone from maybe a few seconds to run to a few 
> minutes to run since mid-July.
> Looking at the output of EXPLAIN ANALYZE, the row count estimates are way 
> off, even though this table was
> just analyzed a day or so ago. What's more bizarre to me is that the row 
> count esimate is *always* 75 for
> every node of the plan, where the actual rows is in the hundreds or 
> thousands. This table is one of the
> busiest tables in our production database (many inserts and updates). It is 
> autovacuumed and autoanalyzed
> a few times per week, although I'm looking to change it to a nightly manual 
> schedule to avoid daytime autovacuums.
> 
>  Hash Join  (cost=1869142.34..1869146.15 rows=75 width=88) (actual 
> time=179877.869..179878.011 rows=759 loops=1)
>    Hash Cond: (stores.pkey = lt.store_pkey)
>    Buffers: shared hit=1654593 read=331897 dirtied=249
>    ->  Seq Scan on stores  (cost=0.00..2.77 rows=77 width=22) (actual 
> time=0.007..0.023 rows=78 loops=1)
>          Buffers: shared hit=2
>    ->  Hash  (cost=1869141.40..1869141.40 rows=75 width=50) (actual 
> time=179877.847..179877.847 rows=759 loops=1)
>          Buckets: 1024  Batches: 1  Memory Usage: 73kB
>          Buffers: shared hit=1654591 read=331897 dirtied=249
>          ->  Subquery Scan on lt  (cost=1869138.59..1869141.40 rows=75 
> width=50) (actual time=179875.976..179877.697 rows=759 loops=1)
>                Buffers: shared hit=1654591 read=331897 dirtied=249
>                ->  GroupAggregate  (cost=1869138.59..1869140.65 rows=75 
> width=50) (actual time=179875.976..179877.606 rows=759 loops=1)
>                      Group Key: lts.store_pkey, lts.owner, 
> (date_trunc('minute'::text, lts.date_gifted))
>                      Filter: (count(*) IS NOT NULL)
>                      Buffers: shared hit=1654591 read=331897 dirtied=249
>                      ->  Sort  (cost=1869138.59..1869138.78 rows=75 width=42) 
> (actual time=179875.961..179876.470 rows=6731 loops=1)
>                            Sort Key: lts.store_pkey, lts.entry_source_owner, 
> (date_trunc('minute'::text, lts.date_gifted))
>                            Sort Method: quicksort  Memory: 757kB
>                            Buffers: shared hit=1654591 read=331897 dirtied=249
>                            ->  Index Scan using gifts_date_added on gifts lts 
>  (cost=0.56..1869136.25 rows=75 width=42) (actual time=190.657..179870.165 
> rows=6731 loops=1)
>                                  Index Cond: ((date_added > '2018-07-14 
> 11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13 
> 14:14:21'::timestamp without time zone))
>                                  Filter: ((date_gifted >= '2018-08-13 
> 11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13 
> 14:14:21'::timestamp without time zone))
>                                  Rows Removed by Filter: 938197
>                                  Buffers: shared hit=1654591 read=331897 
> dirtied=249
>  Planning time: 0.426 ms
>  Execution time: 179893.894 ms
> 
> I don't have a version of this query from prior to this summer, but getting 
> explain plan for older data from
> older sandboxes show a similar plan.
> 
> Sidenote: I am suggesting that an index be added on the date_gifted field as 
> that is far more selective and avoids
> throwing rows away. However I'm very interested in why every node dealing 
> with the gifts table thinks rows=75
> when the actual is much, much higher. And 75 seems like too round of a number 
> to be random?

Yes, I would say that adding an index on "date_gifted" would help.  You may end
up with two bitmap index scans that get combined.
Make sure "work_mem" is big enough to avoid lossy bitmaps (indicated in the 
plan).

About the misestimate:

You could try running ANALYZE with an increased "default_statistics_target" and 
see
if that changes the estimate.
If yes, then maybe you should increase statistics for that table or (seing that 
you are
querying current values) you should collect statistics more often.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Reply via email to