Hi

2017-02-13 18:40 GMT+01:00 David Hinkle <hin...@cipafilter.com>:

> I'm having trouble with purges related to a large table.   The delete
> query consumes ram until postgres crashes due to OOM.   I have a very
> large table called log_raw.  There are half a dozen related tables,
> such as 'urls' and 'titles'.   log_raw.urlid = urls.urlid and urls
> contains the text of the various urls, for example.
>
> Each time I try to purge these side tables the unit OOM's.
>
> psql:postgres@cipafilter = explain DELETE FROM titles WHERE NOT EXISTS
> ( SELECT 1 FROM log_raw WHERE log_raw.titleid = titles.titleid );
>                                        QUERY PLAN
> ────────────────────────────────────────────────────────────
> ─────────────────────────────
>  Delete on titles  (cost=22166473.44..24850954.67 rows=870382 width=12)
>    ->  Hash Anti Join  (cost=22166473.44..24850954.67 rows=870382 width=12)
>          Hash Cond: (titles.titleid = log_raw.titleid)
>          ->  Seq Scan on titles  (cost=0.00..17871.64 rows=870664 width=10)
>          ->  Hash  (cost=12744792.64..12744792.64 rows=542011264 width=10)
>                ->  Seq Scan on log_raw  (cost=0.00..12744792.64
> rows=542011264 width=10)
> (6 rows)
>
>
what is your work_mem setting?

Regards

Pavel


> psql:postgres@cipafilter = select count(*) from (select titleid from
> log_raw group by titleid) as a;
>  count
> ────────
>  872210
> (1 row)
>
> cipafilter=# select n_distinct from pg_stats where tablename =
> 'log_raw' and attname = 'titleid';
>  n_distinct
> ------------
>         282
> (1 row)
>
> The planning data is wildly low for each of these fields, and I wonder
> if because of that error the planner thinks it can keep all these id's
> in ram while it works. Analyze doesn't fix it.   Increasing the
> statistics target improves the data in n_distinct but not
> considerably, as increasing it 3 or 4 fold leads to it still being
> wildly off.  ALTER TABLE set n_distinct doesn't seem to be used by the
> planner as it doesn't change any of the plans I've generated or seem
> to be taken into account in the row estimates. I'm out of ideas.
> Anybody have any ideas?
>
>
> --
> David Hinkle
>
> Senior Software Developer
>
> Phone:  800.243.3729x3000
>
> Email:  hin...@cipafilter.com
>
> Hours:  Mon-Fri   8:00AM-5:00PM (CT)
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Reply via email to