Hi 2017-02-13 18:40 GMT+01:00 David Hinkle <hin...@cipafilter.com>:
> I'm having trouble with purges related to a large table. The delete > query consumes ram until postgres crashes due to OOM. I have a very > large table called log_raw. There are half a dozen related tables, > such as 'urls' and 'titles'. log_raw.urlid = urls.urlid and urls > contains the text of the various urls, for example. > > Each time I try to purge these side tables the unit OOM's. > > psql:postgres@cipafilter = explain DELETE FROM titles WHERE NOT EXISTS > ( SELECT 1 FROM log_raw WHERE log_raw.titleid = titles.titleid ); > QUERY PLAN > ──────────────────────────────────────────────────────────── > ───────────────────────────── > Delete on titles (cost=22166473.44..24850954.67 rows=870382 width=12) > -> Hash Anti Join (cost=22166473.44..24850954.67 rows=870382 width=12) > Hash Cond: (titles.titleid = log_raw.titleid) > -> Seq Scan on titles (cost=0.00..17871.64 rows=870664 width=10) > -> Hash (cost=12744792.64..12744792.64 rows=542011264 width=10) > -> Seq Scan on log_raw (cost=0.00..12744792.64 > rows=542011264 width=10) > (6 rows) > > what is your work_mem setting? Regards Pavel > psql:postgres@cipafilter = select count(*) from (select titleid from > log_raw group by titleid) as a; > count > ──────── > 872210 > (1 row) > > cipafilter=# select n_distinct from pg_stats where tablename = > 'log_raw' and attname = 'titleid'; > n_distinct > ------------ > 282 > (1 row) > > The planning data is wildly low for each of these fields, and I wonder > if because of that error the planner thinks it can keep all these id's > in ram while it works. Analyze doesn't fix it. Increasing the > statistics target improves the data in n_distinct but not > considerably, as increasing it 3 or 4 fold leads to it still being > wildly off. ALTER TABLE set n_distinct doesn't seem to be used by the > planner as it doesn't change any of the plans I've generated or seem > to be taken into account in the row estimates. I'm out of ideas. > Anybody have any ideas? > > > -- > David Hinkle > > Senior Software Developer > > Phone: 800.243.3729x3000 > > Email: hin...@cipafilter.com > > Hours: Mon-Fri 8:00AM-5:00PM (CT) > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >