We have a performance problem accessing one of our tables, I think because
the statistics are out of date.  The table is fairly large, on the order of
100M rows or so.
The general structure of the table is as follows:

Column | Type | Modifiers
-------------------+--------------------------+------------------------------------------------
id | bigint | not null default nextval('foo_id_seq'::regclass)
record_id | text |
account_id | bigint | not null

With indexes:
    "foo_pkey" PRIMARY KEY, btree (id)
    "uq_account_id_record_id" UNIQUE CONSTRAINT, btree (account_id,
record_id)


This table is almost always queried using a combination of (account_id,
record_id) and is generally pretty fast.  However, under certain loads, the
query becomes slower and slower as time goes on.  The workload that causes
this to happen is when data for a new account_id is being inserted into the
table.  This will happen in rapid succession and may insert millions of
rows over the course of several hours.

The pattern that I notice when this happens is that the CPU on DB will be
pegged much higher than usual, and the query to lookup records for the
(account_id, record_id) combo will steadily rise from <1ms to more then 2
or 3 seconds over time.

The fix I have employed to restore the speed of the query after I notice it
is happening is to manually issue a VACUUM ANALYZE on the table.  After the
analyze is done, the query returns to its normal speed.

I am looking for suggestions for how to tune, or perhaps automatically
detect this pattern, so that I don't have to manually intervene whenever
this happens.

Here are my autovacuum settings:

                name                 |  setting  | unit
-------------------------------------+-----------+------
 autovacuum                          | on        |
 autovacuum_analyze_scale_factor     | 0.05      |
 autovacuum_analyze_threshold        | 50        |
 autovacuum_freeze_max_age           | 200000000 |
 autovacuum_max_workers              | 3         |
 autovacuum_multixact_freeze_max_age | 400000000 |
 autovacuum_naptime                  | 30        | s
 autovacuum_vacuum_cost_delay        | 20        | ms
 autovacuum_vacuum_cost_limit        | -1        |
 autovacuum_vacuum_scale_factor      | 0.1       |
 autovacuum_vacuum_threshold         | 50        |
 autovacuum_work_mem                 | -1        | kB

We're using 9.4.4 (RDS)

Reply via email to