>So you are VACUUMing the lesser 'younger' tables?
I VACUUM those with the highest age : 
SELECT age,array_agg(table_name) FROM (SELECT c.oid::regclass as table_name,    
   greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age FROM pg_class c 
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHERE c.relkind IN ('r', 'm') 
order by 2 desc limit 1000) tt group by age order by age desc

  
>> • I exit singleuser mode and relaunch the server so I keep on vacuuming the 
>> oldest table first by copying
>> • But I don’t even have time to launch my script : as soon as launch the 
>> server, the auto-vacuum daemon kicks in and burns through the transactions : 
>> I guess it’s not smart enough to start with 
>>the oldest tables ?
>Not understanding; 'the auto-vacuum daemon kicks in and burns through 
>the transactions'.
>Are you saying it is reclaiming xids for you or using them?
>If reclaiming that is what is supposed to do and is good thing.
>Or am I misunderstanding?
Here is  what the logs show when I do what I described above

1) I got 7 transactions back in single user mode
Aug  7 23:40:57 p2 postgres[30376]: [5-1] 2016-08-07 23:40:57 CEST WARNING:  
database "public" must be vacuumed within 999893 transactions
Aug  7 23:40:57 p2 postgres[30376]: [5-2] 2016-08-07 23:40:57 CEST HINT:  To 
avoid a database shutdown, execute a database-wide VACUUM in that database.
 
2) I exit single user mode and restart the database
Aug  7 23:41:40 p2 postgres[15457]: [1-1] 2016-08-07 23:41:40 CEST LOG:  
database system was shut down at 2016-08-07 23:41:32 CEST
Aug  7 23:41:40 p2 postgres[15458]: [1-1] 2016-08-07 23:41:40 CEST LOG:  
incomplete startup packet
Aug  7 23:41:40 p2 postgres[15459]: [1-1] 2016-08-07 23:41:40 CEST FATAL:  the 
database system is starting up
Aug  7 23:41:40 p2 postgres[15457]: [2-1] 2016-08-07 23:41:40 CEST WARNING:  
database with OID 16385 must be vacuumed within 999892 transactions
Aug  7 23:41:40 p2 postgres[15457]: [2-2] 2016-08-07 23:41:40 CEST HINT:  To 
avoid a database shutdown, execute a database-wide VACUUM in that database.
Aug  7 23:41:40 p2 postgres[15457]: [2-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:40 p2 postgres[15462]: [1-1] 2016-08-07 23:41:40 CEST LOG:  
autovacuum launcher started
Aug  7 23:41:40 p2 postgres[15447]: [1-1] 2016-08-07 23:41:40 CEST LOG:  
database system is ready to accept connections

3) but I don’t even have time to run a query to see the state of the databases
Aug  7 23:41:52 p2 postgres[15487]: [2-1] 2016-08-07 23:41:52 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:52 p2 postgres[15487]: [2-2] 2016-08-07 23:41:52 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:52 p2 postgres[15487]: [2-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:52 p2 postgres[15487]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT:  
SELECT datname, age(datfrozenxid) FROM pg_database
Aug  7 23:41:52 p2 postgres[15497]: [2-1] 2016-08-07 23:41:52 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:52 p2 postgres[15497]: [2-2] 2016-08-07 23:41:52 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:52 p2 postgres[15497]: [2-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:52 p2 postgres[15497]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT:  
COPY  ( SELECT 'VACUUM VERBOSE ' ||c.oid::regclass|| ';--' as _command, 
greatest ( age ( c.relfrozenxid ) ,age ( t.relfrozenxid ) ) as age FROM 
pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHERE c.relkind IN ( 
'r' , 'm' ) order by 2 desc limit 1000 ) TO STDOUT (FORMAT text)
Aug  7 23:41:52 p2 postgres[15518]: [2-1] 2016-08-07 23:41:52 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:52 p2 postgres[15518]: [2-2] 2016-08-07 23:41:52 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:52 p2 postgres[15518]: [2-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:52 p2 postgres[15518]: [2-4] 2016-08-07 23:41:52 CEST STATEMENT:  
SELECT datname, age(datfrozenxid) FROM pg_database
Aug  7 23:41:53 p2 rsyslogd-2177: imuxsock lost 3512 messages from pid 15476 
due to rate-limiting
Aug  7 23:41:53 p2 postgres[15476]: [2-1] 2016-08-07 23:41:53 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:53 p2 postgres[15476]: [2-2] 2016-08-07 23:41:53 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:53 p2 postgres[15476]: [2-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:53 p2 postgres[15476]: [2-4] 2016-08-07 23:41:53 CEST CONTEXT:  
automatic analyze of table "public.public.aaaaaa"


4) lots of autovacuum failures
Aug  7 23:41:55 p2 postgres[15476]: [3-1] 2016-08-07 23:41:55 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:55 p2 postgres[15476]: [3-2] 2016-08-07 23:41:55 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:55 p2 postgres[15476]: [3-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:55 p2 postgres[15476]: [3-4] 2016-08-07 23:41:55 CEST CONTEXT:  
automatic analyze of table "public.public.aaaaab"
Aug  7 23:41:56 p2 postgres[15476]: [4-1] 2016-08-07 23:41:56 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:56 p2 postgres[15476]: [4-2] 2016-08-07 23:41:56 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:56 p2 postgres[15476]: [4-3] #011You might also need to commit or 
roll back old prepared transactions.
Aug  7 23:41:56 p2 postgres[15476]: [4-4] 2016-08-07 23:41:56 CEST CONTEXT:  
automatic analyze of table "public.public.aaaaac"
Aug  7 23:41:56 p2 postgres[15476]: [5-1] 2016-08-07 23:41:56 CEST ERROR:  
database is not accepting commands to avoid wraparound data loss in database 
"public"
Aug  7 23:41:56 p2 postgres[15476]: [5-2] 2016-08-07 23:41:56 CEST HINT:  Stop 
the postmaster and use a standalone backend to vacuum that database.
Aug  7 23:41:56 p2 postgres[15476]: [5-3] #011You might also need to commit or 
roll back old prepared transactions.
(…)

>> So it looks like I’m going to have to bite the bullet and really vacuum the 
>> whole database instead of just the oldest tables first which will impact our 
>> production pipelines.  
> Or VACUUM the most heavily used tables in the database.
Why would that help ? Aren’t the tables with the highest age the problem ?

>    The next question to be asked is; what is creating the transactions and 
>    is the transaction rate 'normal' or is there a possibility you have a 
>    rogue process or rogue processes in action?
That’s always a possibility but I can’t think of what that would be.
I was guessing that the auto-vacuum wasn’t vacuuming the tables with the 
highest age first. Isn’t that a possibility ?

Thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to