2018-05-24 8:30 GMT+02:00 Andrey Borodin <x4...@yandex-team.ru>: > Hi! > > > 24 мая 2018 г., в 0:55, Paolo Crosato <paolo.cros...@gmail.com> > написал(а): > > > > 1) VACUUM FULL was issued after the first time the error occurred, and a > couple of times later. CLUSTER was never run. > > 2) Several failovers tests were perfomed before the cluster was moved to > production. However, before the move, the whole cluster was wiped, > including all the application and monitoring users. After the db was moved > to production, a couple of users were added without any problem. > > 3) No, even if the replication level is set to logical in > postgresql.conf, we only use streaming replication. > > I've encountered seemingly similar ERROR: > [ 2018-05-22 15:04:03.270 MSK ,,,281756,XX001 ]:ERROR: found xmin > 747375134 from before relfrozenxid 2467346321 > [ 2018-05-22 15:04:03.270 MSK ,,,281756,XX001 ]:CONTEXT: automatic vacuum > of table "postgres.pg_catalog.pg_database" > > Table pg_database, probably, was not changed anyhow for a long period of > database exploitation. > Unfortunately, I've found out this only there were million of xids left > and had to vacuum freeze db in single user mode asap. But, probably, I will > be able to restore database from backups and inspect it, if necessary. > Though first occurrence of this error was beyond recovery window. > > Best regards, Andrey Borodin.
I could build a mirror instance with barman and see if the issue is present as well, then try to vacuum freeze it in single mode, and see if it disappears; but I would like to know why it happened in the first time. I wonder if the autovacuum settings played a role, we kept the defaults, even if the instance has a very heavy update workload. Best Regards, Paolo Crosato