Hi,

On 2024-05-17 15:12:31 +0200, Pavel Stehule wrote:
> after migration on PostgreSQL 16 I seen 3x times (about every week) broken
> tables on replica nodes. The query fails with error

Migrating from what version?


You're saying that the data is correctly accessible on primaries, but broken
on standbys? Is there any difference in how the page looks like on the primary
vs standby?


> ERROR:  could not access status of transaction 1442871302
> DETAIL:  Could not open file "pg_xact/0560": No such file or directory
>
> verify_heapam reports
>
> ^[[Aprd=# select * from verify_heapam('account_login_history') where blkno
> = 179036;
>  blkno  | offnum | attnum |                                msg
>
> --------+--------+--------+-------------------------------------------------------------------
>  179036 |     30 |        | xmin 1393743382 precedes oldest valid
> transaction ID 3:1687012112

So that's not just a narrow race...


> master
>
> (2024-05-17 14:36:57) prd=# SELECT * FROM
> page_header(get_raw_page('account_login_history', 179036));
>       lsn      │ checksum │ flags │ lower │ upper │ special │ pagesize │
> version │ prune_xid
> ───────────────┼──────────┼───────┼───────┼───────┼─────────┼──────────┼─────────┼───────────
>  A576/810F4CE0 │        0 │     4 │   296 │   296 │    8192 │     8192 │
>     4 │         0
> (1 row)
>
>
> replica
> prd_aukro=# SELECT * FROM page_header(get_raw_page('account_login_history',
> 179036));
>       lsn      | checksum | flags | lower | upper | special | pagesize |
> version | prune_xid
> ---------------+----------+-------+-------+-------+---------+----------+---------+-----------
>  A56C/63979DA0 |        0 |     0 |   296 |   296 |    8192 |     8192 |
>     4 |         0
> (1 row)

Is the replica behind the primary? Or did we somehow end up with diverging
data? The page LSNs differ by about 40GB...

Is there evidence of failed truncations of the relation in the log? From
autovacuum?

Does the data in the readable versions of the tuples on that page actually
look valid? Is it possibly duplicated data?


I'm basically wondering whether it's possible that we errored out during
truncation (e.g. due to a file permission issue or such). Due to some
brokenness in RelationTruncate() that can lead to data divergence between
primary and standby and to old tuples re-appearing on either.


Another question: Do you use pg_repack or such?

Greetings,

Andres Freund


Reply via email to