Hi, We have had issue with walsender timeout when used with logical decoding and the transaction is taking long time to be decoded (because it contains many changes)
I was looking today at the walsender code and realized that it's because if the network and downstream are fast enough, we'll always take fast path in WalSndWriteData which does not do reply or keepalive processing and is only reached once the transaction has finished by other code. So paradoxically we die of timeout because everything was fast enough to never fall back to slow code path. I propose we only use fast path if the last processed reply is not older than half of walsender timeout, if it is then we'll force the slow code path to process the replies again. This is similar logic that we use to determine if to send keepalive message. I also added CHECK_INTERRUPRS call to fast code path because otherwise walsender might ignore them for too long on large transactions. Thoughts? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
0001-Fix-walsender-timeouts-when-decoding-large-transacti.patch
Description: binary/octet-stream
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers