On 30/05/17 11:02, Kyotaro HORIGUCHI wrote: > At Thu, 25 May 2017 17:52:50 +0200, Petr Jelinek > <petr.jeli...@2ndquadrant.com> wrote in > <e082a56a-fd95-a250-3bae-0fff93832...@2ndquadrant.com> >> Hi, >> >> We have had issue with walsender timeout when used with logical decoding >> and the transaction is taking long time to be decoded (because it >> contains many changes) >> >> I was looking today at the walsender code and realized that it's because >> if the network and downstream are fast enough, we'll always take fast >> path in WalSndWriteData which does not do reply or keepalive processing >> and is only reached once the transaction has finished by other code. So >> paradoxically we die of timeout because everything was fast enough to >> never fall back to slow code path. >> >> I propose we only use fast path if the last processed reply is not older >> than half of walsender timeout, if it is then we'll force the slow code >> path to process the replies again. This is similar logic that we use to >> determine if to send keepalive message. I also added CHECK_INTERRUPRS >> call to fast code path because otherwise walsender might ignore them for >> too long on large transactions. >> >> Thoughts? > > + TimestampTz now = GetCurrentTimestamp(); > > I think it is not recommended to read the current time too > frequently, especially within a loop that hates slowness. (I > suppose that a loop that can fill up a send queue falls into that
Yeah that was my main worry for the patch as well, although given that the loop does tuple processing it might not be very noticeable. > category.) If you don't mind a certain amount of additional > complexity for eliminating the possible slowdown by the check, > timeout would be usable. Attached patch does almost the same > thing with your patch but without busy time check. > > What do you think about this? > I think we could do it that way. > # I saw that SIGQUIT doens't work for active publisher, which I > # think mention in another thread. Ah missed that email I guess, but that's the missing CHECK_INTERRUPTS(); in the fast-path which btw your updated patch is missing as well. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers