Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

Petr Jelinek Fri, 02 Sep 2016 01:51:13 -0700

On 02/09/16 09:58, Craig Ringer wrote:

On 1 September 2016 at 21:19, Simon Riggs <[email protected]> wrote:

I agree the doc patch should go in, though I suggest reword it
slightly, like attached patch.


Thanks. The rewording looks good to me and I think that
Doc-correction-each-change-once.v2.patch is ready for commit.

Meanwhile, thinking about the patch to dirty slots on
confirmed_flush_lsn advance some more, I don't think it's ideal to do
it in LogicalConfirmReceivedLocation(). I'd rather not add more
complexity there, it's already complex enough. Doing it there will
also lead to unnecessary slot write-out being done to slots at normal
checkpoints even if the slot hasn't otherwise changed, possibly in
response to lsn advances sent in response to keepalives due to
activity on other unrelated databases. Slots are always fsync()ed when
written out so we don't want to do it more than we have to.

We really only need to write out slots where only the
confirmed_flush_lsn has changed at a shutdown checkpoint since it's
not really a big worry if it goes backwards on crash, and otherwise it
can't. Even then it only _really_ matters when the SQL interface is
used. Losing the confirmed_flush_lsn is very annoying when using
pg_recvlogical too, and was the original motivation for this patch.
But I'm thinking of instead teaching pg_recvlogical to write out a
status file with its last confirmed point on exit and to be able to
take that as an argument when (re)connecting. Poor-man's replication
origins, effectively.

So here's a simpler patch that just dirties the slot when it's
replayed something from it on the SQL interface, so it's flushed out
next checkpoint or on shutdown. That's the main user visible defect
that should be fixed and it's trivial to do here. It means we'll still
forget the confirmed_flush_lsn on clean shutdown if it was advanced
using the walsender protocol, but *shrug*. That's just a little
inconvenient. I can patch pg_recvlogical separately.

Okay that sounds reasonable, the SQL interface is already somewhatdifferent than walsender as it does not really "stream" so makes senseto improve the behavior there. As a side note, I would really like tohave cursor-like SQL interface which behaves more like walsender one butthat's very different patch.


The alternative is probably to add a second, "softer" dirty tracking
method that only causes a write at a clean shutdown or forced
checkpoint - and maybe doesn't bother fsync()ing. That's a bit more
invasive but would work for walsender use as well as the SQL
interface. I don't think it's worth the bother, since in the end
callers have to be prepared for repeated data on crash anyway.

Correct me if I am wrong but I think the only situation where it wouldmatter is on server that restarts often or crashes often (as the logicaldecoding then has to do the work many times) but I don't think it'sworth optimizing for that kind of scenario.


--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

Reply via email to