Hi,
While investigating logical decoding of pg_logical_emit_message(),
I noticed that the LSN reported for logical messages differs from
the LSN reported for other operations such as INSERT, UPDATE, and
DELETE.
For example, with the following transaction:
BEGIN;
INSERT INTO data(data) VALUES('1');
UPDATE data SET data = 'a' WHERE id = 1;
DELETE FROM data WHERE id = 1;
SELECT * FROM pg_logical_emit_message(true, 'test1', 'aaa');
INSERT INTO data(data) VALUES('2');
TRUNCATE data;
COMMIT;
=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL,
NULL);
lsn | xid | data
------------+-----+------------------------------------------------------------
0/017E9640 | 696 | BEGIN 696
0/017E9640 | 696 | table public.data: INSERT: id[integer]:3
data[text]:'1'
0/017E96C0 | 696 | table public.data: UPDATE: id[integer]:1
data[text]:'a'
0/017E9740 | 696 | table public.data: DELETE: id[integer]:1
0/017E97F8 | 696 | message: transactional: 1 prefix: test1, sz: 3
content:aaa
0/017E9830 | 696 | table public.data: INSERT: id[integer]:4
data[text]:'2'
0/017EA420 | 696 | table public.data: TRUNCATE: (no-flags)
0/017EA5C0 | 696 | COMMIT 696
$ pg_waldump -f data/pg_wal/000000010000000000000001
rmgr: Heap len (rec/tot): 61/ 61, tx: 696, lsn:
0/017E9640, prev 0/017E9608, desc: INSERT off: 5, flags: 0x08, blkref
#0: rel 1663/5/16385 blk 0
..(snip)..
rmgr: Heap len (rec/tot): 72/ 72, tx: 696, lsn:
0/017E96C0, prev 0/017E9680, desc: HOT_UPDATE old_xmax: 696, old_off: 2,
old_infobits: [], flags: 0x10, new_xmax: 0, new_off: 6, blkref #0: rel
1663/5/16385 blk 0
..(snip)..
rmgr: Heap len (rec/tot): 64/ 64, tx: 696, lsn:
0/017E9740, prev 0/017E9708, desc: DELETE xmax: 696, off: 6, infobits:
[KEYS_UPDATED], flags: 0x04, blkref #0: rel 1663/5/16385 blk 0
..(snip)..
rmgr: LogicalMessage len (rec/tot): 59/ 59, tx: 696, lsn:
0/017E97B8, prev 0/017E9780, desc: MESSAGE transactional, prefix
"test1"; payload (3 bytes): 61 61 61
rmgr: Standby len (rec/tot): 54/ 54, tx: 0, lsn:
0/017E97F8, prev 0/017E97B8, desc: RUNNING_XACTS nextXid 697
latestCompletedXid 695 oldestRunningXid 696; 1 xacts: 696
..(snip)..
rmgr: Heap len (rec/tot): 61/ 61, tx: 696, lsn:
0/017E9830, prev 0/017E97F8, desc: INSERT off: 7, flags: 0x08, blkref
#0: rel 1663/5/16385 blk 0
Comparing the output of pg_logical_slot_get_changes() with the
pg_waldump, the LSNs reported for INSERT, UPDATE, and DELETE
match the start LSN of the corresponding WAL records. However,
the LSN reported for the logical message does not match the
LogicalMessage WAL record itself (0/017E97B8); instead, it
matches the LSN of the following RUNNING_XACTS record
(0/017E97F8).
I found that changes such as INSERTs and UPDATEs are queued via
ReorderBufferQueueChange() using XLogRecordBuffer.origptr,
whereas logical messages are queued using
XLogRecordBuffer.endptr. This appears to explain the observed
behavior.
My question is: is there a particular reason why logical messages
use endptr instead of origptr?
Looking through the history, this behavior seems to go back to
commit 3fe3511d05127c, which added logical decoding support for
pg_logical_emit_message().
BTW The reason I started investigating this is that we
encountered a data-loss issue in the Debezium PostgreSQL
connector, which uses logical decoding for Change Data Capture.
Under certain circumstances, messages emitted by
pg_logical_emit_message() could be skipped during recovery
because the message LSN behaves differently from other decoded
operations, as described. The attached patch, which uses origptr
instead of endptr, eliminates the issue in my testing.
Of course, I think consumers of logical decoding, such as
Debezium, could work around this by treating message LSNs
differently. However, compared to other decoded operations, this
special handling feels somewhat unexpected, so I wanted to ask
whether the current behavior is intentional.
Thanks,
--
Atsushi Torikoshi
Seconded from NTT DATA CORPORATION to SRA OSS K.K.
From 8a51b34aa0f9c358b28855a5349fb7084f5a6e66 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <[email protected]>
Date: Mon, 8 Jun 2026 21:44:33 +0900
Subject: [PATCH v1] Align logical message LSN with other operations
The regression test is adjusted accordingly. The test only emits a
single non-transactional logical message, which is decoded with
xid = 0, so filtering on xid = 0 seems to be sufficient to identify
the message without matching on its LSN.
---
src/backend/replication/logical/decode.c | 2 +-
src/test/subscription/t/020_messages.pl | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index c944be4ac83..09efc15e454 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -668,7 +668,7 @@ logicalmsg_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
if (!message->transactional)
snapshot = SnapBuildGetOrBuildSnapshot(builder);
- ReorderBufferQueueMessage(ctx->reorder, xid, snapshot, buf->endptr,
+ ReorderBufferQueueMessage(ctx->reorder, xid, snapshot, buf->origptr,
message->transactional,
message->message, /* first part of message is
* prefix */
diff --git a/src/test/subscription/t/020_messages.pl b/src/test/subscription/t/020_messages.pl
index ac518849156..26dede093df 100644
--- a/src/test/subscription/t/020_messages.pl
+++ b/src/test/subscription/t/020_messages.pl
@@ -107,7 +107,7 @@ $result = $node_publisher->safe_psql(
'proto_version', '1',
'publication_names', 'tap_pub',
'messages', 'true')
- WHERE lsn = '$message_lsn' AND xid = 0
+ WHERE xid = 0
));
is($result, qq(77|0), 'non-transactional message on slot is M');
--
2.48.1