Hi, I set up logical replication a few days ago, but it's throwing some
weird log lines that have me worried. Does anybody have experience with
lines like the following on a subscriber:

LOG: logical replication table synchronization worker for subscription
"compass_subscription", table "search_opinionscitedbyrecapdocument" has
started
ERROR: could not start WAL streaming: ERROR: replication slot
"pg_20031_sync_17418_7324846428853951375" does not exist
LOG: background worker "logical replication worker" (PID 1014) exited with
exit code 1

Slots with this kind of name (pg_xyz_sync_*) are created during the initial
sync, but it seems like the subscription is working based on a quick look
in a few tables.

I thought this might be related to running out of slots on the publisher,
so I increased both max_replication_slots and max_wal_senders to 50 and
rebooted so those would take effect. No luck.

I thought rebooting the subscriber might help. No luck.

When I look in the publisher to see the slots we have...

SELECT * FROM pg_replication_slots;

...I do not see the one that's missing according to the log lines.

So it seems like the initial sync might have worked properly (tables have
content), but that I have an errant process on the subscriber that might be
stuck in a retry loop.

I haven't been able to fix this, and I think my last attempt might be a new
subscription with copy_data=false, but I'd rather avoid that if I can.

Is there a way to fix or understand this so that I don't get the log lines
forever and so that I can be confident the replication is in good shape?

Thank you!


Mike

Reply via email to