On Thu, Aug 26, 2021 at 5:45 PM Ronan Dunklau <ronan.dunk...@aiven.io> wrote: > order to fail early if the replication slot doesn't exist, so please find > attached v2 for that.
Thanks for the patches. Here are some comments: 1) While the intent of these patches looks good, I have following concern with new replication command READ_REPLICATION_SLOT: what if the pg_receivewal exits (because user issued a SIGINT or for some reason) after flushing the received WAL to disk, before it sends sendFeedback to postgres server's walsender so that it doesn't get a chance to update the restart_lsn in the replication slot via PhysicalConfirmReceivedLocation. If the pg_receivewal is started again, isn't it going to get the previous restart_lsn and receive the last chunk of flushed WAL again? 2) What is the significance of READ_REPLICATION_SLOT for logical replication slots? I read above that somebody suggested to restrict the walsender to handle READ_REPLICATION_SLOT for physical replication slots so that the callers will see a command failure. But I tend to think that it is clean to have this command common for both physical and logical replication slots and the callers can have an Assert(type == 'physical'). 3) Isn't it useful to send active, active_pid info of the replication slot via READ_REPLICATION_SLOT? pg_receivewal can use Assert(active == true && active_pid == getpid()) as an assertion to ensure that it is the sole owner of the replication slot? Also, is it good send wal_status info 4) I think below messages should start with lower case letter and also there are some typos: + pg_log_warning("Could not fetch the replication_slot \"%s\" information " + pg_log_warning("Server does not suport fetching the slot's position, " something like: + pg_log_warning("could not fetch replication slot \"%s\" information, " + "resuming from current server position instead", replication_slot); + pg_log_warning("server does not support fetching replication slot information, " + "resuming from current server position instead"); 5) How about emitting the above messages in case of "verbose"? 6) How about an assertion like below? + if (stream.startpos == InvalidXLogRecPtr) + { + stream.startpos = serverpos; + stream.timeline = servertli; + } + +Assert(stream.startpos != InvalidXLogRecPtr)>> 7) How about we let pg_receivewal use READ_REPLICATION_SLOT as an option? 8) Just an idea, how about we store pg_receivewal's lastFlushPosition in a file before pg_receivewal exits and compare it with the restart_lsn that it received from the replication slot, if lastFlushPosition == received_restart_lsn well and good, if not, then something would have happened and we always start at the lastFlushPosition ? Regards, Bharath Rupireddy.