On Fri, Sep 3, 2021 at 3:28 PM Ronan Dunklau <ronan.dunk...@aiven.io> wrote: > There is still the concern raised by Bharath about being able to select the > way to fetch the replication slot information for the user, and what should or > should not be included in the documentation. I am in favor of documenting the > process of selecting the wal start, and maybe include a --start-lsn option for > the user to override it, but that's maybe for another patch.
Let's hear from others. Thanks for the patches. I have some quick comments on the v5 patch-set: 0001: 1) Do you also want to MemSet values too in ReadReplicationSlot? 2) When if clause has single statement we don't generally use "{" and "}" + if (slot == NULL || !slot->in_use) + { + LWLockRelease(ReplicationSlotControlLock); + } you can just have: + if (slot == NULL || !slot->in_use) + LWLockRelease(ReplicationSlotControlLock); 3) This is still not copying the slot contents, as I said earlier you can just take the required info into some local variables instead of taking the slot pointer to slot_contents pointer. + /* Copy slot contents while holding spinlock */ + SpinLockAcquire(&slot->mutex); + slot_contents = *slot; + SpinLockRelease(&slot->mutex); + LWLockRelease(ReplicationSlotControlLock); 4) As I said earlier, why can't we have separate tli variables for more readability instead of just one slots_position_timeline and timeline_history variable? And you are not even resetting those 2 variables after if (!XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)), you might end up sending the restart_lsn timelineid for confirmed_flush. So, better use two separate variables. In fact you can use block local variables: + if (!XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)) + { + List *tl_history= NIL; + TimeLineID tli; + tl_history= readTimeLineHistory(ThisTimeLineID); + tli = tliOfPointInHistory(slot_contents.data.restart_lsn, + tl_history); + values[i] = Int32GetDatum(tli); + nulls[i] = false; + } similarly for confirmed_flush. 5) I still don't see the need for below test case: + "READ_REPLICATION_SLOT does not produce an error with non existent slot"); when we have + "READ_REPLICATION_SLOT does not produce an error with dropped slot"); Because for a user, dropped or non existent slot is same, it's just that for dropped slot we internally don't delete its entry from the shared memory. 0002: 1) Imagine GetSlotInformation always returns READ_REPLICATION_SLOT_ERROR, don't you think StreamLog enters an infinite loop there? Instead, why don't you just exit(1); instead of return; and retry? Similarly for READ_REPLICATION_SLOT_NONEXISTENT? I think, you can just do exit(1), no need to retry. + case READ_REPLICATION_SLOT_ERROR: + + /* + * Error has been logged by GetSlotInformation, return and + * maybe retry + */ + return; 2) Why is it returning READ_REPLICATION_SLOT_OK when slot_info isn't passed? And why are you having this check after you connect to the server and fetch the data? + /* If no slotinformation has been passed, we can return immediately */ + if (slot_info == NULL) + { + PQclear(res); + return READ_REPLICATION_SLOT_OK; + } Instead you can just have a single assert: + Assert(slot_name && slot_info ); 3) How about pg_log_error("could not read replication slot: instead of pg_log_error("could not fetch replication slot: 4) Why are you having the READ_REPLICATION_SLOT_OK case in default? + default: + if (slot_info.is_logical) + { + /* + * If the slot is not physical we can't expect to + * recover from that + */ + pg_log_error("cannot use the slot provided, physical slot expected."); + exit(1); + } + stream.startpos = slot_info.restart_lsn; + stream.timeline = slot_info.restart_tli; + } You can just have another case statement for READ_REPLICATION_SLOT_OK and in the default you can throw an error "unknown read replication slot status" or some other better working and exit(1); 5) Do you want initialize slot_info to 0? + if (replication_slot) + { + SlotInformation slot_info; + MemSet(slot_info, 0, sizeof(SlotInformation)); 6) This isn't readable: + slot_info->is_logical = strcmp(PQgetvalue(res, 0, 0), "logical") == 0; How about: if (strcmp(PQgetvalue(res, 0, 0), "logical") == 0) slot_info->is_logical = true; You don't need to set it false, because you would have MemSet(slot_info) in the caller. 7) How about uint32 hi; uint32 lo; instead of + uint32 hi, + lo; 8) Move SlotInformation * slot_info) to the next line as it crosses the 80 char limit. +GetSlotInformation(PGconn *conn, const char *slot_name, SlotInformation * slot_info) 9) Instead of a boolean is_logical, I would rather suggest to use an enum or #define macros the slot types correctly, because someday we might introduce new type slots and somebody wants is_physical kind of variable name? +typedef struct SlotInformation { + bool is_logical; Regards, Bharath Rupireddy.