On Mon, Aug 30, 2021 at 3:26 PM Ronan Dunklau <ronan.dunk...@aiven.io> wrote: > Thank you for this review ! Please see the other side of the thread where I > answered Michael Paquier with a new patchset, which includes some of your > proposed modifications.
Thanks for the updated patches. Here are some comments on v3-0001 patch. I will continue to review 0002 and 0003. 1) Missing full stops "." at the end. + <literal>logical</literal> + when following the current timeline history + position, when following the current timeline history 2) Can we have the "type" column as "slot_type" just to keep in sync with the pg_replication_slots view? 3) Can we mention the link to pg_replication_slots view in the columns - "type", "restart_lsn", "confirmed_flush_lsn"? Something like: the "slot_type"/"restart_lsn"/"confirmed_flush_lsn" is same as <link linkend="view-pg-replication-slots"><structname>pg_replication_slots</structname></link> view. 4) Can we use "read_replication_slot" instead of "identify_replication_slot", just to be in sync with the actual command? 5) Are you actually copying the slot contents into the slot_contents variable here? Isn't just taking the pointer to the shared memory? + /* Copy slot contents while holding spinlock */ + SpinLockAcquire(&slot->mutex); + slot_contents = *slot; + SpinLockRelease(&slot->mutex); + LWLockRelease(ReplicationSlotControlLock); You could just do: + Oid dbid; + XLogRecPtr restart_lsn; + XLogRecPtr confirmed_flush; + /* Copy the required slot contents */ + SpinLockAcquire(&slot->mutex); + dbid = slot.data.database; + restart_lsn = slot.data.restart_lsn; + confirmed_flush = slot.data.confirmed_flush; + SpinLockRelease(&slot->mutex); + LWLockRelease(ReplicationSlotControlLock); 6) It looks like you are not sending anything as a response to the READ_REPLICATION_SLOT command, if the slot specified doesn't exist. You are just calling end_tup_output which just calls rShutdown (points to donothingCleanup of printsimpleDR) if (has_value) do_tup_output(tstate, values, nulls); end_tup_output(tstate); Can you test the use case where the pg_receivewal asks READ_REPLICATION_SLOT with a non-existent replication slot and see with your v3 patch how it behaves? Why don't you remove has_value flag and do this in ReadReplicationSlot: Datum values[5]; bool nulls[5]; MemSet(values, 0, sizeof(values)); MemSet(nulls, 0, sizeof(nulls)); + dest = CreateDestReceiver(DestRemoteSimple); + tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual); + do_tup_output(tstate, values, nulls); + end_tup_output(tstate); 7) Why don't we use 2 separate variables "restart_tli", "confirmed_flush_tli" instead of "slots_position_timeline", just to be more informative? 8) I still have the question like, how can a client (pg_receivewal for instance) know that it is the current owner/user of the slot it is requesting the info? As I said upthread, why don't we send "active" and "active_pid" fields of the pg_replication_slots view? Also, it would be good to send the "wal_status" field so that the client can check if the "wal_status" is not "lost"? 9) There are 2 new lines at the end of ReadReplicationSlot. We give only one new line after each function definition. end_tup_output(tstate); } <<1stnewline>> <<2ndnewline>> /* * Handle TIMELINE_HISTORY command. */ 10) Why do we need to have two test cases for "non-existent" slots? Isn't the test case after "DROP REPLICATION" enough? +($ret, $stdout, $stderr) = $node_primary->psql( + 'postgres', 'READ_REPLICATION_SLOT non_existent_slot;', + extra_params => [ '-d', $connstr_rep ]); +ok( $ret == 0, + "READ_REPLICATION_SLOT does not produce an error with non existent slot"); +ok( $stdout eq '', + "READ_REPLICATION_SLOT returns no tuple if a slot is non existent"); You can just rename the test case name from: + "READ_REPLICATION_SLOT returns no tuple if a slot has been dropped"); to + "READ_REPLICATION_SLOT returns no tuple if a slot is non existent"); Regards, Bharath Rupireddy.