> On Feb 25, 2026, at 18:45, Jakub Wartak <[email protected]> wrote:
> 
> On Wed, Feb 25, 2026 at 10:15 AM Chao Li <[email protected]> wrote:
> 
> Hi Chao, thanks for review.
> 
>> A few comments for v3:
>> 
>> 1 - syncrep.c
> [..]
>> +                                               proc_die_sender_pid == 0 ? 0 
>> :
>> +                                                       errdetail("Signal 
>> sent by PID %lld, UID %lld.",
>> +                                                               (long 
>> long)proc_die_sender_pid, (long long)proc_die_sender_uid)
>> +                                               ));
>> ```
>> 
>> Here errdetail is used twice. I guess the second conditional one should be 
>> errhint.
>> 
>> 2 - syncrep.c
> [..]
>> Same as comment 1.
> 
> You are right, apparently I copy/pasted code from src/backend/tcop/postgres.c
> way too fast... fixed.
> 
>> 3
>> ```
>> +volatile uint32 proc_die_sender_pid = 0;
>> +volatile uint32 proc_die_sender_uid = 0;
>> ```
>> 
>> These two globals are only written in the signal handler, I think they 
>> should be sig_atomic_t to ensure atomic writes.
> 
> Well the problem that sig_atomic_t is int and we need at least uint32 and
> I couldn't find better way. I think that 4 bytes writes will be mostly
> always atomic (for 64-bits it would depend on
> PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY)
> 
> Yet I moved those little below, so it's more aligned to the other uses.
> 
>> 4
>> ```
>> -                                        errmsg("terminating walreceiver 
>> process due to administrator command")));
>> +                                        errmsg("terminating walreceiver 
>> process due to administrator command"),
>> +                                        proc_die_sender_pid == 0 ? 0 :
>> +                                               errdetail("Signal sent by 
>> PID %lld, UID %lld.",
>> +                                                       (long 
>> long)proc_die_sender_pid, (long long)proc_die_sender_uid)
>> +                                       ));
>> ```
>> 
>> Why do we need to format pid and uid in “long long” format? I searched over 
>> the source tree, the current postmaster.c just formats pid as int (%d):
>> ```
>>        /* in parent, successful fork */
>>        ereport(DEBUG2,
>>                        (errmsg_internal("forked new %s, pid=%d socket=%d",
>>                                                         
>> GetBackendTypeDesc(bn->bkend_type),
>>                                                         (int) pid, (int) 
>> client_sock->sock)));
>> ```
> 
> Yes, I think I was kind of lost when thinking about it (v1 had sig_atomic_t,
> later had pid_t, I read somewhere about 64-bit pids, and so on) vs
> platform-agnostic hell of putting that into printf). Possible I was
> overthinking it
> and I have reverted it to just using %d with that uint32. BTW I've also found:
>   elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) pid, signal);
> 
> v4 attached. Thanks again for the review.
> 
> -J.
> <v4-0001-Add-errdetail-with-PID-and-UID-about-source-of-te.patch>

I just reviewed v4 again and got a few more comments:

1. This patch only set the global proc_die_sender_pid/uid to 0 at startup, then 
assign values to them upon receiving SIGTERM, and never reset them, which 
assumes a process must die upon SIGTERM. Is the assumption true? I guess not. 
If a process receives SIGTERM and not die immediately, then die for other 
reason, then it may report a misleading PID and UID. So, I think we may need to 
reset proc_die_sender_pid/uid somewhere. For example, in ProcessInterrupts(), 
copy them to local variables and reset them to 0, then use the local variables 
for ereport().

2. 
```
@@ -319,7 +323,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
                        ereport(WARNING,
                                        (errmsg("canceling wait for synchronous 
replication due to user request"),
-                                        errdetail("The transaction has already 
committed locally, but might not have been replicated to the standby.")));
+                                        errdetail("The transaction has already 
committed locally, but might not have been replicated to the standby."),
+                                               proc_die_sender_pid == 0 ? 0 :
+                                                       errhint("Signal sent by 
PID %d, UID %d.",
+                                                               
proc_die_sender_pid, proc_die_sender_uid)
+                                               ));
```

syncrpe.c uses errhint to print PID and UID, and postgres.c uses errdetail. We 
should keep consistency, maybe all use errhint.

3.
```
@@ -319,7 +323,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
                        QueryCancelPending = false;
                        ereport(WARNING,
                                        (errmsg("canceling wait for synchronous 
replication due to user request"),
-                                        errdetail("The transaction has already 
committed locally, but might not have been replicated to the standby.")));
+                                        errdetail("The transaction has already 
committed locally, but might not have been replicated to the standby."),
+                                               proc_die_sender_pid == 0 ? 0 :
+                                                       errhint("Signal sent by 
PID %d, UID %d.",
+                                                               
proc_die_sender_pid, proc_die_sender_uid)
+                                               ));
                        SyncRepCancelWait();
                        break;
                }
```

I don’t think the query cancel case relates to SIGTERM, so we don’t need to log 
PID and UID here.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/






Reply via email to