Re: Tier 1 SMP Code Review: glibc Signal, RPC, and Threading Subsystems

Samuel Thibault Sat, 07 Mar 2026 16:06:43 -0800

[email protected], le mar. 03 mars 2026 04:08:15 +0000, a ecrit:
> #### S-3: `_hurdsig_traced` read/written without synchronization (LOW)
> 
> **Files**: `hurdsig.c:793`, `hurdmsg.c:249`
> 
> `_hurdsig_traced` is a `sigset_t` read by the signal thread and written by the
> msg handler without any lock or atomic. On x86 `sigset_t` is `unsigned long`
> (word-sized, atomic), so no tearing. Stale values only cause a missed trace.


Yes, and even with tearing, that's really no big deal.

> #### S-4: `_hurd_stopped`/`_hurd_orphaned`/`_hurd_pgrp` accessed without 
> locks (LOW)
> 
> **Files**: `hurdsig.c:44,650,687,796,847,882,887,900`
> 
> These globals are read in signal delivery without locks, written by proc 
> server
> notification handlers. All are word-sized (atomic on x86). `_hurd_stopped` is
> only accessed by the single signal thread. Stale values cause minor behavioral
> differences, not crashes.

_hurd_stopped being accessed only by the signal thread means there is no
need for any locking...

_hurd_orphaned/ppid/pgrp are only changed on _S_msg_proc_newids which is
asynchronous anyway, or in the child when forking, where we don't have
concurrency any more.

> #### T-1: No explicit memory barriers before `thread_resume` (LOW on x86)
> 
> **File**: `trampoline.c:241-368`
> 
> The signal thread writes the stack frame, then calls `thread_set_state` +
> `thread_resume`. On x86 TSO, all stores are visible before the syscall. On
> weaker architectures, the target could wake up and see an incomplete stack 
> frame.

? The target cannot wake up before we make the syscall, and the syscall
would flush as needed.

> #### P-1: rwlock writer starvation (MEDIUM -- design issue)
> 
> **File**: `sysdeps/htl/pt-rwlock-timedrdlock.c:50-57`
> 
> New readers are admitted immediately when `__readers > 0`, even if writers are
> waiting. No mechanism blocks new readers when writers are queued. Under SMP
> workloads with frequent reads, writers can starve indefinitely.

> #### POSIX-7: rwlock does not check for writer starvation (LOW -- 
> impl-defined)
> 
> **POSIX** (`pthread_rwlock_rdlock`, DESCRIPTION):
> "The calling thread acquires the read lock if a writer does not hold the
> lock and there are no writers blocked on the lock." (Though this is softened
> to "implementation-defined" when TPS is unsupported.)
> 
> **Implementation** (`sysdeps/htl/pt-rwlock-timedrdlock.c:51-58`): New
> readers are always admitted when `__readers > 0`, regardless of waiting
> writers. This allows writer starvation.
> 
> ### Synchronization Primitives
> 
> #### POSIX-6: `pthread_cond_broadcast` is not a single atomic operation 
> (MEDIUM)
> 
> **POSIX** (`pthread_cond_broadcast`, DESCRIPTION;
> Issue 8, Defect 609): "The pthread_cond_broadcast() function shall, as a
> single atomic operation, determine which threads, if any, are blocked on
> the specified condition variable cond and unblock all of these threads."
> 
> **Implementation** (`sysdeps/htl/pt-cond-brdcast.c:29-38`): Drops and
> re-acquires the condvar spinlock between each individual dequeue+wakeup.
> New waiters arriving mid-broadcast could be woken by the same broadcast.
> The set of unblocked threads is not determined atomically.

As written on the contributing page, we'd want to rewrite the rwlock and
barrier implementation to use gsync, possibly even just taking the nptl
implementation.

> #### POSIX-2: `pthread_sigmask` does not deliver pending signals before 
> return (HIGH)
> 
> **POSIX** (`pthread_sigmask`, DESCRIPTION):
> "after pthread_sigmask() changes the currently blocked set of signals it shall
> determine whether there are any pending unblocked signals; if there are any,
> then at least one of those signals shall be delivered before the call to
> pthread_sigmask() returns."
> 
> **Implementation** (`sysdeps/mach/hurd/sigthreadmask.c:77-83`): Sends
> `__msg_sig_post` to the signal thread and returns immediately without waiting
> for delivery. The signal is delivered asynchronously.

? __msg_sig_post will wait for the RPC answer. It's the same with
raise(), which does get interrupted as expected:

#0  handle (sig=30) at test.c:3
#1  <signal handler called>
#2  0x0000000101077cac in __GI___mach_msg_trap () at 
./build-tree/hurd-amd64-libc/mach/mach_msg_trap.S:2
#3  0x000000010107838d in __GI___mach_msg (msg=0x0, msg@entry=0x10103faa0, 
option=28, option@entry=3, send_size=send_size@entry=80,
    rcv_size=rcv_size@entry=48, rcv_name=28, timeout=timeout@entry=0, notify=0) 
at ./mach/msg.c:111
#4  0x000000010132e437 in __msg_sig_post (process=<optimized out>, 
signal=<optimized out>, sigcode=sigcode@entry=0,
    refport=<optimized out>) at 
./build-tree/hurd-amd64-libc/hurd/RPC_msg_sig_post.c:157
#5  0x00000001010defde in kill_port (msgport=<optimized out>, 
refport=<optimized out>) at ../sysdeps/mach/hurd/kill.c:77
#6  kill_pid (pid=pid@entry=881) at ../sysdeps/mach/hurd/kill.c:114
#7  0x00000001010df2f7 in __GI___kill (pid=881, sig=<optimized out>) at 
../sysdeps/mach/hurd/kill.c:148
#8  0x00000001000006e1 in main () at test.c:6

> #### POSIX-8: Robust mutex ENOTRECOVERABLE check appears buggy (HIGH)
> 
> **POSIX** (`pthread_mutex_lock`, ERRORS):
> "[ENOTRECOVERABLE] The state protected by the mutex is not recoverable."
> Also (`pthread_mutexattr_setrobust`, DESCRIPTION):
> describes the protocol for marking a mutex as not recoverable.
> 
> **Implementation** (`sysdeps/mach/hurd/htl/pt-mutex.h`, ROBUST_LOCK macro):
> ```c
> if (mtxp->__owner_id == ENOTRECOVERABLE)
> ```

Ah, indeed, a typo.

Samuel

Re: Tier 1 SMP Code Review: glibc Signal, RPC, and Threading Subsystems

Reply via email to