CV-Bowen opened a new pull request, #17783:
URL: https://github.com/apache/nuttx/pull/17783
## Summary
This PR addresses critical race condition and deadlock issues in the rpmsg
subsystem and adds a new read-write semaphore API for lock downgrading.
### Race Conditions Details:
1. Thread A (Rpmsg RX Thread) calls `rpmsg_ns_bind()` and hold the
`g_rpmsg_lock` to searches `g_rpmsg_cb` but the rpmsg services do not call the
`rpmsg_register_callback()` yet, so `cb` is not in the `g_rpmsg_cb`;
2. Thread A (Rpmsg RX Thread) unlock the `g_rpmsg_lock` after searches the
`g_rpmsg_cb`;
3. Thread B interrupts the Thread A and call `rpmsg_register_callback()` add
the `cb` to the `g_rpmsg_cb` and searches the bind node in `rpmsg->bind` list,
but bind node is not add yet;
4. Thread A continue malloc a bind node and add it to the `rpmsg->bind` list;
So the rpmsg service's `ns_bind()` is missed and will never be called.
### Changes included:
1. **Fix rpmsg deadlock and ns_bind race condition** (commit 2e2524c)
- Resolves a race condition where if a `ns_bind` message arrives between
traversing `g_rpmsg_cb` and adding to the `g_rpmsg_cb` list, the callback node
hasn't been added yet, causing `rpmsg_ns_bind()` to miss the callback and fail
to notify rpmsg services
- Implements proper lock downgrading to prevent deadlock scenarios in the
rpmsg callback registration path
2. **Add downgrade_write API for read-write semaphores** (commit 9ff3c1e)
- Introduces `downgrade_write()` function that atomically converts a
write lock to a read lock
- This API is essential for scenarios where code needs to downgrade from
exclusive access to shared access without releasing the lock entirely (which
could allow race conditions)
- Used in `rpmsg_register_callback()` and `rpmsg_unregister_callback()`
to safely transition from write lock to read lock while iterating over rpmsg
devices
The fix uses `downgrade_write()` to atomically convert the write lock to a
read lock after modifying the callback list, allowing safe concurrent reads
while preventing the race condition.
## Impact
Rpmsg Services Register/Unregister and Bound/Unbound Process
## Testing
qemu-armv8a:rpserver and qemu-armv8a:rpserver test
```c
❯ qemu-system-aarch64 -cpu cortex-a53 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-chardev stdio,id=con,mux=on -serial chardev:con \
-object
memory-backend-file,discard-data=on,id=shmmem-shmem0,mem-path=/dev/shm/my_shmem0,size=4194304,share=yes
\
-device ivshmem-plain,id=shmem0,memdev=shmmem-shmem0,addr=0xb \
-device virtio-serial-device,bus=virtio-mmio-bus.0 \
-chardev socket,path=/tmp/rpmsg_port_uart_socket,server=on,wait=off,id=foo \
-device virtconsole,chardev=foo \
-mon chardev=con,mode=readline -kernel ./nuttx/cmake_out/v8a_server/nuttx \
-gdb tcp::7775
[ 0.000000] [ 0] [ INFO] [server] pci_register_rptun_ivshmem_driver:
Register ivshmem driver, id=0, cpuname=proxy, master=1
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: pci_scan_bus for bus 0
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: class = 00000600,
hdr_type = 00000000
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: 00:00 [1b36:0008]
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar0 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar1 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar2 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar3 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar4 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar5 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: class = 00000200,
hdr_type = 00000000
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: 00:08 [1af4:1000]
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar0:
mask64=fffffffe 32bytes
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar1:
mask64=fffffff0 4096bytes
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar2 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar3 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar4:
mask64=fffffffffffffff0 16384bytes
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: class = 00000500,
hdr_type = 00000000
[ 0.000000] [ 3] [ INFO] [server] pci_scan_bus: 00:58 [1af4:1110]
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar0:
mask64=fffffff0 256bytes
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar1 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar2:
mask64=fffffffffffffff0 4194304bytes
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar4 set bad mask
[ 0.000000] [ 3] [ INFO] [server] pci_setup_device: pbar5 set bad mask
[ 0.000000] [ 3] [ INFO] [server] ivshmem_probe: shmem addr=0x10400000
size=4194304 reg=0x10008000
[ 0.000000] [ 3] [ INFO] [server] rptun_ivshmem_probe: shmem
addr=0x10400000 size=4194304
NuttShell (NSH) NuttX-12.10.0
server> [ 0.000000] [ 0] [ INFO] [proxy]
pci_register_rptun_ivshmem_driver: Register ivshmem driver, id=0,
cpuname=server, master=0
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: pci_scan_bus for bus 0
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: class = 00000600,
hdr_type = 00000000
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: 00:00 [1b36:0008]
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar0 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar1 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar2 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar3 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar4 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar5 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: class = 00000200,
hdr_type = 00000000
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: 00:08 [1af4:1000]
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar0:
mask64=fffffffe 32bytes
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar1:
mask64=fffffff0 4096bytes
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar2 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar3 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar4:
mask64=fffffffffffffff0 16384bytes
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: class = 00000500,
hdr_type = 00000000
[ 0.000000] [ 3] [ INFO] [proxy] pci_scan_bus: 00:58 [1af4:1110]
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar0:
mask64=fffffff0 256bytes
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar1 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar2:
mask64=fffffffffffffff0 4194304bytes
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar4 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] pci_setup_device: pbar5 set bad mask
[ 0.000000] [ 3] [ INFO] [proxy] ivshmem_probe: shmem addr=0x10400000
size=4194304 reg=0x10008000
[ 0.000000] [ 3] [ INFO] [proxy] rptun_ivshmem_probe: shmem
addr=0x10400000 size=4194304
[ 0.000000] [ 3] [ INFO] [proxy] rptun_ivshmem_probe: Start the wdog
server>
server> rptun ping all 1 1 1 1
[ 0.000000] [ 7] [ EMERG] [server] ping times: 1
[ 0.000000] [ 7] [ EMERG] [server] buffer_len: 2032, send_len: 17
[ 0.000000] [ 7] [ EMERG] [server] avg: 0 s, 24833808 ns
[ 0.000000] [ 7] [ EMERG] [server] min: 0 s, 24833808 ns
[ 0.000000] [ 7] [ EMERG] [server] max: 0 s, 24833808 ns
[ 0.000000] [ 7] [ EMERG] [server] rate: 0.005476 Mbits/sec
server> rpmsg ping all 1 1 1 1
[ 0.000000] [ 7] [ EMERG] [server] ping times: 1
[ 0.000000] [ 7] [ EMERG] [server] buffer_len: 2024, send_len: 17
[ 0.000000] [ 7] [ EMERG] [server] avg: 0 s, 10686464 ns
[ 0.000000] [ 7] [ EMERG] [server] min: 0 s, 10686464 ns
[ 0.000000] [ 7] [ EMERG] [server] max: 0 s, 10686464 ns
[ 0.000000] [ 7] [ EMERG] [server] rate: 0.012726 Mbits/sec
server> rptun dump all
[ 0.000000] [ 7] [ EMERG] [server] Remote: proxy headrx 8
[ 0.000000] [ 7] [ EMERG] [server] Dump rpmsg info between cpu (master:
yes)server <==> proxy:
[ 0.000000] [ 7] [ EMERG] [server] rpmsg vq RX:
[ 0.000000] [ 7] [ EMERG] [server] rpmsg vq TX:
[ 0.000000] [ 7] [ EMERG] [server] rpmsg ept list:
[ 0.000000] [ 7] [ EMERG] [server] ept NS
[ 0.000000] [ 7] [ EMERG] [server] ept rpmsg-sensor
[ 0.000000] [ 7] [ EMERG] [server] ept rpmsg-ping
[ 0.000000] [ 7] [ EMERG] [server] ept rpmsg-syslog
[ 0.000000] [ 7] [ EMERG] [server] rpmsg buffer list:
[ 0.000000] [ 7] [ EMERG] [server] RX buffer, total 8, pending 0
[ 0.000000] [ 7] [ EMERG] [server] TX buffer, total 8, pending 0
server> rpmsg dump all
[ 0.000000] [ 7] [ EMERG] [server] Remote: proxy2 state: 1
[ 0.000000] [ 7] [ EMERG] [server] ept NS
[ 0.000000] [ 7] [ EMERG] [server] ept rpmsg-sensor
[ 0.000000] [ 7] [ EMERG] [server] ept rpmsg-ping
[ 0.000000] [ 7] [ EMERG] [server] rpmsg_port queue RX: {used: 0, avail:
8}
[ 0.000000] [ 7] [ EMERG] [server] rpmsg buffer list:
[ 0.000000] [ 7] [ EMERG] [server] rpmsg_port queue TX: {used: 0, avail:
8}
[ 0.000000] [ 7] [ EMERG] [server] rpmsg buffer list:
server> uname -a
NuttX server 12.10.0 224eeb17e3b Jan 7 2026 10:48:30 arm64 qemu-armv8a
server>
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]