CV-Bowen opened a new pull request, #17783:
URL: https://github.com/apache/nuttx/pull/17783

   ## Summary
   
   This PR addresses critical race condition and deadlock issues in the rpmsg 
subsystem and adds a new read-write semaphore API for lock downgrading.
   
   ### Race Conditions Details:
   1. Thread A (Rpmsg RX Thread) calls `rpmsg_ns_bind()` and hold the 
`g_rpmsg_lock` to searches `g_rpmsg_cb` but the rpmsg services do not call the 
`rpmsg_register_callback()` yet, so `cb` is not in the `g_rpmsg_cb`;
   2. Thread A (Rpmsg RX Thread) unlock the `g_rpmsg_lock` after searches the 
`g_rpmsg_cb`;
   3. Thread B interrupts the Thread A and call `rpmsg_register_callback()` add 
the `cb` to the `g_rpmsg_cb` and searches the bind node in `rpmsg->bind` list, 
but bind node is not add yet;
   4. Thread A continue malloc a bind node and add it to the `rpmsg->bind` list;
   
   So the rpmsg service's `ns_bind()` is missed and will never be called.
   
   ### Changes included:
   1. **Fix rpmsg deadlock and ns_bind race condition** (commit 2e2524c)
      - Resolves a race condition where if a `ns_bind` message arrives between 
traversing `g_rpmsg_cb` and adding to the `g_rpmsg_cb` list, the callback node 
hasn't been added yet, causing `rpmsg_ns_bind()` to miss the callback and fail 
to notify rpmsg services
      - Implements proper lock downgrading to prevent deadlock scenarios in the 
rpmsg callback registration path
   
   2. **Add downgrade_write API for read-write semaphores** (commit 9ff3c1e)
      - Introduces `downgrade_write()` function that atomically converts a 
write lock to a read lock
      - This API is essential for scenarios where code needs to downgrade from 
exclusive access to shared access without releasing the lock entirely (which 
could allow race conditions)
      - Used in `rpmsg_register_callback()` and `rpmsg_unregister_callback()` 
to safely transition from write lock to read lock while iterating over rpmsg 
devices
   
   The fix uses `downgrade_write()` to atomically convert the write lock to a 
read lock after modifying the callback list, allowing safe concurrent reads 
while preventing the race condition.
   
   ## Impact
   Rpmsg Services Register/Unregister and Bound/Unbound Process
   
   ## Testing
   qemu-armv8a:rpserver and qemu-armv8a:rpserver test
   ```c
   ❯ qemu-system-aarch64 -cpu cortex-a53 -nographic \
   -machine virt,virtualization=on,gic-version=3 \
   -chardev stdio,id=con,mux=on -serial chardev:con \
   -object 
memory-backend-file,discard-data=on,id=shmmem-shmem0,mem-path=/dev/shm/my_shmem0,size=4194304,share=yes
 \
   -device ivshmem-plain,id=shmem0,memdev=shmmem-shmem0,addr=0xb \
   -device virtio-serial-device,bus=virtio-mmio-bus.0 \
   -chardev socket,path=/tmp/rpmsg_port_uart_socket,server=on,wait=off,id=foo \
   -device virtconsole,chardev=foo \
   -mon chardev=con,mode=readline -kernel ./nuttx/cmake_out/v8a_server/nuttx \
   -gdb tcp::7775
   [    0.000000] [ 0] [  INFO] [server] pci_register_rptun_ivshmem_driver: 
Register ivshmem driver, id=0, cpuname=proxy, master=1
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: pci_scan_bus for bus 0
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: class = 00000600, 
hdr_type = 00000000
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: 00:00 [1b36:0008]
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar0 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar1 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar2 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar3 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar4 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar5 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: class = 00000200, 
hdr_type = 00000000
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: 00:08 [1af4:1000]
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar0: 
mask64=fffffffe 32bytes
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar1: 
mask64=fffffff0 4096bytes
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar2 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar3 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar4: 
mask64=fffffffffffffff0 16384bytes
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: class = 00000500, 
hdr_type = 00000000
   [    0.000000] [ 3] [  INFO] [server] pci_scan_bus: 00:58 [1af4:1110]
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar0: 
mask64=fffffff0 256bytes
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar1 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar2: 
mask64=fffffffffffffff0 4194304bytes
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar4 set bad mask
   [    0.000000] [ 3] [  INFO] [server] pci_setup_device: pbar5 set bad mask
   [    0.000000] [ 3] [  INFO] [server] ivshmem_probe: shmem addr=0x10400000 
size=4194304 reg=0x10008000
   [    0.000000] [ 3] [  INFO] [server] rptun_ivshmem_probe: shmem 
addr=0x10400000 size=4194304
   
   NuttShell (NSH) NuttX-12.10.0
   server> [    0.000000] [ 0] [  INFO] [proxy] 
pci_register_rptun_ivshmem_driver: Register ivshmem driver, id=0, 
cpuname=server, master=0
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: pci_scan_bus for bus 0
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: class = 00000600, 
hdr_type = 00000000
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: 00:00 [1b36:0008]
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar0 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar1 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar2 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar3 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar4 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar5 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: class = 00000200, 
hdr_type = 00000000
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: 00:08 [1af4:1000]
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar0: 
mask64=fffffffe 32bytes
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar1: 
mask64=fffffff0 4096bytes
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar2 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar3 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar4: 
mask64=fffffffffffffff0 16384bytes
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: class = 00000500, 
hdr_type = 00000000
   [    0.000000] [ 3] [  INFO] [proxy] pci_scan_bus: 00:58 [1af4:1110]
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar0: 
mask64=fffffff0 256bytes
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar1 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar2: 
mask64=fffffffffffffff0 4194304bytes
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar4 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] pci_setup_device: pbar5 set bad mask
   [    0.000000] [ 3] [  INFO] [proxy] ivshmem_probe: shmem addr=0x10400000 
size=4194304 reg=0x10008000
   [    0.000000] [ 3] [  INFO] [proxy] rptun_ivshmem_probe: shmem 
addr=0x10400000 size=4194304
   [    0.000000] [ 3] [  INFO] [proxy] rptun_ivshmem_probe: Start the wdog
   
   server> 
   server> rptun ping all 1 1 1 1
   [    0.000000] [ 7] [ EMERG] [server] ping times: 1
   [    0.000000] [ 7] [ EMERG] [server] buffer_len: 2032, send_len: 17
   [    0.000000] [ 7] [ EMERG] [server] avg: 0 s, 24833808 ns
   [    0.000000] [ 7] [ EMERG] [server] min: 0 s, 24833808 ns
   [    0.000000] [ 7] [ EMERG] [server] max: 0 s, 24833808 ns
   [    0.000000] [ 7] [ EMERG] [server] rate: 0.005476 Mbits/sec
   server> rpmsg ping all 1 1 1 1
   [    0.000000] [ 7] [ EMERG] [server] ping times: 1
   [    0.000000] [ 7] [ EMERG] [server] buffer_len: 2024, send_len: 17
   [    0.000000] [ 7] [ EMERG] [server] avg: 0 s, 10686464 ns
   [    0.000000] [ 7] [ EMERG] [server] min: 0 s, 10686464 ns
   [    0.000000] [ 7] [ EMERG] [server] max: 0 s, 10686464 ns
   [    0.000000] [ 7] [ EMERG] [server] rate: 0.012726 Mbits/sec
   server> rptun dump all
   [    0.000000] [ 7] [ EMERG] [server] Remote: proxy headrx 8
   [    0.000000] [ 7] [ EMERG] [server] Dump rpmsg info between cpu (master: 
yes)server <==> proxy:
   [    0.000000] [ 7] [ EMERG] [server] rpmsg vq RX:
   [    0.000000] [ 7] [ EMERG] [server] rpmsg vq TX:
   [    0.000000] [ 7] [ EMERG] [server]   rpmsg ept list:
   [    0.000000] [ 7] [ EMERG] [server]     ept NS
   [    0.000000] [ 7] [ EMERG] [server]     ept rpmsg-sensor
   [    0.000000] [ 7] [ EMERG] [server]     ept rpmsg-ping
   [    0.000000] [ 7] [ EMERG] [server]     ept rpmsg-syslog
   [    0.000000] [ 7] [ EMERG] [server]   rpmsg buffer list:
   [    0.000000] [ 7] [ EMERG] [server]     RX buffer, total 8, pending 0
   [    0.000000] [ 7] [ EMERG] [server]     TX buffer, total 8, pending 0
   server> rpmsg dump all
   [    0.000000] [ 7] [ EMERG] [server] Remote: proxy2 state: 1
   [    0.000000] [ 7] [ EMERG] [server] ept NS
   [    0.000000] [ 7] [ EMERG] [server] ept rpmsg-sensor
   [    0.000000] [ 7] [ EMERG] [server] ept rpmsg-ping
   [    0.000000] [ 7] [ EMERG] [server] rpmsg_port queue RX: {used: 0, avail: 
8}
   [    0.000000] [ 7] [ EMERG] [server] rpmsg buffer list:
   [    0.000000] [ 7] [ EMERG] [server] rpmsg_port queue TX: {used: 0, avail: 
8}
   [    0.000000] [ 7] [ EMERG] [server] rpmsg buffer list:
   server> uname -a
   NuttX server 12.10.0 224eeb17e3b Jan  7 2026 10:48:30 arm64 qemu-armv8a
   server> 
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to