tmedicci commented on PR #13278:
URL: https://github.com/apache/nuttx/pull/13278#issuecomment-2326414854

   Thanks @yf13 , 
   
   Just to add to the discussion on 
https://github.com/apache/nuttx/pull/12864#issuecomment-2325779041, 
long-story-short:
   
   Our internal CI fails to test the `iperf` after #12864 for all Espressif's 
RISC-V devices (ESP32-C3 and ESP32-C6). The output either drops to 0 or the 
device halts. When halted, it keeps looping the list within the 
[`list_for_every_entry`](https://github.com/apache/nuttx/blob/59fd10000eb0a3e87d345e5bc4fff5934af447fd/sched/mqueue/mq_sndinternal.c#L346)
 in 
[`nxmq_do_send`](https://github.com/apache/nuttx/blob/59fd10000eb0a3e87d345e5bc4fff5934af447fd/sched/mqueue/mq_sndinternal.c#L324C5-L324C17):
 our Wi-Fi driver uses `mqueue` to exchange data from the Wi-Fi ISR to the 
Wi-Fi task. After some debugging, I've found that the list was being corrupted 
when a message was received in 
[`file_mq_receive`](https://github.com/apache/incubator-nuttx/blob/391bf7b37c11b3d52e6f17cd8e1ff1c95c7e0e77/sched/mqueue/mq_receive.c#L103).
 This function should run inside a critical section  (although the 
[`nxmq_wait_receive`](https://github.com/apache/nuttx/blob/391bf7b37c11b3d52e6f17cd8e1ff1c95c7e0e77/sched
 /mqueue/mq_rcvinternal.c#L134) and 
[`nxmq_do_receive`](https://github.com/apache/nuttx/blob/391bf7b37c11b3d52e6f17cd8e1ff1c95c7e0e77/sched/mqueue/mq_rcvinternal.c#L269)
 can reenable interrupts if a context switch is needed, restoring the critical 
section when the `mqueue` list is being manipulated).
   
   Finally, I added some global variables to check the places the `mqueue` list 
is being read/written (and where we expect to be in a critical section) and I 
expected these variables to be `false` when an interrupt is about to be 
dispatched (this is what the [NuttX 
Patch](https://github.com/user-attachments/files/16843821/isrmq-patch.gz) in  
https://github.com/apache/nuttx/pull/12864#issuecomment-2325779041 is about). 
Adding a breakpoint in this check, it's reached in ESP32-C3/ESP32-C6 as soon as 
`iperf` starts, stating that the critical section is not being respected.
   
   @yf13 created a testing application that simulates our Wi-Fi drive for 
`rv-virt` and the same behavior can be seen when the application runs on QEMU: 
the breakpoint is reached. Finally, reverting #12864 fixes `iperf` testing and, 
as expected, the breakpoint is not reached (this ensures that no interrupt 
occurred during the manipulation of the list).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to