Re: [I] [BUG] Buggy network conditions cause permanent TCP connection exhaustion [nuttx]

via GitHub Sun, 24 Nov 2024 19:15:31 -0800


seele404 commented on issue #13823:
URL: https://github.com/apache/nuttx/issues/13823#issuecomment-2496590468


   I finally found an issue similar to what I’ve encountered. As a newcomer 
with limited experience, I’ll try to describe what I’ve observed in as much 
detail as possible:
   
   When frequently transmitting large amounts of data via a TCP connection from 
NuttX to PC1, if PC1 is replaced with PC2 or the IP address of PC1 is directly 
switched, all TCP services on NuttX experience problems for some time 
thereafter.
   
   Here are some experimental results and preliminary conclusions I’ve gathered:
   
   1. **TCP socket not closing after physical disconnection:**  
       I discovered that physically disconnecting the network (e.g., unplugging 
the cable) doesn’t trigger the TCP protocol to detect the disconnection. The 
socket remains open. Surprisingly, reconnecting the cable quickly allows the 
same connection to resume data transmission. While this behavior has some use, 
it seems incorrect.
       
   2. **Impact of application-layer high-frequency, high-volume data 
transmission:**  
       When the network cable is disconnected, the socket fails to send data 
but continues filling the send buffer. Consequently, when PC2 attempts to 
request any TCP-related service, NuttX fails to allocate send buffers for new 
sockets (stuck in `iob_allocwait`). This makes it impossible to handle any 
send-related operations. I suspect this is the primary reason for the TCP 
"freezing" issue observed across all services.
       
   
   **My application scenario:**  
   The main TCP services include FTP, Telnet, and HTTP (serving webpages using 
WebSocket). Through WebSocket, I frequently transmit large amounts of module 
data for webpage display. However, my colleagues often disconnect my network 
cable to connect their PCs. After reconnecting, they can’t load the webpage, 
and NuttX’s HTTP service gets stuck when attempting to send files. Furthermore, 
Telnet and FTP also fail to function, which is quite frustrating.
   
   I haven’t fully delved into the logic behind this issue but have had to take 
immediate steps to mitigate the problem. Specifically, I significantly reduced 
`TCP_RTO` and `TCP_MAXRTX`. From my understanding, when the send buffer is 
full, the only fallback is TCP’s retransmission timeout mechanism. This 
approach has yielded reasonable results so far.
   
   Next, I plan to analyze why the network interface status isn’t properly 
reported to the TCP layer. As someone primarily writing application-layer code, 
this is quite challenging for me. Any suggestions, references, or guidance 
would be greatly appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [BUG] Buggy network conditions cause permanent TCP connection exhaustion [nuttx]

Reply via email to