seele404 commented on issue #13823:
URL: https://github.com/apache/nuttx/issues/13823#issuecomment-2496590468
I finally found an issue similar to what I’ve encountered. As a newcomer
with limited experience, I’ll try to describe what I’ve observed in as much
detail as possible:
When frequently transmitting large amounts of data via a TCP connection from
NuttX to PC1, if PC1 is replaced with PC2 or the IP address of PC1 is directly
switched, all TCP services on NuttX experience problems for some time
thereafter.
Here are some experimental results and preliminary conclusions I’ve gathered:
1. **TCP socket not closing after physical disconnection:**
I discovered that physically disconnecting the network (e.g., unplugging
the cable) doesn’t trigger the TCP protocol to detect the disconnection. The
socket remains open. Surprisingly, reconnecting the cable quickly allows the
same connection to resume data transmission. While this behavior has some use,
it seems incorrect.
2. **Impact of application-layer high-frequency, high-volume data
transmission:**
When the network cable is disconnected, the socket fails to send data
but continues filling the send buffer. Consequently, when PC2 attempts to
request any TCP-related service, NuttX fails to allocate send buffers for new
sockets (stuck in `iob_allocwait`). This makes it impossible to handle any
send-related operations. I suspect this is the primary reason for the TCP
"freezing" issue observed across all services.
**My application scenario:**
The main TCP services include FTP, Telnet, and HTTP (serving webpages using
WebSocket). Through WebSocket, I frequently transmit large amounts of module
data for webpage display. However, my colleagues often disconnect my network
cable to connect their PCs. After reconnecting, they can’t load the webpage,
and NuttX’s HTTP service gets stuck when attempting to send files. Furthermore,
Telnet and FTP also fail to function, which is quite frustrating.
I haven’t fully delved into the logic behind this issue but have had to take
immediate steps to mitigate the problem. Specifically, I significantly reduced
`TCP_RTO` and `TCP_MAXRTX`. From my understanding, when the send buffer is
full, the only fallback is TCP’s retransmission timeout mechanism. This
approach has yielded reasonable results so far.
Next, I plan to analyze why the network interface status isn’t properly
reported to the TCP layer. As someone primarily writing application-layer code,
this is quite challenging for me. Any suggestions, references, or guidance
would be greatly appreciated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]