Sorry, nothing comes to mind except for maybe you don't close your TCP pcbs correctly. Normally, pcbs in time-wait should just be reused. If you experience you need some kind of delay, maybe your pcbs are stuck in a state != time-wait?
Regards, Simon Am 04.10.2022 um 19:16 schrieb Geoff Simmons:
Hello, I'm a new subscriber, and am working on my first "serious" LWIP project (meaning more than just a sample). This is an HTTP server for the Raspberry Pi PicoW, using the raw TCP API, accessed via the Pico C SDK (in which LWIP is a git submodule): https://gitlab.com/slimhazard/picow_http It's going well, except for one problem that has me stumped after trying to fix it for days. If a client attempts to connect shortly after a number of connections were closed, then intermittently (not always, but fairly often), the accept process stalls for a long time -- seems to be as long as 10 seconds, maybe more. So I'm hoping that someone on the list can help spot the error. When this happens, I see sequences like this in debug output: TCP connection request 40270 -> 80. tcp_enqueue_flags: queueing 27040:27041 (0x12) tcp_output_segment: 27040:27040 tcp_slowtmr: processing active pcb tcp_slowtmr: polling application tcp_output: nothing to send (00000000) tcp_slowtmr: processing active pcb tcp_slowtmr: polling application tcp_output: nothing to send (00000000) tcp_output_segment: 27040:27040 tcp_slowtmr: processing active pcb tcp_slowtmr: polling application tcp_output: nothing to send (00000000) tcp_slowtmr: processing active pcb tcp_slowtmr: polling application tcp_output: nothing to send (00000000) tcp_slowtmr: processing active pcb tcp_slowtmr: polling application tcp_output: nothing to send (00000000) tcp_slowtmr: processing active pcb tcp_output_segment: 27040:27040 tcp_slowtmr: polling application tcp_output: nothing to send (00000000) tcp_output_segment: 27040:27040 TCP connection established 40270 -> 80. The pattern is always: - "TCP connection request" - tcp_enqueue_flags with a range of 1 ("queueing n:(n+1)"), always with the hex value 0x12 - this sequence, repeated many times: - "tcp_slowtmr: processing active pcb" - "tcp_slowtmr: polling application" - "tcp_output: nothing to send (00000000)", tcp_output_segment with range 0 ("n:n") is interspersed in the repeating sequence. After the stall comes "TCP connection established", and then everything proceeds normally. With long timeouts on the client side, all of the requests succeed, despite the long stall. All of this happens before the tcp_accept callback is invoked. When the stalls happen, I see the client side sending SYN retransmissions in wireshark. I haven't noticed anything else unusual in wireshark (of course it's easy to overlook something). I usually see this when repeating a test script that sends a few dozen requests. There's no stall on the first connection after server startup. There's also no stall if I wait long enough between sending batches of requests. But if I run the test script and then start it again shortly afterward, it can stall for quite a while on the second run. During a stall, I see MEM TCP_PCB stats showing "used" == "max", i.e. all tcp_pcbs in the pool are used. I assume that after all connections are closed following a series of requests, they *should* be in TIME_WAIT, and then for the next connection, the oldest PCB in TIME_WAIT gets re-used. I have seen debug tcp debug output saying exactly that. But I suspect that my application code is not doing everything right about closing connections. Bearing in mind that there's a lot I don't know about LWIP, so this hypothesis may be nonsense -- the feeling is that I have discarded a connection, thinking that is fully closed and should be in TIME_WAIT; but it isn't. Then on the next client connection, the PCB thinks it still needs to send something like an ACK or FIN, and stalls while doing so, accounting for the long sequence "processing active pcb" and "nothing to send". Eventually (because a timeout elapses?) the PCB gives up and accept can proceed. Does (0x12) in the tcp_enqueue_flags debug output refer to a PCB's tcp flags? If so, then the value is TF_RXCLOSED | TF_ACK_DELAY. Is that significant? It doesn't "sound right" for a PCB to be used for an incoming connection. Some things I've tried to fix the problem, none of which have succeeded: - Wait until all bytes of a sent response have been ACKed (using the tcp_sent callback). This may mean that HTTP request pipelining is not possible. And it hasn't helped. - Increase MEMP_NUM_TCP_PCB. But it doesn't seem to matter, if there have been enough requests so that all of them are used (and should be in TIME_WAIT), then the same thing happens. When I "wait long enough" between batches of requests, 6 PCBs are enough (I've tried up to 24). - Increase MEM_SIZE. MEM HEAP stats show very consistently that it never needs more than 4800 bytes. - Increase MEMP_NUM_TCP_SEG from 32 to 64. A bit of a desperation move, because I don't understand what it does, at any rate that didn't help. Sorry for the long introductory post, I'm trying to cover what I think I've understood about the problem. I assume that I've misunderstood something about the TCP API, and someone can set me straight. Thanks, Geoff _______________________________________________ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
_______________________________________________ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users