Hello group,

First, thanks to everyone for the continued development and support of lwIP - 
it been great to see it so active the past few years.  This purpose of this 
message is to notify lwIP 1.4.1 stable users of a problem, and to see if anyone 
(i.e. developers) knows the bug report that would have resolved this.  The 
problem is, based on network traffic that I have been unable to pinpoint, the 
TCP outbound communication stalls.  I am unsure in the debug logs what lwIP is 
trying to do with each tcp_output, but nothing goes out the wire.  Packets come 
and go from the device as I can still open telnet, ping, and use a UDP protocol 
on the device.  We use NO_SYS=1, a cooperative multi-tasking system, UDP and a 
single TCP connection.  1.4.1 was great for over 5 years.

We install systems on a local subnet (only a PC NIC and lwIP/Lantronix devices 
- anywhere from 2 to 10).  A critical customer has been complaining for months 
about our devices disconnecting.  We report a disconnect error when we stop 
getting repeating status messages back from all devices.  I'd heard of this 
occurring intermittently over the years and we always wrote it off as 
electrical problems since we're usually in a noisy environment.  Until by 
chance, I connected my local subnet switch to our corporate network and I was 
seeing disconnects on all lwIP devices I have connected. I don't know why.  
This customer must have the same traffic on the subnet that I see on the 
corporate network.

The first thing I did was upgrade to 2.0.2.  Other than very few minor changes, 
everything builds and runs.  The TCP send stalls are gone.  I went back to lwIP 
1.4.1 and they came back.  Good, I had a test and a solution.  We decided here 
the best approach is to try to patch 1.4.1 with the fix for this for the 
critical customer and then use a controlled rollout and test plan for lwIP 
2.0.2 which means updating 9 of our lwIP devices.  I spent about half a day 
checking the CHANGELOG and trying a few patches in the bug reports mentioning 
TCP and no change I made resolved the problem.  The one mention for TCP 
stalling was with a new scaling window feature in lwIP 2.x.  I would have 
thought a bug-fix regarding stalled TCP sends would be easy to find in the list 
- this is a big deal in a TCP/IP stack.

My question to developers is, does anyone recall a change that resolved TCP 
send stalling?  And a note to lwIP 1.4.1 stable users - you should update to 
2.x.

Thank you - best regards,
Bill Auerbach
_______________________________________________
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users

Reply via email to