On Thu, Jun 9, 2016 at 1:14 PM, Michael Brown <mc...@ipxe.org> wrote: > On 09/06/16 08:58, Ladi Prosek wrote: >> >> This keepalive implementation has proven very effective in dealing with >> freezes. Real-world customer deployments have shown an improvement from >> 5% to virtually zero failed boots. > > > Awesome! > > Do you know what prevents the usual TCP retransmission mechanism from > recovering? ARP discovery should still work even for retransmitted packets.
Just like you wrote in the ipxe-devel thread linked from the commit description, from the client point of view the connection is "stable". Everything the client has sent has been acked so the retransmission timer is not running. The server is retransmitting for sure but its packets just can't reach the client - they're routed somewhere else or are blackholed altogether. I can get to this state easily by configuring my virtual NIC with the hardcoded default MAC. There are more such hosts on the network claiming the same MAC so sooner or later I find myself cut off. > Assuming that we do need to send keepalives to recover, then I would choose > to make this always-on, rather than requiring a config option. That sounds good. Under certain circumstances this may generate otherwise unnecessary traffic so I just want to be careful. For example if it's an HTTP connection and it is kept alive (as in HTTP keepalive), it will look idle and will be pinging the server with keepalives periodically even though it's not waiting for anything. Big deal? Probably not. Worth adding a way for upper layers to signal this down to the TCP implementation? Probably not either. Ladi _______________________________________________ ipxe-devel mailing list ipxe-devel@lists.ipxe.org https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel