First, how are you detecting slow connections? Just by not receiving data? If so, that is a wrong approach and always was.

You have out of bounds ways to detect this - TCP keep alive. I don't remember if that has been put off the table as uncontrollable on all platforms or too faulty. If not, setting the TCP keep-alive timeout _during the time you are receiving a response_ to something reasonable (your 5 seconds) will do the job for you very nicely. And I also think that 5 seconds is a too small interval. At least 10 would do better IMO.

Other way is to rate the _response_ and probably also each _connection_. You can measure how fast they are receiving and adjust your timeout according it. Still not a perfect solution and also not simple and also stateful.

In general, I'm very much against application-level timeouts. I was never able to find the right time and whichever value you choose you always overlap both the good and the bad land as well.


-hb-


On 2/25/2016 10:23, Daniel Stenberg wrote:
Hi friends,

I work on a little issue (bug 1245059) that I feel I could use some feedback or thoughts on, on how to best go about and handle it. This problem is happening on Windows (right now) but in theory it could happen similarly on other platforms too.

The ground rules:

1. We trigger an internal network change event on IP and network
   interface changes.

2. We make a checksum of all network "adapters" and their IP addresses to
avoid duplicate events. We also coalesce events to not send them more often
   than once per second.

3. When a change is detected, we want to detect stalled HTTP connections to
   avoid "hangs" and to provide a snappier experience.

4. A "stalled" HTTP/1 connection is detected by not having traffic for N
   seconds. There is no difference between a stalled connection and a
   connection on which the server is just very slow to respond and thus
   leaving an N second pause. (N is 5 seconds by default).

The problem:

1. The user uses a slow server that often takes more than N seconds to
   respond.

2. The same user has a (Microsoft Teredo tunneling) network adapter that
appears and disappears every few minutes (with 60 - 200 seconds interval it
   seems) thus triggering network change events fairly often.

3. User gets sad face because Firefox keeps cutting off slow (but working) HTTP requests. (There's a few other downsides to these frequent network
   change events, but they're not as visible.)

Additionally:

- Yes, this seems like a broken/strange user setup, but still it happens and it is not causing a (noticable) problem for the user if Firefox is prevented
  from killing silent HTTP connections.

- We can detect Teredo tunnels by its IP address range, but how does that
  help?

The bug:

  https://bugzilla.mozilla.org/show_bug.cgi?id=1245059


Any bright ideas?


_______________________________________________
dev-tech-network mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-network

Reply via email to