[Sts-sponsors] [Bug 1947099] Re: ipconfig does not honour user-requested timeouts in some cases

Robie Basak Wed, 11 May 2022 06:56:00 -0700

It's not clear to me if upstream have accepted the patch. If not,
https://lists.zytor.com/archives/klibc/2021-December/004635.html sounds
like it's a deliberate upstream design decision not to.


In Ubuntu, we might decide to maintain the patch as a delta but then
drop that delta in subsequent releases when we no longer need that
functionality. This is already the case for >18.04, and for 18.04 we're
adding the delta in an SRU.

Therefore, Won't Fix for Ubuntu for the development release I think is
appropriate. It it does get fixed upstream then our status can be
updated, but either way it's not really relevant to Ubuntu any more.

In comment 22 I asked:

> ...would it be practical and/or useful to verify that, with a timeout
of 2s, a DHCP reply sent after 1.5s works, but a DHCP reply sent after
2.5s does not?

Have you covered this in your test plan please? I don't see this
discussed anywhere.

Please could you also update the Test Plan section in the SRU
description to cover the testing you are committing to do?

Apart from that, +1, so I'll accept now to save us time. We do need to
continue discussion to resolve the Test Plan though. That needs to be
resolved to the SRU team's satisfaction before we will release the build
to the updates pocket.

** Description changed:

  [Impact]
  In some cases, ipconfig can take a longer time than the user-specified 
timeouts, causing unexpected delays.
  
  [Test Plan]
- Any situation where ipconfig encounters an error sending the DHCP packet, it 
will automatically set a delay of 10 seconds, which could be longer than the 
user-specified timeout. It can be reproduced by creating a dummy interface and 
attempting to run ipconfig on it with a timeout value of less than 10:
+ 
+ [racb: pending agreement with the SRU team; please see comment 37]
+ 
+ Any situation where ipconfig encounters an error sending the DHCP
+ packet, it will automatically set a delay of 10 seconds, which could be
+ longer than the user-specified timeout. It can be reproduced by creating
+ a dummy interface and attempting to run ipconfig on it with a timeout
+ value of less than 10:
  
  # ip link add eth1 type dummy
  # date; /usr/lib/klibc/bin/ipconfig -t 2 eth1; date
  Thu Nov 18 04:46:13 EST 2021
  IP-Config: eth1 hardware address ae:e0:f5:9d:7e:00 mtu 1500 DHCP RARP
  IP-Config: no response after 2 secs - giving up
  Thu Nov 18 04:46:23 EST 2021
  
  ^ Notice above, ipconfig thinks that it waited 2 seconds, but the
  timestamps show an actual delay of 10 seconds.
  
  [Where problems could occur]
  Please see reproduction steps above. We are seeing this in production too 
(see comment #2).
  
  [Other Info]
  A patch to fix the issue is being proposed here. It is a safe fix - it only 
checks before going into sleep that the timeout never exceeds the 
user-requested value.
  
  [Original Description]
  
  In some cases, ipconfig can take longer than the user-specified
  timeouts, causing unexpected delays.
  
  in main.c, in function loop(), the process can go into
  process_timeout_event() (or process_receive_event() ) and if it
  encounters an error situation, will set an attempt to "try again later"
  at time equal now + 10 seconds by setting
  
  s->expire = now + 10;
  
  This can happen at any time during the main event loop, which can end up
  extending the user-specified timeout if "now + 10" is greater than
  "start_time + user-specified-timeout".
  
  I believe a patch like the following is needed to avoid this problem:
  
  --- a/usr/kinit/ipconfig/main.c
  +++ b/usr/kinit/ipconfig/main.c
  @@ -437,6 +437,13 @@ static int loop(void)
  
                          if (timeout > s->expire - now.tv_sec)
                                  timeout = s->expire - now.tv_sec;
  +
  +                       /* Compensate for already-lost time */
  +                       gettimeofday(&now, NULL);
  +                       if (now.tv_sec + timeout > start + loop_timeout) {
  +                               timeout = loop_timeout - (now.tv_sec - start);
  +                               printf("Lowered timeout to match user request 
= (%d s) \n", timeout);
  +                       }
                  }
  
  I believe the current behaviour is buggy. This is confirmed when the
  following line is executed:
  
                          if (loop_timeout >= 0 &&
                              now.tv_sec - start >= loop_timeout) {
                                  printf("IP-Config: no response after %d "
                                         "secs - giving up\n", loop_timeout);
                                  rc = -1;
                                  goto bail;
                          }
  
  'loop_timeout' is the user-specified time-out. With a value of 2, in
  case of error, this line prints:
  
  IP-Config: no response after 2 secs - giving up
  
  So it thinks that it waited 2 seconds - however, in reality it had
  actually waited for 10 seconds.
  
  The suggested code-change ensures that the timeout that is actually used
  never exceeds the user-specified timeout.
  
- 
  [ Regression potential ]
  
- This change ensures that user-specified timeouts are never exceeded, which is 
a problem that appears to happen only in case of interface errors. 
+ This change ensures that user-specified timeouts are never exceeded, which is 
a problem that appears to happen only in case of interface errors.
  It may be that someone is relying on current behaviour where they receive 
DHCP offers after their specified timeout (but within the 10-second error 
timeout). However, 1) that is buggy behaviour and should be exposed. Such a 
user would need to update their specified timeout to make it long enough to 
receive the DHCP offer (setting the timeout to 10 would keep the existing 
behaviour). 2) I think it is unlikely that such a scenario exists at all. The 
10-second timeout problem happens when there are problems with the interface 
that prevent it from even sending out the DHCP request. I think it is very 
unlikely (or even, impossible) that DHCP offers would be received on a dead 
interface.
  
  Based on the above points, I consider the regression potential to be
  very low for this change. I do not expect anyone who is currently using
  ipconfig successfully to notice this change.
  
  I believe the only difference introduced by this is the reduction of
  delays caused by dead or problematic network interfaces. Those error
  delays are shortened such that they never exceeed user-specified
  timeouts.

** Changed in: klibc (Ubuntu)
       Status: New => Won't Fix

** Changed in: klibc (Ubuntu Bionic)
       Status: Confirmed => Fix Committed

** Tags added: verification-needed verification-needed-bionic

-- 
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1947099

Title:
  ipconfig does not honour user-requested timeouts in some cases

Status in klibc package in Ubuntu:
  Won't Fix
Status in klibc source package in Bionic:
  Fix Committed

Bug description:
  [Impact]
  In some cases, ipconfig can take a longer time than the user-specified 
timeouts, causing unexpected delays.

  [Test Plan]

  [racb: pending agreement with the SRU team; please see comment 37]

  Any situation where ipconfig encounters an error sending the DHCP
  packet, it will automatically set a delay of 10 seconds, which could
  be longer than the user-specified timeout. It can be reproduced by
  creating a dummy interface and attempting to run ipconfig on it with a
  timeout value of less than 10:

  # ip link add eth1 type dummy
  # date; /usr/lib/klibc/bin/ipconfig -t 2 eth1; date
  Thu Nov 18 04:46:13 EST 2021
  IP-Config: eth1 hardware address ae:e0:f5:9d:7e:00 mtu 1500 DHCP RARP
  IP-Config: no response after 2 secs - giving up
  Thu Nov 18 04:46:23 EST 2021

  ^ Notice above, ipconfig thinks that it waited 2 seconds, but the
  timestamps show an actual delay of 10 seconds.

  [Where problems could occur]
  Please see reproduction steps above. We are seeing this in production too 
(see comment #2).

  [Other Info]
  A patch to fix the issue is being proposed here. It is a safe fix - it only 
checks before going into sleep that the timeout never exceeds the 
user-requested value.

  [Original Description]

  In some cases, ipconfig can take longer than the user-specified
  timeouts, causing unexpected delays.

  in main.c, in function loop(), the process can go into
  process_timeout_event() (or process_receive_event() ) and if it
  encounters an error situation, will set an attempt to "try again
  later" at time equal now + 10 seconds by setting

  s->expire = now + 10;

  This can happen at any time during the main event loop, which can end
  up extending the user-specified timeout if "now + 10" is greater than
  "start_time + user-specified-timeout".

  I believe a patch like the following is needed to avoid this problem:

  --- a/usr/kinit/ipconfig/main.c
  +++ b/usr/kinit/ipconfig/main.c
  @@ -437,6 +437,13 @@ static int loop(void)

                          if (timeout > s->expire - now.tv_sec)
                                  timeout = s->expire - now.tv_sec;
  +
  +                       /* Compensate for already-lost time */
  +                       gettimeofday(&now, NULL);
  +                       if (now.tv_sec + timeout > start + loop_timeout) {
  +                               timeout = loop_timeout - (now.tv_sec - start);
  +                               printf("Lowered timeout to match user request 
= (%d s) \n", timeout);
  +                       }
                  }

  I believe the current behaviour is buggy. This is confirmed when the
  following line is executed:

                          if (loop_timeout >= 0 &&
                              now.tv_sec - start >= loop_timeout) {
                                  printf("IP-Config: no response after %d "
                                         "secs - giving up\n", loop_timeout);
                                  rc = -1;
                                  goto bail;
                          }

  'loop_timeout' is the user-specified time-out. With a value of 2, in
  case of error, this line prints:

  IP-Config: no response after 2 secs - giving up

  So it thinks that it waited 2 seconds - however, in reality it had
  actually waited for 10 seconds.

  The suggested code-change ensures that the timeout that is actually
  used never exceeds the user-specified timeout.

  [ Regression potential ]

  This change ensures that user-specified timeouts are never exceeded, which is 
a problem that appears to happen only in case of interface errors.
  It may be that someone is relying on current behaviour where they receive 
DHCP offers after their specified timeout (but within the 10-second error 
timeout). However, 1) that is buggy behaviour and should be exposed. Such a 
user would need to update their specified timeout to make it long enough to 
receive the DHCP offer (setting the timeout to 10 would keep the existing 
behaviour). 2) I think it is unlikely that such a scenario exists at all. The 
10-second timeout problem happens when there are problems with the interface 
that prevent it from even sending out the DHCP request. I think it is very 
unlikely (or even, impossible) that DHCP offers would be received on a dead 
interface.

  Based on the above points, I consider the regression potential to be
  very low for this change. I do not expect anyone who is currently
  using ipconfig successfully to notice this change.

  I believe the only difference introduced by this is the reduction of
  delays caused by dead or problematic network interfaces. Those error
  delays are shortened such that they never exceeed user-specified
  timeouts.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/klibc/+bug/1947099/+subscriptions


-- 
Mailing list: https://launchpad.net/~sts-sponsors
Post to     : sts-sponsors@lists.launchpad.net
Unsubscribe : https://launchpad.net/~sts-sponsors
More help   : https://help.launchpad.net/ListHelp

[Sts-sponsors] [Bug 1947099] Re: ipconfig does not honour user-requested timeouts in some cases

Reply via email to