** Description changed:

  [impact]
  
  boot-smoke test reboots 5 times and verifies systemd is fully started up
- after each boot, but only gives 35 seconds for each boot.  On loaded
- systems this is too short.
+ after each boot, including checking if there are any running jobs (with
+ list-jobs).  However, this test makes the assumption that no further
+ jobs will be started after systemd reaches 'running' (or 'degraded')
+ state, which is a false assumption.
  
  [test case]
  
  see various boot-smoke failures in autopkgtest.ubuntu.com
  
  [regression potential]
  
- longer autopkgtest times.
+ possible false-positive or false-negative autopkgtest results.
  
  [other info]
  
- i can't reproduce this failure locally, but it seems to happen
- intermittently on the adt setup.  Therefore, I don't know for sure that
- the short timeout is actually the cause of the problem, but it certainly
- seems likely - 35 seconds really isn't very long for a full reboot and
- for systemd to finish starting all services, especially on the highly
- loaded autopkgtest.ubuntu.com systems.
+ The problem appears to be that systemd reaches 'running' (or 'degraded')
+ state, and then other systemd services are started.  This confuses the
+ boot-smoke test, because it sees that 'is-system-running' is done, but
+ then it sees running jobs, which fails the test.
  
- There should be no harm, other than delaying an actual failure, from
- extending the timeout.  The test case checks each second if all services
- have finished starting, so on success case it won't wait any longer than
- it currently does.
+ What is starting jobs after systemd reaches running state appears to be
+ X inside the test system.  There are various services started by gnome-
+ session and dbus-daemon.  Additionally, from the artifacts of one
+ example:
+ 
+ 
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
+ /autopkgtest-
+ bionic/bionic/i386/s/systemd/20190416_171327_478f6@/artifacts.tar.gz
+ 
+ the artifacts/journal.txt shows that after the boot-smoke test causes
+ the reboot and then re-ssh into the system after the reboot, it only
+ gives the test system 9 seconds before deciding it has failed, and only
+ 4 seconds after ssh'ing into the rebooted test system.
+ 
+ While increasing the timeout isn't guaranteed to stop the boot-smoke
+ failures due to still-running jobs, the logs show it certainly should
+ help.
+ 
+ If we continue to get failures for still-running jobs, it probably
+ should just be made a non-failing check.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1825997

Title:
  boot-smoke fails due to running jobs

Status in systemd package in Ubuntu:
  In Progress
Status in systemd source package in Bionic:
  In Progress
Status in systemd source package in Cosmic:
  In Progress
Status in systemd source package in Disco:
  In Progress
Status in systemd source package in Eoan:
  In Progress

Bug description:
  [impact]

  boot-smoke test reboots 5 times and verifies systemd is fully started
  up after each boot, including checking if there are any running jobs
  (with list-jobs).  However, this test makes the assumption that no
  further jobs will be started after systemd reaches 'running' (or
  'degraded') state, which is a false assumption.

  [test case]

  see various boot-smoke failures in autopkgtest.ubuntu.com

  [regression potential]

  possible false-positive or false-negative autopkgtest results.

  [other info]

  The problem appears to be that systemd reaches 'running' (or
  'degraded') state, and then other systemd services are started.  This
  confuses the boot-smoke test, because it sees that 'is-system-running'
  is done, but then it sees running jobs, which fails the test.

  What is starting jobs after systemd reaches running state appears to
  be X inside the test system.  There are various services started by
  gnome-session and dbus-daemon.  Additionally, from the artifacts of
  one example:

  
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac
  /autopkgtest-
  bionic/bionic/i386/s/systemd/20190416_171327_478f6@/artifacts.tar.gz

  the artifacts/journal.txt shows that after the boot-smoke test causes
  the reboot and then re-ssh into the system after the reboot, it only
  gives the test system 9 seconds before deciding it has failed, and
  only 4 seconds after ssh'ing into the rebooted test system.

  The timeout waiting for is-system-running is actually probably fine;
  what is needed is another timeout while checking list-jobs, after we
  know that the system is running.  Another timeout should let any new
  jobs started after we reached running complete.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1825997/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to