[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
Indeed we did get a "proper" timeout now \o/ https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-yakkety/yakkety/amd64/l/linux/20161018_180219_53396@/log.gz So closing this one, and using bug 1634519 for the new timeout. ** Changed in: linux (Ubuntu Yakkety) Status: Confirmed => Fix Released ** Changed in: linux (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
It's better now -- in my local QEMU test the last output is now 14:46:41 DEBUG| [stdout] Test icebp [Ok] 14:46:41 DEBUG| [stdout] Test int 3 trap [Ok] 14:46:41 DEBUG| [stdout] selftests: breakpoint_test [PASS] and since then (1 hour) it's hung. But now I still can log into ttyS0. dmesg is almost empty: [ 5501.499217] ata2.01: NODEV after polling detection [ 5501.500138] ata2.00: configured for MWDMA2 (something in the test clears the ring buffer), and journalctl confirms that suspend/resume worked fine: Oct 18 14:47:11 autopkgtest kernel: PM: suspend of devices complete after 65.273 msecs Oct 18 14:47:11 autopkgtest kernel: PM: late suspend of devices complete after 0.172 msecs Oct 18 14:47:11 autopkgtest kernel: PM: noirq suspend of devices complete after 1.963 msecs Oct 18 14:47:11 autopkgtest kernel: ACPI: Preparing to enter system sleep state S3 Oct 18 14:47:11 autopkgtest kernel: PM: Saving platform NVS memory Oct 18 14:47:11 autopkgtest kernel: Disabling non-boot CPUs ... Oct 18 14:47:11 autopkgtest kernel: kvm-clock: cpu 0, msr 1:3fff4001, primary cpu clock, resume Oct 18 14:47:11 autopkgtest kernel: ACPI: Low-level resume complete Oct 18 14:47:11 autopkgtest kernel: PM: Restoring platform NVS memory Oct 18 14:47:11 autopkgtest kernel: ACPI: Waking up from system sleep state S3 Oct 18 14:47:11 autopkgtest kernel: PM: noirq resume of devices complete after 6.973 msecs Oct 18 14:47:11 autopkgtest kernel: PM: early resume of devices complete after 0.105 msecs Oct 18 14:47:11 autopkgtest kernel: pci :00:01.0: PIIX3: Enabling Passive Release Oct 18 14:47:11 autopkgtest kernel: rtc_cmos 00:00: System wakeup disabled by ACPI Oct 18 14:47:11 autopkgtest kernel: PM: resume of devices complete after 8.898 msecs Oct 18 14:47:11 autopkgtest kernel: PM: Finishing wakeup. Oct 18 14:47:11 autopkgtest systemd[1]: Time has been changed Oct 18 14:47:11 autopkgtest systemd[1]: apt-daily.timer: Adding 10h 56min 28.694439s random time. Oct 18 14:47:11 autopkgtest systemd[3986]: Time has been changed Oct 18 14:47:11 autopkgtest kernel: Restarting tasks ... done. Oct 18 14:47:11 autopkgtest sudo[30731]: pam_unix(sudo:session): session closed for user root Oct 18 14:47:11 autopkgtest kernel: ata2.01: NODEV after polling detection Oct 18 14:47:11 autopkgtest kernel: ata2.00: configured for MWDMA2 But also, no messages beyond that (last message one hour ago). Colin wants a new bug for this, so I filed bug 1634519. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
So I've added a fix to the autotest client tests that modify the wakealarm sleep time to 30 seconds and test VM now gets woken up and the test no longer hangs forever at the point reported in comment #3. Fix committed: http://kernel.ubuntu.com/git/ubuntu/autotest-client- tests.git/commit/?id=cad22c2ed884014b2b2ad88deb6f5535ad580689 Please re-run and let me know if there are further problems and/or if this fix solves the issue for you. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
So the linux kernel regression test ./linux/tools/testing/selftests/breakpoints/step_after_suspend_test is being run and does not return. However, running this in a clean instance it does pass. It basically sets the RTC to wake in 5 seconds and does a suspend and the RTC wakealarm wakes it up. Perhaps the 5 second RTC wakealarm is too short when the system is a bit loaded and it takes more than 5 seconds to get to sleep, hence the wakealarm has already fired and does not wake it up when in deep sleep. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
OK, so S3 sleep works OK in the VM, going to debug step_after_susp /sys/power/state -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
I'm not able to reproduce the issues you are seeing, however I do see: 16:01:15 DEBUG| [stdout] Test read watchpoint 3 with len: 8 local: 1 global: 1 [Ok] 16:01:15 DEBUG| [stdout] Test icebp [Ok] 16:01:15 DEBUG| [stdout] Test int 3 trap [Ok] 16:01:15 DEBUG| [stdout] selftests: breakpoint_test [PASS] and one of the last processes to be exec'd is: step_after_susp /sys/power/state and klog is showing: [ 3387.159435] PM: Syncing filesystems ... done. [ 3393.174633] PM: Preparing system for sleep (mem) and then it does not wake up from suspend. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
..and the last things to be exec'd are: 16:01:15 exit 1165 00.037 ./breakpoint_test 16:01:15 exit 1164 00.043 /bin/sh -c for TEST in breakpoint_test step_after_suspend_test; do (./$TEST && echo "selftests: $TEST [PASS]") || echo "selftests: $TEST [FAIL]"; done; 16:01:15 exit 1166 00.036 ./breakpoint_test Time Event PID Info Duration Process 16:01:15 fork 1163 parent /bin/sh -c for TEST in breakpoint_test step_after_suspend_test; do (./$TEST && echo "selftests: $TEST [PASS]") || echo "selftests: $TEST [FAIL]"; done; 16:01:15 fork 1167 child /bin/sh -c for TEST in breakpoint_test step_after_suspend_test; do (./$TEST && echo "selftests: $TEST [PASS]") || echo "selftests: $TEST [FAIL]"; done; 16:01:15 fork 1167 parent /bin/sh -c for TEST in breakpoint_test step_after_suspend_test; do (./$TEST && echo "selftests: $TEST [PASS]") || echo "selftests: $TEST [FAIL]"; done; 16:01:15 fork 1168 child /bin/sh -c for TEST in breakpoint_test step_after_suspend_test; do (./$TEST && echo "selftests: $TEST [PASS]") || echo "selftests: $TEST [FAIL]"; done; 16:01:15 exec 1168 ./step_after_suspend_test 16:01:21 fork333 parent /lib/systemd/systemd-udevd 16:01:21 fork 1169 child /lib/systemd/systemd-udevd 16:01:21 fork 1169 parent /lib/systemd/systemd-udevd 16:01:21 fork 1170 child /lib/systemd/systemd-udevd 16:01:21 fork333 parent /lib/systemd/systemd-udevd 16:01:21 fork 1171 child /lib/systemd/systemd-udevd So, I'm basically ssh'ing to the running qemu instance and running on three terminals: dmesg -w, forkstat and fnotifystat to see what happens before it gets stuck -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
With the QEMU runner this gets further, but it fails for me with 14:33:01 DEBUG| Running 'git clone https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/yakkety linux' 14:33:01 ERROR| [stderr] Cloning into 'linux'... autopkgtest [16:02:54]: ERROR: timed out on command "..." (kind: test) autopkgtest [16:02:54]: test ubuntu-regression-suite: ---] Exception in thread copyin: OSError: [Errno 28] No space left on device Which bears the questions: (1) why is this timing out -- shouldn't failures like this cause an immediate exit? ISTM that the test suite is very prone to just hanging when anything goes wrong; can this be robustified somehow? (2) Is it really necessary to clone the entire kernel for the test? This will both take ages (it is running through a proxy in the infra!) and also take lots of disk space. If this just needs a few files, can you get them individually instead? If this needs to compile some helper/test binaries, can this happen during package build and you ship them in linux-source, linux-tools-XX, or maybe the -dbgsym package? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
** Changed in: linux (Ubuntu Yakkety) Assignee: (unassigned) => Colin Ian King (colin-king) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
After the hang, even SysRq doesn't work (I tried "sync" with Ctrl+A b s -- Ctrl+A b is the QEMU console key combo for sending SysRq, see Ctrl-A ?) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
I am able to reproduce this locally by using the "ssh" runner on a manually started QEMU instance, instead of the "qemu" runner directly; so this is much easier to investigate. First this needs a small new feature in autopkgtest's ssh runner: https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/commit/?id=9aa6fdbef . Easiest to just run autopkgtest from git (see reproducer below). * Take a standard autopkgtest yakkety image (autopkgtest-buildvm- ubuntu-cloud) and run it in QEMU: qemu-system-x86_64 -enable-kvm -m 4096 -smp 2 -nographic -drive file=path/to/autopkgtest-yakkety-amd64.img,if=virtio -net nic,model=virtio -net user,hostfwd=tcp::22000-:22 * Log in (ubuntu/ubuntu) and scp/install your host's ssh key info ~/.ssh/authorized_keys * Run the test: git clone https://anonscm.debian.org/git/autopkgtest/autopkgtest.git autopkgtest/runner/autopkgtest --testname ubuntu-regression-suite linux -- ssh -H localhost -l ubuntu -p 22000 --reboot --capability=isolation-machine --capability=revert --capability=revert-full-system This takes an hour or so, then the test fails with "testbed failure: testbed auxverb failed with exit code 255" and the console (in qemu) is completely dead. So notably this fails later than on the production infra (where it OOM-kills sshd during AppArmor tests), here it fails right after 12:39:40 DEBUG| [stdout] selftests: breakpoint_test [PASS] but this could be variance due to the infra instances having more CPUs and memory, or it's just an artifact of truncating the log earlier. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
It is still happening. I was running journalctl -f on the testbed while it ran, and was able to copy the last 9000 lines of scrollback from tmux. ** Attachment added: "journal" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+attachment/4762321/+files/journal.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1632252] Re: linux autopkgtest kills sshd in testbed
Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.8 kernel[0]. If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'. If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'. Once testing of the upstream kernel is complete, please mark this bug as "Confirmed". Thanks in advance. [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8 ** Changed in: linux (Ubuntu) Importance: Undecided => Medium ** Tags added: kernel-da-key ** Also affects: linux (Ubuntu Yakkety) Importance: Medium Status: Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1632252 Title: linux autopkgtest kills sshd in testbed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632252/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs