Re: +1 maintenance: 6-10 March 2023
Hi, On Mon, Mar 13, 2023 at 2:00 AM Steve Langasek wrote: > > Hi Andreas, > > I notice that most of these packages mentioned in your report are in main. > Were there particular reasons for you to focus on these during +1 > maintenance? (You mention "not so" random retries) I started with the find-proposed-cluster script, which flagged java packages, and then I went to the excuses-by-team page[1]. I guess that directed me more to main packages indeed. And I totally forgot about NBS packages this time around. > While we of course need to take care of the packages in main, a reminder > that the expectation is that the engineering teams responsible for those > packages will in general take care of them outside of +1 maintenance, with > +1 maintenance focused on driving down the -proposed queue overall, with, if > anything, a de-emphasis on main. Noted. -- ubuntu-devel mailing list ubuntu-devel@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
Re: +1 maintenance: 6-10 March 2023
Hi Andreas, I notice that most of these packages mentioned in your report are in main. Were there particular reasons for you to focus on these during +1 maintenance? (You mention "not so" random retries) While we of course need to take care of the packages in main, a reminder that the expectation is that the engineering teams responsible for those packages will in general take care of them outside of +1 maintenance, with +1 maintenance focused on driving down the -proposed queue overall, with, if anything, a de-emphasis on main. On Sun, Mar 12, 2023 at 03:15:19PM -0300, Andreas Hasenack wrote: > Hi, > > this is my report. I skipped March 9th to do some SRU work. > > # php-net-ldap2 blocking openldap > bug #2008825, fixed and uploaded > > # strongswan DEP8 test > I'm the one who introduced the new DEP8 tests to strongswan, and > noticed some flakiness, but I believe this was caused by the overall > slowness in fetching packages from the ftpmaster server. I increased > the time I was giving the lxd container to get ready (I need to > install packages in it for the test), and it looks better now. > > # (Not so) random retries > netplan.io > openssh: test logs were showing lots of ^@^@^@ characters, indicating > some sort of corruption. A retry worked. > firewalld: timeout > zfs-linux: was failing on "badpkg" errors. Retry worked. > > # crash > Sometimes failed on fetching the gpg key needed for the ddeb > repository, other times it was getting a 503 error from apt. Retries > eventually passed these errors, but the test still failed. Took me a > long time to get a test setup ready: > - the kernel ddeb it downloads is huge, and was taking a very long > time. Mentioned to ~is, as I was getting around 200kb/s. This was > before I investigated the DEP8 slowness, maybe it was related > - lunar lxd vms weren't booting, had to disable secure boot > - lunar lxd vms are using the kvm kernel by default, which doesn't > have KCORE enabled (had to investigate that until I finally saw it was > the kernel flavor that didn't enable it). Switched to generic. > - ddebs.ubuntu.com is flaky and slow, so each test run was taking a > long time (there is an internal RT ticket about that, from 7 months > ago) > - finally got a test setup ready and showing a problem in lunar: > > ubuntu@l-crash:~$ sudo crash -st /usr/lib/debug/boot/vmlinux-6.1.0-16-generic > WARNING: /usr/lib/debug/boot/vmlinux-6.1.0-16-generic > and /proc/version do not match! > > WARNING: /proc/version indicates kernel version: 6.1.0-16-generic > > crash: please use the vmlinux file for that kernel version, or try using >the System.map for that kernel version as an additional argument. > > It is the same version: > ubuntu@l-crash:~$ uname -a > Linux l-crash 6.1.0-16-generic #16-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb > 24 14:37:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux > > Since those versions look the same to me, I don't know what is going > on, and filed https://bugs.launchpad.net/ubuntu/+source/crash/+bug/2009595 > with the update-excuse tag. > > > # mysql-8.0 > Failing a lot on amd64, also taking a long time to run. > > Tests on arm64, where mysql-8.0 is already on big_packages, are taking > around 1h30 to run, even less than 1h most of the time. > On amd64 they take 2h or more, and usually fail with a timeout of > "kind test" (so not related to the ftpmaster slowness), where the > timeout is the default 10k seconds (~2h42). > I checked a few logs and the mysql amd64 test suite was killed at > different stages due to the timeout: 46%, 88%, 74%, 90% > Added mysql-8.0/amd64 to big_packages: > https://code.launchpad.net/~ahasenack/autopkgtest-cloud/+git/autopkgtest-package-configs/+merge/438389 > After the change, retriggered amd64 and it passed, quickly. > Also retriggered s390x, there was one failure, not related to a > timeout, but infra. There it's still failing, and taking a long time > (3h). Looking at the log, the test was killed mid-apt-download: > > Get:13 http://ftpmaster.internal/ubuntu lunar/universe s390x > mysql-testsuite-8.0 s390x 8.0.32-0ubuntu4 [357 MB] > autopkgtest-virt-ssh [11:46:34]: --- nova console-log > 014de5df-c6ce-484a-a082-7f6ef2ccb8d6 > (adt-lunar-s390x-mysql-8.0-20230312-020749-lrg-root4) -- > > So another victim of poor network performance to ftpmaster.internal. > > # glib2 > glib2 blocking netplan.io, let's see > - auto-multiple-choice dep8 test, gets killed due to timeout kind: > install. It was killed mid apt-get install, while downloading packages > https://autopkgtest.ubuntu.com/results/autopkgtest-lunar/lunar/arm64/a/auto-multiple-choice/20230306_200519_8770d@/log.gz > > # chrony > Blocking tzdata > Two chrony tests are consistently failing on arm64, armhf, i386 and > ppc64el: 113-leapsecond and 124-tai > I picked the easiest of those arches to try to reproduce it locally > (arm64 it is), but it always works. Works in my pi4, on another bare > metal arm64, on a random arm64 VM someone
+1 maintenance: 6-10 March 2023
Hi, this is my report. I skipped March 9th to do some SRU work. # php-net-ldap2 blocking openldap bug #2008825, fixed and uploaded # strongswan DEP8 test I'm the one who introduced the new DEP8 tests to strongswan, and noticed some flakiness, but I believe this was caused by the overall slowness in fetching packages from the ftpmaster server. I increased the time I was giving the lxd container to get ready (I need to install packages in it for the test), and it looks better now. # (Not so) random retries netplan.io openssh: test logs were showing lots of ^@^@^@ characters, indicating some sort of corruption. A retry worked. firewalld: timeout zfs-linux: was failing on "badpkg" errors. Retry worked. # crash Sometimes failed on fetching the gpg key needed for the ddeb repository, other times it was getting a 503 error from apt. Retries eventually passed these errors, but the test still failed. Took me a long time to get a test setup ready: - the kernel ddeb it downloads is huge, and was taking a very long time. Mentioned to ~is, as I was getting around 200kb/s. This was before I investigated the DEP8 slowness, maybe it was related - lunar lxd vms weren't booting, had to disable secure boot - lunar lxd vms are using the kvm kernel by default, which doesn't have KCORE enabled (had to investigate that until I finally saw it was the kernel flavor that didn't enable it). Switched to generic. - ddebs.ubuntu.com is flaky and slow, so each test run was taking a long time (there is an internal RT ticket about that, from 7 months ago) - finally got a test setup ready and showing a problem in lunar: ubuntu@l-crash:~$ sudo crash -st /usr/lib/debug/boot/vmlinux-6.1.0-16-generic WARNING: /usr/lib/debug/boot/vmlinux-6.1.0-16-generic and /proc/version do not match! WARNING: /proc/version indicates kernel version: 6.1.0-16-generic crash: please use the vmlinux file for that kernel version, or try using the System.map for that kernel version as an additional argument. It is the same version: ubuntu@l-crash:~$ uname -a Linux l-crash 6.1.0-16-generic #16-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 24 14:37:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Since those versions look the same to me, I don't know what is going on, and filed https://bugs.launchpad.net/ubuntu/+source/crash/+bug/2009595 with the update-excuse tag. # mysql-8.0 Failing a lot on amd64, also taking a long time to run. Tests on arm64, where mysql-8.0 is already on big_packages, are taking around 1h30 to run, even less than 1h most of the time. On amd64 they take 2h or more, and usually fail with a timeout of "kind test" (so not related to the ftpmaster slowness), where the timeout is the default 10k seconds (~2h42). I checked a few logs and the mysql amd64 test suite was killed at different stages due to the timeout: 46%, 88%, 74%, 90% Added mysql-8.0/amd64 to big_packages: https://code.launchpad.net/~ahasenack/autopkgtest-cloud/+git/autopkgtest-package-configs/+merge/438389 After the change, retriggered amd64 and it passed, quickly. Also retriggered s390x, there was one failure, not related to a timeout, but infra. There it's still failing, and taking a long time (3h). Looking at the log, the test was killed mid-apt-download: Get:13 http://ftpmaster.internal/ubuntu lunar/universe s390x mysql-testsuite-8.0 s390x 8.0.32-0ubuntu4 [357 MB] autopkgtest-virt-ssh [11:46:34]: --- nova console-log 014de5df-c6ce-484a-a082-7f6ef2ccb8d6 (adt-lunar-s390x-mysql-8.0-20230312-020749-lrg-root4) -- So another victim of poor network performance to ftpmaster.internal. # glib2 glib2 blocking netplan.io, let's see - auto-multiple-choice dep8 test, gets killed due to timeout kind: install. It was killed mid apt-get install, while downloading packages https://autopkgtest.ubuntu.com/results/autopkgtest-lunar/lunar/arm64/a/auto-multiple-choice/20230306_200519_8770d@/log.gz # chrony Blocking tzdata Two chrony tests are consistently failing on arm64, armhf, i386 and ppc64el: 113-leapsecond and 124-tai I picked the easiest of those arches to try to reproduce it locally (arm64 it is), but it always works. Works in my pi4, on another bare metal arm64, on a random arm64 VM someone let me use. I don't know what's going on. This test suite runs each test multiple times (20), and tolerates up to 2 failures. But these two tests are failing all runs in the DEP8 infrastructure, and locally they all pass. # openjdk-XX Tests affected by the ftpmaster.internal slowness. # python3.11 $ python3-dbg-config --cflags --libs /usr/bin/python3-dbg-config: 117: Syntax error: Unterminated quoted string I filed https://bugs.launchpad.net/ubuntu/+source/python3.11/+bug/2009967 with a patch. I would prefer if foundations uploaded this, as the package is currently a sync, and I don't want to inadvertently start another python-related massive DEP8 run. 1. https://lists.ubuntu.com/archives/ubuntu-devel/2023-March/042500.html -- ubuntu-devel mailing list