[Bug 1940723] Re: GRUB (re)installation failing due to stale grub-{pc, efi}/install_devices
See also: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/2083176 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1940723 Title: GRUB (re)installation failing due to stale grub-{pc,efi}/install_devices To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1940723/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2083176] Re: grub-efi/install_devices becoming stale due to by-id/nvme-eui.* symlinks disappearing
I looked into this a few months ago for slightly different reasons (juju/maas getting confused and not identifying a disk, due to differing kernels used for install vs boot), I can confirm I found at the time that the nvme by-id symlinks change due to backporting of the NVME_QUIRK_BOGUS_NID quirk. This was Unfortunately backports of this quirk for random SSD models has been regularly done to linux -stable kernels upstream. I ran out of time to follow-up on this at the time, but probably this practice needs to be raised upstream with the kernel and possibly needs to stop and/or some solution to do with the symlinks needs to happen, I didn't quite get as far as understanding why the BOGUS NID matters and what that breaks, or what is fixed by the change, fully. There are a couple of other open bugs related to this issue, e.g. where it also breaks on upgrade: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/2039108 https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1940723 in my juju/maas case this was happening with VirtIO SCSI devices too, not a real SSD. As that was also quirked. May make for a way to reproduce the issue without one of the effected SSDs. Possibly also related links I collected: https://lore.kernel.org/all/20220606064055.ga2...@lst.de/T/#madf46b0ae9d07405bad2e324cb782c477e7518b2: https://bugs.launchpad.net/curtin/+bug/2015100 https://bugzilla.redhat.com/show_bug.cgi?id=2031810 https://bugzilla.kernel.org/show_bug.cgi?id=217981 https://www.truenas.com/community/threads/bluefin-to-cobia-rc1-drive-now-fails-with-duplicate-ids.113205/ ** Bug watch added: Red Hat Bugzilla #2031810 https://bugzilla.redhat.com/show_bug.cgi?id=2031810 ** Bug watch added: Linux Kernel Bug Tracker #217981 https://bugzilla.kernel.org/show_bug.cgi?id=217981 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2083176 Title: grub-efi/install_devices becoming stale due to by-id/nvme-eui.* symlinks disappearing To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/2083176/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2064717] Re: ceph-volume needs "packaging" and "ceph" modules
Note: This issue is more impactful than I initially realised. I was thinking it was mainly an issue on initial deploy, but if you upgrade your deployment to 18.2.4 and then reboot a node, the OSDs won't start, because the ceph-volume tool is needed to activate the OSDs. ** Changed in: cloud-archive/bobcat Importance: Undecided => High ** Changed in: cloud-archive/bobcat Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2064717 Title: ceph-volume needs "packaging" and "ceph" modules To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ceph-osd/+bug/2064717/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2064717] Re: ceph-volume needs "packaging" and "ceph" modules
OK, well we learnt now that only upgrading and not doing a fresh deployment, and only doing the ceph-mon tests is not enough. Indeed, let's work on a more concrete/full test plan. I have some strong thoughts for that so will discuss with you and Utkarsh, etc. Luciano: In the mean time, can you prioritise a ceph 18.2.4 SRU to fix this regression please (not the charm fix)? Like this week preferably. We have customers actually using this and affected by it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2064717 Title: ceph-volume needs "packaging" and "ceph" modules To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ceph-osd/+bug/2064717/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2064717] Re: ceph-volume needs "packaging" and "ceph" modules
I suspect the reason this was not picked up in the SRU test, is possibly related to code from the Squid charm being used in the test instead of the Reef Charm. The squid charm merged a "tactical fix" to manually install python3-packaging in this change: https://review.opendev.org/c/openstack/charm-ceph-osd/+/918992 But not for Reef, it was originally proposed but abandoned when we thought it wasn't needed for Reef: https://review.opendev.org/c/openstack/charm-ceph-osd/+/919794 The Squid charm supports running/installing reef because you're expected to upgrade the charm before Ceph itself, to orchestrate an upgrade. So both the Reef and Squid charm branches have a test for Reef (tests/bundles/jammy-bobcat). IMHO merging this charm change was a bad idea and it should be reverted once all the packages are fixed. The package should simply have been fixed immediately in the first instance. While I can appreciate this might have been done as a stop-gap to get the charm CI working while the issue was not yet fixed in an SRU, the problem is that we are using the charm tests to verify the SRU of the Ubuntu package which is potentially (and actually, even in the cloud- archive) used by people without the charms, so this is likely to hide such an issue as it did here. It also means we don't have a functional test to actually test that the issue is fixed, both in the Reef and Squid SRUs. I can't quite figure out exactly how this test was done though. The original message said it was tested with the ceph-osd charm tests, but the zaza.openstack.charm_tests.ceph.tests.CephPrometheusTest test listed in the output only exists in charm-ceph-mon. Then those tests all use the reef branch of the charm.. I am guessing maybe since we had to test with bobcat-proposed that the squid branch was used by with openstack- origin overriden to bobcat-proposed or something? Luciano: Would be great if you can clarify/reverse engineer exactly how you managed to do this so we can learn for next time. I also wonder if we'd be better using charmed-openstack-tester or something like that, instead of purely the charm-ceph-mon tests, for validating SRUs? A few possible lessons for future SRU verificaiton: - We need to ensure we verify SRUs with all GIT/charmhub branches of the charm that support a release. So generally that would be both the matching and newer version. It's not sufficient to check with only one of those. - Thinking more about the charm users that are the majority, I think ideally we also need to run both the charmhub stable AND candidate branches for both of those releases. Currently the test bundles use the '/edge' channel (which maps to candidate) and would only test the candidate charm, and won't show up if we're about to release a package that is broken wtith the stable charms. Especially for the latest release of Ceph, due to the Solutions QA process, sometimes the stable channel is lagging the edge channel by weeks or even months. So this is not unlikely. - Using the charm tests to verify the Ubuntu package in general has some limits, in that it may miss scenarios that would still effect non-charm users. I am not proposing we stop using it, but we should be aware of that. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2064717 Title: ceph-volume needs "packaging" and "ceph" modules To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ceph-osd/+bug/2064717/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2064717] Re: ceph-volume needs "packaging" and "ceph" modules
I discovered this issue myself (for Reef, 18.2.4) today when running the zaza integration test for charm-glance-simplestreams-sync against jammy- bobcat. According to the SRU, the charm-ceph-osd tests were run, and the package version was verified. The question is, why did those tests not catch this? When I run the zaza test for charm-ceph-osd in the stable/reef branch, it also fails with the issue. I see the exact same version installed as reported in the SRU bug: juju ssh ceph-osd/0 sudo ceph -v ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable) So I am really curious to understand why the test passed previously. ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/bobcat Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2064717 Title: ceph-volume needs "packaging" and "ceph" modules To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ceph-osd/+bug/2064717/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2062927] Re: Ambiguity in mdns configuration
It's not possible to correctly run two mDNS stacks at the same time, as while multicast udp packets can be received by multiple programs, only one program will receive unicasted port 5353 mDNS replies, even if both daemons allow multiple-binding to port 5353. While actually using that feature is not so commonly used intentionally, it is used sortof by accident by many enterprise wireless network vendors when they "convert" multicast to unicast as a network optimisation (because multicast packets are truly multicasted, but at a "base" network rate much slower than the normal rate of the clients, which uses up more airtime than sending them all individually at a higher speed). Hence, we cannot really enable the independent systemd-resolved support at the same time as actually using Avahi to do proper service discovery, and you should use the avahi/nss-mdns support instead if you want any actual mDNS service discovery support. Ideally resolved would add a backend to use avahi when it exists/is installed so we could drop the extra nss-mdns step. But no one has written that code so far. But I am not sure why you say you cannot disable the systemd-resolved mDNS support. It's disabled in resolved by default out of the box, and then when disabled it doesn't bind to the port, so Avahi works fine, and nss-mdns will work find alongside systemd-resolved. Many people use this configuration all the time. So I am curious.. in what specific scenario and configuration are you seeing it enabled and the port conflict? On an out of the box install if you run "resolvectl status" you'll see -mDNS on all the interfaces. Can you detail your configuration more precisely? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2062927 Title: Ambiguity in mdns configuration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/2062927/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1811255] Re: perf archive missing
*** This bug is a duplicate of bug 1823281 *** https://bugs.launchpad.net/bugs/1823281 ** This bug has been marked a duplicate of bug 1823281 perf-archive is not shipped in the linux-tools package -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1811255 Title: perf archive missing To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1811255/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1977669] Re: Metadata broken for SR-IOV external ports
** Tags added: sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1977669 Title: Metadata broken for SR-IOV external ports To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1977669/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1977669] Re: Metadata broken for SR-IOV external ports
** Description changed: OpenStack Usurri/OVN SR-IOV instances are unable to connect to the metadata service despite DHCP and normal traffic work. The 169.254.169.254 metadata route is directed at the DHCP port IP, and no arp reply is received by the VM for this IP. Diagnosis finds that the ARP reply returns from the ovnmeta namespace on the chassis hosting the external port but is dropped inside OVS. 20.03.2-0ubuntu0.20.04.2 backported the following patch: Do not forward traffic from localport to localnet ports (LP: #1943266) (d/p/lp-1943266-physical-do-not-forward-traffic-from-localport-to-a-.patch) - This patch broke metadata for SR-IOV external prots and was fixed in 1148580290d0ace803f20aeaa0241dd51c100630 "Don't suppress localport traffic directed to external port": + This patch broke metadata for SR-IOV external ports and was fixed in 1148580290d0ace803f20aeaa0241dd51c100630 "Don't suppress localport traffic directed to external port": https://github.com/ovn-org/ovn/commit/1148580290d0ace803f20aeaa0241dd51c100630 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1977669 Title: Metadata broken for SR-IOV external ports To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1977669/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1977669] [NEW] Metadata broken for SR-IOV external ports
Public bug reported: OpenStack Usurri/OVN SR-IOV instances are unable to connect to the metadata service despite DHCP and normal traffic work. The 169.254.169.254 metadata route is directed at the DHCP port IP, and no arp reply is received by the VM for this IP. Diagnosis finds that the ARP reply returns from the ovnmeta namespace on the chassis hosting the external port but is dropped inside OVS. 20.03.2-0ubuntu0.20.04.2 backported the following patch: Do not forward traffic from localport to localnet ports (LP: #1943266) (d/p/lp-1943266-physical-do-not-forward-traffic-from-localport-to-a-.patch) This patch broke metadata for SR-IOV external prots and was fixed in 1148580290d0ace803f20aeaa0241dd51c100630 "Don't suppress localport traffic directed to external port": https://github.com/ovn-org/ovn/commit/1148580290d0ace803f20aeaa0241dd51c100630 ** Affects: ovn (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1977669 Title: Metadata broken for SR-IOV external ports To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1977669/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1970453] Re: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set
With regards to the patch here: https://lists.linuxfoundation.org/pipermail/iommu/2021-October/060115.html It is mentioned this issue can occur if you are passing through a PCI device to a virtual machine guest. This patch seems like it never made it into the kernel. So I am curious if you are using any virtual machines on this host, and if any of them are mapping PCI devices from the host in. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1970453 Title: DMAR: ERROR: DMA PTE for vPFN 0x7bf32 already set To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970453/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1964445] [NEW] Incorrectly identifies processes inside LXD container on jammy/cgroupsv2
Public bug reported: Processes inside of LXD containers are incorrectly identified as needing a restart on jammy. The cause is that needrestart does not correctly parse cgroups v2. Since needrestart is installed in a default install, this is problematic as it prompts you to restart and actually restarts the host version of a container's processes unnecessarily. I have sent an upstream pull request to fix this here, it's a simple fix to the regex: https://github.com/liske/needrestart/pull/238 Upstream also already has a fix to the same for Docker: https://github.com/liske/needrestart/pull/234 We should patch both of these into Jammy before release. I can send this patch upstream to Debian also however as they do not currently use cgroups v2 by default it is not directly affected in a default configuration (but would be affected if you enable them). Since we are also close to release this may also need to be expedited. = Test Case = - Install Jammy Server with needrestart installed (the server iso installs it by default, cloud/vm/lxd images do not) - Launch an LXD focal container - (slightly harder) inside the focal container, upgrade a commonly used library such as libc6. To do this you may need to first downgrade libc6, restart avahi-daemon, upgrade it again. - Run "needrestart" on the host and see that the container's avahi-daemon is recognised to restart (but it will restart the hosts process, and the next invocation will prompt to restart again) ** Affects: needrestart (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1964445 Title: Incorrectly identifies processes inside LXD container on jammy/cgroupsv2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/needrestart/+bug/1964445/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1958148] Re: mkinitramfs is too slow
Where is the discussion happening? I ran the same benchmarks for my i7-6770HQ 4-core system. This really needs revising. While disk space using in /boot is a concern, in this example at least -10 would use only 8MB (10%) more space and cut the time taken from 2m1s to 13s. zstd.0 84M 0m2.150s zstd.1 96M 0m1.236s zstd.2 90M 0m1.350s zstd.3 84M 0m2.235s zstd.4 84M 0m3.355s zstd.5 81M 0m5.679s zstd.6 81M 0m7.416s zstd.7 78M 0m8.857s zstd.8 77M 0m10.134s zstd.9 77M 0m11.238s zstd.10 72M 0m13.232s zstd.11 72M 0m14.897s zstd.12 72M 0m19.343s zstd.13 72M 0m26.327s zstd.14 72M 0m30.948s zstd.15 72M 0m40.913s zstd.16 70M 0m59.517s zstd.17 66M 1m15.854s zstd.18 64M 1m36.227s zstd.19 64M 2m1.417s -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1958148 Title: mkinitramfs is too slow To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1958148/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Re-installing from scratch should resolve the issue. I suspect in most cases if you install with the 21.10 installer (even though it has the old kernel) as long as you install updates during the install this issue probably won't hit you. It mostly seems to occur after a reboot and it's loading data back from disk again. As per some of the other comments you'll have a bit of a hard time copying data off the old broken install.. you need to work through which files/folders are corrupt and reboot and then exclude those from the next rsync. You could use the 22.04 daily build, it will eventually upgrade into the final release. However not usually recommended as there may be bugs or other problems in those daily images and/or it's not uncommon for the development release to sometimes break during the development cycle. Most of the time it doesn't and it usually works most of the time, but it's much more likely than using 21.10. I'd try a re-install with 21.10 as I described. Obviously you'll need to backup all of your data from the existing install first. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1077796] Re: /bin/kill no longer works with negative PID
Most shells (including bash, zsh) have a built-in for kill so it's done internally. Some shells don't so it executes /bin/kill instead which has this issue. One comment noted this was fixed at some point in 2013 in version 3.3.4 but it apparently broke again at some point and is broken at least in 20.04 Focal's v3.3.16. This was recently fixed again upstream here: https://gitlab.com/procps-ng/procps/-/merge_requests/77 Upstream v3.3.16 (in 20.04 Focal and 20.10 Hirsute) was released Dec 2019 without this fix. That fix was submitted upstream 3 years ago but only merged 11 months ago and was included in the v3.3.17 release which was made in Feb 2021 so not included in 20.04 Focal. 3.3.17 with the fix is already in 21.10 Impish. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1077796 Title: /bin/kill no longer works with negative PID To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1077796/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1952496] Re: ubuntu 20.04 LTS network problem
Thanks for the data. I can see you queried 'steven-ubuntu.local' and that looks like the hostname of the local machine. Can you also query the hostname of the AFP server you are trying to connect to (using both getent hosts and avahi-resolve-host-name). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1952496 Title: ubuntu 20.04 LTS network problem To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/1952496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1952496] Re: ubuntu 20.04 LTS network problem
As a side note, it may be time to switch to a new protocol. As even Apple has dropped support for sharing AFP versions in the last few releases and is deprecating it's usage. You can use Samba to do SMBFS including the extra special apple stuff if you need timemachine support etc on your NAS -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1952496 Title: ubuntu 20.04 LTS network problem To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/1952496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1952496] Re: ubuntu 20.04 LTS network problem
To assist with this can you get the following outputs from the broken system: # Change 'hostname.local' to the hostname expected to work cat /etc/nsswitch.conf systemctl status avahi-daemon journalctl -u avahi-daemon --boot avahi-resolve-host-name hostname.local getent hosts hostname.local ** Changed in: avahi (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1952496 Title: ubuntu 20.04 LTS network problem To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/1952496/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1339518] Re: sudo config file specifies group "admin" that doesn't exist in system
Subscribing Marc as he seems to be largely maintaining this and made the original changes and has been keeping the delta. Hopefully he can provide some insight. Seems this is a delta to Debian that is being kept intentionally for a long time, it's frequently in the changelog even in the most recent Debian merge. I'd have thought if we kept this in here by default we probably should have kept a default 'admin' group with no members but it's a bit late for that at this point. - debian/sudoers: + also grant admin group sudo access Also seems this change was originally made in 2014: sudo (1.8.9p5-1ubuntu3) vivid; urgency=medium * debian/patches/also_check_sudo_group.diff: also check the sudo group in plugins/sudoers/sudoers.c to create the admin flag file. Leave the admin group check for backwards compatibility. (LP: #1387347) -- Marc Deslauriers Wed, 29 Oct 2014 15:55:34 -0400 sudo (1.8.9p5-1ubuntu2) utopic; urgency=medium * debian/sudo_root.8: mention sudo group instead of deprecated group admin (LP: #1130643) -- Andrey Bondarenko Sat, 23 Aug 2014 01:18:05 +0600 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1339518 Title: sudo config file specifies group "admin" that doesn't exist in system To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/sudo/+bug/1339518/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1339518] Re: sudo config file specifies group "admin" that doesn't exist in system
Just noticed this today, it's still the same on Ubuntu 20.04. The default sudoers file ships the admin group having sudo privileges but the group doesn't exist by default. While it doesn't have out of the box security implications, I think this is a security concern as someone could potentially add an 'admin' user and not expect them to get sudo access with the default matching group name created for them. For example downstream products like web hosting or control panel style tools that creates users with a user-provided name. Since neither the user or group 'admin' exists by default they could be fooled into creating escalatable privileges. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1339518 Title: sudo config file specifies group "admin" that doesn't exist in system To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/sudo/+bug/1339518/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1931660] Re: PANIC at zfs_znode.c:339:zfs_znode_sa_init()
This looks like a duplicate of this: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1931660 Title: PANIC at zfs_znode.c:339:zfs_znode_sa_init() To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1931660/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
In a related way say you wanted to recover a system from a boot disk, and copy all the data off to another disk. If you use a sequential file copy like from tar/cp in verbose mode and watch it, eventaully it will hang on the file triggering the issue (watch dmesg/kern.log). Once that happens, move that file into a directory like /broken which you exclude from tar/cp, reboot to get back into a working state, then start the copy again. Basically what I did incrementally to find all the broken files. Fortunately for me they were mostly inside chrome or electron app dirs. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
So to be clear this patch revert fixes the issue being caused new, but, if the issue already happened on your filesystem it will continue to occur because the exception is reporting corruption on disk. I don't currently have a good fix for this other than to move the affected files to a directory you don't use (but it's sometimes tricky to figure out which files are the cause). For dkms status you could try check ls -la /proc/$(pidof dkms)/fd to see what file it opened, or strace it, to try figure out what file it's up to when it hangs. then move that file or directory out of the way and then replace them. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Have created a 100% reliable reproducer test case and also determined the Ubuntu-specific patch 4701-enable-ARC-FILL-LOCKED-flag.patch to fix Bug #1900889 is likely the cause. [Test Case] The important parts are: - Use encryption - rsync the zfs git tree - Use parallel I/O from silversearcher-ag to access it after a reboot. A simple "find ." or "find . -exec cat {} > /dev/null \;" does not reproduce the issue. Reproduction done using a libvirt VM installed from the Ubuntu Impish daily livecd using a normal ext4 root but with a second 4GB /dev/vdb disk for zfs later = Preparation apt install silversearcher-ag git zfs-dkms zfsutils-linux echo -n testkey2 > /root/testkey git clone https://github.com/openzfs/zfs /root/zfs = Test Execution zpool create test /dev/vdb zfs create test/test -o encryption=on -o keyformat=passphrase -o keylocation=file:///root/testkey rsync -va --progress -HAX /root/zfs/ /test/test/zfs/ # If you access the data now it works fine. reboot zfs load-key test/test zfs mount -a cd /test/test/zfs/ ag DISKS= = Test Result ag hangs, "sudo dmesg" shows an exception [Analysis] I rebuilt the zfs-linux 2.0.6-1ubuntu1 package from ppa:colin-king/zfs-impish without the Ubuntu-specific patch ubuntu/4701-enable-ARC-FILL-LOCKED-flag.patch which fixed Bug #1900889. With this patch disabled the issue does not reproduce. Re-enabling the patch it reproduces reliably every time again. Seems this bug was never sent upstream. No code changes upstream setting the flag ARC_FILL_IN_PLACE appear to have been added since that I can see however interestingly the code for this ARC_FILL_IN_PLACE handling was added to fix a similar sounding issue "Raw receive fix and encrypted objset security fix" in https://github.com/openzfs/zfs/commit/69830602de2d836013a91bd42cc8d36bbebb3aae . This first shipped in zfs 0.8.0 and the original bug was filed against 0.8.3. I also have found the same issue as the original Launchpad bug reported upstream without any fixes and a lot of discussion (and quite a few duplicates linking back to 11679): https://github.com/openzfs/zfs/issues/11679 https://github.com/openzfs/zfs/issues/12014 Without fully understanding the ZFS code in relation to this flag, the code at https://github.com/openzfs/zfs/blob/ce2bdcedf549b2d83ae9df23a3fa0188b33327b7/module/zfs/arc.c#L2026 talks about how this flag is to do with decrypting blocks in the ARC and doing so 'inplace'. It makes some sense thus that I need encryption to reproduce it and it works best after a reboot (thus flushing the ARC) and why I can still read the data in the test case before doing a reboot when it then fails. This patch was added in 0.8.4-1ubuntu15 and I first experienced the issue somewhere between 0.8.4-1ubuntu11 and 0.8.4-1ubuntu16. So it all adds up and I suggest that this patch should be reverted. ** Bug watch added: github.com/openzfs/zfs/issues #11679 https://github.com/openzfs/zfs/issues/11679 ** Bug watch added: github.com/openzfs/zfs/issues #12014 https://github.com/openzfs/zfs/issues/12014 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
While trying to setup a reproducer that would excercise chrome or wine or something I stumbled across the following reproducer that worked twice in a row in a libvirt VM on my machine today. The general gist is to (1) Create a zfs filesystem with "-o encryption=aes-256-gcm -o compression=zstd -o atime=off -o keyformat=passphrase" (2) rsync a copy of the openzfs git tree into it (3) Reboot (4) Use silversearcher-ag to search the directory for "DISKS=" Precise steps: mkdir src cd src git clone https://github.com/openzfs/zfs sudo apt install zfsutils-linux zfs-initramfs sudo zpool create tank /dev/vdb sudo zfs create tank/lathiat2 -o encryption=aes-256-gcm -o compression=zstd -o atime=off -o keyformat=passphrase rsync -va --progress -HAX /etc/skel /tank/lathiat2/; chown -R lathiat:lathiat /tank/lathiat2; rsync -va --progress /home/lathiat/src/ /tank/lathiat2/src/; chown -R lathiat:lathiat /tank/lathiat2/src/ # reboot sudo zfs load-key tank/lathiat2 sudo zfs mount -a cd /tank/lathiat2/src/zfs/ ag DISKS= Hit on the exact same crash: [ 61.377929] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed [ 61.377930] PANIC at zfs_znode.c:339:zfs_znode_sa_init() Now will test this out on the beta 2.0.6 package and also see if the standard zfs test suite will trigger it or not as a matter of somewhat curiosity. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
34 more user reports on the upstream bug of people hitting it on Ubuntu 5.13.0: https://github.com/openzfs/zfs/issues/10971 I think this needs some priority. It doesn't seem like it's hitting upstream, for some reason only really hitting on Ubuntu. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
@Colin To be clear this is the same bug I originally hit and opened the launchpad for, it just doesn't quite match with what most people saw in the upstream bugs. But it seemed to get fixed anyway for a while, and has regressed again somehow. Same exception as from the original description and others reporting: 2021 May 16 21:19:09 laptop VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed The upstream bug mostly reported slightly different errors though similar symptoms (files get stuck and can't be accessed). I also tried to use 'zdb' to check if incorrect file modes were saved, unfortunately it seems zdb does not work for encrypted datasets, it only dumps the unencrypted block info and doesn't dump info about filemodes etc from the encrypted part. So I can't check that. I've reverted back to 5.11.0-25 for now and it's stable again. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1783184] Re: neutron-ovs-cleanup can have unintended side effects
There is a systemd option that I think will solve this issue. https://www.freedesktop.org/software/systemd/man/systemd.unit.html#RefuseManualStart= RefuseManualStart=, RefuseManualStop= Takes a boolean argument. If true, this unit can only be activated or deactivated indirectly. In this case, explicit start-up or termination requested by the user is denied, however if it is started or stopped as a dependency of another unit, start-up or termination will succeed. This is mostly a safety feature to ensure that the user does not accidentally activate units that are not intended to be activated explicitly, and not accidentally deactivate units that are not intended to be deactivated. These options default to false. As far as I am aware there is rarely/never a good reason to run this intentionally. If someone *really* wants to run it, the command is somewhat straightforward to run directly: ExecStart=/usr/bin/neutron-ovs-cleanup --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini --log-file /var/log/neutron/ovs-cleanup.log There are 2 such services: neutron-ovs-cleanup.service neutron-linuxbridge-cleanup.service See also: https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1885264 (recent work to stop it being run on package upgrade by accident) And while we're at it, RedHat had a bug where the cleanup script could take 1-2 minutes on some busy/large hosts, and added "TimeoutSec=0" to avoid issues related to that. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1783184 Title: neutron-ovs-cleanup can have unintended side effects To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1783184/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1892242] Re: Curtin doesn't handle type:mount entries without 'path' element
In terms of understanding when this was fixed for what users/versions. Assuming that MAAS is copying the curtin version from the server, to the deployed client, which I think is the case, you need to get an updated Curtin to the MAAS server. The bug was fix released into curtin 20.1-20-g1304d3ea-0ubuntu1 in August 2020. Curtin has not been updated in bionic itself since May 2020 (20.1-2-g42a9667f-0ubuntu1~18.04.1). So no fix there. MAAS 2.7 PPA (https://launchpad.net/~maas/+archive/ubuntu/2.7) - No fix MAAS 2.8 PPA (https://launchpad.net/~maas/+archive/ubuntu/2.8) - Fixed in 21.2-0ubuntu1~18.04.1 uploaded 1st March 2021 - first and only curtin upload MAAS 2.9 PPA (https://launchpad.net/~maas/+archive/ubuntu/2.9) - FIxed in 21.2-0ubuntu1~20.04.1 uploaded 16th February 2021 - first and only curtin upload MAAS 2.8 was released 24 June 2020 MAAS 2.9 was released December 2020 MAAS 3.0 was released 6 July 2021 [Note: only supports 20.04] So seems like there was a gap from August 2020 to December 2020 where the fix possibly wasn't available to MAAS users at all? And then until March 2021 it wasn't available to MAAS 2.8 users. However I don't know what version of MAAS, if any, is consuming curtin as a snap. And whether that applies to both DEB and SNAP installations of those given versions. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1892242 Title: Curtin doesn't handle type:mount entries without 'path' element To manage notifications about this bug go to: https://bugs.launchpad.net/curtin/+bug/1892242/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
I traced the call failure. I found the failing code is in sa.c:1291#sa_build_index() if (BSWAP_32(sa_hdr_phys->sa_magic) != SA_MAGIC) { This code prints debug info to /proc/spl/kstat/zfs/dbgmsg, which for me is: 1629791353 sa.c:1293:sa_build_index(): Buffer Header: cb872954 != SA_MAGIC:2f505a object=0x45175e So in this case seems the data is somehow corrupted, since this is supposed to be a magic value that is always correct and doesn't change. Not entirely clear how this actually played into the original bug. So it may be that this is really a different bug. Hrm. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
This has re-appeared for me today after upgrading to 5.13.0-14 on Impish. Same call stack, and same chrome-based applications (Mattermost was hit first) affected. Not currently running DKMS, so: Today: 5.13.0-14-lowlat Tue Aug 24 10:59 still running (zfs module is 2.0.3-8ubuntu6) Yesterday: 5.11.0-25-lowlat Mon Aug 23 12:52 - 08:05 (19:13) (zfs module is 2.0.2-1ubuntu5) I am a bit confused because the patched line "newmode = zp->z_mode;" still seems present in the package. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Try the zfs_recover step from Colin's comment above. And then look for invalid files and try to move them out of the way. I'm not aware of encrypted pools being specifically implicated (no such mention in the bug and it doesn't seem like it), having said that, I am using encryption on the dataset where I was hitting this. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1827264] Re: ovs-vswitchd thread consuming 100% CPU
Seems there is a good chance at least some of the people commenting or affected by this bug are duplicate of Bug #1839592 - essentially a libc6 bug that meant threads weren't woken up when they should have been. Fixed by libc6 upgrade to 2.27-3ubuntu1.3 in bionic. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1827264 Title: ovs-vswitchd thread consuming 100% CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1827264/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Are you confident that the issue is a new issue? Unfortunately as best I can tell, the corruption can occur and then will still appear on a fixed system if it's reading corruption created in the past that unfortunately scrub doesn't seem to detect. I've still had no re-occurance here after a few weeks on hirsute with 2.0.2-1ubuntu5 (which includes the https://github.com/openzfs/zfs/issues/11474 fix) - but from a fresh install. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1920640] Re: EXPKEYSIG C8CAB6595FDFF622 Ubuntu Debug Symbol Archive Automatic Signing Key (2016)
** Changed in: ubuntu-keyring (Ubuntu) Importance: Undecided => Critical ** Changed in: ubuntu-keyring (Ubuntu) Importance: Critical => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1920640 Title: EXPKEYSIG C8CAB6595FDFF622 Ubuntu Debug Symbol Archive Automatic Signing Key (2016) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ubuntu-keyring/+bug/1920640/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1920640] Re: EXPKEYSIG C8CAB6595FDFF622 Ubuntu Debug Symbol Archive Automatic Signing Key (2016)
Updated the following wiki pages: https://wiki.ubuntu.com/Debug%20Symbol%20Packages https://wiki.ubuntu.com/DebuggingProgramCrash With the note: Note: The GPG key expired on 2021-03-21 and may need updating by either upgrading the ubuntu-dbgsym-keyring package or re-running the apt-key command. Please see Bug #1920640 for workaround details if that does not work. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1920640 Title: EXPKEYSIG C8CAB6595FDFF622 Ubuntu Debug Symbol Archive Automatic Signing Key (2016) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ubuntu-keyring/+bug/1920640/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1920640] Re: EXPKEYSIG C8CAB6595FDFF622 Ubuntu Debug Symbol Archive Automatic Signing Key (2016)
Just to make the current status clear from what I can gather: - The GPG key was extended by 1 year to 2022-03-21 - On Ubuntu Bionic (18.04) and newer the GPG key is normally installed by the ubuntu-dbgsym-keyring package (on 18.04 Bionic onwards). This package is not yet updated. An update to this package is required and still pending. - On Ubuntu Xenial (16.04) users typically imported the key from keyserver.ubuntu.com. As that is not yet updated, you will need to import the key from HTTP using the workaround below which will work as a temporary workaround on all Ubuntu releases. Once keyserver.ubuntu.com is updated, you could also use "sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F2EDC64DC5AEE1F6B9C621F0C8CAB6595FDFF622" - The updated GPG key is not currently published to keyserver.ubuntu.com - The updated GPG key is available at http://ddebs.ubuntu.com/dbgsym- release-key.asc - As a workaround you can import that key to apt using "wget -O - http://ddebs.ubuntu.com/dbgsym-release-key.asc | sudo apt-key add -" (note: you need a space between the -O and -, contrary to the previously pasted comment) - I believe that the key likely needs to be extended longer and published to all resources including the ubuntu-dbgsym-keyring package and keyserver.ubuntu.com -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1920640 Title: EXPKEYSIG C8CAB6595FDFF622 Ubuntu Debug Symbol Archive Automatic Signing Key (2016) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ubuntu-keyring/+bug/1920640/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
I got another couple of days out of it without issue - so I think it's likely fixed. It seems like this issue looks very similar to the following upstream bug, same behaviour but a different error, and so I wonder if it was ultimately the same bug. Looks like this patch from 2.0.3 was pulled into the package? https://github.com/openzfs/zfs/issues/11621 https://github.com/openzfs/zfs/issues/11474 https://github.com/openzfs/zfs/pull/11576 Further testing has been hampered as zsys deleted all of my home datasets entirely (including all snapshots) - tracked in https://github.com/ubuntu/zsys/issues/196 - using a non-zfs boot until I finish recovering that - but still seems likely fixed as I was hitting it most days before. ** Bug watch added: github.com/openzfs/zfs/issues #11621 https://github.com/openzfs/zfs/issues/11621 ** Bug watch added: github.com/openzfs/zfs/issues #11474 https://github.com/openzfs/zfs/issues/11474 ** Bug watch added: github.com/ubuntu/zsys/issues #196 https://github.com/ubuntu/zsys/issues/196 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
I have specifically verified that this bug (vlan traffic interruption during restart when rabbitmq is down) is fixed by the package in bionic- proposed. Followed my reproduction steps per the Test Case and all traffic to instances stops on 12.1.1-0ubuntu3 and does not stop on 12.1.1-0ubuntu4 But not completing verification yet as we need to perform more general testing on the package for regressions etc -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1894843] Re: [dvr_snat] Router update deletes rfp interface from qrouter even when VM port is present on this host
When using DVR-SNAT, a simple neutron-l3-agent gateway restart triggers this issue. Reproduction Note: Nodes with an ACTIVE or BACKUP (in the case of L3HA) router for the network are not affected by this issue, so a small 1-6 node environment may make this difficult to reproduce or only affect half of the nodes (e.g. 3/6 nodes if you have L3HA). Workaround: for each compute node, you need to create a new VM on each network. While registering the new VM port it will cause the missing fpr/rfp interface pair to be created and paired. It does not seem possible to fix it any other way such as stopping/starting the existing VM, rebooting the host, etc. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1894843 Title: [dvr_snat] Router update deletes rfp interface from qrouter even when VM port is present on this host To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1894843/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Looking to get this approved so that we can verify it, as needing this ideally released by the weekend of March 27th for some maintenance activity. Is something holding back the approval? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
It's worth noting that, as best I can understand, the patches won't fix an already broken filesystem. You have to remove all of the affected files, and it's difficult to know exactly what files are affected. I try to guess based on which show a ??? mark in "ls -la". But sometimes the "ls" hangs, etc. I've been running zfs-dkms 2.0.2-1ubuntu2 for 24 hours now and so far so good.. won't call it conclusive but hoping this has solved it. Though I am thoroughly confused as to what patch solved it, nothing *seems* relevant. Which is frustrating. Will try to update in a few days as to whether it definitely hasn't hit, most of the time I hit it in a day but wasn't strictly 100%. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1916708] Re: udpif_revalidator crash in ofpbuf_resize__
E-mailed upstream for assistance: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050963.html ** Tags added: sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1916708 Title: udpif_revalidator crash in ofpbuf_resize__ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1916708/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1916708] [NEW] udpif_revalidator crash in ofpbuf_resize__
Public bug reported: The udpif_revalidator thread crashed in ofpbuf_resize__ on openvswitch 2.9.2-0ubuntu0.18.04.3~cloud0 (on 16.04 from the xenial-queens cloud archive, backported from the 18.04 release of the same version). Kernel version was 4.4.0-159-generic. The issue is suspected to still exist in upstream master as Feb 2021/v2.15.0 but has not been completed understood. Opening this bug to track future occurances. The general issue appears to be that the udpif_revaliditator thread tried to expand a stack-allocated ofpbuf to fit a netlink reply with size 3204 but the buffer is of size 2048. This intentionally raises an assertion as we can't expand the memory on the stack. The crash in __ofpbuf_resize__ appears due to OVS_NOT_REACHED() being called because b->source = OFPBUF_STACK (the line number indicates it's the default: case but this appears to be an optimiser quirk, b->source is OFPBUF_STACK). We can't realloc() the buffer memory if it's allocated on the stack. This buffer is provided in #7 nl_sock_transact_multiple__ during the call to nl_sock_recv__, specified as buf_txn->reply. In this specific case it seems we found transactions[0] available and so we used that rather than tmp_txn. The original source of transactions (it's passed through most of the function calls) appears to be op_auxdata allocated on the stack at the top of the dpif_netlink_operate__ function (dpif-netlink.c:1875). The size of this particular message was 3204, so 2048 went into the buffer and 1156 went into the tail iovector setup inside nl_sock_recv__ which it then tried to expand the ofpbuf to hold. Various nl_sock_* functions have comments about the buffer ideally being the right size for optimal performance (I guess to avoid the reallocation), but it seems like a possible oversight in the dpif_netlink_operate__ workflow that the nl_sock_* functions may ultimately want to try to expand that buffer and then fail because of the stack allocation. The relevant source tree can be found here: git clone -b applied/2.9.2-0ubuntu0.18.04.3 https://git.launchpad.net/ubuntu/+source/openvswitch https://git.launchpad.net/ubuntu/+source/openvswitch/tree/?h=applied/2.9.2-0ubuntu0.18.04.3 Thread 1 (Thread 0x7f3e0700 (LWP 1539131)): #0 0x7f3ed30c8428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x7f3ed30ca02a in __GI_abort () at abort.c:89 #2 0x004e5035 in ofpbuf_resize__ (b=b@entry=0x7f3e0fffb050, new_headroom=, new_tailroom=new_tailroom@entry=1156) at ../lib/ofpbuf.c:262 #3 0x004e5338 in ofpbuf_prealloc_tailroom (b=b@entry=0x7f3e0fffb050, size=size@entry=1156) at ../lib/ofpbuf.c:291 #4 0x004e54e5 in ofpbuf_put_uninit (size=size@entry=1156, b=b@entry=0x7f3e0fffb050) at ../lib/ofpbuf.c:365 #5 ofpbuf_put (b=b@entry=0x7f3e0fffb050, p=p@entry=0x7f3e0ffcf0a0, size=size@entry=1156) at ../lib/ofpbuf.c:388 #6 0x005392a6 in nl_sock_recv__ (sock=sock@entry=0x7f3e50009150, buf=0x7f3e0fffb050, wait=wait@entry=false) at ../lib/netlink-socket.c:705 #7 0x00539474 in nl_sock_transact_multiple__ (sock=sock@entry=0x7f3e50009150, transactions=transactions@entry=0x7f3e0ffdff20, n=1, done=done@entry=0x7f3e0ffdfe10) at ../lib/netlink-socket.c:824 #8 0x0053980a in nl_sock_transact_multiple (sock=0x7f3e50009150, transactions=transactions@entry=0x7f3e0ffdff20, n=n@entry=1) at ../lib/netlink-socket.c:1009 #9 0x0053aa1b in nl_sock_transact_multiple (n=1, transactions=0x7f3e0ffdff20, sock=) at ../lib/netlink-socket.c:1765 #10 nl_transact_multiple (protocol=protocol@entry=16, transactions=transactions@entry=0x7f3e0ffdff20, n=n@entry=1) at ../lib/netlink-socket.c:1764 #11 0x00528b01 in dpif_netlink_operate__ (dpif=dpif@entry=0x25a6150, ops=ops@entry=0x7f3e0fffaf28, n_ops=n_ops@entry=1) at ../lib/dpif-netlink.c:1964 #12 0x00529956 in dpif_netlink_operate_chunks (n_ops=1, ops=0x7f3e0fffaf28, dpif=) at ../lib/dpif-netlink.c:2243 #13 dpif_netlink_operate (dpif_=0x25a6150, ops=, n_ops=) at ../lib/dpif-netlink.c:2279 #14 0x004756de in dpif_operate (dpif=0x25a6150, ops=, ops@entry=0x7f3e0fffaf28, n_ops=n_ops@entry=1) at ../lib/dpif.c:1359 #15 0x004758e7 in dpif_flow_get (dpif=, key=, key_len=, ufid=, pmd_id=, buf=buf@entry=0x7f3e0fffb050, flow=) at ../lib/dpif.c:1014 #16 0x0043f662 in ukey_create_from_dpif_flow (udpif=0x229cbf0, udpif=0x229cbf0, ukey=, flow=0x7f3e0fffc790) at ../ofproto/ofproto-dpif-upcall.c:1709 #17 ukey_acquire (error=, result=, flow=0x7f3e0fffc790, udpif=0x229cbf0) at ../ofproto/ofproto-dpif-upcall.c:1914 #18 revalidate (revalidator=0x250eaa8) at ../ofproto/ofproto-dpif-upcall.c:2473 #19 0x0043f816 in udpif_revalidator (arg=0x250eaa8) at ../ofproto/ofproto-dpif-upcall.c:913 #20 0x004ea4b4 in ovsthread_wrapper (aux_=) at ../lib/ovs-thread.c:348 #21 0x7f3ed39756ba in start_thread (arg=0x7f3e0700) at pthread_create.c:333 #22 0x7f3ed319a41d in clon
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Attaching revised SRU patch for Ubuntu Bionic, no code content changes but fixed the changelog to list all 3 bug numbers correctly. ** Patch added: "neutron SRU patch for Ubuntu Bionic (new version)" https://bugs.launchpad.net/neutron/+bug/1869808/+attachment/5464699/+files/lp1869808-bionic.debdiff ** Patch removed: "debdiff for ubuntu cloud archive (queens)" https://bugs.launchpad.net/neutron/+bug/1869808/+attachment/5464416/+files/lp1869808-queens.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
Ubuntu SRU Justification [Impact] - When there is a RabbitMQ or neutron-api outage, the neutron- openvswitch-agent undergoes a "resync" process and temporarily blocks all VM traffic. This always happens for a short time period (maybe <1 second) but in some high scale environments this lasts for minutes. If RabbitMQ is down again during the re-sync, traffic will also be blocked until it can connect which may be for a long period. This also affects situations where neutron-openvswitch-agent is intentionally restarted while RabbitMQ is down. Bug #1869808 addresses this issue and Bug #1887148 is a fix for that fix to prevent network loops during DVR startup. - In the same situation, the neutron-l3-agent can delete the L3 router (Bug #1871850) [Test Case] (1) Deploy Openstack Bionic-Queens with DVR and a *VLAN* tenant network (VXLAN or FLAT will not reproduce the issue). With a standard deployment, simply enabling DHCP on the ext_net subnet will allow VMs to be booted directly on the ext_net provider network. "openstack subnet set --dhcp ext_net and then deploy the VM directly to ext_net" (2) Deploy a VM to the VLAN network (3) Start pinging the VM from an external network (4) Stop all RabbitMQ servers (5) Restart neutron-openvswitch-agent (6) Ping traffic should cease and not recover (7) Start all RabbitMQ servers (8) Ping traffic will recover after 30-60 seconds [Where problems could occur] These patches are all cherry-picked from the upstream stable branches, and have existed upstream including the stable/queens branch for many months and in Ubuntu all supported subsequent releases (Stein onwards) have also had these patches for many months with the exception of Queens. There is a chance that not installing these drop flows during startup could have traffic go somewhere that's not expected when the network is in a partially setup case, this was the case for DVR and in setups where more than 1 DVR external network port existed a network loop was possibly temporarily created. This was already addressed with the included patch for Bug #1869808. Checked and could not locate any other merged changes to this drop_port logic that also need to be backported. [Other Info] -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1869808] Re: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic
SRU proposed for Ubuntu Bionic + Cloud Archive (Queens) for the following 3 bugs: Bug #1869808 reboot neutron-ovs-agent introduces a short interrupt of vlan traffic Bug #1887148 Network loop between physical networks with DVR (Fix for fix to Bug #1869808) Bug #1871850 [L3] existing router resources are partial deleted unexpectedly when MQ is gone SRU is only required for Bionic + Queens Cloud Archive, all other releases already have these patches. == reboot neutron-ovs-agent introduces a short interrupt of vlan traffic https://bugs.launchpad.net/neutron/+bug/1869808 pike1f4f888ad34d54ec968d9c9f9f80c388f3ca0d12stable/pike [EOL] queens 131bbc9a53411033cf27664d8f1fd7afc72c57bfstable/queens [Needed] rocky cc48edf85cf66277423b0eb52ae6353f8028d2a6stable/rocky [EOL] stein 6dfc35680fcc885d9ad449ca2b39225fb1bca89814.3.0 [Already done] train 4f501f405d1c44e00784df8450cbe83129da1ea715.2.0 [Already done] ussuri 88e70a520acaca37db645c3ef1124df8c7d778d516.1.0 [Already done] master 90212b12cdf62e92d811997ebba699cab431d69617.0.0 [Already done] == [L3] existing router resources are partial deleted unexpectedly when MQ is gone https://bugs.launchpad.net/neutron/+bug/1871850 queens ec6c98060d78c97edf6382ede977209f007fdb81stable/queens [Needed] rocky 5ee377952badd94d08425aab41853916092acd07stable/rocky [EOL] stein 71f22834f2240834ca591e27a920f9444bac968914.4.0 [Already done] train a96ad52c7e57664c63e3675b64718c5a288946fb15.3.0 [Already done] ussuri 5eeb98cdb51dc0dadd43128d1d0ed7d497606ded16.2.0 [Already done] master 12b9149e20665d80c11f1ef3d2283e1fa6f3b69317.0.0 [Already done] == Network loop between physical networks with DVR (Fix for 1869808) https://bugs.launchpad.net/neutron/+bug/1887148 pike00466f41d690ca7c7a918bfd861878ef620bbec9stable/pike [EOL] queens 8a173ec29ac1819c3d28c191814cd1402d272bb9stable/queens [Needed] rocky 47ec363f5faefd85dfa33223c0087fafb5b9stable/rocky [EOL] stein 8181c5dbfe799ac6c832ab67b7eab3bcef4098b914.3.1 [Already done] train 17eded13595b18ab60af5256e0f63c57c370229615.2.0 [Already done] ussuri 143fe8ff89ba776618ed6291af9d5e28e4662bdb16.1.0 [Already done] master c1a77ef8b74bb9b5abbc5cb03fb3201383122eb817.0.0 [Already done] -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1869808 Title: reboot neutron-ovs-agent introduces a short interrupt of vlan traffic To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1869808/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
I can confirm 100% this bug is still happening with 2.0.1 from hirsute- proposed, even with a brand new install, on a different disk (SATA SSD instead of NVMe Intel Optane 900p SSD), using 2.0.1 inside the installer and from first boot. I can reproduce it reliably within about 2 hours just using the desktop with google chrome (after restoring my google chrome sync, so common set of data and extensions), it always seems to trigger first on an access from Google Chrome for some reason - that part is very reliable - but other files can get corrupt or lose access including git trees and the like. So I am at a loss to explain the cause given no one outside of Ubuntu seems to be hitting this. It also, for whatever reason, seems to always cause my tampermonkey and lastpass extension files to show as corrupt - but not other extensions - very reliably happens every time. The only notable change from default is I am using encryption=on with passphrase for /home/user. I have not tested with encryption off. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/zfs/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Using 2.0.1 from hirsute-proposed it seems like I'm still hitting this. Move and replace .config/google-chrome and seems after using it for a day, shutdown, boot up, same issue again. Going to see if I can somehow try to reproduce this on a different disk or in a VM with xfstests or something. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
This issue seems to have appeared somewhere between zfs-linux 0.8.4-1ubuntu11 (last known working version) and 0.8.4-1ubuntu16. When the issue first hit, I had zfs-dkms installed, which was on 0.8.4-1ubuntu16 where as the kernel build had 0.8.4-1ubuntu11. I removed zfs-dkms to go back to the kernel built version and it was working OK. linux-image-5.8.0-36-generic is now released on Hirsute with 0.8.4-1ubuntu16 and so now the out of the box kernel is also broken and I am regularly having problems with this. linux-image-5.8.0-29-generic: working linux-image-5.8.0-36-generic: broken ` lathiat@optane ~/src/zfs[zfs-2.0-release]$ sudo modinfo /lib/modules/5.8.0-29-generic/kernel/zfs/zfs.ko|grep version version: 0.8.4-1ubuntu11 lathiat@optane ~/src/zfs[zfs-2.0-release]$ sudo modinfo /lib/modules/5.8.0-36-generic/kernel/zfs/zfs.ko|grep version version: 0.8.4-1ubuntu16 ` I don't have a good quick/easy reproducer but just using my desktop for a day or two seems I am likely to hit the issue after a while. I tried to install the upstream zfs-dkms package for 2.0 to see if I can bisect the issue on upstream versions but it breaks my boot for some weird systemd reason I cannot quite figure out as yet. Looking at the Ubuntu changelog I'd say the fix for https://bugs.launchpad.net/bugs/1899826 that landed in 0.8.4-1ubuntu13 to backport the 5.9 and 5.10 compataibility patches is a prime suspect but could also be any other version. I'm going to try and 'bisect' 0.8.4-1ubuntu11 through 0.8.4-1ubuntu16 to figure out which version actually hit it. Since the default kernel is now hitting this, there have been 2 more user reports of the same things in the upstream bug in the past few days since that kernel landed and I am regularly getting inaccessible files not just from chrome but even a linux git tree among other things I am going to raise the priority on this bug to Critical as you lose access to files so has data loss potential. I have not yet determined if you can somehow get the data back, so far it's only affected files I can replace such as cache/git files. It seems like snapshots might be OK (which would make sense). ** Changed in: zfs-linux (Ubuntu) Importance: High => Critical -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1899826] Re: backport upstream fixes for 5.9 Linux support
Accidentally posted the above comment in the wrong bug, sorry, was meant for https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476 - where I suspect this bug as having caused a regression. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1899826 Title: backport upstream fixes for 5.9 Linux support To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1899826/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1899826] Re: backport upstream fixes for 5.9 Linux support
This issue seems to have appeared somewhere between zfs-linux 0.8.4-1ubuntu11 (last known working version) and 0.8.4-1ubuntu16. When the issue first hit, I had zfs-dkms installed, which was on 0.8.4-1ubuntu16 where as the kernel build had 0.8.4-1ubuntu11. I removed zfs-dkms to go back to the kernel built version and it was working OK. linux-image-5.8.0-36-generic is now released on Hirsute with 0.8.4-1ubuntu16 and so now the out of the box kernel is also broken, and I am regularly having problems with this. linux-image-5.8.0-29-generic: working linux-image-5.8.0-36-generic: broken ` lathiat@optane ~/src/zfs[zfs-2.0-release]$ sudo modinfo /lib/modules/5.8.0-29-generic/kernel/zfs/zfs.ko|grep version version: 0.8.4-1ubuntu11 lathiat@optane ~/src/zfs[zfs-2.0-release]$ sudo modinfo /lib/modules/5.8.0-36-generic/kernel/zfs/zfs.ko|grep version version: 0.8.4-1ubuntu16 ` I don't have a good quick/easy reproducer but just using my desktop for a day or two seems I am likely to hit the issue after a while. I tried to install the upstream zfs-dkms package for 2.0 to see if I can bisect the issue on upstream versions but it breaks my boot for some reason I cannot quite figure out. Looking at the Ubuntu changelog I'd say the fix for https://bugs.launchpad.net/bugs/1899826 that landed in 0.8.4-1ubuntu13 to backport the 5.9 and 5.10 compataibility patches is a prime suspect but could also be any other version. I'm going to try and 'bisect' 0.8.4-1ubuntu11 through 0.8.4-1ubuntu16 to figure out which version actually hit it. Since the default kernel is now hitting this, there have been 2 more user reports of the same things in the upstream bug in the past few days since that kernel landed, and I am regularly getting inaccessible files not just from chrome but even a linux git tree among other things I am going to raise the priority on this bug to Critical as you lose access to files so has data loss potential. I have not yet determined if you can somehow get the data back, so far it's only affected files I can replace such as cache/git files. It seems like snapshots might be OK (which would make sense). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1899826 Title: backport upstream fixes for 5.9 Linux support To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1899826/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Another user report here: https://github.com/openzfs/zfs/issues/10971 Curiously I found a 2016(??) report of similar here: https://bbs.archlinux.org/viewtopic.php?id=217204 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
I hit this problem again today, but now without zfs-dkms. After upgrading my kernel from initrd.img-5.8.0-29-generic to 5.8.0-36-generic my Google Chrome Cache directory is broken again, had to rename it and then reboot to get out of the problem. ** Changed in: zfs-linux (Ubuntu) Importance: Undecided => High ** Bug watch added: github.com/openzfs/zfs/issues #10971 https://github.com/openzfs/zfs/issues/10971 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1874939] Re: ceph-osd can't connect after upgrade to focal
This issue appears to be documented here: https://docs.ceph.com/en/latest/releases/nautilus/#instructions Complete the upgrade by disallowing pre-Nautilus OSDs and enabling all new Nautilus-only functionality: # ceph osd require-osd-release nautilus Important This step is mandatory. Failure to execute this step will make it impossible for OSDs to communicate after msgrv2 is enabled. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1874939 Title: ceph-osd can't connect after upgrade to focal To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1874939/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1907262] Re: raid10: discard leads to corrupted file system
** Attachment added: "blktrace-lp1907262.tar.gz" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/+attachment/5442212/+files/blktrace-lp1907262.tar.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to corrupted file system To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1907262] Re: raid10: discard leads to corrupted file system
I can reproduce this on a Google Cloud n1-standard-16 using 2x Local NVMe disks. Then partition nvme0n1 and nvne0n2 with only an 8GB partition, then format directly with ext4 (skip LVM). In this setup each 'check' takes <1 min so speeds up testing considerably. Example details - seems pre-emptible instance cost for this is $0.292/hour / $7/day. gcloud compute instances create raid10-test --project=juju2-157804 \ --zone=us-west1-b \ --machine-type=n1-standard-16 \ --subnet=default \ --network-tier=STANDARD \ --no-restart-on-failure \ --maintenance-policy=TERMINATE \ --preemptible \ --boot-disk-size=32GB \ --boot-disk-type=pd-ssd \ --image=ubuntu-1804-bionic-v20201116 --image-project=ubuntu-os-cloud \ --local-ssd=interface=NVME --local-ssd=interface=NVME # apt install linux-image-virtual # apt-get remove linux-image-gcp linux-image-5.4.0-1029-gcp linux-image-unsigned-5.4.0-1029-gcp --purge # reboot sgdisk -n 0:0:+8G /dev/nvme0n1 sgdisk -n 0:0:+8G /dev/nvme0n2 mdadm -C -v -l10 -n2 -N "lv-raid" -R /dev/md0 /dev/nvme0n1p2 /dev/nvme1n1p2 mkfs.ext4 /dev/md0 mount /dev/md0 /mnt dd if=/dev/zero of=/mnt/data.raw bs=4K count=1M; sync; rm /mnt/data.raw echo check >/sys/block/md0/md/sync_action; watch 'grep . /proc/mdstat /sys/block/md0/md/mismatch_cnt' # no mismatch fstrim -v /mnt echo check >/sys/block/md0/md/sync_action; watch 'grep . /proc/mdstat /sys/block/md0/md/mismatch_cnt' # mismatch=256 I ran blktrace /dev/md0 /dev/nvme0n1 /dev/nvme0n2 and will upload the results I didn't have time to try and understand the results as yet. Some thoughts - It was asserted that the first disk 'appears' fine - So I wondered can we reliably repair by asking mdadm to do a 'repair' or 'resync' - It seems that reads are at least sometimes balanced (maybe by PID) to different disks since this post.. https://www.spinics.net/lists/raid/msg62762.html - unclear if the same selection impacts writes (not that it would help performance) - So it's unclear we can reliably say only a 'passive mirror' is being corrupted, it's possible application reads may or may not be corrupted. More testing/understanding of the code required. - This area of RAID10 and RAID1 seems quite under-documented, "man md" doesn't talk much about how or which disk is used to repair the other if there is a mismatch (unlike RAID5 where the parity gives us some assurances as to which data is wrong). - We should try writes from different PIDs, with known different data, and compare the data on both disks with the known data to see if we can knowingly get the wrong data on both disks or only one. And try that with 4 disks instead of 2. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1907262 Title: raid10: discard leads to corrupted file system To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] Re: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Should mention that Chrome itself always showed "waiting for cache" part of backing up the story around the cache files. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1906476] [NEW] PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Public bug reported: Since today while running Ubuntu 21.04 Hirsute I started getting a ZFS panic in the kernel log which was also hanging Disk I/O for all Chrome/Electron Apps. I have narrowed down a few important notes: - It does not happen with module version 0.8.4-1ubuntu11 built and included with 5.8.0-29-generic - It was happening when using zfs-dkms 0.8.4-1ubuntu16 built with DKMS on the same kernel and also on 5.8.18-acso (a custom kernel). - For whatever reason multiple Chrome/Electron apps were affected, specifically Discord, Chrome and Mattermost. In all cases they seem (but I was unable to strace the processes so it was a bit hard ot confirm 100% but by deduction from /proc/PID/fd and the hanging ls) they seem hung trying to open files in their 'Cache' directory, e.g. ~/.cache /google-chrome/Default/Cache and ~/.config/Mattermost/Cache .. while the issue was going on I could not list that directory either "ls" would just hang. - Once I removed zfs-dkms only to revert to the kernel built-in version it immediately worked without changing anything, removing files, etc. - It happened over multiple reboots and kernels every time, all my Chrome apps weren't working but for whatever reason nothing else seemed affected. - It would log a series of spl_panic dumps into kern.log that look like this: Dec 2 12:36:42 optane kernel: [ 72.857033] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed Dec 2 12:36:42 optane kernel: [ 72.857036] PANIC at zfs_znode.c:335:zfs_znode_sa_init() I could only find one other google reference to this issue, with 2 other users reporting the same error but on 20.04 here: https://github.com/openzfs/zfs/issues/10971 - I was not experiencing the issue on 0.8.4-1ubuntu14 and fairly sure it was working on 0.8.4-1ubuntu15 but broken after upgrade to 0.8.4-1ubuntu16. I will reinstall those zfs-dkms versions to verify that. There were a few originating call stacks but the first one I hit was Call Trace: dump_stack+0x74/0x95 spl_dumpstack+0x29/0x2b [spl] spl_panic+0xd4/0xfc [spl] ? sa_cache_constructor+0x27/0x50 [zfs] ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? dmu_buf_set_user_ie+0x54/0x80 [zfs] zfs_znode_sa_init+0xe0/0xf0 [zfs] zfs_znode_alloc+0x101/0x700 [zfs] ? arc_buf_fill+0x270/0xd30 [zfs] ? __cv_init+0x42/0x60 [spl] ? dnode_cons+0x28f/0x2a0 [zfs] ? _cond_resched+0x19/0x40 ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? aggsum_add+0x153/0x170 [zfs] ? spl_kmem_alloc_impl+0xd8/0x110 [spl] ? arc_space_consume+0x54/0xe0 [zfs] ? dbuf_read+0x4a0/0xb50 [zfs] ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? dnode_rele_and_unlock+0x5a/0xc0 [zfs] ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 ? dmu_object_info_from_dnode+0x84/0xb0 [zfs] zfs_zget+0x1c3/0x270 [zfs] ? dmu_buf_rele+0x3a/0x40 [zfs] zfs_dirent_lock+0x349/0x680 [zfs] zfs_dirlook+0x90/0x2a0 [zfs] ? zfs_zaccess+0x10c/0x480 [zfs] zfs_lookup+0x202/0x3b0 [zfs] zpl_lookup+0xca/0x1e0 [zfs] path_openat+0x6a2/0xfe0 do_filp_open+0x9b/0x110 ? __check_object_size+0xdb/0x1b0 ? __alloc_fd+0x46/0x170 do_sys_openat2+0x217/0x2d0 ? do_sys_openat2+0x217/0x2d0 do_sys_open+0x59/0x80 __x64_sys_openat+0x20/0x30 ** Affects: zfs-linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1906476 Title: PANIC at zfs_znode.c:335:zfs_znode_sa_init() // VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1847361] Re: Upgrade of qemu binaries causes running instances not able to dynamically load modules
Note: This patch has related regressions in Hirsute due to the version number containing a space: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1906245 https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1905377 Seems the patch is temporarily dropped will need to ensure we don't totally lose the fix -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1847361 Title: Upgrade of qemu binaries causes running instances not able to dynamically load modules To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1847361/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1902351] Re: forgets touchpad settings
I am experiencing this as well, it worked on 20.04 Focal and is broken on 20.10 Groovy and 21.04 Hirsute as of today with the latest Hirsute packages. I am using GNOME with a Logitech T650 touchpad. If I unplug and replug the receiver it forgets again. I then have to toggle both natural scrolling (Settings->Touchpad) and "mouse click emulations" (Tweaks) each to have it work again. Given this is apparently common accross GNOME and KDE perhaps it is somehow related to libinput rather than gnome-shell/settings/etc? ** Also affects: libinput (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1902351 Title: forgets touchpad settings To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/kcm-touchpad/+bug/1902351/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9)
For clarity my findings so far are that: - The package upgrade stops pacemaker - After 30 seconds (customised down from 30min by charm-hacluster), the stop times out and pretends to have finished, but leaves pacemaker running (due to SendSIGKILL=no in the .service intentionally set upstream to prevent fencing) - Pacemaker is started again, but fails to start because the old copy is still running, so exits and the systemd service is left 'stopped' - The original "unmanaged" pacemaker copy eventually exits sometimes later (usually once the resources all transitioned away) leaving no running pacemaker at all Compounding this issue is that: - Pacemaker won't stop until it confirms all local services have stopped and transitioned away to other nodes (and possibly that it won't destory quorum by going down, but I am not sure about that bit) - in some cases this just takes more than 30 seconds in other cases the cluster may be in such a state that it will never happen, e.g. another node was already down or trying to shutdown. - All unattended-upgrades happen within a randomized 60 minute window (apt-daily-upgrade.timer), and they all just try to stop pacemaker without regard to whether that is possible or likely to succeed - after a while all 3 will be attempting to stop so none of them would succeed. Current Thoughts: - Adjust the charm-hacluster StopTimeout=30 back to some value (possibly the default) after testing this does not break the charm from doing deploy/scale-up/scale-down [as noted in previous bugs where it was originally added, but the original case was supposedly fixed by adding the cluster_count option]. - Consider whether we need to override SendSigKILL in the charm - changing it as a global package default seems like a bad idea - Research an improvement to the pacemaker dpkg scripts to do something smarter than just running stop, for example the preinst script could ask for a transition away without actually running stop on pacemaker and/or abort the upgrade if it is obvious that that transition will fail. - As a related note, the patch to set BindsTo=corosync on pacemaker.service was removed in Groovy due to debate with Debian over this change (but still exists in Xenial-Focal). This is something that will need to be dealt with for the next LTS. This override should probably be added to charm-hacluster at a minimum. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9)
With regards to Billy's Comment #18, my analysis for that bionic sosreport is in Comment #8 where I found that specific sosreport didn't experience this issue - but I found most likely that node was suffering from the issue occuring on the MySQL nodes it was connected to - and the service couldn't connect to MySQL as a result. We'd need the full logs (sosreport --all-logs) from all related keystone nodes and mysql nodes in the environment to be sure but I am 95% sure that is the case there. I think there is some argument to be made to improve the package restart process for the pacemaker package itself, whoever I am finding based on the logs here and in a couple of environments I analysed that the primary problem is specifically related to the reduced StopTimeout set by charm-hacluster. So I think we should focus on that issue here and if we decide it makes sense to make improvements to the pacemaker package process itself that should be opened as a separate bug as I haven't seen any evidence of that issue in the logs here so far. For anyone else experiencing this bug, please take a *full* copy of /var/log (or sosreport --all-logs) from -all- nodes in that specific pacemaker cluster and upload them and I am happy to analyse them - if you need a non-public location to share the files feel free to e-mail them to me. It would be great to receive that from any nodes already recovered so we can ensure we fully understand all the cases that happened. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters
** Changed in: charm-hacluster Status: New => Confirmed ** Changed in: pacemaker (Ubuntu) Status: Confirmed => Invalid ** Summary changed: - upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters + pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9) ** Description changed: On several machines running pacemaker with corosync, after the package was upgraded by unattended-upgrades, the VIPs were gone. Restarting pacemaker and corosync didn't help, because some processes (lrmd) remained after the stop. Manually killing them allowed to restart in a good shape. - This is on Ubuntu xenial. + This is on Ubuntu xenial (EDIT: and bionic) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters
For the fix to Bug #1654403 charm-hacluster sets TimeoutStartSec and TimeoutStopSync for both corosync and pacemaker, to the same value. system-wide default (xenial, bionic): TimeoutStopSec=90s TimeoutStartSec=90s corosync package default: system-wide default (no changes) pacemaker package default: TimeoutStopSec=30min TimeoutStartSec=60s charm-hacluster corosync+pacemaker override: TimeoutStopSec=60s TimeoutStartSec=180s effective changes: corosync TimeoutStopSec=90s -> 60sTimeoutStartSec=90s -> 180s pacemaker TimeoutStopSec=30min -> 60s TimeoutStartSec=60s -> 180s The original bug description was "On corosync restart, corosync may take longer than a minute to come up. The systemd start script times out too soon. Then pacemaker which is dependent on corosync is immediatly started and fails as corosync is still in the process of starting." So the TimeoutStartSec increase from 60/90 -> 180 was the only thing needed. I believe the TimeoutStopSec change for pacemaker is in error at least as the bug is described. Having said that, I can imagine charm failures during deployment or reconfiguration where it tries to stop pacemaker for various reasons and it fails to stop fast enough because the resources won't migrate away (possibly because all the nodes are trying to stop at the same time, as charm-hacluster doesn't seem to have a staggered change setup) and it currently restarts corosync to effect changes to the ring. So this may well have fixed other charm-related problems not really accurately described in the previous bug - though that bug does specifically mention cases where the expected cluster_count is not set - in that case it tries to setup corosync/pacemaker before all 3 nodes are up - which might get into this scenario. So before we go ahead and change the stop_timeout back to 30min we probably need to validate various scenarios for that issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: pacemaker left stopped after unattended-upgrade of pacemaker (1.1.14-2ubuntu1.8 -> 1.1.14-2ubuntu1.9) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters
I misread and the systemd unit is native, and it already sets the following settings: SendSIGKILL=no TimeoutStopSec=30min TimeoutStartSec=60s The problem is that most of these failures have been experienced on juju hacluster charm installations, which are overriding these values $ cat ./systemd/system/pacemaker.service.d/overrides.conf [Service] TimeoutStartSec=180 TimeoutStopSec=60 This was apparently done to fix the following bug: https://bugs.launchpad.net/charms/+source/hacluster/+bug/1654403 FWIW These values are configurable in charm config options. It seems this bug needs to be revisited and/or this bug may need to be retargeted at least in part to charm-hacluster. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters
Analysed the logs for an occurance of this, the problem appears to be that pacemaker doesn't stop after 1 minute so systemd gives up and just starts a new instance anyway, noting that all of the existing processes are left behind. I am awaiting the extra rotated logs to confirm but from what I can see basically the new pacemaker fails to start because the old one is still running, and then the old one eventually exits, leave you with no instance of pacemaker (which is the state we found it in, pacemaker was stopped). 06:13:44 systemd[1]: pacemaker.service: State 'stop-sigterm' timed out. Skipping SIGKILL. 06:13:44 pacemakerd[427]: notice: Caught 'Terminated' signal 06:14:44 systemd[1]: pacemaker.service: State 'stop-final-sigterm' timed out. Skipping SIGKILL. Entering failed mode. 06:14:44 systemd[1]: pacemaker.service: Failed with result 'timeout'. 06:14:44 systemd[1]: Stopped Pacemaker High Availability Cluster Manager. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 445 (cib) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 449 (attrd) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 450 (pengine) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 451 (crmd) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 427 (pacemakerd) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 447 (stonithd) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Found left-over process 448 (lrmd) in control group while starting unit. Ignoring. 06:14:45 systemd[1]: pacemaker.service: Failed to reset devices.list: Operation not permitted 06:14:45 systemd[1]: Started Pacemaker High Availability Cluster Manager. Likely the solution here is some combination of tweaking the systemd config to wait longer, force kill if necessary and possibly reap all processes if it does force a restart. It's not a native systemd unit though some of this stuff can be tweaked by comments. I'll look a little further at that. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1903745] Re: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters
I reviewed the sosreports and provide some general analysis below. [sosreport-juju-machine-2-lxc-1-2020-11-10-tayyude] I don't see any sign in this log of package upgrades or VIP stop/starts, I suspect this host may be unrelated. [sosreport-juju-caae6f-19-lxd-6-20201110230352.tar.xz] This is a charm-keystone node Looking at this sosreport my general finding is that everything worked correctly on this specific host. unattended-upgrades.log: We can see the upgrade starts at 2020-11-10 06:17:03 and finishes at "2020-11-10 06:17:48" syslog.1: Nov 10 06:17:41 juju-caae6f-19-lxd-6 crmd[41203]: notice: Result of probe operation for res_ks_680cfdf_vip on juju-caae6f-19-lxd-6: 7 (not running) Nov 10 06:19:44 juju-caae6f-19-lxd-6 crmd[41203]: notice: Result of start operation for res_ks_680cfdf_vip on juju-caae6f-19-lxd-6: 0 (ok) We also see that the VIP moved around to different hosts a few times, likely as a result of each host successively upgrading. Which makes sense. I don't see any sign in this log of the mentioned lrmd issue. [mysql issue] What we do see however is issues with "Too many connections" from MySQL in the keystone logs. This generally happens because when the VIP moves from one host to another, all the old connections are left behind and just go stale (because the VIP was removed, the traffic for these connections just disappears and is sent to the new VIP owner which doens't have those TCP connections) and sit there until wait_timeout is reached (typically either 180s/3 min or 3600s/1 hour in our deployments) as the node will never get the TCP reset when the remote end sends it. The problem happens when it fails *back* to a host it already failed away from, now many of the connection slots are still used by the stale connections and you run our of connections if your max_connections limit is not at least double your normal connection count. This problem will eventually self resolve once the connections timeout but may take an hour. Note that this sosreport is from a keystone node that *also* has charm- hacluster/corosync/pacemaker but the above discussed mysql issue would have occurred on the percona mysql nodes. To analyse the number of failovers we would need to get sosreports from the mysql node(s). [summary] I think we have likely 2 potential issues here from what I can see described so far. Firstly the networkd issue is likely not related to this specific case, as that happens specifically when systemd is upgraded and thus networkd is restarted, that shouldn't have happened here. (Issue 1) The first is that we hit max_connections due to the multiple successive MySQL VIP failovers where max_connections is not at least 2x the steady state connection count. It also seems possible in some cases the VIP may shift back to the same host a 3rd time by chance and you may end up needing 3x. I think we could potentially improve that by modifying the pacemaker resource scripts to kill active connections when the VIP departs, or, ensuring that you have 2-3x max_connections of the steady state active connection count. That should go into a new bug likely against charm-percona-cluster as it ships it's own resource agent. We could also potentially add a configurable nagios check for having active connections in excess of 50% of max_connections. (Issue 2) It was described that pacemaker got into a bad state during the restart and the lrmd didn't exit, and didn't work correctly until it was manually killed and restarted. I think we need to get more logs/sosreports from the nodes that had that specific issue, it sounds like something that may be a bug specific to a certain scenario or perhaps the older xenial version [This USN-4623-1 update happened for all LTS releases, 16.04/18.04/20.04]. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1903745 Title: upgrade from 1.1.14-2ubuntu1.8 to 1.1.14-2ubuntu1.9 breaks clusters To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1903745/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1848497] Re: virtio-balloon change breaks migration from qemu prior to 4.0
I have verified the package for this specific virtio-balloon issue discussed in this bug only. Migrating from 3.1+dfsg-2ubuntu3.2~cloud0 - To the latest released version (3.1+dfsg-2ubuntu3.7~cloud0) fails due to balloon setup 2020-10-26T07:40:30.157066Z qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0 2020-10-26T07:40:30.157431Z qemu-system-x86_64: Failed to load PCIDevice:config 2020-10-26T07:40:30.157443Z qemu-system-x86_64: Failed to load virtio-balloon:virtio 2020-10-26T07:40:30.157448Z qemu-system-x86_64: error while loading state for instance 0x0 of device ':00:04.0/virtio-balloon' 2020-10-26T07:40:30.159527Z qemu-system-x86_64: load of migration failed: Invalid argument 2020-10-26 07:40:30.223+: shutting down, reason=failed - To the proposed version (3.1+dfsg-2ubuntu3.7~cloud1): works as expected Marking as verification completed. ** Tags removed: verification-stein-needed ** Tags added: verification-stein-done -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1848497 Title: virtio-balloon change breaks migration from qemu prior to 4.0 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1848497/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1897483] Re: With hardware offloading enabled, OVS logs are spammed with netdev_offload_tc ERR messages
There is an indication in the below RHBZ this can actually prevent openvswitch from working properly as it loses too much CPU time to this processing in large environments (100s or 1000s of ports) https://bugzilla.redhat.com/show_bug.cgi?id=1737982 Seems to be a rejected upstream patch here, unclear if one was later accepted, we shoudl check for it: https://lists.linuxfoundation.org/pipermail/ovs-dev/2019-March/357348.html And potentially prioritise a fix for this. ** Tags added: sts ** Bug watch added: Red Hat Bugzilla #1737982 https://bugzilla.redhat.com/show_bug.cgi?id=1737982 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1897483 Title: With hardware offloading enabled, OVS logs are spammed with netdev_offload_tc ERR messages To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1897483/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1896734] Re: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack
** Tags added: seg -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1896734 Title: A privsep daemon spawned by neutron-openvswitch-agent hangs when debug logging is enabled (large number of registered NICs) - an RPC response is too large for msgpack To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1896734/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1887779] Re: recurrent uncaught exception
I hit this too, after restart to fix it I also lose all my stored metrics from the last few days. So going to triage this as High. ** Changed in: graphite-carbon (Ubuntu) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1887779 Title: recurrent uncaught exception To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/graphite-carbon/+bug/1887779/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1882416] Re: virtio-balloon change breaks rocky -> stein live migrate
I think the issue here is that Stein's qemu comes from Disco which was EOL before Bug #1848497 was fixed and so the change wasn't backported. While Stein is EOL next month the problem is this makes live migrations fail which are often wanted during OpenStack upgrades to actually get through Stein onto Train. So I think we'll need to backport the fix. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1882416 Title: virtio-balloon change breaks rocky -> stein live migrate To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1882416/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1882416] Re: virtio-balloon change breaks rocky -> stein live migrate
** Tags added: seg -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1882416 Title: virtio-balloon change breaks rocky -> stein live migrate To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1882416/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
Right, the systems are running 1.1ubuntu1.18.04.11 - in my original query to you I was trying to figure out if the patches in .12 or .13 were likely to have caused this specific situation and you weren't sure hence the bug report with more details. ** Changed in: unattended-upgrades (Ubuntu) Status: Incomplete => New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1894453] Re: Building Ceph packages with RelWithDebInfo
Are we sure it's actually building as Debug? At least 15.2.3 on focal seems to build with RelWithDebugInfo.. I see -O2 .. only do_cmake.sh had logic for this (it would set Debug if a .git directory exists) but the debian rules file doesn't seem to use that script but cmake directly. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1894453 Title: Building Ceph packages with RelWithDebInfo To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1894453/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
** Attachment added: "dpkg.log.6" https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+attachment/5406809/+files/dpkg.log.6 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
Uploaded all historical log files in lp1893889-logs.tar.gz Uploaded dpkg_-l For convenient access also uploaded unattended-upgrades.log.4, unattended-upgrades-dpkg.log.4 and dpkg.log.6 which have the lines from the first instance of hitting the error -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
** Attachment added: "unattended-upgrades-dpkg.log.4" https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+attachment/5406808/+files/unattended-upgrades-dpkg.log.4 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
** Attachment added: "dpkg_-l" https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+attachment/5406810/+files/dpkg_-l -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
** Attachment added: "unattended-upgrades.log.4" https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+attachment/5406807/+files/unattended-upgrades.log.4 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] Re: unattended-upgrade of nova-common failure due to conffile prompt
** Attachment added: "all unattended-upgrades and dpkg logs" https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+attachment/5406806/+files/lp1893889-logs.tar.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1893889] [NEW] unattended-upgrade of nova-common failure due to conffile prompt
Public bug reported: unattended-upgrades attempted to upgrade nova from 2:17.0.9-0ubuntu1 to 2:17.0.10-0ubuntu2.1 (bionic-security), however nova-common contains a modified conffile (/etc/nova/nova.conf) which prompts during upgrade and leaves apt/dpkg in a permanent error state requiring manual intervention. It also prevents other automated apt install operations from working while in this state. I understand that this conffile prompt is a generally known problem and that unattended-upgrades specifically attempts to skip upgrades that have such a conffile prompt, however that did not work on this case. I am filing this bug to try and identify and resolve the cause and this affected multiple systems in an Ubuntu OpenStack deployment. rbalint advised that this is very likely a more complex interaction with the exact upgrades that were being staged at the time and hence more logs would be needed, indeed attempting to reproduce this very simply with a downgrade of nova packages to 2:17.0.0-0ubuntu1 results in it being skipped, as expected: root@juju-c21ec6-bionic-nova-7:/home/ubuntu# unattended-upgrade Package nova-common has conffile prompt and needs to be upgraded manually And from the unattended-upgrades log we can see that 179 packages in total were scheduled to upgrade together during this run. Attaching the following logs files: /var/log/unattended-upgrades/* /var/log/dpkg* dpkg_-l (As at 2020-04-27 16:22, the same time period as the unattended-upgrades logs, but the dpkg.log* files were taken later but also cover the full time period from before 2019-12-28 and after 2020-04-27). The first instance of the failure is in unattended-upgrades.log.4.gz Line 161 "2019-12-28 06:15:29,837 Packages that will be upgraded: amd64-microcode... [truncated, 179 packages total]" That relates to the output in unattended-upgrades-dpkg.log.4.gz Line 791 "Log started: 2019-12-28 06:25:56" Which relates to the output of dpkg.log.6.gz Line 392 "2019-12-28 06:25:56 upgrade nova-compute-kvm:all 2:17.0.9-0ubuntu1 2:17.0.10-0ubuntu2.1" It fails many times after that as anytime you attempt to install a package, it attempts to configure nova.conf again and exits with an error again. But that is the original failure. But note that various package upgrades happened by unattended-upgrades (and possibly other sources) in the intervening 4 months and so I guess reproducing the situation may require reverse engineering the original package list from the dpkg logs. I have not currently attempted to do that with the hopes intimate knowledge of the unattended-upgrades code and logs will make that process faster. A full sosreport from the system is available if more information is required that will include other log files, and various other command outputs. It is not uploaded initially for privacy. ** Affects: unattended-upgrades (Ubuntu) Importance: Undecided Status: New ** Tags: sts ** Tags added: sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1893889 Title: unattended-upgrade of nova-common failure due to conffile prompt To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unattended-upgrades/+bug/1893889/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1891269] Re: perf is not built with python script support
Logs are not required for this issue ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1891269 Title: perf is not built with python script support To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1891269/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1891269] [NEW] perf is not built with python script support
Public bug reported: The "perf" tool supports python scripting to process events, this support is currently not enabled. $ sudo perf script -g python Python scripting not supported. Install libpython and rebuild perf to enable it. For example: # apt-get install python-dev (ubuntu) # yum install python-devel (Fedora) etc. The expected behaviour is that the script creates a template python file for you to modify to process the events. >From what I can see enabling this requires a few items - We need to Build-Depend on python3-dev - We would ship the perf-script-python binary - There are various python modules (under tools/perf/scripts/python) needed for these to work - There are also a number of upstream scripts (e.g. 'net_dropmonitor') we could ship, normally you can see those by running 'perf script -l' but we get "open(/usr/libexec/perf-core/scripts) failed. Check "PERF_EXEC_PATH" env to set scripts dir.". Expected output can be seen by running "PERF_EXEC_PATH=LINUX_SOURCE_PATH/tools/perf ./perf script -l" While not important to me personally, it also doesn't have support for perl that could be fixed in a similar way, in case we want to fix that at the same time. It doesn't have as many pre-existing scripts though and seems less likely to be as useful compared to the Python version. $ sudo perf script -g perl Perl scripting not supported. Install libperl and rebuild perf to enable it. For example: # apt-get install libperl-dev (ubuntu) # yum install 'perl(ExtUtils::Embed)' (Fedora) etc. ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: seg ** Tags added: seg -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1891269 Title: perf is not built with python script support To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1891269/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1888047] Re: libnss-mdns slow response
This output is generally quite confusing. Can you try remove the "search www.tendawifi.com" and see how it differs? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1888047 Title: libnss-mdns slow response To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nss-mdns/+bug/1888047/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1888047] Re: libnss-mdns slow response
ideally using mdns4_minimal specifically (or i guess, both, but generally not recommended to use mdns4 in most cases) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1888047 Title: libnss-mdns slow response To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nss-mdns/+bug/1888047/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1888047] Re: libnss-mdns slow response
Can you please confirm (1) The timing of "getent hosts indigosky.local", "host indigosky.local", "nslookup indigosky.local" and "nslookup indigosky.local 192.168.235.1" all done at the same time (mainly adding the direct lookup through the server, wondering if nslookup is doing something weird in focal). (2) The timings for the same if you switch mdns4 back to mdns4_minimal (but remove everything else) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1888047 Title: libnss-mdns slow response To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nss-mdns/+bug/1888047/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 80900] Re: Avahi daemon prevents resolution of FQDNs ending in ".local" due to false negatives in the detection of ".local" networks
This is fixed in Ubuntu 20.04 with nss-mdns 0.14 and later which does proper split horizon handling. ** Changed in: avahi (Ubuntu) Status: Triaged => Fix Released ** Changed in: nss-mdns (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/80900 Title: Avahi daemon prevents resolution of FQDNs ending in ".local" due to false negatives in the detection of ".local" networks To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/80900/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1888047] Re: libnss-mdns slow response
Rumen, When you use 'nslookup' it should go directly to using the DNS server (127.0.0.53 [which is systemd-resolved]) which typically bypasses libnss-mdns but also typically doesn't have this 5 second delay (which avahi can have in some configurations). Seems most likely the 5 second delay is coming from inside systemd-resolved for some reason. The best way to test with "NSS" is to use "getent hosts DOMAIN" Could you please confirm the output of the following commands: lsb_release -a dpkg -l libnss-mdns systemctl status avahi-daemon time getent hosts sirius.local time nslookup sirius.local # just to verify the problem still exists at the same time we do the above test systemd-resolve --status --no-pager - attach the file /etc/systemd/resolved.conf -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1888047 Title: libnss-mdns slow response To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nss-mdns/+bug/1888047/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1886809] Re: Pulse connect VPN exists because unwanted avahi network starts
I'm not sure it makes sense to just universally skip "tun*" interfaces (at least yet) but we may need to review the scenarios in which /etc/network/if-up.d/avahi-autoipd is executing. Helio: Can you provider a reproducer scenario? e.g. is this ubuntu server, ubuntu desktop, what is the contents of: /etc/network/interfaces, /etc/network/interfaces.d/*, /etc/netplan/* and whether network manager is in use or not. And lastly exactly how pulse VPN is installed and configured, and how that interface is started/connected? Additionally you may find this issue goes away with netplan versus the older-style interfaces files. In any case with as much info as possible for a reproducer I can check your exact scenario. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1886809 Title: Pulse connect VPN exists because unwanted avahi network starts To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/1886809/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1871685] Re: [SRU] vagrant spits out ruby deprecation warnings on every call
Hi Lucas, Thanks for the patch updates. When I first submitted this we could have snuck through before release without an SRU but the patch backport now makes sense. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1871685 Title: [SRU] vagrant spits out ruby deprecation warnings on every call To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/vagrant/+bug/1871685/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1874021] Re: avahi-browse -av is empty, mdns not working in arduino ide
OK thanks for the updates. So I can see a lot of mDNS packets in the lp1874021.pcap capture from various sources. I can see some printers, google cast, sonoff, etc. Curiously though when you do the avahi cache dump it isn't seeing any of these. Wireshark is showing malformed packets for many of the responses strangely, the IP and UDP headers indicate a different length to that of the actual dta. Not sure if this is an issue with wireshark, the wireless driver or whatever mDNS implementations are replying. May need further looking at. But it's curious though that avahi is showing absoultely no cached services, and that it works on ethernet, given that the lp1874021.pcap seems to show plenty of actual mDNS packets coming and going. Could you try start Avahi (on wireless) using --debug (1) override the systemd config with this command: systemctl edit avahi-daemon.service Once the editor opens, add the following 3 lines then save and squit: [Service] ExecStart= ExecStart=/usr/sbin/avahi-daemon -s --debug Then restart avahi-daemon: sudo systemctl restart avahi-daemon.service Lastly run "avahi-browse -av", wait a minute or two, then upload a copy of the "journalctl -u avahi-daemon" again? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1874021 Title: avahi-browse -av is empty, mdns not working in arduino ide To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/1874021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1874192] [NEW] Remove avahi .local notification support (no longer needed)
Public bug reported: As of nss-mdns 0.14 (which is now shipping in Focal 20.04) Avahi no longer requires to be stopped when a unicast .local domain is present, nss-mdns now has logic to make this work correctly when Avahi is running for both multicast and unicast. We dropped the script that performs this check in Avahi, the relevant logic in update-notifier to notify this information should also be removed, although it should no longer function it's just dead code now. e.g. /usr/lib/systemd/user/unicast-local-avahi.path, etc. ** Affects: update-notifier (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1874192 Title: Remove avahi .local notification support (no longer needed) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/update-notifier/+bug/1874192/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1874021] Re: avahi-browse -av is empty, mdns not working in arduino ide
Looking at jctl.txt things look normal, the server starts up, gets server startup complete and then adds the appropriate IP for wlan0. Config file looks normal. Can you please try the following to collect extra debug info (1) Start a tcpdump and leaving it running - tcpdump --no-promiscuous-mode -w lp1874021.pcap -i wlp1s0 port 5353 and udp (2) Restart avahi; sudo systemctl restart avahi-daemon (3) Wait 10 seconds, then try run "avahi-browse -av | tee -a lp1874021-browse.txt" (4) Wait another 10 seconds (5) Run: sudo killall -USR1 avahi-daemon # this dumps the avahi cache into the journal (6) Quit avahi-browse (7) Quit tcpdump (8) Please then upload the lp1874021-browse.txt (copied output from avahi-browse), lp1874021.pcap (raw packet capture of mdns packets) and a copy of the output of "journalctl -u avahi-daemon" As an extra test, after having done the above, you can try put the interface in promiscous mode and see if that fixes the problem. This can make Avahi work on bad network (usually wifi) drivers that do not correctly implement multicast. (9) sudo tcpdump -w lp1874021-promisc.pcap -i wlp1s0 port 5353 and udp (10) sudo systemctl restart avahi-daemon (11) avahi-browse -av (12) If the service still hasn't shown up, consider also then restarting whatever device is advertising the service you want to connect to. And note if it then appears after doing that. (13) If you have the option, try to then plug in either or both devices via ethernet instead of WiFi. If the services do start appearing at some point be sure to note which step you were at when that happened. Please note that all of these files will contain information about mDNS services on your local network, typically this information is relatively OK to be public since it would be broadcast if you were on a public WiFi network - it can include names, mac addresses, etc. If that is a concern to you in terms of privacy then feel free to consider either attempting to sanitize the data (though that is difficult for the pcap file) or setting the bug to private although we much prefer not to set bugs to private if possible. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1874021 Title: avahi-browse -av is empty, mdns not working in arduino ide To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/1874021/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 327362] Re: Some ISPs have .local domain which disables avahi-daemon
For anyone looking at this in 2020, this is fixed in nss-mdns 0.14 which is in Ubuntu Focal 20.04 - it will now correctly pass through unicast .local lookups. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/327362 Title: Some ISPs have .local domain which disables avahi-daemon To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-release-notes/+bug/327362/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1871685] Re: vagrant spits out ruby deprecation warnings on every call
** Patch added: "full merge debdiff from old ubuntu version to new ubuntu version" https://bugs.launchpad.net/ubuntu/+source/vagrant/+bug/1871685/+attachment/5356998/+files/lp1871685_complete-merge_2.2.6+dfsg-2ubuntu1_2.2.7+dfsg-1ubuntu1.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1871685 Title: vagrant spits out ruby deprecation warnings on every call To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/vagrant/+bug/1871685/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1871685] Re: vagrant spits out ruby deprecation warnings on every call
** Patch added: "partial merge debdiff showing only the delta to current debian version" https://bugs.launchpad.net/ubuntu/+source/vagrant/+bug/1871685/+attachment/5356999/+files/lp1871685_merge-only_2.2.7+dfsg-1_2.2.7+dfsg-1ubuntu1.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1871685 Title: vagrant spits out ruby deprecation warnings on every call To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/vagrant/+bug/1871685/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1871685] Re: vagrant spits out ruby deprecation warnings on every call
Please sponsor this upload of a merge of Vagrant 2.2.7+dfsg-1 from Debian. It is a minor upstream version bump (2.2.6 -> 2.2.7) plus contains new patches from Debian to fix multiple Ruby 2.7 deprecation warnings on every command invocation. Two debdiffs attached: partial merge debdiff showing only the delta to current debian version (lp1871685_merge-only_2.2.7+dfsg-1_2.2.7+dfsg-1ubuntu1.debdiff) full merge debdiff from old ubuntu version to new ubuntu version (lp1871685_complete-merge_2.2.6+dfsg-2ubuntu1_2.2.7+dfsg-1ubuntu1.debdiff) This is a direct merge of the previous merge, which is simply to disable the autopkgtest as it is long time known flakey on Ubuntu infrastructure. It would be ideal to get this merge through ahead of Focal release to continue having no delta to Debian upstream. This package is in universe. ** Changed in: vagrant (Ubuntu) Status: New => Confirmed ** Changed in: vagrant (Ubuntu) Importance: Undecided => Low -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1871685 Title: vagrant spits out ruby deprecation warnings on every call To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/vagrant/+bug/1871685/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs