[Dx-packages] [Bug 1966905] Re: Valgrind memory errors in gnome-shell 42 from accountsservice
I think this is triggered by valgrind because it leads to slowdowns, but the bug is indeed there. We can handle it in a later upload I think, but due to gslice and the randomness of these memory errors, I wouldn't be shocked if this is actually presenting right now in the wild with a different stack trace. -- You received this bug notification because you are a member of DX Packages, which is subscribed to accountsservice in Ubuntu. Matching subscriptions: dx-packages https://bugs.launchpad.net/bugs/1966905 Title: Valgrind memory errors in gnome-shell 42 from accountsservice Status in accountsservice: Unknown Status in accountsservice package in Ubuntu: New Status in gnome-shell package in Ubuntu: New Bug description: Valgrind memory errors in gnome-shell 42 from accountsservice: ==60511== Invalid read of size 8 ==60511==at 0x4D207FA: g_type_check_instance_cast (gtype.c:4120) ==60511==by 0x1E421CA2: free_fetch_user_request (act-user-manager.c:1708) ==60511==by 0x1E4298E7: on_find_user_by_name_finished (act-user-manager.c:1187) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300) ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256) ==60511==by 0x4C298BA: reply_cb (gdbusproxy.c:2576) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300) ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256) ==60511==by 0x4C2107E: g_dbus_connection_call_done (gdbusconnection.c:5895) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0C4C: complete_in_idle_cb (gtask.c:1244) ==60511==by 0x4D9CC23: UnknownInlinedFun (gmain.c:3417) ==60511==by 0x4D9CC23: g_main_context_dispatch (gmain.c:4135) ==60511== Address 0x185b5110 is 0 bytes inside a block of size 64 free'd ==60511==at 0x484B27F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==60511==by 0x4D1F7D4: g_type_free_instance (gtype.c:2008) ==60511==by 0x1E428ECA: UnknownInlinedFun (act-user.c:562) ==60511==by 0x1E428ECA: UnknownInlinedFun (act-user.c:557) ==60511==by 0x1E428ECA: _act_user_update_from_object_path (act-user.c:1346) ==60511==by 0x1E42966F: fetch_user_incrementally (act-user-manager.c:1789) ==60511==by 0x1E4298E7: on_find_user_by_name_finished (act-user-manager.c:1187) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300) ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256) ==60511==by 0x4C298BA: reply_cb (gdbusproxy.c:2576) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300) ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256) ==60511==by 0x4C2107E: g_dbus_connection_call_done (gdbusconnection.c:5895) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511== Block was alloc'd at ==60511==at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==60511==by 0x4DA5718: g_malloc (gmem.c:125) ==60511==by 0x4DBCB64: g_slice_alloc (gslice.c:1072) ==60511==by 0x4DBD1CD: g_slice_alloc0 (gslice.c:1098) ==60511==by 0x4D24E61: g_type_create_instance (gtype.c:1911) ==60511==by 0x4D0BF4C: g_object_new_internal (gobject.c:2011) ==60511==by 0x4D0D1AC: g_object_new_with_properties (gobject.c:2181) ==60511==by 0x4D0DCB0: g_object_new (gobject.c:1821) ==60511==by 0x1E422792: create_new_user (act-user-manager.c:706) ==60511==by 0x1E429BD8: act_user_manager_get_user (act-user-manager.c:1879) ==60511==by 0x68ADE2D: ??? (in /usr/lib/x86_64-linux-gnu/libffi.so.8.1.0) ==60511==by 0x68AA492: ??? (in /usr/lib/x86_64-linux-gnu/libffi.so.8.1.0) ==60511== ==60511== Invalid read of size 8 ==60511==at 0x4D206E9: g_type_check_instance_is_fundamentally_a (gtype.c:4091) ==60511==by 0x4D06E9A: g_object_set_data (gobject.c:3982) ==60511==by 0x1E421CB6: free_fetch_user_request (act-user-manager.c:1708) ==60511==by 0x1E4298E7: on_find_user_by_name_finished (act-user-manager.c:1187) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300) ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256) ==60511==by 0x4C298BA: reply_cb (gdbusproxy.c:2576) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300) ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256) ==60511==by 0x4C2107E: g_dbus_connection_call_done (gdbusconnection.c:5895) ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230) ==60511==by 0x4BC0C4C: complete_in_idle_cb (gtask.c:1244) ==60511== Address 0x185b5110 is 0 bytes inside a block of size 64 free'd ==60511==
[Dx-packages] [Bug 1871538] Re: dbus timeout-ed during an upgrade, taking services down including gdm
Quoting upstream systemd developers (https://github.com/systemd/systemd/issues/22737#issuecomment-1077682307): "We essentially traded one problem (lockup when starting services) for another (the failure described in this commit). I actually think that the lockup is worse. Here there is a simple solution: switch from dbus-daemon to dbus-broker. [...] A proper fix will most likely be to make all dbus calls asynchronous in systemd, but that is a lot of work and it's unclear when/if it'll happen. The regression is unfortunate, but I don't think we can fix it in reasonable time." So I wonder what's the best path forward in Ubuntu... if we revert, we'll re-introduce the lockup/timeout problem with dbus-daemon. If we keep the current version, fwupd-refresh.service is broken. The comment at https://github.com/fwupd/fwupd/issues/3037#issuecomment-1100816992 suggests that disabling the DynamicUser= setting makes the service work again. Maybe that's worth a try, in order to get both problems solved? (i.e. shipping an override config for fwupd) $cat /etc/systemd/system/fwupd-refresh.service.d/override.conf [Service] DynamicUser=no -- You received this bug notification because you are a member of DX Packages, which is subscribed to accountsservice in Ubuntu. Matching subscriptions: dx-packages https://bugs.launchpad.net/bugs/1871538 Title: dbus timeout-ed during an upgrade, taking services down including gdm Status in D-Bus: Unknown Status in systemd: New Status in accountsservice package in Ubuntu: Invalid Status in dbus package in Ubuntu: Invalid Status in gnome-shell package in Ubuntu: Invalid Status in systemd package in Ubuntu: Fix Released Status in accountsservice source package in Focal: Invalid Status in dbus source package in Focal: Invalid Status in gnome-shell source package in Focal: Invalid Status in systemd source package in Focal: Fix Released Status in accountsservice source package in Groovy: Invalid Status in dbus source package in Groovy: Invalid Status in gnome-shell source package in Groovy: Invalid Status in accountsservice source package in Hirsute: Invalid Status in dbus source package in Hirsute: Won't Fix Status in gnome-shell source package in Hirsute: Invalid Status in accountsservice source package in Impish: Invalid Status in dbus source package in Impish: Invalid Status in gnome-shell source package in Impish: Invalid Status in systemd source package in Impish: Fix Released Status in accountsservice source package in Jammy: Invalid Status in dbus source package in Jammy: Invalid Status in gnome-shell source package in Jammy: Invalid Status in systemd source package in Jammy: Fix Released Bug description: [Impact] * There's currently a deadlock between PID 1 and dbus-daemon: in some cases dbus-daemon will do NSS lookups (which are blocking) at the same time PID 1 synchronously blocks on some call to dbus-daemon (e.g. `GetConnectionUnixUser` DBus call). Let's break that by setting SYSTEMD_NSS_DYNAMIC_BYPASS=1 env var for dbus-daemon, which will disable synchronously blocking varlink calls from nss-systemd to PID 1. * This can lead to delayed boot times * It can also lead to dbus-daemon being killed/re-started, taking down other services with it, like GDM, killing user sessions on the way (e.g. on installing updates) [Test Plan] * This bug is really hard to reproduce, as can be seen from the multi-year long discussion at https://github.com/systemd/systemd/issues/15316 * Canonical's CPC team has the ability to reproduce this issue (with a relatively high probability) in their Azure test environment, due to the specific setup they are using * So our test plan is to ask CPC (@gjolly) for confirmation if the issue is fixed. [Where problems could occur] * This fix touches the communication between systemd and dbus daemon, especially the NSS lookup, so if something is broken the (user-)name resolution could be broken. * As a workaround dbus-daemon could be replaced by dbus-broker, which never showed this issue or the behaviour could be changed back by using the `SYSTEMD_NSS_DYNAMIC_BYPASS` env variable, like this: #/etc/systemd/system/dbus.service.d/override.conf [Service] Environment=SYSTEMD_NSS_DYNAMIC_BYPASS=0 [Other Info] * Fixed upstream (v251) in https://github.com/systemd/systemd/pull/22552 === Original Description === This morning I found my computer on the login screen. But not the one of the screen log, no a new one - so something must have crashed. Logging in again confirmed that all apps were gone and the gnome shell was brought down what seems like triggered by a background update o accountsservice. As always things are not perfectly clear :-/ The following goes *back* in time through my logs one by one. Multiple apps crashed at 06:09, but we will find later that this is a follow on issue of the
[Dx-packages] [Bug 1871538] Re: dbus timeout-ed during an upgrade, taking services down including gdm
> https://git.launchpad.net/~ubuntu-core- dev/ubuntu/+source/systemd/commit/?id=e3aacfa26e3fc6df369e6f28e740389ae0020907 This appears to have caused a regression in fwupd in Ubuntu 20.04 with details at https://github.com/fwupd/fwupd/issues/3037 fwupd-refresh.service uses DynamicUser and now hits this upstream bug: https://github.com/systemd/systemd/issues/22737 ** Bug watch added: github.com/fwupd/fwupd/issues #3037 https://github.com/fwupd/fwupd/issues/3037 ** Bug watch added: github.com/systemd/systemd/issues #22737 https://github.com/systemd/systemd/issues/22737 -- You received this bug notification because you are a member of DX Packages, which is subscribed to accountsservice in Ubuntu. Matching subscriptions: dx-packages https://bugs.launchpad.net/bugs/1871538 Title: dbus timeout-ed during an upgrade, taking services down including gdm Status in D-Bus: Unknown Status in systemd: New Status in accountsservice package in Ubuntu: Invalid Status in dbus package in Ubuntu: Invalid Status in gnome-shell package in Ubuntu: Invalid Status in systemd package in Ubuntu: Fix Released Status in accountsservice source package in Focal: Invalid Status in dbus source package in Focal: Invalid Status in gnome-shell source package in Focal: Invalid Status in systemd source package in Focal: Fix Released Status in accountsservice source package in Groovy: Invalid Status in dbus source package in Groovy: Invalid Status in gnome-shell source package in Groovy: Invalid Status in accountsservice source package in Hirsute: Invalid Status in dbus source package in Hirsute: Won't Fix Status in gnome-shell source package in Hirsute: Invalid Status in accountsservice source package in Impish: Invalid Status in dbus source package in Impish: Invalid Status in gnome-shell source package in Impish: Invalid Status in systemd source package in Impish: Fix Released Status in accountsservice source package in Jammy: Invalid Status in dbus source package in Jammy: Invalid Status in gnome-shell source package in Jammy: Invalid Status in systemd source package in Jammy: Fix Released Bug description: [Impact] * There's currently a deadlock between PID 1 and dbus-daemon: in some cases dbus-daemon will do NSS lookups (which are blocking) at the same time PID 1 synchronously blocks on some call to dbus-daemon (e.g. `GetConnectionUnixUser` DBus call). Let's break that by setting SYSTEMD_NSS_DYNAMIC_BYPASS=1 env var for dbus-daemon, which will disable synchronously blocking varlink calls from nss-systemd to PID 1. * This can lead to delayed boot times * It can also lead to dbus-daemon being killed/re-started, taking down other services with it, like GDM, killing user sessions on the way (e.g. on installing updates) [Test Plan] * This bug is really hard to reproduce, as can be seen from the multi-year long discussion at https://github.com/systemd/systemd/issues/15316 * Canonical's CPC team has the ability to reproduce this issue (with a relatively high probability) in their Azure test environment, due to the specific setup they are using * So our test plan is to ask CPC (@gjolly) for confirmation if the issue is fixed. [Where problems could occur] * This fix touches the communication between systemd and dbus daemon, especially the NSS lookup, so if something is broken the (user-)name resolution could be broken. * As a workaround dbus-daemon could be replaced by dbus-broker, which never showed this issue or the behaviour could be changed back by using the `SYSTEMD_NSS_DYNAMIC_BYPASS` env variable, like this: #/etc/systemd/system/dbus.service.d/override.conf [Service] Environment=SYSTEMD_NSS_DYNAMIC_BYPASS=0 [Other Info] * Fixed upstream (v251) in https://github.com/systemd/systemd/pull/22552 === Original Description === This morning I found my computer on the login screen. But not the one of the screen log, no a new one - so something must have crashed. Logging in again confirmed that all apps were gone and the gnome shell was brought down what seems like triggered by a background update o accountsservice. As always things are not perfectly clear :-/ The following goes *back* in time through my logs one by one. Multiple apps crashed at 06:09, but we will find later that this is a follow on issue of the underlying gnome/X/... recycling. -rw-r- 1 paelzer whoopsie 52962868 Apr 8 06:09 _usr_bin_konversation.1000.crash -rw-r- 1 paelzer whoopsie 986433 Apr 8 06:09 _usr_lib_x86_64-linux-gnu_libexec_drkonqi.1000.crash rdkit was failing fast and giving up (that will be a different bug, it just seems broken on my system): Apr 08 06:10:13 Keschdeichel systemd[1]: Started RealtimeKit Scheduling Policy Service. Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Successfully called chroot. Apr 08 06:10:13 Keschdeichel