[Dx-packages] [Bug 1966905] Re: Valgrind memory errors in gnome-shell 42 from accountsservice

2022-04-20 Thread Treviño
I think this is triggered by valgrind because it leads to slowdowns, but
the bug is indeed there.

We can handle it in a later upload I think, but due to gslice and the
randomness of these memory errors, I wouldn't be shocked if this is
actually presenting right now in the wild with a different stack trace.

-- 
You received this bug notification because you are a member of DX
Packages, which is subscribed to accountsservice in Ubuntu.
Matching subscriptions: dx-packages
https://bugs.launchpad.net/bugs/1966905

Title:
  Valgrind memory errors in gnome-shell 42 from accountsservice

Status in accountsservice:
  Unknown
Status in accountsservice package in Ubuntu:
  New
Status in gnome-shell package in Ubuntu:
  New

Bug description:
  Valgrind memory errors in gnome-shell 42 from accountsservice:

  ==60511== Invalid read of size 8
  ==60511==at 0x4D207FA: g_type_check_instance_cast (gtype.c:4120)
  ==60511==by 0x1E421CA2: free_fetch_user_request (act-user-manager.c:1708)
  ==60511==by 0x1E4298E7: on_find_user_by_name_finished 
(act-user-manager.c:1187)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300)
  ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256)
  ==60511==by 0x4C298BA: reply_cb (gdbusproxy.c:2576)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300)
  ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256)
  ==60511==by 0x4C2107E: g_dbus_connection_call_done 
(gdbusconnection.c:5895)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0C4C: complete_in_idle_cb (gtask.c:1244)
  ==60511==by 0x4D9CC23: UnknownInlinedFun (gmain.c:3417)
  ==60511==by 0x4D9CC23: g_main_context_dispatch (gmain.c:4135)
  ==60511==  Address 0x185b5110 is 0 bytes inside a block of size 64 free'd
  ==60511==at 0x484B27F: free (in 
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==60511==by 0x4D1F7D4: g_type_free_instance (gtype.c:2008)
  ==60511==by 0x1E428ECA: UnknownInlinedFun (act-user.c:562)
  ==60511==by 0x1E428ECA: UnknownInlinedFun (act-user.c:557)
  ==60511==by 0x1E428ECA: _act_user_update_from_object_path 
(act-user.c:1346)
  ==60511==by 0x1E42966F: fetch_user_incrementally (act-user-manager.c:1789)
  ==60511==by 0x1E4298E7: on_find_user_by_name_finished 
(act-user-manager.c:1187)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300)
  ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256)
  ==60511==by 0x4C298BA: reply_cb (gdbusproxy.c:2576)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300)
  ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256)
  ==60511==by 0x4C2107E: g_dbus_connection_call_done 
(gdbusconnection.c:5895)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==  Block was alloc'd at
  ==60511==at 0x4848899: malloc (in 
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==60511==by 0x4DA5718: g_malloc (gmem.c:125)
  ==60511==by 0x4DBCB64: g_slice_alloc (gslice.c:1072)
  ==60511==by 0x4DBD1CD: g_slice_alloc0 (gslice.c:1098)
  ==60511==by 0x4D24E61: g_type_create_instance (gtype.c:1911)
  ==60511==by 0x4D0BF4C: g_object_new_internal (gobject.c:2011)
  ==60511==by 0x4D0D1AC: g_object_new_with_properties (gobject.c:2181)
  ==60511==by 0x4D0DCB0: g_object_new (gobject.c:1821)
  ==60511==by 0x1E422792: create_new_user (act-user-manager.c:706)
  ==60511==by 0x1E429BD8: act_user_manager_get_user 
(act-user-manager.c:1879)
  ==60511==by 0x68ADE2D: ??? (in /usr/lib/x86_64-linux-gnu/libffi.so.8.1.0)
  ==60511==by 0x68AA492: ??? (in /usr/lib/x86_64-linux-gnu/libffi.so.8.1.0)
  ==60511== 
  ==60511== Invalid read of size 8
  ==60511==at 0x4D206E9: g_type_check_instance_is_fundamentally_a 
(gtype.c:4091)
  ==60511==by 0x4D06E9A: g_object_set_data (gobject.c:3982)
  ==60511==by 0x1E421CB6: free_fetch_user_request (act-user-manager.c:1708)
  ==60511==by 0x1E4298E7: on_find_user_by_name_finished 
(act-user-manager.c:1187)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300)
  ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256)
  ==60511==by 0x4C298BA: reply_cb (gdbusproxy.c:2576)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0E0A: UnknownInlinedFun (gtask.c:1300)
  ==60511==by 0x4BC0E0A: g_task_return (gtask.c:1256)
  ==60511==by 0x4C2107E: g_dbus_connection_call_done 
(gdbusconnection.c:5895)
  ==60511==by 0x4BC0C08: g_task_return_now (gtask.c:1230)
  ==60511==by 0x4BC0C4C: complete_in_idle_cb (gtask.c:1244)
  ==60511==  Address 0x185b5110 is 0 bytes inside a block of size 64 free'd
  ==60511==

[Dx-packages] [Bug 1871538] Re: dbus timeout-ed during an upgrade, taking services down including gdm

2022-04-20 Thread Lukas Märdian
Quoting upstream systemd developers 
(https://github.com/systemd/systemd/issues/22737#issuecomment-1077682307):
"We essentially traded one problem (lockup when starting services) for another 
(the failure described in this commit).
I actually think that the lockup is worse. Here there is a simple solution: 
switch from dbus-daemon to dbus-broker. [...]

A proper fix will most likely be to make all dbus calls asynchronous in
systemd, but that is a lot of work and it's unclear when/if it'll
happen. The regression is unfortunate, but I don't think we can fix it
in reasonable time."

So I wonder what's the best path forward in Ubuntu... if we revert,
we'll re-introduce the lockup/timeout problem with dbus-daemon. If we
keep the current version, fwupd-refresh.service is broken.


The comment at 
https://github.com/fwupd/fwupd/issues/3037#issuecomment-1100816992 suggests 
that disabling the DynamicUser= setting makes the service work again. Maybe 
that's worth a try, in order to get both problems solved? (i.e. shipping an 
override config for fwupd)


$cat /etc/systemd/system/fwupd-refresh.service.d/override.conf
[Service]
DynamicUser=no

-- 
You received this bug notification because you are a member of DX
Packages, which is subscribed to accountsservice in Ubuntu.
Matching subscriptions: dx-packages
https://bugs.launchpad.net/bugs/1871538

Title:
  dbus timeout-ed during an upgrade, taking services down including gdm

Status in D-Bus:
  Unknown
Status in systemd:
  New
Status in accountsservice package in Ubuntu:
  Invalid
Status in dbus package in Ubuntu:
  Invalid
Status in gnome-shell package in Ubuntu:
  Invalid
Status in systemd package in Ubuntu:
  Fix Released
Status in accountsservice source package in Focal:
  Invalid
Status in dbus source package in Focal:
  Invalid
Status in gnome-shell source package in Focal:
  Invalid
Status in systemd source package in Focal:
  Fix Released
Status in accountsservice source package in Groovy:
  Invalid
Status in dbus source package in Groovy:
  Invalid
Status in gnome-shell source package in Groovy:
  Invalid
Status in accountsservice source package in Hirsute:
  Invalid
Status in dbus source package in Hirsute:
  Won't Fix
Status in gnome-shell source package in Hirsute:
  Invalid
Status in accountsservice source package in Impish:
  Invalid
Status in dbus source package in Impish:
  Invalid
Status in gnome-shell source package in Impish:
  Invalid
Status in systemd source package in Impish:
  Fix Released
Status in accountsservice source package in Jammy:
  Invalid
Status in dbus source package in Jammy:
  Invalid
Status in gnome-shell source package in Jammy:
  Invalid
Status in systemd source package in Jammy:
  Fix Released

Bug description:
  [Impact]

   * There's currently a deadlock between PID 1 and dbus-daemon: in some
  cases dbus-daemon will do NSS lookups (which are blocking) at the same
  time PID 1 synchronously blocks on some call to dbus-daemon (e.g. 
`GetConnectionUnixUser` DBus call). Let's break
  that by setting SYSTEMD_NSS_DYNAMIC_BYPASS=1 env var for dbus-daemon,
  which will disable synchronously blocking varlink calls from nss-systemd
  to PID 1.

   * This can lead to delayed boot times

   * It can also lead to dbus-daemon being killed/re-started, taking
  down other services with it, like GDM, killing user sessions on the
  way (e.g. on installing updates)

  [Test Plan]

   * This bug is really hard to reproduce, as can be seen from the
  multi-year long discussion at
  https://github.com/systemd/systemd/issues/15316

   * Canonical's CPC team has the ability to reproduce  this issue (with
  a relatively high probability) in their Azure test environment, due to
  the specific setup they are using

   * So our test plan is to ask CPC (@gjolly) for confirmation if the
  issue is fixed.

  [Where problems could occur]

   * This fix touches the communication between systemd and dbus daemon,
  especially the NSS lookup, so if something is broken the (user-)name
  resolution could be broken.

   * As a workaround dbus-daemon could be replaced by dbus-broker, which
  never showed this issue or the behaviour could be changed back by
  using the `SYSTEMD_NSS_DYNAMIC_BYPASS` env variable, like this:

  #/etc/systemd/system/dbus.service.d/override.conf
  [Service]
  Environment=SYSTEMD_NSS_DYNAMIC_BYPASS=0

  [Other Info]
   
   * Fixed upstream (v251) in https://github.com/systemd/systemd/pull/22552

  
  === Original Description ===


  
  This morning I found my computer on the login screen.
  But not the one of the screen log, no a new one - so something must have 
crashed.

  Logging in again confirmed that all apps were gone and the gnome shell
  was brought down what seems like triggered by a background update o
  accountsservice.

  As always things are not perfectly clear :-/
  The following goes *back* in time through my logs one by one.

  Multiple apps crashed at 06:09, but we will find later that this is a follow 
on issue of the 

[Dx-packages] [Bug 1871538] Re: dbus timeout-ed during an upgrade, taking services down including gdm

2022-04-20 Thread Mario Limonciello
>   https://git.launchpad.net/~ubuntu-core-
dev/ubuntu/+source/systemd/commit/?id=e3aacfa26e3fc6df369e6f28e740389ae0020907

This appears to have caused a regression in fwupd in Ubuntu 20.04 with
details at https://github.com/fwupd/fwupd/issues/3037

fwupd-refresh.service uses DynamicUser and now hits this upstream bug:
https://github.com/systemd/systemd/issues/22737


** Bug watch added: github.com/fwupd/fwupd/issues #3037
   https://github.com/fwupd/fwupd/issues/3037

** Bug watch added: github.com/systemd/systemd/issues #22737
   https://github.com/systemd/systemd/issues/22737

-- 
You received this bug notification because you are a member of DX
Packages, which is subscribed to accountsservice in Ubuntu.
Matching subscriptions: dx-packages
https://bugs.launchpad.net/bugs/1871538

Title:
  dbus timeout-ed during an upgrade, taking services down including gdm

Status in D-Bus:
  Unknown
Status in systemd:
  New
Status in accountsservice package in Ubuntu:
  Invalid
Status in dbus package in Ubuntu:
  Invalid
Status in gnome-shell package in Ubuntu:
  Invalid
Status in systemd package in Ubuntu:
  Fix Released
Status in accountsservice source package in Focal:
  Invalid
Status in dbus source package in Focal:
  Invalid
Status in gnome-shell source package in Focal:
  Invalid
Status in systemd source package in Focal:
  Fix Released
Status in accountsservice source package in Groovy:
  Invalid
Status in dbus source package in Groovy:
  Invalid
Status in gnome-shell source package in Groovy:
  Invalid
Status in accountsservice source package in Hirsute:
  Invalid
Status in dbus source package in Hirsute:
  Won't Fix
Status in gnome-shell source package in Hirsute:
  Invalid
Status in accountsservice source package in Impish:
  Invalid
Status in dbus source package in Impish:
  Invalid
Status in gnome-shell source package in Impish:
  Invalid
Status in systemd source package in Impish:
  Fix Released
Status in accountsservice source package in Jammy:
  Invalid
Status in dbus source package in Jammy:
  Invalid
Status in gnome-shell source package in Jammy:
  Invalid
Status in systemd source package in Jammy:
  Fix Released

Bug description:
  [Impact]

   * There's currently a deadlock between PID 1 and dbus-daemon: in some
  cases dbus-daemon will do NSS lookups (which are blocking) at the same
  time PID 1 synchronously blocks on some call to dbus-daemon (e.g. 
`GetConnectionUnixUser` DBus call). Let's break
  that by setting SYSTEMD_NSS_DYNAMIC_BYPASS=1 env var for dbus-daemon,
  which will disable synchronously blocking varlink calls from nss-systemd
  to PID 1.

   * This can lead to delayed boot times

   * It can also lead to dbus-daemon being killed/re-started, taking
  down other services with it, like GDM, killing user sessions on the
  way (e.g. on installing updates)

  [Test Plan]

   * This bug is really hard to reproduce, as can be seen from the
  multi-year long discussion at
  https://github.com/systemd/systemd/issues/15316

   * Canonical's CPC team has the ability to reproduce  this issue (with
  a relatively high probability) in their Azure test environment, due to
  the specific setup they are using

   * So our test plan is to ask CPC (@gjolly) for confirmation if the
  issue is fixed.

  [Where problems could occur]

   * This fix touches the communication between systemd and dbus daemon,
  especially the NSS lookup, so if something is broken the (user-)name
  resolution could be broken.

   * As a workaround dbus-daemon could be replaced by dbus-broker, which
  never showed this issue or the behaviour could be changed back by
  using the `SYSTEMD_NSS_DYNAMIC_BYPASS` env variable, like this:

  #/etc/systemd/system/dbus.service.d/override.conf
  [Service]
  Environment=SYSTEMD_NSS_DYNAMIC_BYPASS=0

  [Other Info]
   
   * Fixed upstream (v251) in https://github.com/systemd/systemd/pull/22552

  
  === Original Description ===


  
  This morning I found my computer on the login screen.
  But not the one of the screen log, no a new one - so something must have 
crashed.

  Logging in again confirmed that all apps were gone and the gnome shell
  was brought down what seems like triggered by a background update o
  accountsservice.

  As always things are not perfectly clear :-/
  The following goes *back* in time through my logs one by one.

  Multiple apps crashed at 06:09, but we will find later that this is a follow 
on issue of the underlying gnome/X/... recycling.
  -rw-r-  1 paelzer  whoopsie 52962868 Apr  8 06:09 
_usr_bin_konversation.1000.crash
  -rw-r-  1 paelzer  whoopsie   986433 Apr  8 06:09 
_usr_lib_x86_64-linux-gnu_libexec_drkonqi.1000.crash

  rdkit was failing fast and giving up (that will be a different bug, it just 
seems broken on my system):
  Apr 08 06:10:13 Keschdeichel systemd[1]: Started RealtimeKit Scheduling 
Policy Service.
  Apr 08 06:10:13 Keschdeichel rtkit-daemon[1729333]: Successfully called 
chroot.
  Apr 08 06:10:13 Keschdeichel