Your message dated Fri, 21 Feb 2020 13:50:16 +0000
with message-id <[email protected]>
and subject line Bug#946847: fixed in sssd 2.2.3-1.1
has caused the Debian Bug report #946847,
regarding sssd_be: Busy loops on flaky LDAP, SIGTERM from watchdog not processed
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
946847: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=946847
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: sssd
Version: 2.2.2-1+b1
Severity: important
Tags: upstream

In a setup with sssd using a remote slapd for NSS, and a somewhat flaky
network in between, sssd_be tends to get into a busy loop sometimes, using
100% CPU time on one core.

Debugging showed that sssd has a watchdog to clean up in such cases, but
sssd_be installs a signal handler that prevents the SIGTERM on the
processgroup to be processed correctly, and does not exit.

src/util/util_watchdog.c:

     64 /* the watchdog is purposefully *not* handled by the tevent
     65  * signal handler as it is meant to check if the daemon is
     66  * still processing the event queue itself. A stuck process
     67  * may not handle the event queue at all and thus not handle
     68  * signals either */
     69 static void watchdog_handler(int sig)
     70 {
     71 
     72     watchdog_detect_timeshift();
     73 
     74     /* if a pre-defined number of ticks passed by kills itself */
     75     if (__sync_add_and_fetch(&watchdog_ctx.ticks, 1) > 
WATCHDOG_MAX_TICKS) {
     76         if (getpid() == getpgrp()) {
     77             kill(-getpgrp(), SIGTERM);
     78         } else {
     79             _exit(1);
     80         }
     81     }
     82 }

(NB. Seems what is described in the comment was not all too successful ;)

The signal handler is installed in src/providers/data_provider_be.c:

    448 static void be_process_finalize(struct tevent_context *ev,
    449                                 struct tevent_signal *se,
    450                                 int signum,
    451                                 int count,
    452                                 void *siginfo,
    453                                 void *private_data)
    454 {
    455     struct be_ctx *be_ctx;
    456 
    457     be_ctx = talloc_get_type(private_data, struct be_ctx);
    458     talloc_free(be_ctx);
    459     orderly_shutdown(0);
    460 }
    461 
    462 static errno_t be_process_install_sigterm_handler(struct be_ctx *be_ctx)
    463 {
    464     struct tevent_signal *sige;
    465 
    466     BlockSignals(false, SIGTERM);
    467 
    468     sige = tevent_add_signal(be_ctx->ev, be_ctx, SIGTERM, SA_SIGINFO,
    469                              be_process_finalize, be_ctx);
    470     if (sige == NULL) {
    471         DEBUG(SSSDBG_CRIT_FAILURE, "tevent_add_signal failed.\n");
    472         return ENOMEM;
    473     }
    474 
    475     return EOK;
    476 }

Setting a breakpoint on be_process_finalize showed that this function is
never reached, probably because libtevent never gets around to calling it.

Two proposals to circumvent this are:

 a) Reset the handler before calling kill on the process group in line 77
    (e.g. signal(SIGTERM, SIG_DFL);)
 b) Move the exit call in line 79 out of the branch so it gets called 
unconditionally
    in case kill() fails to kill the process itself

We tested solution a) in gdb and it caused sssd_be to exit cleanly and
restart, as it should.

Cheers,
Nik

Analysis was sponsored by Teckids e.V. and tarent solutions GmbH.

-- System Information:
Debian Release: bullseye/sid
  APT prefers testing-debug
  APT policy: (500, 'testing-debug'), (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 5.3.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), 
LANGUAGE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages sssd depends on:
ii  python3-sss  2.2.2-1+b1
ii  sssd-ad      2.2.2-1+b1
ii  sssd-common  2.2.2-1+b1
ii  sssd-ipa     2.2.2-1+b1
ii  sssd-krb5    2.2.2-1+b1
ii  sssd-ldap    2.2.2-1+b1
ii  sssd-proxy   2.2.2-1+b1

sssd recommends no packages.

sssd suggests no packages.

-- no debconf information

--- End Message ---
--- Begin Message ---
Source: sssd
Source-Version: 2.2.3-1.1
Done: Thorsten Glaser <[email protected]>

We believe that the bug you reported is fixed in the latest version of
sssd, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to [email protected],
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Thorsten Glaser <[email protected]> (supplier of updated sssd package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing [email protected])


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA384

Format: 1.8
Date: Fri, 21 Feb 2020 14:04:25 +0100
Source: sssd
Architecture: source
Version: 2.2.3-1.1
Distribution: unstable
Urgency: medium
Maintainer: Debian SSSD Team <[email protected]>
Changed-By: Thorsten Glaser <[email protected]>
Closes: 946847
Changes:
 sssd (2.2.3-1.1) unstable; urgency=medium
 .
   * Non-maintainer upload with maintainer permission (natureshadow).
   * Fix sssd_be busy-looping when LDAP connection flickers.
     (Closes: #946847)
Checksums-Sha1:
 209144fcaf2f1c87d61fc334a550c82e8768573e 4956 sssd_2.2.3-1.1.dsc
 cc9134867258cfcbb19210cc254fcef4f3c5d4a0 118300 sssd_2.2.3-1.1.diff.gz
Checksums-Sha256:
 7c640708aa3c5c6d5fb329bf1e8dfd985eafb5f3e02cd1e90c6bc47ac161bbac 4956 
sssd_2.2.3-1.1.dsc
 c48ae4ccb63924222e43a96df17c12d9966a6d0044f1fef7a75383f331625054 118300 
sssd_2.2.3-1.1.diff.gz
Files:
 ddf46e06639cffefd6b05c07e09e8b6a 4956 utils optional sssd_2.2.3-1.1.dsc
 7c4a9b2b1b7fce1015e0ab1e3101b2fd 118300 utils optional sssd_2.2.3-1.1.diff.gz

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (MirBSD)
Comment: ☃ ЦΤℱ—8 ☕☂☄

iQIcBAEBCQAGBQJeT9rSAAoJEHa1NLLpkAfg9wsQAKxmkhoAcaaeFl8fw9uZxSwC
7bu/ShNllmRsnl+YXmyZBiwmPUIRUtZlleEz3+qt+CapNkiBfJRzw1q8RIRSp9vM
Rbb966Bw9xTr+Z04eJ5n9KfkPHv1j2cW+UAsAfKeL5pjkBmZlzdUwDqwVg9cxoqT
/lc6Zb9NVtq/1ZqWamYpdBYD3iTQRYG3Us+gt0zLNh6SN3O/3z4M1n/VFUY+8D4w
rHwU+OVDsmvZmLVt65DCme27G93qd9hFAk9+VDK5UVqmUJOm/F449MaMYS28Axgh
OxmKCIDXOQSyDiTqmxRD7Sz0RRc+J2/kjoPEUhuxOXtXenPcTCEipgSFktXkxXLR
E1WDLwKS/KGAlUAZ0jA8kXOqRIt1MO6cEhaJF+SI0oQXDkvFPQNYBP6OIcaXj0Sk
8YcQeSsE7YwHlHI1MUaKhLbTAUcYcBVkRdyegz+F8dJ8C+EPTL0opQ4PKC10OnXd
CxnXM13JpqSUkh0kPfdDNZDDpsis1u3VCyPd6fZOy1nYlov9pWSDYMHHCoYtzFG0
ZrEkthZPVHywCdcAGwdRvl20oNcQ0O8juNjCBOK9nuBhzFkKCf//fPgjf9LuUOLv
gINXdkWVPDncDsSDKItSjWB1tlYnt2dEL1x5SUHHDoqMAgKuI9fkfFZ+u6/L8uR7
aavaxUqrz6RzYlqTUQzS
=Y+ip
-----END PGP SIGNATURE-----

--- End Message ---

Reply via email to