Public bug reported: On sssd 1.13.4-1ubuntu1.1 in an Active Directory domain, configured for dynamic SRV record lookup (not statically defined in krb5.conf or in sssd.conf), the lookup fails if the first nameserver in /etc/resolv.conf is unavailable.
Noticed the issue when one of our DCs (LDAP/DNS... etc) went offline and was suddenly unable to log into several systems running sssd. Found the following from the RHEL/Fedora folks: https://fedorahosted.org/sssd/ticket/1966 >From the sssd_<DOMAIN>.log: Wed Nov 16 18:01:47 2016) [sssd[be[DOMAIN.NET]]] [be_process_init] (0x0080): No SUDO module provided for [DOMAIN.NET] !! (Wed Nov 16 18:01:47 2016) [sssd[be[DOMAIN.NET]]] [be_process_init] (0x0020): No selinux module provided for [DOMAIN.NET] !! (Wed Nov 16 18:01:47 2016) [sssd[be[DOMAIN.NET]]] [be_process_init] (0x0020): No host info module provided for [DOMAIN.NET] !! (Wed Nov 16 18:01:53 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:01:53 2016) [sssd[be[DOMAIN.NET]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error]) (Wed Nov 16 18:01:53 2016) [sssd[be[DOMAIN.NET]]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks. (Wed Nov 16 18:01:53 2016) [sssd[be[DOMAIN.NET]]] [ad_subdomains_get_conn_done] (0x0080): No AD server is available, cannot get the subdomain list while offline (Wed Nov 16 18:01:59 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:01:59 2016) [sssd[be[DOMAIN.NET]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error]) (Wed Nov 16 18:01:59 2016) [sssd[be[DOMAIN.NET]]] [be_ptask_enable] (0x0080): Task [Check if online (periodic)]: already enabled (Wed Nov 16 18:01:59 2016) [sssd[be[DOMAIN.NET]]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks. (Wed Nov 16 18:01:59 2016) [sssd[be[DOMAIN.NET]]] [ad_subdomains_get_conn_done] (0x0080): No AD server is available, cannot get the subdomain list while offline (Wed Nov 16 18:03:05 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:04:14 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:06:19 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:10:23 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:18:50 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 18:34:51 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 19:07:10 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 19:39:10 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 20:11:16 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 20:43:39 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 21:15:52 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 21:48:17 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 22:20:31 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 22:38:44 2016) [sssd[be[DOMAIN.NET]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached (Wed Nov 16 22:38:44 2016) [sssd[be[DOMAIN.NET]]] [be_ptask_enable] (0x0080): Task [Check if online (periodic)]: already enabled (Wed Nov 16 22:38:44 2016) [sssd[be[DOMAIN.NET]]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks. I've confirmed that reordering /etc/resolv.conf to place a working nameserver first corrects the issue. >From apt-cache show sssd: Package: sssd Priority: extra Section: utils Installed-Size: 29 Maintainer: Ubuntu Developers <ubuntu-devel-disc...@lists.ubuntu.com> Original-Maintainer: Debian SSSD Team <pkg-sssd-de...@lists.alioth.debian.org> Architecture: amd64 Version: 1.13.4-1ubuntu1.1 Depends: python-sss (= 1.13.4-1ubuntu1.1), sssd-ad (= 1.13.4-1ubuntu1.1), sssd-common (= 1.13.4-1ubuntu1.1), sssd-ipa (= 1.13.4-1ubuntu1.1), sssd-krb5 (= 1.13.4-1ubuntu1.1), sssd-ldap (= 1.13.4-1ubuntu1.1), sssd-proxy (= 1.13.4-1ubuntu1.1) Filename: pool/main/s/sssd/sssd_1.13.4-1ubuntu1.1_amd64.deb Size: 4310 MD5sum: 30343a1c72b2d1c64e3cfa666377d9b5 SHA1: 687dde4b2010a7c16424e9d0793241d18335b332 SHA256: fe7df88e69e5907e5d19d9d064cb4fb394c97284883ad0f57234dc285430eb17 Description-en: System Security Services Daemon -- metapackage Provides a set of daemons to manage access to remote directories and authentication mechanisms. It provides an NSS and PAM interface toward the system and a pluggable backend system to connect to multiple different account sources. It is also the basis to provide client auditing and policy services for projects like FreeIPA. . This package is a metapackage which installs the daemon and existing authentication back ends. Description-md5: fbc7eaa314ae2423fee9d2943b3f4223 Multi-Arch: foreign Homepage: https://fedorahosted.org/sssd/ Bugs: https://bugs.launchpad.net/ubuntu/+filebug Origin: Ubuntu Supported: 5y Package: sssd Priority: extra Section: utils Installed-Size: 29 Maintainer: Ubuntu Developers <ubuntu-devel-disc...@lists.ubuntu.com> Original-Maintainer: Debian SSSD Team <pkg-sssd-de...@lists.alioth.debian.org> Architecture: amd64 Version: 1.13.4-1ubuntu1 Depends: python-sss (= 1.13.4-1ubuntu1), sssd-ad (= 1.13.4-1ubuntu1), sssd-common (= 1.13.4-1ubuntu1), sssd-ipa (= 1.13.4-1ubuntu1), sssd-krb5 (= 1.13.4-1ubuntu1), sssd-ldap (= 1.13.4-1ubuntu1), sssd-proxy (= 1.13.4-1ubuntu1) Filename: pool/main/s/sssd/sssd_1.13.4-1ubuntu1_amd64.deb Size: 4304 MD5sum: dddc14b5d833eb161161aaf8d3d0987e SHA1: b24d8b3795591338ba9ccc0ced4d653e91a196a1 SHA256: f580cc0036efe7fd3a95ce13715f1bc82690fe7ac01eb1aee174e79bfe6f06d4 Description-en: System Security Services Daemon -- metapackage Provides a set of daemons to manage access to remote directories and authentication mechanisms. It provides an NSS and PAM interface toward the system and a pluggable backend system to connect to multiple different account sources. It is also the basis to provide client auditing and policy services for projects like FreeIPA. . This package is a metapackage which installs the daemon and existing authentication back ends. Description-md5: fbc7eaa314ae2423fee9d2943b3f4223 Multi-Arch: foreign Homepage: https://fedorahosted.org/sssd/ Bugs: https://bugs.launchpad.net/ubuntu/+filebug Origin: Ubuntu Supported: 5y root@eris-ubnt:~# lsb_release -rd Description: Ubuntu 16.04.1 LTS Release: 16.04 Expected behavior is that sssd will utilize a fallback/secondary DNS server to query for the SRV records it needs. Actual behavior is SSSD goes offline and fails to authenticate. ** Affects: sssd (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1642486 Title: SSSD SRV Lookup Failover Fails when primary DNS Server is unresponsive To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1642486/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs