** Summary changed:

- ad_use_ldaps error could not start tls encryption
+ ldap_install_tls occasionally fails due to watchdog timeout when using 
ad_use_ldaps with tls

** Description changed:

- New sssd.conf variable ad_use_ldaps not working. On starting sssd it
- errors with "sssd[be[13765]: Could not start TLS encryption. (unknown
- error code)"
+ [Impact]
  
- # lsb_release -rd
- Description:    Ubuntu 18.04.5 LTS
- Release:        18.04
- Note: problem also seen with Ubuntu 20.04.2
- # apt-cache policy sssd | grep Installed
-   Installed: 1.16.1-1ubuntu1.7
+ If you enable ad_use_ldaps on your sssd config, and have your sssd
+ configured to use TLS instead of the regular GSS-SPNEGO or GSSAPI
+ encryption, if you have a slow AD server or a busy network, the watchdog
+ could timeout the call to ldap_install_tls() before it completes, and
+ you won't be able to connect to the AD server, since the TLS handshake
+ will fail.
  
- Expectation
- Adding ad_use_ldaps to a working AD integrated /etc/sssd/sssd.conf to use 
port 636 instead of port 389 due ADV 190023. Reference 
https://bugs.launchpad.net/ubuntu/focal/+source/sssd/+bug/1868703/
+ If you set debug_level to 4 or higher, you will see the following in
+ sssd_ldap_server.log:
  
- Problem
- Added a working Public root CA cert to the common ca-certificate 
(/etc/ssl/ca-certificates) and  /etc/ldap/ldap.conf has following set:
- TLS_CACERT      /etc/ssl/certs/ca-certificates.crt
- An ldapsearch using the above certificate bundle against LDAPS is successful:
- 
- # openssl s_client -connect company-ad-server.company.com:636 
CONNECTED(00000005)
- # ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b 
"dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( 
ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication 
started SASL username: superduperu...@company.com SASL SSF: 0 filter: 
(sAMAccountName=superduperuser) requesting: All userApplication attributes 
<snip>
- # Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super 
ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
- 
- sssd.conf is configured with:
- [sssd]
- domains = company.com
- config_file_version = 2
- services = nss, pam
- 
- [domain/company.com]
- ad_domain = company.com
- krb5_realm = company.com
- realmd_tags = manages-system joined-with-adcli
- cache_credentials = True
- id_provider = ad
- krb5_store_password_if_offline = True
- default_shell = /bin/bash
- use_fully_qualified_names = True
- fallback_homedir = /home/%u@%d
- ldap_id_mapping = True
- ad_use_ldaps = True
- ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt
- auth_provider = ad
- access_provider = simple
- simple_allow_groups = linux-admins
- 
- Stopping sssd, clearing sssd cache, starting sssd returns following error:
- sssd[be[13765]: Could not start TLS encryption. (unknown error code)
- 
- Setting debug_level = 4 (or higher) returns following around this unknown 
error:
  [set_server_common_status] (0x0100): Marking server 'ad-server.company.com' 
as 'name resolved'
  [be_resolve_server_process] (0x0200): Found address for server 
ad-server.company.com: [y.y.y.y] TTL 3600
  [ad_resolve_callback] (0x0100): Constructed uri 
'ldaps://ad-server.company.com'
  [ad_resolve_callback] (0x0100): Constructed GC uri 
'ldaps://ad-server.company.com'
  [sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for 
connecting
  [sss_ldap_init_sys_connect_done] (0x0020): ldap_install_tls failed: [Connect 
error] [(unknown error code)]
  [sss_ldap_init_state_destructor] (0x0400): calling ldap_unbind_ext for 
ldap:[0x55d1149ef6e0] sd:[18]
  [sss_ldap_init_state_destructor] (0x0400): closing socket [18]
  [sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: 
[5]: Input/output error.
  [fo_set_port_status] (0x0100): Marking port 389 of server 
'ad-server.company.com' as 'not working'
  [fo_set_port_status] (0x0400): Marking port 389 of duplicate server 
'ad-server.company.com' as 'not working'
+ 
+ ldapsearch with ldaps will work correctly in the same environment:
+ 
+ # openssl s_client -connect company-ad-server.company.com:636 
CONNECTED(00000005)
+ # ldapsearch -v -H ldaps://company-ad-server.company.com:636 -b 
"dc=company,dc=com" "(sAMAccountName=superduperuser)" ldap_initialize( 
ldaps://company-ad-server.company.com:636/??base ) SASL/GSSAPI authentication 
started SASL username: superduperu...@company.com SASL SSF: 0 filter: 
(sAMAccountName=superduperuser) requesting: All userApplication attributes 
<snip>
+ # Duperuser\2C Super ADM, Users, Admin, company.com dn: CN=Duperuser\, Super 
ADM,OU=Internal,OU=Users,OU=Admin,DC=company,DC=com <snip>
+ 
+ A workaround is to simply try again, since this a race condition, and
+ you might beat the watchdog on subsequent retries. Otherwise, disable
+ ad_use_ldaps until a fix is available.
+ 
+ [Testcase]
+ 
+ You will need a Windows 2k19 server with Active Directory installed and
+ configured, and create some users in Active Directory.
+ 
+ On the Ubuntu client, join the AD server using realm. You will need to
+ import the AD certificate too.
+ 
+ When importing the TLS certificate, you can add it to 
/etc/ssl/ca-certificates, and edit /etc/ldap/ldap.conf and set:
+ TLS_CACERT      /etc/ssl/certs/ca-certificates.crt
+ 
+ Edit /etc/sssd/sssd.conf and ensure that ldap_tls_cacert is set
+ correctly to "ldap_tls_cacert = /etc/ssl/certs/ca-certificates.crt", and
+ enable "ad_use_ldaps = True".
+ 
+ Then restart sssd with:
+ 
+ $ sudo systemctl restart sssd.service
+ 
+ If you have a slow server or busy network, the watchdog will kill the
+ call to ldap_install_tls() before it completes, and sssd will fail to
+ start. You may need several attempts to reproduce. Just keep restarting
+ sssd.service.
+ 
+ [Where problems could occur]
+ 
+ The changes only affect users who implement ad_use_ldaps, and only those
+ who use TLS. Those using GSS-SPNEGO with ad_use_ldaps would not be
+ affected, and neither those not using ad_use_ldaps.
+ 
+ The patch checks for failure of TLS handshake with the AD server, and
+ adds a retry if the failure was caused by the watchdog killing the call
+ to ldap_install_tls(). This happens very early on in sssd service
+ startup, and if a regression were to occur, a system administrator would
+ notice almost immediately and downgrade the package.
+ 
+ If a regression were to occur, a workaround is to 1) change from tls to
+ GSS_SPNEGO, or 2) disable ad_use_ldaps.
+ 
+ [Other info]
+ 
+ This is reported upstream in:
+ 
+ https://github.com/SSSD/sssd/issues/5531
+ 
+ The commit which fixes the issue is:
+ 
+ commit da55e3e69707de416b7949d08c165c950090bbb6
+ From: Iker Pedrosa <ipedr...@redhat.com>
+ Date: Wed, 3 Mar 2021 15:34:49 +0100
+ Subject: ldap: retry ldap_install_tls() when watchdog interruption
+ Link: 
https://github.com/SSSD/sssd/commit/da55e3e69707de416b7949d08c165c950090bbb6
+ 
+ This landed in sssd 2.5.0, so Bionic, Focal, Hirsute and Impish all
+ require fixing. The commit is a cherry pick to Focal, Hirsute and
+ Impish, while Bionic requires a backport for minor context adjustments.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1921494

Title:
  ldap_install_tls occasionally fails due to watchdog timeout when using
  ad_use_ldaps with tls

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1921494/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to