I am having some extreme difficulty with a setup for work, and I'm hoping that someone on the list can help...
My employer has two sites, New York and London (details are being anonymized, but the details are consistent. At each physical location, we have servers which we wish to configure with AD logins against an existing on-prem AD domain, example.com. Each of these locations has two domain controllers which are properly categorized in Active Directory Sites and Services, and the Windows servers will log into their local DCs for their site. This can be confirmed as zero communication is allowed from one site's servers to the opposite's DCs. and you cannot telnet from one side to the other, although AD replication between the two sites is configured correctly. The Linux servers running sssd are Ubuntu 20.04 with sssd 2.2.3. The Windows DCs are Server 2019 running a supported domain functional level. Each Linux server has the IP addresses of their respective local DCs configured as their DNS resolvers. I was unable to use "realm discover" to discover a default domain, but I understand this is because the servers are not using DHCP, so they have no default DNS domain. However, when I "realm discover example.com", I get the expected output, including the "ad" provider statement. I am able to join the domain, and I can see the computer object getting created on the DCs, so it looks like the domain join is working correctly. I can also run "getent passwd <user>@example.com" and get a passwd entry exactly as I would expect it. The next test is to attempt to log in using the AD provider. I have tried using SSH, sudo, and directly running "login" from a 'root' shell. In all three cases, they will work once or twice and then fail over and over again, with the failures taking roughly 90 seconds to time out. Having found nothing obviously wrong but with timeout, I double-checked DNS to make sure all DNS names were available from the DCs, and they were. I also confirmed with our network team that no failures are seen when the login is happening, so it's not traversing and being blocked at the firewall. At nearly my wits end, I decided to do a local packet capture with Wireshark on the Linux server. To my surprise, sssd was trying to connect to the opposite datacenter. A server in New York was trying to connect to London for authentication, and this is not allowed by firewall rules. I see the first packet fail, and then 4 more with a doubling of each duration until it times out completely. I have spent the last couple of days trying various combinations of parameters with no luck. I have added the below settings to "sssd.conf" and restarted sssd: * ad_site = NewYork * ad_server = dc1.example.com, dc2.example.com * ad_enable_dns_sites = false * dns_discovery_domain = NewYork._sites.example.com For each of these, I have added each one to the section for example.com before restarting sssd. Each change has resulted in the same symptoms. What am I missing? What other things can I try to diagnose or resolve the issue? Thanks in advance, Rob
-- _______________________________________________ sssd-users mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
