I am having some extreme difficulty with a setup for work, and I'm hoping
that someone on the list can help...

My employer has two sites, New York and London (details are being
anonymized, but the details are consistent.  At each physical location, we
have servers which we wish to configure with AD logins against an existing
on-prem AD domain, example.com.  Each of these locations has two domain
controllers which are properly categorized in Active Directory Sites and
Services, and the Windows servers will log into their local DCs for their
site.  This can be confirmed as zero communication is allowed from one
site's servers to the opposite's DCs. and you cannot telnet from one side
to the other, although AD replication between the two sites is configured
correctly.  The Linux servers running sssd are Ubuntu 20.04 with sssd
2.2.3.  The Windows DCs are Server 2019 running a supported domain
functional level.  Each Linux server has the IP addresses of their
respective local DCs configured as their DNS resolvers.

I was unable to use "realm discover" to discover a default domain, but I
understand this is because the servers are not using DHCP, so they have no
default DNS domain.  However, when I "realm discover example.com", I get
the expected output, including the "ad" provider statement.  I am able to
join the domain, and I can see the computer object getting created on the
DCs, so it looks like the domain join is working correctly.  I can also run
"getent passwd <user>@example.com" and get a passwd entry exactly as I
would expect it.

The next test is to attempt to log in using the AD provider.  I have tried
using SSH, sudo, and directly running "login" from a 'root' shell.  In all
three cases, they will work once or twice and then fail over and over
again, with the failures taking roughly 90 seconds to time out.

Having found nothing obviously wrong but with timeout, I double-checked DNS
to make sure all DNS names were available from the DCs, and they were.  I
also confirmed with our network team that no failures are seen when the
login is happening, so it's not traversing and being blocked at the
firewall.  At nearly my wits end, I decided to do a local packet capture
with Wireshark on the Linux server.  To my surprise, sssd was trying to
connect to the opposite datacenter.  A server in New York was trying to
connect to London for authentication, and this is not allowed by firewall
rules.  I see the first packet fail, and then 4 more with a doubling of
each duration until it times out completely.

I have spent the last couple of days trying various combinations of
parameters with no luck.

I have added the below settings to "sssd.conf" and restarted sssd:
* ad_site = NewYork
* ad_server = dc1.example.com, dc2.example.com
* ad_enable_dns_sites = false
* dns_discovery_domain = NewYork._sites.example.com

For each of these, I have added each one to the section for example.com
before restarting sssd.  Each change has resulted in the same symptoms.

What am I missing?  What other things can I try to diagnose or resolve the
issue?

Thanks in advance,
Rob
-- 
_______________________________________________
sssd-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/[email protected]
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to