On 2017-04-06 20:18, Jakub Hrozek wrote:
On Thu, Apr 06, 2017 at 07:21:01PM +0200, m...@chinewalking.com wrote:
Hi,

My IPA<->AD trust setup experiences intermittent failures during login
events. The AD subdomain goes in an inactive/offline state and users logging in are put into a 'delayed authentication' queue. Usually logging in after a minute or so succeeds as the subdomain is reset and the user is cached for following events. At all times getent/id and kinit's are succesfull, even
with a purged sssd cache.
SRV records are correctly resolved, except for _kerberos-master.

I have not been able to further troubleshoot the intermittent failures.
Traffic captures show no strange behaviour, yet the sssd_domain log is
clearly showing AD to be unreachable at times. All AD servers are W2012 and DNS masking _ldap and _kerberos to single nodes, factoring out any faulty
Windows configs, so far has not had any effect (Would it?).

sssd's data_provider_fo.c :> be_fo_reset_svc() calls fo_get_service(), which returns EOK. I'm not familiar yet with the variables at play, would adding
debug statements here reveal faults that may cause this?

Could you paste a bit more context? I think what would work is to trim
the logs (truncate --size 0), then reproduce the issue and search for
the first occurence of "NOT_WORKING" message from any of the fo_*
functions.

After truncating the logs I noticed a comparable error that was fixed earlier today. I created a number of existing groups (sudo, app, etc) with low GIDs during initial deployment of IPA. One group caused issues and I deleted it earlier on. Now another group triggered exactly the same sequence of errors:

[{"CODE_FILE=src/providers/ipa/ipa_id.c", 36}{"CODE_FUNC=ipa_initgr_get_overrides_step"{"The group name=s...@unix.foo.local,cn=groups,cn=unix.foo.local,cn=sysdb has no UUID attribute objectSIDString, error!\n" [{"CODE_FILE=src/providers/ipa/ipa_subdomains_id.c", 47}{"CODE_FUNC=ipa_id_get_groups_overrides_done", 42}{"IPA resolve user groups overrides failed [22].\n" [{"CODE_FUNC=be_mark_dom_offline", 29}{"Marking subdomain foo.local offline\n"

With all these troublesome groups removed I have not been able to reproduce the issues. I will further test with different users and mapped groups. I guess the main fault was incorrect log handling. Multiple logins caused overlooking the real error and only showed the mentions of offline AD backends and subdomains.

I am not sure why these Posix groups had no objectSIDString while others did.

Thank you,

Mike

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Reply via email to