On Thu, Feb 27, 2025 at 12:54 AM Alexey Tikhonov <[email protected]> wrote: > > Hi. > > I'll take a stab at providing some info, but please don't take this as > a definitive answer. > > If you take a glance at https://sssd.io/_images/architecture.svg > you'll see that SSSD is built around its cache (/var/lib/sss/db/*) > > The problem of a "slow `id` of a user that is a member of a bunch of > big groups" is a very prominent SSSD problem in large environments, > culminating in the IPA-AD trust scenario. > And quite long standing: > https://jhrozek.wordpress.com/2015/08/19/performance-tuning-sssd-for-large-ipa-ad-trust-deployments/ > So far 'ignore_group_members = true' is by far maring the best > response available. > > On a high level, on a client side the problem is two-fold: > (1) slow cache write operations (by "backends", 'sssd_be' process) > (2) slow cache read operations (by "responders", 'sssd_nss' in your case) > > (2) is being addressed to some extent: I currently have patches posted > for review - https://github.com/SSSD/sssd/pull/7841 - that show some > promise. Depending on specifics of your setup and workflow, those > patches might, or might not, provide you some alleviation. > Typical scenario where pronounced benefits are expected: busy server > with a hot and *huge* cache that performs tons of identity operations. > If you see it worth and could give those patches a try and then > provide feedback - that would be great.
Thank you for the detailed reply. I will take a look at your patch and see if I can apply it and if so will report back whether I see any performance improvements. > (1) is more tricky. We have profiling results that show that most of > CPU time is consumed in: > - https://github.com/SSSD/sssd/blob/master/src/ldb_modules/memberof.c > This a plugin for a 3rd party library - `libldb` - that on the fly > adds 'memberof: group-dn" attributes to user objects being written to > the cache. > - otherwise CPU consumption really depends on a backend being used - > IPA, AD, LDAP with or without nested groups, etc. There is no single > bottleneck. I suspect my issue might be more with sssd_be, the backend, and thanks to the jhrozek post you shared, I will consider mounting the sssd cache on tmpfs. > > Now getting to your ideas, if I understood it correctly. > > What you describe is more or less what already happens when > 'id_provider = ldap' is used. > When one does `getent -s sss group $group` with 'ignore_group_members > = false', it will return all group members. But inspection of > /var/lib/sss/db/cache_$domain.ldb will show only the group object > being cached, containing all members as "ghost" and "orig_member" > attributes. > > With IPA it doesn't work this way, if I understand correctly, to > properly support IPA views (server side overrides) - user objects need > to be resolved, so that the group could return overridden members > properly. > > Honestly, I don't remember right now how it works exactly with > "id_provider = ad". If you are curious, you can stop SSSD, wipe cache, > start SSSD, resolve single group (`getent group ...`), stop SSSD and > inspect cache content using 'ldbsearch' tool. > I'm talking about `getent group` because this is - `getgrgid()` - what > takes time when you call `id`. > `id` first resolves user (fast), list of groups user is member of > (fast* using tokenGroups), and then it needs to convert every GID to > groupname using `getgrgid()` - this loop is what typically hammers > SSSD. I'll take a closer look at this too. In my naive thinking I guess I was wondering why id <account_with_many_group_membership> is so slow, when it seems only 2 ldap calls are necessary (possibly more than 2 ldap calls if a member of 1000+ groups and the filter must be chunked into acceptable sizes). 1 to get memberof for samaccountname, parsing the results into a single ldap filter of: (&(objectclass=group)(|(CN=groupA)(CN=groupB)(CN=groupC)(etc...))) and using that filter to run a second ldap query to retrieve group samaccountname and gidNumber. basically do the following: #!/bin/bash ACCT=bob LDAP_URI=org.corp.com:3268 LDAP_BASE="dc=org,dc=corp,dc=com" LDAP_FILTER=$(ldapsearch -o ldif-wrap=no -QLLLH ldap://"$LDAP_URI" -b "$LDAP_BASE" "(samaccountname=${ACCT})" memberof | awk -F,OU 'BEGIN {printf "(&(objectclass=group)(|" } { if ($1 ~ "memberOf:") {sub(/memberOf: /,""); printf "("$1")" }} END { printf "))\n"}') ldapsearch -o ldif-wrap=no -QLLLH ldap://"$LDAP_URI" -b "$LDAP_BASE" "$LDAP_FILTER" gidnumber cn|awk '{if ($1 == "cn:") {printf $NF" "} else if ($1 == "gidNumber:") {printf $NF"\n"}}' For a member of ~200 groups this takes a second or so. sssd might then cache this minimal group information, which at this point is only group samaccountname, dn (possibly, but I don't parse it with my example), gidnumber and the fact that the user samaccountname is a member. Perhaps then sssd could via a deferred or background process attempt to "fill out" this minimal cached group information by querying each of the initial cached groups to find full group membership or more actively once someone ran getent group <group>? Regardless, I will follow up on your suggestions, and thank you very much for taking the time to respond. Bob -- _______________________________________________ sssd-devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
