Hello all, during this month I have been slowly working on a set of patches to move from storing information in 2 different formats (legacy and member/memberOf based) to just one format (member/memberOf based). While doing this I had to address some problems that come up when you want to store a group and its members have not been stored yet, and cases like this. All the while I have been testing doing enumerations against a server that has more than 3k users and 3k groups. This is a medium sized database, and yet getting groups from scratch (startup after deleting the .ldb database) could take up to a minute; granted the operation is quite a bit faster if the database just needs updating and not creation from scratch, but I still think it's too much.
I've been thinking hard about how to address this problem and solve the few hacks we have in the code when it comes to enumeration caching and retrieval. We always said that enumerations are evil (and they are indeed) and in fact we even have options that disable enumerations by default. Yet I think this is not necessarily the right way to go. I think we have 2 major problems in our current architecture when it comes to enumerations. 1) we try to hit the wire when an enumeration request comes in from a process and a (small) timeout for the previous enumeration has been expired. 2) We run the enumeration in a single transaction (and yes I have recently introduced this), which means any other operation is blocked until the enumeration is finished. The problem I actually see is that user space apps may have to wait just too much, and this *will* turn out to be a problem. Even if we give the option to turn off enumeration I think that for apps that needs it the penalty has become simply too big. Also I think the way we have to perform updates using this model is largely inefficient, as we basically perform a full new search potentially every few minutes. After some hard thinking I wrote down a few points I'd like the list opinion on. If people agree I will start acting on them. * stop performing enumerations on demand, and perform them in background if enumerations are activated (change the enumeration parameter from a bitfield to a true/flase boolean) * perform a full user+group enumeration at startup (possibly using a paged or vlv search) * when possible request the modifyTimestamp attribute and save the highest modifyTimestamp into the domain entry as originalMaxTimestamp * on a tunable interval run a task that refreshes all users and groups in the background using a search filter that includes &(modifyTimestamp>$originalMaxtimestamp) * still do a full refresh every X minutes/hours * disable using a single huge transaction for enumerations (we might be ok doing a transaction for each page search if pages are small, otherwise just revert to the previous behavior of having a transaction per stored object) * Every time we update an entry we store the originalModifyTimestamp on it as a copy of the remote modifyTimestamp, this allows us to know if we actually need to touch the cached entry at all upon refresh (like when a getpwnam() is called, speeding up operations for entries that need no refresh (we will avoid data transformation and a writing to ldb). * Every time we run the general refresh task or we save a changed entry we store a LastUpdatedTimestamp * When the refresh task is completed successfully we run another cleanup task that searches our LDB for any entry that has a too old LastUpdatedTimestamp. If any is found, we double check against the remote server if the entry still exists (and update it if it does), and otherwise we delete it. NOTE: this means that until the first background enumeration is complete, a getent passwd or a getent group call may return incomplete results. I think this is acceptable as it will really happen only at startup, when the daemon caches are empty. NOTE2: Off course the scheduled refreshes and cleanup tasks are always rescheduled if we are offline or if a fatal error occurs during the task. NOTE3: I am proposing to change only the way enumerations are handled, single user or group lookups will remain unchanged for now and will be dealt with later if needed. Please provide comments or questions if you think there is anything not clear with the proposed items or if you think I forgot to take some important aspect in account. Simo. _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel