On Wed, 2012-04-04 at 19:52 +0200, Pavel Březina wrote: > First of all I have never thought of the in-memory cache as a > performance issue (we are only sorting them), but as a nice way to > handle rules expiration. > > I will try to explain why the sysdb is not enough in this case. > > Because it is a security issue,
Ok Pavel, stop right there. Explain exactly what would be the security issue and why sysdb implies it while a second in-memory cache wouldn't. > when user runs SUDO, we want to always > go to an LDAP server and download current rules - not all of them, just > the ones that apply to him to minimize the traffic. No we do not always want to, that's not necessarily the case, I can well see admins decide that updating every X minutes is good enough, it's a balance between performance and freshness of rules. > But running SUDO multiple times in a short period of time is not an > exception, therefore we cannot afford this approach because it would be > too. It would make SUDO unusable. Correct, a period of time within which sudo rules are considered valid is quite ok. > Therefore we need to implement expiration mechanism so we can reuse the > rules in a short period (currently set to 180 seconds). Why do we need a new expiration mechanism ? In sysdb we already store "expiration time" for the entries normally. At least we do that for users/groups. If it is not done for sudo rules it is an error, as you wouldn't be able to asses rules validity on sssd process restart. > The question is *how*? > > 1. *Download every time all rules instead of per user* > This would be probably too costly to do, similar to users and groups > enumeration. There is already implemented an option that enables > periodical update of all rules. > > I admit I have never seen an enterprise sudoers. I assume that it > can get very big with hundreds of rules. If this is not a valid > assumption, just tell me that I am stupid and we can do it this > way :-) sudo rules can get very large indeed, whether it is ok to enumerate them or not is a different thing. Enumerating a few hudreds of rules is not a big deal. For users it is different because we expect up to hundreds of thousands users/groups. Perhaps we should have a settable threshold under which we do enumerate frequently, over which we automatically stop doing that unless forced to. > 2. *Store the rules in sysdb per user* > There would be many data duplication and the database may get very > huge. Why per-user ? Doesn't make sense to me. > 3. Store the expiration time with each rule > *update only those rules that are expired* I totally expect this to be done already, if it is not done I consider it a bug. > We wouldn't be able to decide whether a new rule was created without > downloading all rules that apply for the user. This assumption and logic is incorrect. The assumption is incorrect because we already do smart updates for users/groups enumeration by downloading only rules that were changed since the last time they were enumerated. So you can easily download only newer rules by using the changetime of the newest rule you have in sysdb. But here the logic seem also incorrect. The main issue I see is not much in finding if new rules have been added. But in invalidating rules that have been removed. However this could also be done in a smart way. First refresh only for new rules. Then apply rule matching. For the handful of rules that match the sudo request you check how old they are. For each rule older than X minutes you do active validation by trying to fetch them individually. This way you refresh only the matching rules. If any rules is changed/missing you update/remove it from the evaluation set. > (And we might be delivering to SUDO half of the updated rules and > half of the old rules.) I do not see how this would happen. > 4. Store the expiration time with each rule > *if one rule has different timestamp than others, perform an update* I do not think this makes much sense, you want to be able to update only individual rules, always fetching all rules that apply to a user seem a waste of bandwidth and time spent waiting by the user. > Here we may end up in a not complete set of rules and a race > condition when the cache will not be used at all. Details are below > (*). > > 5. *Storing set of rules per user in in-memory cache* (current approach) > > Once the rules that apply for the user are downloaded, they are > stored in the sysdb for the offline times but they are also stored > in a hash table. We set a tevent timer that will remove them from > the hash table after the expiration time exceeds. What's the point here ? You can achieve this exaclty same result by setting an expiration date on each sysdb entry. > We will refresh the rules in case of in-memory cache miss. You can do the same with sysdb rules, just consider any expired one as missing. > There is a possible duplication of the rules in the memory, but only > for a short time and for a small amount of users (I don't expect > many administrators to be using sudo at one moment). I wouldn't count on such assumptions, however I do not see how a duplicated in-memory cache is helping here. You can do exactly the same operations with exactly the same effects using sysdb entries, so why is it not done that way ? > 6. If there is some approach I have not thought of, please tell me. See above. > ======================================================================= > (*) > > **Rules in LDAP** > > cn=rule1 > sudoUser: A > > cn=rule2 > sudoUser: B > > cn=rule3 > sudoUser: A > sudoUser: B > > **Timeline** > > *Timestamp: 0* > user A does: sudo ls > > Downloaded rules: rule1 and rule3 > Sysdb contains: > > cn=rule1 > expire: 180 > > cn=rule3 > expire: 180 > > *Timestamp: 10* > user B does: sudo ls > > rule1 and rule3 are found in sysdb > expiration time equals > return them > > !!! rule2 is missing !!! > > *Timestamp: 190* > user B does: sudo ls > > rule3 in sysdb is expired > perform an LDAP search for user B > sysdb contains: > > cn=rule1 > expire: 180 > > cn=rule2 > expire: 370 > > cn=rule3 > expire: 370 > > *Timestamp: 200* > user A does: sudo ls > > rule1 and rule3 have different expiration time > perform an LDAP search for user A > sysdb contains: > > cn=rule1 > expire: 380 > > cn=rule2 > expire: 370 > > cn=rule3 > expire: 380 > > *Timestamp: 210* > user B does: sudo ls > > rule2 and rule3 have different expiration time > perform an LDAP search for user B > sysdb contains: > > cn=rule1 > expire: 380 > > cn=rule2 > expire: 390 > > cn=rule3 > expire: 390 > > !!! always performing an LDAP search !!! The error seem to me in trying to download rules per-user, I do not think you can ever attain decent performances this way, and it will always be fraught with errors. For example if sssd goes off line right after the first step, then userB will always miss rules. This is not expected, very difficult to explain to admins, and, frankly probably unacceptable as it is very non-deterministic. What you really want to do is to enumerate sudo rules that apply to the machine. You effectively want to cache them all if possible and have a reasonable tunable expiration time (5/10/15 minutes ...). The reason is quite simple, sudo rules change very rarely, but you want them all even when you are offline. You do not want to have a normal user not have the sudo rules for his laptop handy only because he rarely needs them. Assume the case where sudo is used to run some VPN software only when user X is out of the office. Now user X never used the VPN before and therefore never needed to run sudo. Because you never downloaded rules he goes home, opens the machine, runs the command to bring up the VPN and ... it doesn't work. I think the nature of sudo requires you to download all the rules that apply to a specific machine. If we have a concern about the number of rules, the admin can easily cut down the rules by properly restricting the set of rule used on specific machines. But in general I do not think it will be a big deal, I do not expect to see cases where multiple thousands of rules that apply to the same host are used. It would be unmanageable anyways. So I think that the problem in the end is the approach where you decided to download rules per-user instead of per-machine, it leads you into all sort of corner cases. The in-memory cache at this point is just a red-herring but I am glad I dug in here as it uncovered a bigger design issue I didn't notice before. The in-memory cache keeps being an additional layer that should simply be avoided, and becomes completely useless if you download rules per-machine instead of per-user. Simo. -- Simo Sorce * Red Hat, Inc * New York _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel