Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/10/2010 09:42 AM, Dmitri Pal wrote: Stephen Gallagher wrote: On 09/08/2010 09:04 AM, Stephen Gallagher wrote: I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. I've added some additional details about how I would like to do the nesting and loop detection to the wiki page. Comments welcome. Sorry I am having trouble understanding this algorithm. But may be it is because i do not understand the tevent_req interface to the level needed here. AFAIU the struct tevent_req setnetgrent_send(char *netgroupname, hash_table_t *nesting) call will ask for the netgroups, while another call errno_t setnetgrent_recv(tevent_req *req, struct collection **entries) is the call that will be executed when the response from the server is received. No, this is where you are mistaken. The call that will be executed when the request is finished is specified by the caller right after invoking setnetgrent_send() by using the tevent_set_callback() function. This function that is called must then invoke setnetgrent_recv() in order to read out the final result data that is available. The problems I have are with the item 4). If it removes the netgroup from the hash how the hash ever grows? You're confusing the hash with the result set (which will be a libcollection object). I'm thinking about changing the way I do the nested invocation so that the toplevel hides the need for the hash. The idea behind the hash is actually to have it double as a reference count and a loop-detection mechanism. It doesn't need to be a hash (it could just as easily be a b-tree), but since we already have an efficient hash available, I was just going to use that. The idea is that for every time we recurse down a level, we will add the name of that netgroup to the hash. Before recursing down again, we'll make sure that the new name is not a key in the hash. If it is, we know we've hit a loop and should break processing. When we recurse up a level, we need to remove this entry from the tracking hash so that it's possible to recurse down into it again in a different branch of the tree. There is a pro and con to this approach. Pro: We can store the complete result sets of all of the member netgroups individually, so if they're requested directly or indirectly again later, we don't have to go back to the server. This is a very real, tangible advantage. Con: It does mean that if the same member netgroup appears twice in the nest that we will return additional copies. This is allowable by the standard and is more of a configuration bug than anything else, so I don't think it necessarily makes a lot of sense to try optimizing it away at this point. Trying to graph it: (netgroupA has nested members netgroupB and netgroupC) netgroupA | - netgroupB | - netgroupC (netgroupB has nested member netgroupD) netgroupB | - netgroupD (netgroupC has nested member netgroupD) netgroupC | - netgroupD In this situation, my result set will ACTUALLY be: netgroupA | - netgroupB | | | netgroupD | - netgroupC | netgroupD The result set for netgroupA WILL have two copies of netgroupD It should grow when the requests and responses (!) are processed in a nested way. But I do not think that this is possible with the interface we have (at least how it is described). If the request for a netgroup sent and then the response is received and we are processing a response and find that the netgroup has nested netgroups what do we do? You misunderstood. We're doing the nesting internally and only responding once all the recursive calls are complete. So when the callback invokes setnetgrent_recv(), it's going to receive a libcollection object that is 100% complete. Am I missing something? Yes, see above :) Also there is not design for the int innetgr(const char *netgroup, const char *host, const char *user, const char *domain); Is this intentional or just an omission? It's intentional. It would be really nice if there was actually an interface for this, but unfortunately libc internally wraps this by calling setnetgrent(), looping through getnetgrent() then endgrent() and then manually searching the result list for the 3-tuple specified. It would be much more efficient if we could handle it internally, but it's a
Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/13/2010 08:13 AM, Stephen Gallagher wrote: The idea behind the hash is actually to have it double as a reference count and a loop-detection mechanism. It doesn't need to be a hash (it could just as easily be a b-tree), but since we already have an efficient hash available, I was just going to use that. The idea is that for every time we recurse down a level, we will add the name of that netgroup to the hash. Before recursing down again, we'll make sure that the new name is not a key in the hash. If it is, we know we've hit a loop and should break processing. When we recurse up a level, we need to remove this entry from the tracking hash so that it's possible to recurse down into it again in a different branch of the tree. There is a pro and con to this approach. I forgot to mention that the other use of the hash is that it provides an easy way to handle nesting limits. A dhash table provides an interface for requesting the number of entries in the hash. If one more subrequest is going to exceed the maximum nesting level, then we can cancel the request. So using a tracking dhash here just makes the nesting limit and loop-detection very easy. - -- Stephen Gallagher RHCE 804006346421761 Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyOFnoACgkQeiVVYja6o6OaagCffR1twxELT7PAKYsNRya5RjZW MFsAn2NdOiN4pnvB0eHp6sRPj8XGdRds =Hwqr -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
Stephen Gallagher wrote: On 09/10/2010 09:42 AM, Dmitri Pal wrote: Stephen Gallagher wrote: On 09/08/2010 09:04 AM, Stephen Gallagher wrote: I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. I've added some additional details about how I would like to do the nesting and loop detection to the wiki page. Comments welcome. Sorry I am having trouble understanding this algorithm. But may be it is because i do not understand the tevent_req interface to the level needed here. AFAIU the struct tevent_req setnetgrent_send(char *netgroupname, hash_table_t *nesting) call will ask for the netgroups, while another call errno_t setnetgrent_recv(tevent_req *req, struct collection **entries) is the call that will be executed when the response from the server is received. No, this is where you are mistaken. The call that will be executed when the request is finished is specified by the caller right after invoking setnetgrent_send() by using the tevent_set_callback() function. This function that is called must then invoke setnetgrent_recv() in order to read out the final result data that is available. If I read it right you are saying that the callback is invoked first and it in turn invokes setnetgrent_recv() from itself. If this is the case then my statement in general is true. And the problem I see is that you can't assume that the order of the responses from the server will be same as the order of the requests. What if the request got rejected due to a bad connection but next request was satisfied. Or there was a glitch in router or something. It does not matter. My point that generally it is possible that the responses will come in reverse order so relying on the order of the responses in the algorithm is a mistake. The problems I have are with the item 4). If it removes the netgroup from the hash how the hash ever grows? You're confusing the hash with the result set (which will be a libcollection object). I'm thinking about changing the way I do the nested invocation so that the toplevel hides the need for the hash. The idea behind the hash is actually to have it double as a reference count and a loop-detection mechanism. It doesn't need to be a hash (it could just as easily be a b-tree), but since we already have an efficient hash available, I was just going to use that. It does not make a difference what actual implementation (hash, b-tree, collection you are going to use for it). It is a set and a check against a set needs to be made. The idea is that for every time we recurse down a level, we will add the name of that netgroup to the hash. Before recursing down again, we'll make sure that the new name is not a key in the hash. If it is, we know we've hit a loop and should break processing. I agree with this part 100%. I was concerned about the moment you plan to remove item from the set. When we recurse up a level, we need to remove this entry from the tracking hash so that it's possible to recurse down into it again in a different branch of the tree. There is a pro and con to this approach. I see what you are trying to accomplish but since you can't rely on the order of the responses you can't be sure if this is a response for a nested group or group on a different branch. We need to think of a better safeguard here. Thanks Dmitri ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
Stephen Gallagher wrote: On 09/13/2010 08:13 AM, Stephen Gallagher wrote: The idea behind the hash is actually to have it double as a reference count and a loop-detection mechanism. It doesn't need to be a hash (it could just as easily be a b-tree), but since we already have an efficient hash available, I was just going to use that. The idea is that for every time we recurse down a level, we will add the name of that netgroup to the hash. Before recursing down again, we'll make sure that the new name is not a key in the hash. If it is, we know we've hit a loop and should break processing. When we recurse up a level, we need to remove this entry from the tracking hash so that it's possible to recurse down into it again in a different branch of the tree. There is a pro and con to this approach. I forgot to mention that the other use of the hash is that it provides an easy way to handle nesting limits. A dhash table provides an interface for requesting the number of entries in the hash. If one more subrequest is going to exceed the maximum nesting level, then we can cancel the request. So using a tracking dhash here just makes the nesting limit and loop-detection very easy. Yes. I agree with this. Hash is definitely the best object for the task. It is just that algorithm has a flaw I pointed in the other mail. ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel -- Thank you, Dmitri Pal Engineering Manager IPA project, Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/13/2010 12:07 PM, Dmitri Pal wrote: If I read it right you are saying that the callback is invoked first and it in turn invokes setnetgrent_recv() from itself. If this is the case then my statement in general is true. And the problem I see is that you can't assume that the order of the responses from the server will be same as the order of the requests. What if the request got rejected due to a bad connection but next request was satisfied. Or there was a glitch in router or something. It does not matter. My point that generally it is possible that the responses will come in reverse order so relying on the order of the responses in the algorithm is a mistake. No, it's not possible. The way requests with subrequests work is that the toplevel request will NEVER return until all of its subrequests have been processed (or an error occurs, which will cancel the subrequests). I'll try to explain in pseudocode a little better. setnetgrent() { req = setnetgrent_send(data); tevent_req_set_callback(req, setnetgrent_done); } setnetgrent_send() { subreq = setnetgrent_internal_send(data, nesting=0); tevent_req_set_callback(subreq, setnetgrent_internal_done) } The toplevel will NEVER call setnetgrent_done() until any and all individual setnetgrent_send_internal_send/done pairs have completed. That's a high-level view. In a real implementation, the setnetgrent_internal_done() call would call a setnetgrent_internal_step() function whose purpose it would be to call setnetgrent_internal_send() again in series so that we processed all the lookups. We don't parallelize the individual netgroup lookups. They're always done in a serial manner (though asynchronous so we don't block if they have to go to LDAP). The problems I have are with the item 4). If it removes the netgroup from the hash how the hash ever grows? You're confusing the hash with the result set (which will be a libcollection object). I'm thinking about changing the way I do the nested invocation so that the toplevel hides the need for the hash. The idea behind the hash is actually to have it double as a reference count and a loop-detection mechanism. It doesn't need to be a hash (it could just as easily be a b-tree), but since we already have an efficient hash available, I was just going to use that. It does not make a difference what actual implementation (hash, b-tree, collection you are going to use for it). It is a set and a check against a set needs to be made. Yes, and also the dhash has a built-in function that can report the total number of keys available, which I'll use to identify the nesting level. The idea is that for every time we recurse down a level, we will add the name of that netgroup to the hash. Before recursing down again, we'll make sure that the new name is not a key in the hash. If it is, we know we've hit a loop and should break processing. I agree with this part 100%. I was concerned about the moment you plan to remove item from the set. I have to remove it from the set when we traverse down a separate section of the nesting tree because (as an optimization) we're saving the results of the nested netgroups in their entirety into the cache (since we're already processing it). So that if we call netgroupA which has netgroupB and netgroupC as members, each of which ALSO have netgroupD as a member, then we have the complete copies of netgroupB and netgroupC available in the lookup cache already, in case a request is made directly against them. It DOES result in duplicate data when reporting netgroupA (which is acceptable), but it means that we have complete data to cache for the other netgroups. When we recurse up a level, we need to remove this entry from the tracking hash so that it's possible to recurse down into it again in a different branch of the tree. There is a pro and con to this approach. I see what you are trying to accomplish but since you can't rely on the order of the responses you can't be sure if this is a response for a nested group or group on a different branch. We need to think of a better safeguard here. See above. We can always guarantee the order of the responses. - -- Stephen Gallagher RHCE 804006346421761 Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyOYOcACgkQeiVVYja6o6MNLACeJD9TmIQ30LEEt/tFilE0YBWM pvsAn23weIajFuwiXGzB+h1jjxwZ9sGq =9Str -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
Stephen Gallagher wrote: On 09/08/2010 09:04 AM, Stephen Gallagher wrote: I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. I've added some additional details about how I would like to do the nesting and loop detection to the wiki page. Comments welcome. Sorry I am having trouble understanding this algorithm. But may be it is because i do not understand the tevent_req interface to the level needed here. AFAIU the struct tevent_req setnetgrent_send(char *netgroupname, hash_table_t *nesting) call will ask for the netgroups, while another call errno_t setnetgrent_recv(tevent_req *req, struct collection **entries) is the call that will be executed when the response from the server is received. The problems I have are with the item 4). If it removes the netgroup from the hash how the hash ever grows? It should grow when the requests and responses (!) are processed in a nested way. But I do not think that this is possible with the interface we have (at least how it is described). If the request for a netgroup sent and then the response is received and we are processing a response and find that the netgroup has nested netgroups what do we do? Issue another request for the nested group? Fine but then we continue with the processing of the parent group and would delete it from the hash table before the result for the nested group gets back. My point is that hash table should probably be created per setnetgrent call and cleaned when fetching of all nested netgroups is complete. It should be never cleaned in the middle. Am I missing something? Also there is not design for the int innetgr(const char *netgroup, const char *host, const char *user, const char *domain); Is this intentional or just an omission? Also I think we should have the following optimization: Each fetched netgroup goes to the cache with a timestamp. If the expiration is say 30 sec and there is a netgroup C nested into two independent netgroups A and B and A is fetched and then B is fetched before the expiration timeout of the C, then the netgroup C should be taken from the cache rather than refetched. Another thing that I just realized is that you create a flat result set collection by appending nested groups rather than creating a collection with tree structure and iterating it as a flat collection. While your approach is probably the right one I wanted to draw attention to the fact that the option of having a tree style collection with nested referenced (or copied) subcollections and then traversing the tree as if it is a flat collection is also available. I do not know if you looked at such structure and whether it would help better if we need to do some optimization (now or later). Just something to consider. Thanks Dmitri ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/08/2010 09:04 AM, Stephen Gallagher wrote: I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. I've added some additional details about how I would like to do the nesting and loop detection to the wiki page. Comments welcome. - -- Stephen Gallagher RHCE 804006346421761 Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyJEvYACgkQeiVVYja6o6PsBQCePVgQm9TVj3aSCPKkGL+h2DS6 1u4AoIAvj2c0Mt7bCeMYkMwEExLtiPWU =kdvv -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/08/2010 02:12 AM, Stephen Gallagher wrote: I have written up a problem statement and a brief overview of my plans regarding netgroup support in the SSSD. Please read through and comment as needed: https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups My goal is to have this work completed before the end of September. I asked my question as another bullet point to the wiki, but Stephen reminded me there's probably a wider audience on this list. This was the question and answer: Q: Maybe this is too low-level at this time, but is a cleanup task planned? A: Netgroups should be handled in the same way that users and groups are handled, so I will probably have to extend the existing cleanup task to also address the netgroups entries in the cache -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyHgNgACgkQHsardTLnvCXRxgCgsfZfTN5m+FEItz8W9lHKW43k z4oAoNyL0VnfkdCY5DTE4SFtAXCu9SA/ =O/2P -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/08/2010 08:26 AM, Jakub Hrozek wrote: On 09/08/2010 02:12 AM, Stephen Gallagher wrote: I have written up a problem statement and a brief overview of my plans regarding netgroup support in the SSSD. Please read through and comment as needed: https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups My goal is to have this work completed before the end of September. I asked my question as another bullet point to the wiki, but Stephen reminded me there's probably a wider audience on this list. This was the question and answer: Q: Maybe this is too low-level at this time, but is a cleanup task planned? A: Netgroups should be handled in the same way that users and groups are handled, so I will probably have to extend the existing cleanup task to also address the netgroups entries in the cache I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. - -- Stephen Gallagher RHCE 804006346421761 Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyHicQACgkQeiVVYja6o6PklgCfaOrYEXwo7KhNepXT//rGBSHd vhcAn09U1wcNzjLZ8ztRQ7pXGmhZ4bQs =jLf/ -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/08/2010 12:00 PM, Dmitri Pal wrote: Stephen Gallagher wrote: On 09/08/2010 08:26 AM, Jakub Hrozek wrote: On 09/08/2010 02:12 AM, Stephen Gallagher wrote: I have written up a problem statement and a brief overview of my plans regarding netgroup support in the SSSD. Please read through and comment as needed: https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups My goal is to have this work completed before the end of September. I asked my question as another bullet point to the wiki, but Stephen reminded me there's probably a wider audience on this list. This was the question and answer: Q: Maybe this is too low-level at this time, but is a cleanup task planned? A: Netgroups should be handled in the same way that users and groups are handled, so I will probably have to extend the existing cleanup task to also address the netgroups entries in the cache I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. I think this would work. I started thinking that potentially there should be a way to not clone the subcollections but rather use references to nested groups. I was thinking that if you keep a hash table of the names of netgroups as keys and pointers (or structures with one member being pointer) as values you would be able to construct collections from cache without copying data but rather by reference. However this seems to be an optimization that might not be worth it at least in the first implementation. So let us start with what you propose and see if it scales. If not we will see how it can be improved. I thought about this approach originally, but the problem is that each of the netgroup result objects have their own lifetime. For example, if the lifetime is 30 seconds, and we do a lookup for netgroup1 at t=0, then do a lookup for netgroup2 at t=29, where netgroup2 has netgroup1 as a nested member, then the reference for that first netgroup is only going to be valid for one more second. There are ways we can force updates on read, but this is very expensive computationally. I'd rather take a memory hit with copying the data to guarantee that it remains intact than add the complexity of trying to manage references with different life-spans. So in this case, if netgroup2 is looked up at t=29, their internal copy of netgroup1 is still valid until t=59 when netgroup2 expires. - -- Stephen Gallagher RHCE 804006346421761 Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyHwkAACgkQeiVVYja6o6MNgQCfcjR+ZjKX01CQXtUHGsHEJ/2K RogAn0Wc8zDZ8mhphmAMa5rDqjiopGz2 =zz/r -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
Re: [SSSD] Netgroups in SSSD
Stephen Gallagher wrote: On 09/08/2010 12:00 PM, Dmitri Pal wrote: Stephen Gallagher wrote: On 09/08/2010 08:26 AM, Jakub Hrozek wrote: On 09/08/2010 02:12 AM, Stephen Gallagher wrote: I have written up a problem statement and a brief overview of my plans regarding netgroup support in the SSSD. Please read through and comment as needed: https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups My goal is to have this work completed before the end of September. I asked my question as another bullet point to the wiki, but Stephen reminded me there's probably a wider audience on this list. This was the question and answer: Q: Maybe this is too low-level at this time, but is a cleanup task planned? A: Netgroups should be handled in the same way that users and groups are handled, so I will probably have to extend the existing cleanup task to also address the netgroups entries in the cache I've also been thinking about how we're going to handle processing the nested groups, and I think what I'm going to do is take advantage of some of the nicer features of libcollection. Internal processing of setnetgrent_send() will recursively call out a subrequest to setnetgrent_send() again for each of the named nested groups. The setnetgrent_recv() function will return a libcollection object containing all of the results from that request (as well as any additional subrequests called). When the results come back up, they can be added together trivially using the col_add_collection_to_collection() interface with the col_add_mode_clone mode. I think this would work. I started thinking that potentially there should be a way to not clone the subcollections but rather use references to nested groups. I was thinking that if you keep a hash table of the names of netgroups as keys and pointers (or structures with one member being pointer) as values you would be able to construct collections from cache without copying data but rather by reference. However this seems to be an optimization that might not be worth it at least in the first implementation. So let us start with what you propose and see if it scales. If not we will see how it can be improved. I thought about this approach originally, but the problem is that each of the netgroup result objects have their own lifetime. For example, if the lifetime is 30 seconds, and we do a lookup for netgroup1 at t=0, then do a lookup for netgroup2 at t=29, where netgroup2 has netgroup1 as a nested member, then the reference for that first netgroup is only going to be valid for one more second. There are ways we can force updates on read, but this is very expensive computationally. I'd rather take a memory hit with copying the data to guarantee that it remains intact than add the complexity of trying to manage references with different life-spans. So in this case, if netgroup2 is looked up at t=29, their internal copy of netgroup1 is still valid until t=59 when netgroup2 expires. Yes, I agree, though I feel there is a way to do it nicely. I just can't visualize it. But... So what will you do if you have two netgroups: A and B. B is a member of A. The caller fetched B and then a second before B expires fetched A. Are you going to refetch B or just copy it? If you refetch you effectively refetch each time, if you copy you just extended its life. So I was thinking that the hash I described would keep a structure. One member is pointer. Another member is time stamp of the original fetch and another is delta. Then we will have two config values: max cache lifetime and min cache lifetime Max cache life time will control if the netgroup needs to be unconditionally refetched - effectively it is the expiration time. min cache life time is the expiration time of the netgroup entry that was not touched. Here is the example: min = 30 sec max = 90 sec B is fetched at t = 0 A is fetched at t = 20 B is not refetched. Instead its life is extended by saving delta = 20. There is another netgroup C that nests B that is fetched at t = X If (X - B_fetch_timestamp) max or (X - B_fetch_timestamp - B_delta) min then refetch B otherwise B_delta = X -B_fetch_timestamp But this does not solve the problem of the multiple iterations happening at the same time since you do not want to reconstruct the collection while the iterator is already defined against it so you still need to make a copy. However using this logic you probably would be able to share more and go to the server less. But again it is an optimization that might not be needed. ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel -- Thank you, Dmitri Pal Engineering Manager IPA project, Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts
Re: [SSSD] Netgroups in SSSD
Stephen Gallagher wrote: On 09/08/2010 02:07 PM, Dmitri Pal wrote: Yes, I agree, though I feel there is a way to do it nicely. I just can't visualize it. But... So what will you do if you have two netgroups: A and B. B is a member of A. The caller fetched B and then a second before B expires fetched A. Are you going to refetch B or just copy it? Copy it. If you refetch you effectively refetch each time, if you copy you just extended its life. Yes, you extend the life of the instance of B within A, not B itself. What this means is that if netgroup C comes in five seconds later which ALSO has B as a member, it will fetch a new copy of B into the memory cache. So I was thinking that the hash I described would keep a structure. One member is pointer. Another member is time stamp of the original fetch and another is delta. Then we will have two config values: max cache lifetime and min cache lifetime Max cache life time will control if the netgroup needs to be unconditionally refetched - effectively it is the expiration time. min cache life time is the expiration time of the netgroup entry that was not touched. Here is the example: min = 30 sec max = 90 sec B is fetched at t = 0 A is fetched at t = 20 B is not refetched. Instead its life is extended by saving delta = 20. The problem with this approach is that it can result in B never being refetched, if A happens to be refreshed for example every 29 seconds. This is not the case. If you look at the if statement below you will see that it is always refreshed after max interval but this is a mute point and I generally agree with your approach. By copying the data and leaving B alone to refresh itself as needed, we guarantee that the worst-case situation is that a sub-entry within an unrolled netgroup will stick around twice as long as the timeout, but that primary entries will always be refreshed as expected. There is another netgroup C that nests B that is fetched at t = X If (X - B_fetch_timestamp) max or (X - B_fetch_timestamp - B_delta) min then refetch B otherwise B_delta = X -B_fetch_timestamp But this does not solve the problem of the multiple iterations happening at the same time since you do not want to reconstruct the collection while the iterator is already defined against it so you still need to make a copy. However using this logic you probably would be able to share more and go to the server less. But again it is an optimization that might not be needed. -- Thank you, Dmitri Pal Engineering Manager IPA project, Red Hat Inc. --- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
[SSSD] Netgroups in SSSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have written up a problem statement and a brief overview of my plans regarding netgroup support in the SSSD. Please read through and comment as needed: https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups My goal is to have this work completed before the end of September. - -- Stephen Gallagher RHCE 804006346421761 Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkyG1OEACgkQeiVVYja6o6PIwQCgh62juMYdDpeGcXaYoGCf5/WG mC4An0jwqIf+C22owuI4gE8M6CiPLHnZ =iN+7 -END PGP SIGNATURE- ___ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel