Re: [SSSD] Netgroups in SSSD

2010-09-13 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/10/2010 09:42 AM, Dmitri Pal wrote:
 Stephen Gallagher wrote:
 On 09/08/2010 09:04 AM, Stephen Gallagher wrote:
 I've also been thinking about how we're going to handle processing the
 nested groups, and I think what I'm going to do is take advantage of
 some of the nicer features of libcollection.

 Internal processing of setnetgrent_send() will recursively call out a
 subrequest to setnetgrent_send() again for each of the named nested
 groups. The setnetgrent_recv() function will return a libcollection
 object containing all of the results from that request (as well as any
 additional subrequests called). When the results come back up, they can
 be added together trivially using the col_add_collection_to_collection()
 interface with the col_add_mode_clone mode.


 I've added some additional details about how I would like to do the
 nesting and loop detection to the wiki page. Comments welcome.

 
 Sorry I am having trouble understanding this algorithm.
 But may be it is because i do not understand the tevent_req interface to
 the level needed here.
 AFAIU the
 struct tevent_req setnetgrent_send(char *netgroupname, hash_table_t
 *nesting)
 call will ask for the netgroups, while another call
 errno_t setnetgrent_recv(tevent_req *req, struct collection **entries)
 is the call that will be executed when the response from the server is
 received.

No, this is where you are mistaken. The call that will be executed when
the request is finished is specified by the caller right after invoking
setnetgrent_send() by using the tevent_set_callback() function.

This function that is called must then invoke setnetgrent_recv() in
order to read out the final result data that is available.

 
 The problems I have are with the item 4).
 If it removes the netgroup from the hash how the hash ever grows?

You're confusing the hash with the result set (which will be a
libcollection object). I'm thinking about changing the way I do the
nested invocation so that the toplevel hides the need for the hash.

The idea behind the hash is actually to have it double as a reference
count and a loop-detection mechanism. It doesn't need to be a hash (it
could just as easily be a b-tree), but since we already have an
efficient hash available, I was just going to use that.

The idea is that for every time we recurse down a level, we will add the
name of that netgroup to the hash. Before recursing down again, we'll
make sure that the new name is not a key in the hash. If it is, we know
we've hit a loop and should break processing.

When we recurse up a level, we need to remove this entry from the
tracking hash so that it's possible to recurse down into it again in a
different branch of the tree. There is a pro and con to this approach.

Pro: We can store the complete result sets of all of the member
netgroups individually, so if they're requested directly or indirectly
again later, we don't have to go back to the server. This is a very
real, tangible advantage.

Con: It does mean that if the same member netgroup appears twice in the
nest that we will return additional copies. This is allowable by the
standard and is more of a configuration bug than anything else, so I
don't think it necessarily makes a lot of sense to try optimizing it
away at this point.


Trying to graph it:

(netgroupA has nested members netgroupB and netgroupC)
netgroupA
|
- netgroupB
|
- netgroupC

(netgroupB has nested member netgroupD)
netgroupB
|
- netgroupD

(netgroupC has nested member netgroupD)
netgroupC
|
- netgroupD



In this situation, my result set will ACTUALLY be:

netgroupA
|
- netgroupB
|   |
|   netgroupD
|
- netgroupC
|
netgroupD


The result set for netgroupA WILL have two copies of netgroupD

 It should grow when the requests and responses (!) are processed in a
 nested way.
 But I do not think that this is possible with the interface we have (at
 least how it is described).
 If the request for a netgroup sent and then the response is received and
 we are processing a response and find that the netgroup has nested
 netgroups what do we do?

You misunderstood. We're doing the nesting internally and only
responding once all the recursive calls are complete. So when the
callback invokes setnetgrent_recv(), it's going to receive a
libcollection object that is 100% complete.

 Am I missing something?

Yes, see above :)

 
 Also there is not design for the
  int innetgr(const char *netgroup, const char *host,
const char *user, const char *domain);
 
 Is this intentional or just an omission?
 

It's intentional. It would be really nice if there was actually an
interface for this, but unfortunately libc internally wraps this by
calling setnetgrent(), looping through getnetgrent() then endgrent() and
then manually searching the result list for the 3-tuple specified.

It would be much more efficient if we could handle it internally, but
it's a 

Re: [SSSD] Netgroups in SSSD

2010-09-13 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/13/2010 08:13 AM, Stephen Gallagher wrote:

 The idea behind the hash is actually to have it double as a reference
 count and a loop-detection mechanism. It doesn't need to be a hash (it
 could just as easily be a b-tree), but since we already have an
 efficient hash available, I was just going to use that.
 
 The idea is that for every time we recurse down a level, we will add the
 name of that netgroup to the hash. Before recursing down again, we'll
 make sure that the new name is not a key in the hash. If it is, we know
 we've hit a loop and should break processing.
 
 When we recurse up a level, we need to remove this entry from the
 tracking hash so that it's possible to recurse down into it again in a
 different branch of the tree. There is a pro and con to this approach.
 


I forgot to mention that the other use of the hash is that it provides
an easy way to handle nesting limits. A dhash table provides an
interface for requesting the number of entries in the hash. If one more
subrequest is going to exceed the maximum nesting level, then we can
cancel the request.

So using a tracking dhash here just makes the nesting limit and
loop-detection very easy.

- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyOFnoACgkQeiVVYja6o6OaagCffR1twxELT7PAKYsNRya5RjZW
MFsAn2NdOiN4pnvB0eHp6sRPj8XGdRds
=Hwqr
-END PGP SIGNATURE-
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-13 Thread Dmitri Pal
Stephen Gallagher wrote:
 On 09/10/2010 09:42 AM, Dmitri Pal wrote:
  Stephen Gallagher wrote:
  On 09/08/2010 09:04 AM, Stephen Gallagher wrote:
  I've also been thinking about how we're going to handle processing the
  nested groups, and I think what I'm going to do is take advantage of
  some of the nicer features of libcollection.
  Internal processing of setnetgrent_send() will recursively call out a
  subrequest to setnetgrent_send() again for each of the named nested
  groups. The setnetgrent_recv() function will return a libcollection
  object containing all of the results from that request (as well as any
  additional subrequests called). When the results come back up,
 they can
  be added together trivially using the
 col_add_collection_to_collection()
  interface with the col_add_mode_clone mode.
 
  I've added some additional details about how I would like to do the
  nesting and loop detection to the wiki page. Comments welcome.
 
  Sorry I am having trouble understanding this algorithm.
  But may be it is because i do not understand the tevent_req interface to
  the level needed here.
  AFAIU the
  struct tevent_req setnetgrent_send(char *netgroupname, hash_table_t
  *nesting)
  call will ask for the netgroups, while another call
  errno_t setnetgrent_recv(tevent_req *req, struct collection **entries)
  is the call that will be executed when the response from the server is
  received.

 No, this is where you are mistaken. The call that will be executed when
 the request is finished is specified by the caller right after invoking
 setnetgrent_send() by using the tevent_set_callback() function.

 This function that is called must then invoke setnetgrent_recv() in
 order to read out the final result data that is available.


If I read it right you are saying that the callback is invoked first and
it in turn invokes setnetgrent_recv() from itself. If this is the case
then my statement in general is true. And the problem I see is that you
can't assume that the order of the responses from the server will be
same as the order of the requests. What if the request got rejected due
to a bad connection but next request was satisfied. Or there was a
glitch in router or something. It does not matter. My point that
generally it is possible that the responses will come in reverse order
so relying on the order of the responses in the algorithm is a mistake.

  The problems I have are with the item 4).
  If it removes the netgroup from the hash how the hash ever grows?

 You're confusing the hash with the result set (which will be a
 libcollection object). I'm thinking about changing the way I do the
 nested invocation so that the toplevel hides the need for the hash.

 The idea behind the hash is actually to have it double as a reference
 count and a loop-detection mechanism. It doesn't need to be a hash (it
 could just as easily be a b-tree), but since we already have an
 efficient hash available, I was just going to use that.


It does not make a difference what actual implementation (hash, b-tree,
collection you are going to use for it). It is a set and a check against
a set needs to be made.

 The idea is that for every time we recurse down a level, we will add the
 name of that netgroup to the hash. Before recursing down again, we'll
 make sure that the new name is not a key in the hash. If it is, we know
 we've hit a loop and should break processing.
I agree with this part 100%. I was concerned about the moment you plan
to remove item from the set.



 When we recurse up a level, we need to remove this entry from the
 tracking hash so that it's possible to recurse down into it again in a
 different branch of the tree. There is a pro and con to this approach.
I see what you are trying to accomplish but since you can't rely on the
order of the responses you can't be sure if this is a response for a
nested group or group on a different branch. We need to think of a
better safeguard here.


Thanks
Dmitri
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-13 Thread Dmitri Pal
Stephen Gallagher wrote:
 On 09/13/2010 08:13 AM, Stephen Gallagher wrote:

  The idea behind the hash is actually to have it double as a reference
  count and a loop-detection mechanism. It doesn't need to be a hash (it
  could just as easily be a b-tree), but since we already have an
  efficient hash available, I was just going to use that.

  The idea is that for every time we recurse down a level, we will add the
  name of that netgroup to the hash. Before recursing down again, we'll
  make sure that the new name is not a key in the hash. If it is, we know
  we've hit a loop and should break processing.

  When we recurse up a level, we need to remove this entry from the
  tracking hash so that it's possible to recurse down into it again in a
  different branch of the tree. There is a pro and con to this approach.



 I forgot to mention that the other use of the hash is that it provides
 an easy way to handle nesting limits. A dhash table provides an
 interface for requesting the number of entries in the hash. If one more
 subrequest is going to exceed the maximum nesting level, then we can
 cancel the request.

 So using a tracking dhash here just makes the nesting limit and
 loop-detection very easy.


Yes. I agree with this. Hash is definitely the best object for the task.
It is just that algorithm has a flaw I pointed in the other mail.

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel

-- 
Thank you,
Dmitri Pal

Engineering Manager IPA project,
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-13 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/13/2010 12:07 PM, Dmitri Pal wrote:
 
 If I read it right you are saying that the callback is invoked first and
 it in turn invokes setnetgrent_recv() from itself. If this is the case
 then my statement in general is true. And the problem I see is that you
 can't assume that the order of the responses from the server will be
 same as the order of the requests. What if the request got rejected due
 to a bad connection but next request was satisfied. Or there was a
 glitch in router or something. It does not matter. My point that
 generally it is possible that the responses will come in reverse order
 so relying on the order of the responses in the algorithm is a mistake.
 

No, it's not possible. The way requests with subrequests work is that
the toplevel request will NEVER return until all of its subrequests have
been processed (or an error occurs, which will cancel the subrequests).

I'll try to explain in pseudocode a little better.

setnetgrent() {
req = setnetgrent_send(data);
tevent_req_set_callback(req, setnetgrent_done);
}

setnetgrent_send() {
subreq = setnetgrent_internal_send(data, nesting=0);
tevent_req_set_callback(subreq, setnetgrent_internal_done)
}

The toplevel will NEVER call setnetgrent_done() until any and all
individual setnetgrent_send_internal_send/done pairs have completed.


That's a high-level view. In a real implementation, the
setnetgrent_internal_done() call would call a
setnetgrent_internal_step() function whose purpose it would be to call
setnetgrent_internal_send() again in series so that we processed all the
lookups.

We don't parallelize the individual netgroup lookups. They're always
done in a serial manner (though asynchronous so we don't block if they
have to go to LDAP).

 The problems I have are with the item 4).
 If it removes the netgroup from the hash how the hash ever grows?

 You're confusing the hash with the result set (which will be a
 libcollection object). I'm thinking about changing the way I do the
 nested invocation so that the toplevel hides the need for the hash.

 The idea behind the hash is actually to have it double as a reference
 count and a loop-detection mechanism. It doesn't need to be a hash (it
 could just as easily be a b-tree), but since we already have an
 efficient hash available, I was just going to use that.

 
 It does not make a difference what actual implementation (hash, b-tree,
 collection you are going to use for it). It is a set and a check against
 a set needs to be made.
 

Yes, and also the dhash has a built-in function that can report the
total number of keys available, which I'll use to identify the nesting
level.

 The idea is that for every time we recurse down a level, we will add the
 name of that netgroup to the hash. Before recursing down again, we'll
 make sure that the new name is not a key in the hash. If it is, we know
 we've hit a loop and should break processing.
 I agree with this part 100%. I was concerned about the moment you plan
 to remove item from the set.
 

I have to remove it from the set when we traverse down a separate
section of the nesting tree because (as an optimization) we're saving
the results of the nested netgroups in their entirety into the cache
(since we're already processing it). So that if we call netgroupA which
has netgroupB and netgroupC as members, each of which ALSO have
netgroupD as a member, then we have the complete copies of netgroupB and
netgroupC available in the lookup cache already, in case a request is
made directly against them.

It DOES result in duplicate data when reporting netgroupA (which is
acceptable), but it means that we have complete data to cache for the
other netgroups.

 

 When we recurse up a level, we need to remove this entry from the
 tracking hash so that it's possible to recurse down into it again in a
 different branch of the tree. There is a pro and con to this approach.
 I see what you are trying to accomplish but since you can't rely on the
 order of the responses you can't be sure if this is a response for a
 nested group or group on a different branch. We need to think of a
 better safeguard here.
 

See above. We can always guarantee the order of the responses.


- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyOYOcACgkQeiVVYja6o6MNLACeJD9TmIQ30LEEt/tFilE0YBWM
pvsAn23weIajFuwiXGzB+h1jjxwZ9sGq
=9Str
-END PGP SIGNATURE-
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-10 Thread Dmitri Pal
Stephen Gallagher wrote:
 On 09/08/2010 09:04 AM, Stephen Gallagher wrote:
  I've also been thinking about how we're going to handle processing the
  nested groups, and I think what I'm going to do is take advantage of
  some of the nicer features of libcollection.

  Internal processing of setnetgrent_send() will recursively call out a
  subrequest to setnetgrent_send() again for each of the named nested
  groups. The setnetgrent_recv() function will return a libcollection
  object containing all of the results from that request (as well as any
  additional subrequests called). When the results come back up, they can
  be added together trivially using the col_add_collection_to_collection()
  interface with the col_add_mode_clone mode.


 I've added some additional details about how I would like to do the
 nesting and loop detection to the wiki page. Comments welcome.


Sorry I am having trouble understanding this algorithm.
But may be it is because i do not understand the tevent_req interface to
the level needed here.
AFAIU the
struct tevent_req setnetgrent_send(char *netgroupname, hash_table_t
*nesting)
call will ask for the netgroups, while another call
errno_t setnetgrent_recv(tevent_req *req, struct collection **entries)
is the call that will be executed when the response from the server is
received.

The problems I have are with the item 4).
If it removes the netgroup from the hash how the hash ever grows?
It should grow when the requests and responses (!) are processed in a
nested way.
But I do not think that this is possible with the interface we have (at
least how it is described).
If the request for a netgroup sent and then the response is received and
we are processing a response and find that the netgroup has nested
netgroups what do we do?
Issue another request for the nested group? Fine but then we continue
with the processing of the parent group and would delete it from the
hash table before the result for the nested group gets back.
My point is that hash table should probably be created per setnetgrent
call and cleaned when fetching  of all nested netgroups is complete. It
should be never cleaned in the middle.

Am I missing something?

Also there is not design for the
 int innetgr(const char *netgroup, const char *host,
   const char *user, const char *domain);

Is this intentional or just an omission?

Also I think we should have the following optimization:
Each fetched netgroup goes to the cache with a timestamp. If the
expiration is say 30 sec and there is a netgroup C nested into two
independent netgroups A and B and A is fetched and then B is fetched
before the expiration timeout of the C, then the netgroup C should be
taken from the cache rather than refetched.

Another thing that I just realized is that you create a flat result set
collection by appending nested groups rather than creating a collection
with tree structure and iterating it as a flat collection. While your
approach is probably the right one I wanted to draw attention to the
fact that the option of having a tree style collection with nested
referenced (or copied) subcollections and then traversing the tree as if
it is a flat collection is also available. I do not know if you looked
at such structure and whether it would help better if we need to do some
optimization (now or later). Just something to consider.
 

Thanks
Dmitri
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-09 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/08/2010 09:04 AM, Stephen Gallagher wrote:
 
 I've also been thinking about how we're going to handle processing the
 nested groups, and I think what I'm going to do is take advantage of
 some of the nicer features of libcollection.
 
 Internal processing of setnetgrent_send() will recursively call out a
 subrequest to setnetgrent_send() again for each of the named nested
 groups. The setnetgrent_recv() function will return a libcollection
 object containing all of the results from that request (as well as any
 additional subrequests called). When the results come back up, they can
 be added together trivially using the col_add_collection_to_collection()
 interface with the col_add_mode_clone mode.
 

I've added some additional details about how I would like to do the
nesting and loop detection to the wiki page. Comments welcome.

- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyJEvYACgkQeiVVYja6o6PsBQCePVgQm9TVj3aSCPKkGL+h2DS6
1u4AoIAvj2c0Mt7bCeMYkMwEExLtiPWU
=kdvv
-END PGP SIGNATURE-
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-08 Thread Jakub Hrozek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/08/2010 02:12 AM, Stephen Gallagher wrote:
 I have written up a problem statement and a brief overview of my plans
 regarding netgroup support in the SSSD.
 
 Please read through and comment as needed:
 https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups
 
 My goal is to have this work completed before the end of September.
 

I asked my question as another bullet point to the wiki, but Stephen
reminded me there's probably a wider audience on this list.

This was the question and answer:

Q: Maybe this is too low-level at this time, but is a cleanup task planned?
A: Netgroups should be handled in the same way that users and groups are
handled, so I will probably have to extend the existing cleanup task to
also address the netgroups entries in the cache
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyHgNgACgkQHsardTLnvCXRxgCgsfZfTN5m+FEItz8W9lHKW43k
z4oAoNyL0VnfkdCY5DTE4SFtAXCu9SA/
=O/2P
-END PGP SIGNATURE-
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-08 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/08/2010 08:26 AM, Jakub Hrozek wrote:
 On 09/08/2010 02:12 AM, Stephen Gallagher wrote:
 I have written up a problem statement and a brief overview of my plans
 regarding netgroup support in the SSSD.
 
 Please read through and comment as needed:
 https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups
 
 My goal is to have this work completed before the end of September.
 
 
 I asked my question as another bullet point to the wiki, but Stephen
 reminded me there's probably a wider audience on this list.
 
 This was the question and answer:
 
 Q: Maybe this is too low-level at this time, but is a cleanup task planned?
 A: Netgroups should be handled in the same way that users and groups are
 handled, so I will probably have to extend the existing cleanup task to
 also address the netgroups entries in the cache


I've also been thinking about how we're going to handle processing the
nested groups, and I think what I'm going to do is take advantage of
some of the nicer features of libcollection.

Internal processing of setnetgrent_send() will recursively call out a
subrequest to setnetgrent_send() again for each of the named nested
groups. The setnetgrent_recv() function will return a libcollection
object containing all of the results from that request (as well as any
additional subrequests called). When the results come back up, they can
be added together trivially using the col_add_collection_to_collection()
interface with the col_add_mode_clone mode.

- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyHicQACgkQeiVVYja6o6PklgCfaOrYEXwo7KhNepXT//rGBSHd
vhcAn09U1wcNzjLZ8ztRQ7pXGmhZ4bQs
=jLf/
-END PGP SIGNATURE-
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-08 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/08/2010 12:00 PM, Dmitri Pal wrote:
 Stephen Gallagher wrote:
 On 09/08/2010 08:26 AM, Jakub Hrozek wrote:
 On 09/08/2010 02:12 AM, Stephen Gallagher wrote:
 I have written up a problem statement and a brief overview of my plans
 regarding netgroup support in the SSSD.
 Please read through and comment as needed:
 https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups
 My goal is to have this work completed before the end of September.

 I asked my question as another bullet point to the wiki, but Stephen
 reminded me there's probably a wider audience on this list.

 This was the question and answer:

 Q: Maybe this is too low-level at this time, but is a cleanup task
 planned?
 A: Netgroups should be handled in the same way that users and groups are
 handled, so I will probably have to extend the existing cleanup task to
 also address the netgroups entries in the cache


 I've also been thinking about how we're going to handle processing the
 nested groups, and I think what I'm going to do is take advantage of
 some of the nicer features of libcollection.

 Internal processing of setnetgrent_send() will recursively call out a
 subrequest to setnetgrent_send() again for each of the named nested
 groups. The setnetgrent_recv() function will return a libcollection
 object containing all of the results from that request (as well as any
 additional subrequests called). When the results come back up, they can
 be added together trivially using the col_add_collection_to_collection()
 interface with the col_add_mode_clone mode.

 I think this would work.
 I started thinking that potentially there should be a way to not clone
 the subcollections but rather use references to nested groups.
 I was thinking that if you keep a hash table of the names of netgroups
 as keys and pointers (or structures with one member being pointer) as
 values you would be able to construct collections from cache without
 copying data but rather by reference.
 However this seems to be an optimization that might not be worth it at
 least in the first implementation. So let us start with what you propose
 and see if it scales. If not we will see how it can be improved.


I thought about this approach originally, but the problem is that each
of the netgroup result objects have their own lifetime. For example, if
the lifetime is 30 seconds, and we do a lookup for netgroup1 at t=0,
then do a lookup for netgroup2 at t=29, where netgroup2 has netgroup1 as
a nested member, then the reference for that first netgroup is only
going to be valid for one more second. There are ways we can force
updates on read, but this is very expensive computationally.

I'd rather take a memory hit with copying the data to guarantee that it
remains intact than add the complexity of trying to manage references
with different life-spans.

So in this case, if netgroup2 is looked up at t=29, their internal copy
of netgroup1 is still valid until t=59 when netgroup2 expires.

- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyHwkAACgkQeiVVYja6o6MNgQCfcjR+ZjKX01CQXtUHGsHEJ/2K
RogAn0Wc8zDZ8mhphmAMa5rDqjiopGz2
=zz/r
-END PGP SIGNATURE-
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


Re: [SSSD] Netgroups in SSSD

2010-09-08 Thread Dmitri Pal
Stephen Gallagher wrote:
 On 09/08/2010 12:00 PM, Dmitri Pal wrote:
  Stephen Gallagher wrote:
  On 09/08/2010 08:26 AM, Jakub Hrozek wrote:
  On 09/08/2010 02:12 AM, Stephen Gallagher wrote:
  I have written up a problem statement and a brief overview of my
 plans
  regarding netgroup support in the SSSD.
  Please read through and comment as needed:
  https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups
  My goal is to have this work completed before the end of September.
  I asked my question as another bullet point to the wiki, but Stephen
  reminded me there's probably a wider audience on this list.
  This was the question and answer:
  Q: Maybe this is too low-level at this time, but is a cleanup task
  planned?
  A: Netgroups should be handled in the same way that users and
 groups are
  handled, so I will probably have to extend the existing cleanup
 task to
  also address the netgroups entries in the cache
 
  I've also been thinking about how we're going to handle processing the
  nested groups, and I think what I'm going to do is take advantage of
  some of the nicer features of libcollection.
 
  Internal processing of setnetgrent_send() will recursively call out a
  subrequest to setnetgrent_send() again for each of the named nested
  groups. The setnetgrent_recv() function will return a libcollection
  object containing all of the results from that request (as well as any
  additional subrequests called). When the results come back up, they can
  be added together trivially using the
 col_add_collection_to_collection()
  interface with the col_add_mode_clone mode.
 
  I think this would work.
  I started thinking that potentially there should be a way to not clone
  the subcollections but rather use references to nested groups.
  I was thinking that if you keep a hash table of the names of netgroups
  as keys and pointers (or structures with one member being pointer) as
  values you would be able to construct collections from cache without
  copying data but rather by reference.
  However this seems to be an optimization that might not be worth it at
  least in the first implementation. So let us start with what you propose
  and see if it scales. If not we will see how it can be improved.


 I thought about this approach originally, but the problem is that each
 of the netgroup result objects have their own lifetime. For example, if
 the lifetime is 30 seconds, and we do a lookup for netgroup1 at t=0,
 then do a lookup for netgroup2 at t=29, where netgroup2 has netgroup1 as
 a nested member, then the reference for that first netgroup is only
 going to be valid for one more second. There are ways we can force
 updates on read, but this is very expensive computationally.

 I'd rather take a memory hit with copying the data to guarantee that it
 remains intact than add the complexity of trying to manage references
 with different life-spans.

 So in this case, if netgroup2 is looked up at t=29, their internal copy
 of netgroup1 is still valid until t=59 when netgroup2 expires.

Yes, I agree, though I feel there is a way to do it nicely. I just can't
visualize it.

But...
So what will you do if you have two netgroups: A and B. B is a member of A.
The caller fetched B and then a second before B expires fetched A. Are
you going to refetch B or just copy it?
If you refetch you effectively refetch each time, if you copy you just
extended its life.
So I was thinking that the hash I described would keep a structure. One
member is pointer.
Another member is time stamp of the original fetch and another is delta.
Then we will have two config values: max cache lifetime and min cache
lifetime
Max cache life time will control if the netgroup needs to be
unconditionally refetched - effectively it is the expiration time.
min cache life time is the expiration time of the netgroup entry that
was not touched.
Here is the example:
min = 30 sec
max = 90 sec

B is fetched at t = 0
A is fetched at t = 20
B is not refetched. Instead its life is extended by saving delta = 20.

There is another netgroup C that nests B that is fetched at t = X
If (X - B_fetch_timestamp)  max  or (X - B_fetch_timestamp - B_delta) 
min then refetch B
otherwise B_delta = X -B_fetch_timestamp

But this does not solve the problem of the multiple iterations happening
at the same time since you do not want to reconstruct the collection
while the iterator is already defined against it  so you still need to
make a copy.
However using this logic you probably would be able to share more and go
to the server less.
But again it is an optimization that might not be needed.




___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel

-- 
Thank you,
Dmitri Pal

Engineering Manager IPA project,
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts

Re: [SSSD] Netgroups in SSSD

2010-09-08 Thread Dmitri Pal
Stephen Gallagher wrote:
 On 09/08/2010 02:07 PM, Dmitri Pal wrote:
  Yes, I agree, though I feel there is a way to do it nicely. I just can't
  visualize it.

  But...
  So what will you do if you have two netgroups: A and B. B is a
 member of A.
  The caller fetched B and then a second before B expires fetched A. Are
  you going to refetch B or just copy it?

 Copy it.

  If you refetch you effectively refetch each time, if you copy you just
  extended its life.

 Yes, you extend the life of the instance of B within A, not B itself.
 What this means is that if netgroup C comes in five seconds later which
 ALSO has B as a member, it will fetch a new copy of B into the memory
 cache.

  So I was thinking that the hash I described would keep a structure. One
  member is pointer.
  Another member is time stamp of the original fetch and another is delta.
  Then we will have two config values: max cache lifetime and min cache
  lifetime
  Max cache life time will control if the netgroup needs to be
  unconditionally refetched - effectively it is the expiration time.
  min cache life time is the expiration time of the netgroup entry that
  was not touched.
  Here is the example:
  min = 30 sec
  max = 90 sec

  B is fetched at t = 0
  A is fetched at t = 20
  B is not refetched. Instead its life is extended by saving delta = 20.


 The problem with this approach is that it can result in B never being
 refetched, if A happens to be refreshed for example every 29 seconds. 

This is not the case. If you look at the if statement below you will see
that it is always refreshed after max interval but this is a mute
point and I generally agree with your approach.

 By
 copying the data and leaving B alone to refresh itself as needed, we
 guarantee that the worst-case situation is that a sub-entry within an
 unrolled netgroup will stick around twice as long as the timeout, but
 that primary entries will always be refreshed as expected.

  There is another netgroup C that nests B that is fetched at t = X
  If (X - B_fetch_timestamp)  max  or (X - B_fetch_timestamp - B_delta) 
  min then refetch B
  otherwise B_delta = X -B_fetch_timestamp

  But this does not solve the problem of the multiple iterations happening
  at the same time since you do not want to reconstruct the collection
  while the iterator is already defined against it  so you still need to
  make a copy.
  However using this logic you probably would be able to share more and go
  to the server less.
  But again it is an optimization that might not be needed.



-- 
Thank you,
Dmitri Pal

Engineering Manager IPA project,
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel


[SSSD] Netgroups in SSSD

2010-09-07 Thread Stephen Gallagher

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I have written up a problem statement and a brief overview of my plans
regarding netgroup support in the SSSD.

Please read through and comment as needed:
https://fedorahosted.org/sssd/wiki/DesignDocs/Netgroups

My goal is to have this work completed before the end of September.

- -- 
Stephen Gallagher
RHCE 804006346421761

Delivering value year after year.
Red Hat ranks #1 in value among software vendors.
http://www.redhat.com/promo/vendor/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkyG1OEACgkQeiVVYja6o6PIwQCgh62juMYdDpeGcXaYoGCf5/WG
mC4An0jwqIf+C22owuI4gE8M6CiPLHnZ
=iN+7
-END PGP SIGNATURE-

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sssd-devel