Re: Bottleneck with Winbind and NT ACLs in 2.2.7a

2003-02-05 Thread Michael Steffens
Hi Jeremy,

[EMAIL PROTECTED] wrote:

Damn good idea ! I think I'll look into applying some version
of this - thanks !


Many thanks to you!

Our "big boy" unveiled another problem with winbind and a large
number of clients (most of them smbds, but also other processes,
of course): Winbindd becomes an excessive file descriptor
consumer for client sockets.

Each smbd wants two of them. And as long as client processes are
alive, client connections stay open even when being idle.

It is possible to increase the maxfiles kernel parameter (we
have set it to 300). But as every process can potentially become
a winbind client, it's hard to tell what the actual limit should
be. During the last three days our winbindd was already pretty
close to 300 open files under peak load :)

I think that winbindd could use some housekeeping of client
connections. In the attached patch I have tried to apply a
threshold method. As soon as a maximum number of clients is
exceeded, the oldest idle connection is looked up and shut
down. Criterion for a connection being considered "idle" is

 - empty read and write buffers
 - no get??ent environments

In case all connections are actually active, exceeding the
threshold is being allowed (hoping it's temporary).

Together with smbds caching id mappings, reducing the frequency
of queries, this could work without too much impact on client
processes (which re-open connections winbindd has closed
when required).

What do you think about it?

Cheers!
Michael

Index: source/nsswitch/winbindd.h
===
RCS file: /cvsroot/samba/source/nsswitch/winbindd.h,v
retrieving revision 1.3.4.9
diff -u -r1.3.4.9 winbindd.h
--- source/nsswitch/winbindd.h  13 Sep 2002 23:46:27 -  1.3.4.9
+++ source/nsswitch/winbindd.h  5 Feb 2003 12:48:02 -
@@ -42,6 +42,7 @@
 struct winbindd_response response;/* Respose to client */
 struct getent_state *getpwent_state;  /* State for getpwent() */
 struct getent_state *getgrent_state;  /* State for getgrent() */
+time_t access;/* Time of last access (read or write) 
+*/
 };
 
 /* State between get{pw,gr}ent() calls */
@@ -189,6 +190,7 @@
 
 #define WINBINDD_ESTABLISH_LOOP 30
 #define DOM_SEQUENCE_NONE ((uint32)-1)
+#define WINBINDD_MAX_CLIENTS 100
 
 /* SETENV */
 #if HAVE_SETENV
Index: source/nsswitch/winbindd.c
===
RCS file: /cvsroot/samba/source/nsswitch/winbindd.c,v
retrieving revision 1.3.2.35
diff -u -r1.3.2.35 winbindd.c
--- source/nsswitch/winbindd.c  3 Oct 2002 21:00:10 -   1.3.2.35
+++ source/nsswitch/winbindd.c  5 Feb 2003 12:48:03 -
@@ -343,6 +343,10 @@

ZERO_STRUCTP(state);
state->sock = sock;
+
+   /* give it a date of birth, such that it doesn't become a removal
+  candidate immediately */
+   state->access = time(NULL);

/* Add to connection list */

@@ -380,6 +384,36 @@
}
 }
 
+/* Shutdown client connection which has been idle for the longest time */
+
+static BOOL remove_idle_client(void) {
+   struct winbindd_cli_state *state, *remove_state = NULL;
+   time_t access = 0;
+   int nidle = 0;
+
+   for (state = client_list; state; state = state->next) {
+
+   if (state->read_buf_len == 0 && state->write_buf_len == 0 &&
+   !state->getpwent_state && !state->getgrent_state) {
+
+   nidle++;
+   if (!access || state->access < access) {
+   access = state->access;
+   remove_state = state;
+   }
+   }
+   }
+
+   if (remove_state) {
+   DEBUG(5,("Found %d idle client connections, shutting down sock %d, pid 
+%d\n",
+nidle, remove_state->sock, remove_state->pid));
+   remove_client(remove_state);
+   return True;
+   }
+
+   return False;
+}
+
 /* Process a complete received packet from a client */
 
 static void process_packet(struct winbindd_cli_state *state)
@@ -427,6 +461,7 @@
/* Update client state */

state->read_buf_len += n;
+   state->access = time(NULL);
 }
 
 /* Write some data to a client connection */
@@ -479,6 +514,7 @@
/* Update client state */

state->write_buf_len -= num_written;
+   state->access = time(NULL);

/* Have we written all data? */

@@ -597,8 +633,15 @@
 
if (selret > 0) {
 
-   if (FD_ISSET(accept_sock, &r_fds))
+   if (FD_ISSET(accept_sock, &r_fds)) {
+   while (num_clients > WINBINDD_MAX_CLIENTS - 1)
+   if (!remove_idle_client()) {
+   DEBUG(0,("Exceeding %d client 
+connec

Re: Bottleneck with Winbind and NT ACLs in 2.2.7a

2003-02-04 Thread jra
On Tue, Feb 04, 2003 at 02:12:29PM +0100, Michael Steffens wrote:
> Hi,
> 
> we are running a big Samba 2.2.7a server with Winbind (>100 concurrent
> users, >600 id mappings created since then) since last weekend.
> 
> It's running quite well! :)
> 
> However, users are complaining about Samba being very slow when
> NT ACL support is enabled. I'm suspecting that winbindd is the
> bottleneck.
> 
> In winbindd's log I can see that it has to serve "[ug]id to sid"
> requests at a very high frequency. Most of them seem to be triggered
> by smbd daemons working on POSIX ACLs, and the UIDs and GIDs
> requested are almost always those of the user running a session.
> Requests are keeping winbindd busy all the time. And when winbindd
> is busy talking to DCs, user sessions have to wait for ACL settings
> to complete.
> 
> As the ID mappings are static - as soon as they exist - wouldn't it
> be a good idea to have smbd cache those it has come across in the
> current session?
> 
> In the attached patch I have tried to implement such a local
> mapping cache for smbd. What do you think about it?

Damn good idea ! I think I'll look into applying some version
of this - thanks !

Jeremy.