Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-24 Thread Jeff Trawick
Johannes Erdfelt [EMAIL PROTECTED] writes:

 Problem 1:
 In worker_thread, there is a variable called csd that is used to get
 the new socket from lr-accept_func(). If that variable is NULL, then
 the memory for the new socket is allocated in the per-transaction pool.
 Unfortunately, the code neglected to reset the variable to NULL after
 servicing a request. The result is that the first request for each
 thread worked fine, but subsequent request may have the memory
 overwritten and resulting in an invalid FD.
 
 Solution:
 Set it to NULL before calling lr-accept_func(). Implemented in the patch
 below.

I cleared that storage just before the call to apr_os_sock_put() so
that the reason is more obvious.

 Problem 2:
 pass_request fills out an iovec with the headers and the body of the
 request it wants to pass to another process. It unfortunately uses the
 wrong variable for the length causing it to always be 0.
 
 Solution:
 Use the correct variable name l. len is never set to anything so I removed
 that and used 0 in the one reference to it. Implemented in the patch below.

applied, but I made another fix (?) too... see my note below, and let
me know if it is bad :)

 Problem 3:
 receive_from_other_child assumes that the iovec is the same on read as
 it is on write. This isn't true and readmsg() follows readv() semantics.
 iovec is a scatter/gather list and as a result, the 2 send buffers are
 merged into one received buffer with the second always being untouched.
 It also trusted the lengths in iov.iov_len which will be the size of the
 original buffer, not the size of the data actually received.
 
 Solution:
 Merge the 2 buffer's into 1 and find the null terminators for the 2 strings.
 Implemented in the patch below.

fix applied as-is

 Index: perchild.c
 @@ -1635,7 +1645,6 @@
  apr_bucket_brigade *bb = apr_brigade_create(r-pool, c-bucket_alloc);
  apr_bucket_brigade *sockbb;
  char request_body[HUGE_STRING_LEN] = \0;
 -apr_off_t len = 0;

looks to me like len should be initialized to sizeof(request_body)
since on input to apr_brigade_flatten it should have the maximum
length of the char array

Thanks!

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...



Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-24 Thread Johannes Erdfelt
On Thu, Oct 24, 2002, Jeff Trawick [EMAIL PROTECTED] wrote:
 Johannes Erdfelt [EMAIL PROTECTED] writes:
  Problem 2:
  pass_request fills out an iovec with the headers and the body of the
  request it wants to pass to another process. It unfortunately uses the
  wrong variable for the length causing it to always be 0.
  
  Solution:
  Use the correct variable name l. len is never set to anything so I removed
  that and used 0 in the one reference to it. Implemented in the patch below.
 
 applied, but I made another fix (?) too... see my note below, and let
 me know if it is bad :)
 
  Index: perchild.c
  @@ -1635,7 +1645,6 @@
   apr_bucket_brigade *bb = apr_brigade_create(r-pool, c-bucket_alloc);
   apr_bucket_brigade *sockbb;
   char request_body[HUGE_STRING_LEN] = \0;
  -apr_off_t len = 0;
 
 looks to me like len should be initialized to sizeof(request_body)
 since on input to apr_brigade_flatten it should have the maximum
 length of the char array

I'll have to defer to you here. I'm not that familiar with the brigade
implementation to know what the appropriate fix for this is.

However, it most certainly is better than the current behaviour so I'm
not against it. If it's wrong, I'll find out pretty soon :)

Thanks for applying these patches!

JE




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread Johannes Erdfelt
On Mon, Oct 21, 2002, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 On Mon, 21 Oct 2002, Johannes Erdfelt wrote:
  Perhaps I misunderstood. The patch I had developed (which is broken
  because of the problems with the accept lock) just didn't listen on the
  socket if it has no chance of answering the query itself.
  
  That's what I thought you meant, but reading what you said again I may
  have misunderstood.
 
 Nope, that's exactly what I meant.  (see below)
 
  Did you mean closing the client socket or the listening socket? Wouldn't
  closing the client socket just cause the client to treat it as an error?
 
 Just close the listening socket in the child_init phase of the
 module.  You want to do this before you start accepting requests.  You
 could also do this by just removing the socket from the accept array.  The
 reason I like having the child close the socket, is that it eliminates
 a potential bad config.
 
 Consider the case where an admin configures the server to listen on
 www.foo.com:8080, but he never assigns a child process to listen to that
 port.  If you just don't accept the connections, the user will hang
 forever.  If every child process, however, actively closes the sockets
 that it can't serve, then the client won't even be able to connect.

That's a good point too.

  What I was thinking was to add an artificial limitation that you can't
  share an IP:port pair across two different uid/gid's since that's the
  only case you want to pass a connection.
 
 That limitation should already exist, although it may not.

That limitation doesn't exist either in the documentation or in the
implementation.

I figured that was part of the point of connection passing. I know it's
also useful for passing between the siblings for a particular uid/gid
also, but why not just require 1 child per uid/gid and then not worry
about connection passing at all?

JE




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread Johannes Erdfelt
On Mon, Oct 21, 2002, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
   As long as you are doing all this work, there is one more thought that I
   have been meaning to implement, but that I never got around to.  Currently
   perchild doesn't work with SSL, because of when the request is passed off,
   and how SSL works.  The easy solution to this, is to have the child
   processes close the sockets for any requests that they cannot
   handle.  This will also improve the chance that a request won't be passed
   if you have vhosts with different ports.  Consider the following:
   
   VHost www.foo.com:80
   AssignChildPerUidGid  rbb rbb
   /Vhost
   
   Vhost www.foo.com:81
   AssignChildPerUidGid  foo foo
   /VHost
   
   There is no reason for the foo/foo child process to be listening on port
   80.
   
   Just a thought for how to get SSL to work.
  
  I actually have a patch for this already :) Although I implemented it
  only as an optimization and not because of the issue with SSL. I hadn't
  tried to do SSL yet.
  
  I'd imagine this SSL limitation will have to be clearly documented since
  it may not be obvious to everyone.
 
 Actually, as long as you have a patch to do this, then SSL should just
 work, and no docs should be necessary.   :-)

Perhaps I misunderstood. The patch I had developed (which is broken
because of the problems with the accept lock) just didn't listen on the
socket if it has no chance of answering the query itself.

That's what I thought you meant, but reading what you said again I may
have misunderstood.

Did you mean closing the client socket or the listening socket? Wouldn't
closing the client socket just cause the client to treat it as an error?

What I was thinking was to add an artificial limitation that you can't
share an IP:port pair across two different uid/gid's since that's the
only case you want to pass a connection.

JE




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread rbb

  Consider the case where an admin configures the server to listen on
  www.foo.com:8080, but he never assigns a child process to listen to that
  port.  If you just don't accept the connections, the user will hang
  forever.  If every child process, however, actively closes the sockets
  that it can't serve, then the client won't even be able to connect.
 
 That's a good point too.
 
   What I was thinking was to add an artificial limitation that you can't
   share an IP:port pair across two different uid/gid's since that's the
   only case you want to pass a connection.
  
  That limitation should already exist, although it may not.
 
 That limitation doesn't exist either in the documentation or in the
 implementation.
 
 I figured that was part of the point of connection passing. I know it's
 also useful for passing between the siblings for a particular uid/gid
 also, but why not just require 1 child per uid/gid and then not worry
 about connection passing at all?

A couple of reasons.  Remember first of all that the more threads per
process, the less stable your server.  This is because if a module
seg-faults, the whole process dies.  For that reason, many admins will
want to have multiple child processes with the same uid/gid to create some
failover.

The second reason, however, is that the file descriptor passing is meant
to solve the case of multiple Name-based Vhosts on the same ip:port all
with different uid/gid requirements.  In that case, you will have multiple
child processes all listening to the same ip:port, but you will need to do
the file descriptor passing.

Closing the file descriptors of ip:port combinations that can't be served
is a great optimization, but that is all it is, an optimization for the
non-Name-based vhost case.

Ryan

___
Ryan Bloom  [EMAIL PROTECTED]
550 Jean St
Oakland CA 94610
---




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread Johannes Erdfelt
On Mon, Oct 21, 2002, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
What I was thinking was to add an artificial limitation that you can't
share an IP:port pair across two different uid/gid's since that's the
only case you want to pass a connection.
   
   That limitation should already exist, although it may not.
  
  That limitation doesn't exist either in the documentation or in the
  implementation.
  
  I figured that was part of the point of connection passing. I know it's
  also useful for passing between the siblings for a particular uid/gid
  also, but why not just require 1 child per uid/gid and then not worry
  about connection passing at all?
 
 A couple of reasons.  Remember first of all that the more threads per
 process, the less stable your server.  This is because if a module
 seg-faults, the whole process dies.  For that reason, many admins will
 want to have multiple child processes with the same uid/gid to create some
 failover.
 
 The second reason, however, is that the file descriptor passing is meant
 to solve the case of multiple Name-based Vhosts on the same ip:port all
 with different uid/gid requirements.  In that case, you will have multiple
 child processes all listening to the same ip:port, but you will need to do
 the file descriptor passing.

Well, that's what I said before, but you implied that the limitations
should prevent this. :)

 Closing the file descriptors of ip:port combinations that can't be served
 is a great optimization, but that is all it is, an optimization for the
 non-Name-based vhost case.

Yes.

This brings it back to what I was saying before, you can't pass SSL
connections, so as a result, you can't share an SSL port across two
uid/gid's. So this should be documented somewhere, or enforced in the
server (which may be difficult because of the layering), so people don't
do that :)

Anyway, I think I understand the path to follow now. I'm going to
implement the necessary fixes to make perchild finally usable.

JE




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread rbb
On Thu, 17 Oct 2002, Johannes Erdfelt wrote:

 Ryan, I've CC'd you on this just to let you see the patch. If you don't
 want me to involve you in this, please accept my apologies and let me
 know and I won't CC you in any further patches.

I have no problem being CC'ed on patches, although for the most part, I am
unlikely to respond.  In this case however, I will.  I want to say that
this is great.  I have reviewed the patch and the logic, and it all looks
fantastic.  I'm really glad that the MPM is making progress.  I will also
provide a possible solution to the one problem that you were unable to
solve.  :-)

 Problem 4:
 Occasionally, when a request needs to be passed to another process, it will
 hang until another connection is accepted. perchild uses the same process
 accept mutex that the other MPM's seem to need. This results in
 serialization of all accept() calls as is the intent of the mutex. The
 problem is that another process might have acquired the mutex from the
 process that actually needs the mutex to hit accept() and see the passed
 request. This isn't normally a problem because all processes accept
 connections on all listening ports. The problem in this case is the fact
 not all processes accept connections on the per process unix domain socket
 to pass connections around, for obvious reasons.
 
 Solution:
 I don't have one :) I'm not sure what the best way of solving this is. I
 don't fully understand the semantics of the process accept mutex yet. Any
 suggestions?

There is only one solution that I can see for this, unfortunately, it is a
rather big change.  :-(  The first step, is to migrate the perchild MPM to
the same model as the worker MPM, where one thread accepts requests and
puts them in a queue, and the rest of the threads remove the requests from
the queue, and actually serve them.  I've been meaning to do this for a
long time, but I never got around to it.

Once that is done, the solution to this problem is easy.  Every child
process needs to have two acceptor threads.  The first accepts on the TCP
socket, and the second accepts on the Unix Domain Socket.  For the TCP
socket, the thread should just use the same accept_mutex that it is using
now.  However for the Unix Domain Socket, each thread that is accepting on
the socket should just use apr_file_lock to lock the socket before
accepting.  This should remove the starvation that you are seeing, because
each socket will have it's own lock.

This is really great work, I hope that it gets applied ASAP.

Ryan




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread rbb

  As long as you are doing all this work, there is one more thought that I
  have been meaning to implement, but that I never got around to.  Currently
  perchild doesn't work with SSL, because of when the request is passed off,
  and how SSL works.  The easy solution to this, is to have the child
  processes close the sockets for any requests that they cannot
  handle.  This will also improve the chance that a request won't be passed
  if you have vhosts with different ports.  Consider the following:
  
  VHost www.foo.com:80
  AssignChildPerUidGid  rbb rbb
  /Vhost
  
  Vhost www.foo.com:81
  AssignChildPerUidGid  foo foo
  /VHost
  
  There is no reason for the foo/foo child process to be listening on port
  80.
  
  Just a thought for how to get SSL to work.
 
 I actually have a patch for this already :) Although I implemented it
 only as an optimization and not because of the issue with SSL. I hadn't
 tried to do SSL yet.
 
 I'd imagine this SSL limitation will have to be clearly documented since
 it may not be obvious to everyone.

Actually, as long as you have a patch to do this, then SSL should just
work, and no docs should be necessary.   :-)

Ryan

___
Ryan Bloom  [EMAIL PROTECTED]
550 Jean St
Oakland CA 94610
---




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-21 Thread rbb
On Mon, 21 Oct 2002, Johannes Erdfelt wrote:

 On Mon, Oct 21, 2002, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 What I was thinking was to add an artificial limitation that you can't
 share an IP:port pair across two different uid/gid's since that's the
 only case you want to pass a connection.

That limitation should already exist, although it may not.
   
   That limitation doesn't exist either in the documentation or in the
   implementation.
   
   I figured that was part of the point of connection passing. I know it's
   also useful for passing between the siblings for a particular uid/gid
   also, but why not just require 1 child per uid/gid and then not worry
   about connection passing at all?
  
  A couple of reasons.  Remember first of all that the more threads per
  process, the less stable your server.  This is because if a module
  seg-faults, the whole process dies.  For that reason, many admins will
  want to have multiple child processes with the same uid/gid to create some
  failover.
  
  The second reason, however, is that the file descriptor passing is meant
  to solve the case of multiple Name-based Vhosts on the same ip:port all
  with different uid/gid requirements.  In that case, you will have multiple
  child processes all listening to the same ip:port, but you will need to do
  the file descriptor passing.
 
 Well, that's what I said before, but you implied that the limitations
 should prevent this. :)

yeah, you're right, I wasn't thinking when I wrote that.  Sorry.

  Closing the file descriptors of ip:port combinations that can't be served
  is a great optimization, but that is all it is, an optimization for the
  non-Name-based vhost case.
 
 Yes.
 
 This brings it back to what I was saying before, you can't pass SSL
 connections, so as a result, you can't share an SSL port across two
 uid/gid's. So this should be documented somewhere, or enforced in the
 server (which may be difficult because of the layering), so people don't
 do that :)

You don't need that doc.  In order to share an ip:port combination across
child processes, you need to be using name-based vhosts.  SSL doesn't
support name-based vhosts today, so there is no new docs required.  Of
course, if my patch to provide Upgrade support goes in, then you will be
able to share ip:port combinations that support SSL across child
processes.  So, the docs really aren't necessary.

 Anyway, I think I understand the path to follow now. I'm going to
 implement the necessary fixes to make perchild finally usable.

Great!

Ryan

___
Ryan Bloom  [EMAIL PROTECTED]
550 Jean St
Oakland CA 94610
---




Re: [patch] perchild MPM bug fixes (+ open problem)

2002-10-20 Thread Jeff Trawick
Johannes Erdfelt [EMAIL PROTECTED] writes:

 I spent some time debugging the perchild MPM since it wasn't working for
 me, nor anyone else it seems. I've found a few problems:

just FYI so you don't feel ignored...

if nobody beats me to it, I plan to take a more detailed look at this
with the intent to commit...  nag me directly in a few days if I
haven't found the time to play with it now...

no need to rework this now, but patches that are smaller and solve
exactly one problem at a time get committed sooner...

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...



[patch] perchild MPM bug fixes (+ open problem)

2002-10-18 Thread Johannes Erdfelt
Ryan, I've CC'd you on this just to let you see the patch. If you don't
want me to involve you in this, please accept my apologies and let me
know and I won't CC you in any further patches.

I spent some time debugging the perchild MPM since it wasn't working for
me, nor anyone else it seems. I've found a few problems:

Problem 1:
In worker_thread, there is a variable called csd that is used to get
the new socket from lr-accept_func(). If that variable is NULL, then
the memory for the new socket is allocated in the per-transaction pool.
Unfortunately, the code neglected to reset the variable to NULL after
servicing a request. The result is that the first request for each
thread worked fine, but subsequent request may have the memory
overwritten and resulting in an invalid FD.

Solution:
Set it to NULL before calling lr-accept_func(). Implemented in the patch
below.

Problem 2:
pass_request fills out an iovec with the headers and the body of the
request it wants to pass to another process. It unfortunately uses the
wrong variable for the length causing it to always be 0.

Solution:
Use the correct variable name l. len is never set to anything so I removed
that and used 0 in the one reference to it. Implemented in the patch below.

Problem 3:
receive_from_other_child assumes that the iovec is the same on read as
it is on write. This isn't true and readmsg() follows readv() semantics.
iovec is a scatter/gather list and as a result, the 2 send buffers are
merged into one received buffer with the second always being untouched.
It also trusted the lengths in iov.iov_len which will be the size of the
original buffer, not the size of the data actually received.

Solution:
Merge the 2 buffer's into 1 and find the null terminators for the 2 strings.
Implemented in the patch below.

Problem 4:
Occasionally, when a request needs to be passed to another process, it will
hang until another connection is accepted. perchild uses the same process
accept mutex that the other MPM's seem to need. This results in
serialization of all accept() calls as is the intent of the mutex. The
problem is that another process might have acquired the mutex from the
process that actually needs the mutex to hit accept() and see the passed
request. This isn't normally a problem because all processes accept
connections on all listening ports. The problem in this case is the fact
not all processes accept connections on the per process unix domain socket
to pass connections around, for obvious reasons.

Solution:
I don't have one :) I'm not sure what the best way of solving this is. I
don't fully understand the semantics of the process accept mutex yet. Any
suggestions?

Anyway, this patch has been tested lightly and makes things work MUCH better
for the perchild MPM now, atleast for me :)

If this patch is acceptable by the Apache development team, can it be
applied?

JE

Index: perchild.c
===
RCS file: /home/cvspublic/httpd-2.0/server/mpm/experimental/perchild/perchild.c,v
retrieving revision 1.135
diff -u -r1.135 perchild.c
--- perchild.c  11 Oct 2002 15:41:52 -  1.135
+++ perchild.c  18 Oct 2002 01:31:22 -
 -679,9 +679,9 
 {
 struct msghdr msg;
 struct cmsghdr *cmsg;
-char headers[HUGE_STRING_LEN];
-char request_body[HUGE_STRING_LEN];
-struct iovec iov[2];
+char buffer[HUGE_STRING_LEN * 2], *headers, *body;
+int headerslen, bodylen;
+struct iovec iov;
 int ret, dp;
 apr_os_sock_t sd;
 apr_bucket_alloc_t *alloc = apr_bucket_alloc_create(ptrans);
 -690,15 +690,13 
 
 apr_os_sock_get(sd, lr-sd);
 
-iov[0].iov_base = headers;
-iov[0].iov_len = HUGE_STRING_LEN;
-iov[1].iov_base = request_body;
-iov[1].iov_len = HUGE_STRING_LEN;
+iov.iov_base = buffer;
+iov.iov_len = sizeof(buffer);
 
 msg.msg_name = NULL;
 msg.msg_namelen = 0;
-msg.msg_iov = iov;
-msg.msg_iovlen = 2;
+msg.msg_iov = iov;
+msg.msg_iovlen = 1;
 
 cmsg = apr_palloc(ptrans, sizeof(*cmsg) + sizeof(sd));
 cmsg-cmsg_len = sizeof(*cmsg) + sizeof(sd);
 -715,14 +713,25 
 APR_BRIGADE_INSERT_HEAD(bb, bucket);
 bucket = apr_bucket_socket_create(*csd, alloc);
 APR_BRIGADE_INSERT_HEAD(bb, bucket);
-bucket = apr_bucket_heap_create(iov[1].iov_base, 
-iov[1].iov_len, NULL, alloc);
+
+body = strchr(iov.iov_base, 0);
+if (!body) {
+return 1;
+}
+
+body++;
+bodylen = strlen(body);
+
+headers = iov.iov_base;
+headerslen = body - headers;
+
+bucket = apr_bucket_heap_create(body, bodylen, NULL, alloc);
 APR_BRIGADE_INSERT_HEAD(bb, bucket);
-bucket = apr_bucket_heap_create(iov[0].iov_base, 
-iov[0].iov_len, NULL, alloc);
+bucket = apr_bucket_heap_create(headers, headerslen, NULL, alloc);
 APR_BRIGADE_INSERT_HEAD(bb, bucket);
 
 apr_pool_userdata_set(bb, PERCHILD_SOCKETS, NULL,