Great, Great, Great! Sean Conner and Andrew Warnier.  This topic is covered
in full without anybody being sarcastic, as is often the case in other
usenet groups.

My heartful thanks to participating parties.

Thanks a lot.

Nagrik

On Fri, Jun 12, 2009 at 12:41 AM, Sean Conner <s...@conman.org> wrote:

> It was thus said that the Great Vinay Nagrik once stated:
> > Thank you Andrew and Tom,
> >
> > Thank you for your insightful replies.  These have definitely helped me
> in
> > understanding the major issues.
> >
> > At this moment I can not understand "How a 'Connecton' is passed from
> parent
> > process to child process."
> >
> > My understanding is
> >
> > "a connection is a combination of (IP address + port.) and the parent
> > process listens at one such address or multiple such addresses in virtual
> > host interfaces."
> >
> > Let us assume the parent process is listening to only one such address
> i.e.
> > (IP address + port).  Then if this connection is passed to the child then
> > this connection must be blocked and this is the only connection which
> will
> > be multiplexed among several child processes as welll parent process.  My
> > point is that concurrency can not be achieved on a single connection (IP
> > address + port) unless I am missing something fundemental about the
> > definition of "Connection."
> >
> > Secondly if a connection is passed to the child then once again the child
> > process will have to make a three way handshake to the original client to
> > service the request.
> >
> > I hope Andrew or someone from the group can clear my doubts.
>
>  This is for Unix and Unix-like operating systems.  Your milage will vary
> with other operating systems.
>
>  When Apache starts, it creates a listening TCP socket.  In the kernel,
> this socket will look something like:
>
>        protocol localhost      port    remotehost      remoteport
>           TCP   192.168.1.23   80      0.0.0.0         0
>
>  In other words, a half connected socket (in reality, the localhost portion
> can also be 0.0.0.0, which means listen on all interfaces that support an
> IP
> address, but for the sake of argument, let's say we only want Apache to
> listen on a particular interface).  The code to create this typically (if
> not spread out) looks like:
>
>        struct sockaddr addr;
>        int             sock;
>
>        /*
>         * this creates space for a TCP socket
>         */
>
>        sock = socket(AF_INET,SOCK_STREAM,0);
>
>        /*
>         * fill in the address we want to to listen in on
>         * is not quite this way, but the actual details
>         * would only get in the way ...
>         */
>
>        addr.family = AF_INET;
>        addr.host   = 192.168.1.23
>        addr.port   = 80;
>
>        /*
>         * now, connect the address/port to the socket
>         * we just created
>         */
>
>        bind(sock,&addr,sizeof(addr));
>
>  So now we have our side of the socket created (see above).  Now, onto real
> work.  Apache (and I'm assuming the pre-fork version here) will create the
> children processes to handle actual requests.  This is done via the fork()
> call (which creates a duplicate of the calling process).  As part of this
> fork() call, the child process will see this socket as well [1], but since
> it doesn't handle incoming connections, the child can then close its copy
> of
> the socket (which won't affect the socket in the main parent process).  The
> child process then changes its effective user id to some lower priviledged
> account, and then wait for the parent to give it some work to do.
>
>  The parent process, however, continues on and tells Unix it is ready
> to accept network connections.
>
>        /*
>         * tell Unix we want to accept connections on this port.  The
>         * 5 value is the size of the backlog---the number of incoming
>         * connections the kernel will queue up for us while we're busy
>         * doing other stuff ... more on this in a bit
>         */
>
>        listen(sock,5);
>
>  So now Unix knows the main Apache process wants to accept connections on
> TCP port 80.  Then the main Apache process Apache enters a loop that looks
> like:
>
>        struct sockaddr remote_addr;
>        socklen_t       remote_size;    /* size of remote address */
>        int             connection;
>
>        for ( ; ; )     /* ever */
>        {
>          /*
>           * we'll accept connections from anywhere, and from any port
>           */
>
>          remote_addr.family = AF_INET;
>          remote_addr.host   = ANY_IP;
>          remote_addr.port   = ANY_PORT;
>          remote_size        = sizeof(remote_addr);
>          connection         = accept(sock,&remote,&remsize);
>
>          /*
>           * between now and the time we get back to the accept()
>           * call, the Unix kernel will queue up to five connection
>           * requests.  More on this below ...
>           * Meanwhile, pass this socket to a child process ...
>           */
>
>          pass_connection_to_child(connection);
>
>          /*
>           * now that we have passed the socket on, the parent
>           * no longer needs its copy of the socket, so it can
>           * close it, and cycle back for another connection.
>           */
>
>          close(connection);
>        }
>
>  The accept() call blocks Apache until an incoming connection to port 80 is
> initiated (or one or more are pending).  It then returns a new socket of
> this connection (the remote address is stored in remote_addr, and the size
> of this structure is also return in remote_size---the network stack under
> Unix can work with more than just IP and different network protocols have
> different size addresses; for instance, while an IP address:port is 6 bytes
> (four for address, two for port), an IPv6 address:port will be 18 bytes).
> So, now we have:
>
> var     protocol localhost      port    remotehost      remoteport process
> -------------------------------------------------------------------
> sock       TCP   192.168.1.23   80      0.0.0.0         0          main
> connection TCP   192.168.1.23   80      173.45.15.4     45234      main
>
>  The parent process then takes the connection socket, and passes it on to
> an available child process to handle---once the socket is passed on to the
> child (and no, the three-way TCP handshake does not have to happen again,
> the connected socket is passed from the parent to the child process), the
> parent can then close its copy of the connection socket (which won't affect
> the connection, nor the connection socket the child process now has), and
> go
> back to handle a new connection by calling accept() on the half created
> listening socket.
>
> var     protocol localhost      port    remotehost      remoteport process
> -------------------------------------------------------------------
> sock       TCP   192.168.1.23   80      0.0.0.0         0          main
> connection TCP   192.168.1.23   80      173.45.15.4     45234      child
>
>  It's the time between accept() requests that the queue limit given in the
> listen() call comes into play.  During the time between calls to accept(),
> the Unix kernel will queue up pending connection requests (the value 5 is
> the traditional value for this, but the early BSD kernels pretty much
> assumed this value would always be 5, and acted oddly if it wasn't so
> that's
> why you see this value used in much sample network code, but I
> digress)---it
> has nothing to do with the total number of requests that can be handled,
> just the number of requests that will be held between calls to accept().
>
>  How the socket is passed from the parent to the child will not be covered
> (as it would only cloud the issue since it takes 11 pages in _Advanced
> Programming in the Unix Environment_ to cover this particular issue---it's
> ... messy) but just assume It Works.
>
>  -spc (I hope this clears things up ... )
>
> [1]     The socket is technically an open file descriptor, which are
>        "duplicated" [2] during a call to fork(), and the process (parent or
>        child) that doesn't need access can close its copy without affecting
>        the other.
>
> [2]     The file descriptor is really an index into a table of open files a
>        process can use.  This table is maintained by the kernel (the
>        process can't "see" this table at all), and what's really duplicated
>        is this table, which contains references to other structures that
>        define the location of the file on disk.
>  y
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
>   "   from the digest: users-digest-unsubscr...@httpd.apache.org
> For additional commands, e-mail: users-h...@httpd.apache.org
>
>


-- 
Thanks

Nagrik

Reply via email to