#43610 [Com]: fastcgi socket dies on high concurrency
ID: 43610 Comment by: olafvdspek at gmail dot com Reported By: oliver at realtsp dot com Status: Open Bug Type: CGI related Operating System: FreeBSD 6.2 PHP Version: 5.2.5 New Comment: > Since the parent process manages this queue Eh, are you sure it does? As far as I know that's not true. Previous Comments: [2007-12-23 11:31:54] oliver at realtsp dot com @olafvdspek at gmail dot com That is not in keeping with the FastCGI spec: # FCGI_OVERLOADED: rejecting a new request. This happens when the application runs out of some resource, e.g. database connections. The situation I am talking about here is a severely overloaded condition. ie all php worker (child) processes are already busy and there is a queue of, in my case, an additional 200+ connections. My suggestion is that the php parent process allows a max_fastcgi_queue of say 200 and then rejects further connections with FCGI_OVERLOADED. Since the parent process manages this queue it should its size and it "should" be be easy to place a max limit on that size. The limit could be configured in php.ini. [2007-12-22 15:10:55] olafvdspek at gmail dot com > Could you explain or perhaps review PHP's behaviour under overloaded conditions. I'm no PHP developer and haven't looked at the code, but my guess: A PHP process has C children, each being able to handle one connection. When that connection is closed, it'll do an accept() to handle a new connection. When a web server opens more than C connections, those will not be accepted until an existing connection is closed, which may take a long time. So a web server should never open more than C connections to one PHP process. [2007-12-17 13:05:41] oliver at realtsp dot com Actually.. It turns out that the php parent is not dead at all. Even with stable 5.2.5 (rather than 5.2-latest) if you setup the fastcgi server to be started separately from lighty ie with lighty config like this: fastcgi.server = ( ".php" => ( "localhost" => ( "socket" => "/tmp/php-fastcgi.sock" ) ) ) and the use spawn_fcgi to start the php fcgi server manually. Then all behaves as expected. ie you get some (not all!!) 500s while the overload condition exists and when the load drops away you get all normal 200 responses again. ie elastic/tolerant performance as hoped for. After some investigation into the the lighty source it turns out that lighty is confused by the fact that PHP just fails to respond (ie timeout) rather than returning FCGI_OVERLOADED. refer to this: http://bugs.php.net/bug.php?id=39809 where dimitry said: "PHP cannot return FCGI_OVERLOADED, because all PHP processes are busy and nobody accepts new connection. The only way to detect this situation - use connection timeout." lighty however is sticking to the fastcgi spec and expecting the php parent to be in shutdown mode (ie its PID to dissappear) when it does not respond (after which it would then respawn a new parent). But because the PHP parent is just busy and not actually shutting down, the PID never dissappears and lighty gets stuck in a loop. I have posted a workaround involving starting PHP separately here: http://trac.lighttpd.net/trac/ticket/1488 which also proposes a "patch" to deal with PHP's non-standard behaviour regarding FCGI_OVERLOADED. However, the fundamental problem remains: It is very difficult for a FASTCGI client to determine what is going on and therefore what to do when php just times out on connections rather than returning the correct FCGI_OVERLOADED response. I did not understand dmitry's original reason for this: "PHP cannot return FCGI_OVERLOADED, because all PHP processes are busy and nobody accepts new connection." Could you explain or perhaps review PHP's behaviour under overloaded conditions. Thanks Oliver [2007-12-17 10:44:55] oliver at realtsp dot com We have tried with http://snaps.php.net/php5.2-latest.tar.gz Result is unchanged. NOTE that the php workers and parent processes are still showing on ps after the crash (same as before the crash). But lightly cannot get a sensible response from them. [EMAIL PROTECTED] /usr/ports/lang/php5]# pstree ... |-+- 25262 www /usr/local/sbin/lighttpd -f /usr/local/etc/lighttpd.conf | \-+= 25263 www /usr/local/bin/php-cgi | |--- 25264 www /usr/local/bin/php-cgi | |--- 25265 www /usr/local/bin/php-cgi | |--- 25266 www /usr/local/bin/php-cgi | |--- 25267 www /usr/local/bin/php-cgi | |--- 25268 www
#43610 [Com]: fastcgi socket dies on high concurrency
ID: 43610 Comment by: olafvdspek at gmail dot com Reported By: oliver at realtsp dot com Status: Open Bug Type: CGI related Operating System: FreeBSD 6.2 PHP Version: 5.2.5 New Comment: > Could you explain or perhaps review PHP's behaviour under overloaded conditions. I'm no PHP developer and haven't looked at the code, but my guess: A PHP process has C children, each being able to handle one connection. When that connection is closed, it'll do an accept() to handle a new connection. When a web server opens more than C connections, those will not be accepted until an existing connection is closed, which may take a long time. So a web server should never open more than C connections to one PHP process. Previous Comments: [2007-12-17 13:05:41] oliver at realtsp dot com Actually.. It turns out that the php parent is not dead at all. Even with stable 5.2.5 (rather than 5.2-latest) if you setup the fastcgi server to be started separately from lighty ie with lighty config like this: fastcgi.server = ( ".php" => ( "localhost" => ( "socket" => "/tmp/php-fastcgi.sock" ) ) ) and the use spawn_fcgi to start the php fcgi server manually. Then all behaves as expected. ie you get some (not all!!) 500s while the overload condition exists and when the load drops away you get all normal 200 responses again. ie elastic/tolerant performance as hoped for. After some investigation into the the lighty source it turns out that lighty is confused by the fact that PHP just fails to respond (ie timeout) rather than returning FCGI_OVERLOADED. refer to this: http://bugs.php.net/bug.php?id=39809 where dimitry said: "PHP cannot return FCGI_OVERLOADED, because all PHP processes are busy and nobody accepts new connection. The only way to detect this situation - use connection timeout." lighty however is sticking to the fastcgi spec and expecting the php parent to be in shutdown mode (ie its PID to dissappear) when it does not respond (after which it would then respawn a new parent). But because the PHP parent is just busy and not actually shutting down, the PID never dissappears and lighty gets stuck in a loop. I have posted a workaround involving starting PHP separately here: http://trac.lighttpd.net/trac/ticket/1488 which also proposes a "patch" to deal with PHP's non-standard behaviour regarding FCGI_OVERLOADED. However, the fundamental problem remains: It is very difficult for a FASTCGI client to determine what is going on and therefore what to do when php just times out on connections rather than returning the correct FCGI_OVERLOADED response. I did not understand dmitry's original reason for this: "PHP cannot return FCGI_OVERLOADED, because all PHP processes are busy and nobody accepts new connection." Could you explain or perhaps review PHP's behaviour under overloaded conditions. Thanks Oliver [2007-12-17 10:44:55] oliver at realtsp dot com We have tried with http://snaps.php.net/php5.2-latest.tar.gz Result is unchanged. NOTE that the php workers and parent processes are still showing on ps after the crash (same as before the crash). But lightly cannot get a sensible response from them. [EMAIL PROTECTED] /usr/ports/lang/php5]# pstree ... |-+- 25262 www /usr/local/sbin/lighttpd -f /usr/local/etc/lighttpd.conf | \-+= 25263 www /usr/local/bin/php-cgi | |--- 25264 www /usr/local/bin/php-cgi | |--- 25265 www /usr/local/bin/php-cgi | |--- 25266 www /usr/local/bin/php-cgi | |--- 25267 www /usr/local/bin/php-cgi | |--- 25268 www /usr/local/bin/php-cgi | |--- 25269 www /usr/local/bin/php-cgi | |--- 25270 www /usr/local/bin/php-cgi | |--- 25271 www /usr/local/bin/php-cgi | |--- 25272 www /usr/local/bin/php-cgi | |--- 25273 www /usr/local/bin/php-cgi | |--- 25274 www /usr/local/bin/php-cgi | |--- 25275 www /usr/local/bin/php-cgi | |--- 25276 www /usr/local/bin/php-cgi | |--- 25277 www /usr/local/bin/php-cgi | |--- 25278 www /usr/local/bin/php-cgi | \--- 25279 www /usr/local/bin/php-cgi [2007-12-17 09:17:30] [EMAIL PROTECTED] Please try using this CVS snapshot: http://snaps.php.net/php5.2-latest.tar.gz For Windows (zip): http://snaps.php.net/win32/php5.2-win32-latest.zip For Windows (installer): http://snaps.php.net/win32/php5.2-win32-installer-latest.msi [2007-12-16 21:55:00] oliver at realtsp dot com Description: Version information below. When I load the server with sieg