Hello, PJ. Perhaps your prefork settings are the cause of the issue.
Look, you have 80 StartServers and 120 MaxSpareServers, and with such settings, apache can spawn 9600 (80*120) children. However, your ServerLimit and MaxClients (3500) are way to lower than that. I've had similar issues when the number of children apache could spawn were higher than the ServerLimit/MaxClients value. Try raising the ServerLimit and MaxClients value to 9600 (make sure you have enough memory to do so) and check what happens. In case you can't afford such high number of children, lower the value of StartServers and MaxSpareServers but keep it equivalent to MaxClients and ServerLimit. Hope this helps. Luis Alen On Wed, May 2, 2012 at 11:38 AM, P J <pauljfli...@gmail.com> wrote: > On Tue, May 1, 2012 at 7:26 AM, P J <pauljfli...@gmail.com> wrote: > >> On Tue, May 1, 2012 at 7:22 AM, P J <pauljfli...@gmail.com> wrote: >> >>> On Mon, Apr 30, 2012 at 10:37 AM, P J <pauljfli...@gmail.com> wrote: >>> >>>> >>>> On Mon, Apr 30, 2012 at 9:13 AM, Alexandr Normuradov < >>>> norma...@gmail.com> wrote: >>>> >>>>> cat /proc/$(pidof -s httpd)/limitsTo troubleshoot that you should have >>>>> at least two additional outputs from >>>>> >>>>> netstat -pant, with connections states >>>>> and >>>>> service httpd fullstatus, listing current state of all the apache >>>>> procs/threads. >>>>> >>>>> What applications your Apache is serving? >>>>> PHP? is it mod_php, mod_python, mod_perl? >>>>> >>>>> What the vhost access log file for the most accessed vhost is showing? >>>>> Any pattern of slow, connections consuming attack? >>>>> If it is, and all tasks are in the Keep Alive wait then disable Keep >>>>> Alive and lower the general timeout to just 7 seconds. >>>>> >>>>> The error "connect to listener on [::]:80" error is quite unusual. >>>>> >>>>> ETIMEDOUT >>>>> Timeout while attempting connection. The server may be too busy to >>>>> accept new connections. Note that for IP sockets the timeout may be >>>>> very long when syncookies are enabled on the server. >>>>> >>>>> cat /proc/sys/fs/file-nr >>>>> >>>>> cat /proc/$(pidof -s httpd)/limits >>>>> >>>>> >>>>> Sincerely, >>>>> Alexandr Normalex >>>>> >>>> >>>> Hi Alexandr, thanks for taking a look at this with me. >>>> >>>> The traffic pattern for this website is at certain times of the day it >>>> receives huge spikes of traffic in very short periods of time, trying to >>>> tune Apache to accommodate it the best we can. >>>> >>>> cat /proc/$(pidof -s httpd)/limits >>>> >>>> Limit Soft Limit Hard Limit >>>> Units >>>> Max cpu time unlimited unlimited >>>> seconds >>>> Max file size unlimited unlimited >>>> bytes >>>> Max data size unlimited unlimited >>>> bytes >>>> Max stack size 10485760 unlimited >>>> bytes >>>> Max core file size 0 unlimited >>>> bytes >>>> Max resident set unlimited unlimited >>>> bytes >>>> Max processes 55296 55296 >>>> processes >>>> Max open files 1024 1024 >>>> files >>>> Max locked memory 32768 32768 >>>> bytes >>>> Max address space unlimited unlimited >>>> bytes >>>> Max file locks unlimited unlimited >>>> locks >>>> Max pending signals 55296 55296 >>>> signals >>>> Max msgqueue size 819200 819200 >>>> bytes >>>> Max nice priority 0 0 >>>> Max realtime priority 0 0 >>>> >>>> cat /proc/sys/fs/file-nr >>>> 1530 0 560543 >>>> >>>> Looking at Max open files I see what is likely the problem :) >>>> Max open files 1024 >>>> >>>> I swear I modified this to 4096! I've changed the limit to 4096 now, >>>> I'll double check it tomorrow. Hopefully this will be the obvious fix! >>>> >>>> I will check service httpd fullstatus and netstat -pant tomorrow >>>> morning when this happens again, it happens the same time every day - it is >>>> not an attack, the customers application receives massive amounts of >>>> connections at certain times of the day. >>>> >>>> I've been working with Apache for 15 years and I've never seen "connect >>>> to listener on [::]:80" error message before, I hope it's related to >>>> reaching Max open files. >>>> >>>> Thanks again for your help. >>>> >>>> -- >>>> PJ >>>> >>>> >>> I was hoping this would be fixed now that Max Open files has been >>> updated, same issue this morning. >>> >>> cat /proc/$(pidof -s httpd)/limits >>> Limit Soft Limit Hard Limit >>> Units >>> Max cpu time unlimited unlimited >>> seconds >>> Max file size unlimited unlimited >>> bytes >>> Max data size unlimited unlimited >>> bytes >>> Max stack size 10485760 unlimited >>> bytes >>> Max core file size 0 unlimited >>> bytes >>> Max resident set unlimited unlimited >>> bytes >>> Max processes 55296 55296 >>> processes >>> Max open files 1024 1024 >>> files >>> Max locked memory 32768 32768 >>> bytes >>> Max address space unlimited unlimited >>> bytes >>> Max file locks unlimited unlimited >>> locks >>> Max pending signals 55296 55296 >>> signals >>> Max msgqueue size 819200 819200 >>> bytes >>> Max nice priority 0 0 >>> Max realtime priority 0 0 >>> >>> Once it reaches 1000 total children >>> >>> [info] server seems busy, (you may need to increase StartServers, or >>> Min/MaxSpareServers), spawning 32 children, there are 17 idle, and 1002 >>> total children >>> >>> After 1000 total children >>> >>> mpm_common.c(663): (70007)The timeout specified has expired: connect to >>> listener on [::]:80 >>> mpm_common.c(663): (70007)The timeout specified has expired: connect to >>> listener on [::]:80 >>> mpm_common.c(663): (70007)The timeout specified has expired: connect to >>> listener on [::]:80 >>> >>> Until apache is restarted. >>> >>> I tried to run service httpd fullstatus during this time but >>> it want able to connect: >>> >>> ELinks: Connection refused. >>> >>> I did capture the output of netstat -pant which shows many connections >>> to the MySQL DB as well. >>> I've double checked MySQL has not reached max connections and that it's >>> still working during this time. >>> >>> netstat output is so big I have to put it up on pastebin: >>> http://pastebin.com/0DjvDnJp >>> >>> I dont understand why this is happening at 1000 children, what limit is >>> it hitting? >>> >>> Apache config: >>> >>> Timeout 30 >>> >>> KeepAlive On >>> MaxKeepAliveRequests 10000 >>> KeepAliveTimeout 3 >>> >>> <IfModule prefork.c> >>> StartServers 80 >>> MinSpareServers 50 >>> MaxSpareServers 120 >>> ServerLimit 3500 >>> MaxClients 3500 >>> MaxRequestsPerChild 4000 >>> </IfModule >>> >>> >>> Any help would be greatly appreciated. >>> >>> -- >>> PJ >>> >>> >> Haha, Max open files still says 1024!! I hardcoded it to 16384 yesterday, >> something keeps resetting it! >> >> Let me figure this out before I keep bugging the list :) >> >> Thanks, >> >> -- >> PJ >> >> > Same issue this morning: > > [Wed May 02 07:01:57 2012] [info] server seems busy, (you may need to > increase StartServers, or Min/MaxSpareServers), spawning 32 children, there > are 48 idle, and 1004 total children > > [Wed May 02 07:02:16 2012] [debug] mpm_common.c(663): (70007)The timeout > specified has expired: connect to listener on [::]:80 > [Wed May 02 07:02:23 2012] [debug] mpm_common.c(663): (70007)The timeout > specified has expired: connect to listener on [::]:80 > [Wed May 02 07:02:30 2012] [debug] mpm_common.c(663): (70007)The timeout > specified has expired: connect to listener on [::]:80 > > --snip-- > > And the site was down. > > I've confirmed the Max open files setting has been fixed: > > Max open files 16384 16384 files > > Anyone else have any insight on what the "(70007)The timeout specified has > expired: connect to listener on [::]:80" error is and why it happens every > day after reaching 1000 children? > > Not sure where else to look. > > Thanks in advance. > > -- > PJ >