Hello, PJ.

Perhaps your prefork settings are the cause of the issue.

Look, you have 80 StartServers and 120 MaxSpareServers, and with such
settings, apache can spawn 9600 (80*120) children.

However, your ServerLimit and MaxClients (3500) are way to lower than that.

I've had similar issues when the number of children apache could spawn were
higher than the ServerLimit/MaxClients value.

Try raising the ServerLimit and MaxClients value to 9600 (make sure you
have enough memory to do so) and check what happens.

In case you can't afford such high number of children, lower the value of
StartServers and MaxSpareServers but keep it equivalent to MaxClients and
ServerLimit.

Hope this helps.


Luis Alen

On Wed, May 2, 2012 at 11:38 AM, P J <pauljfli...@gmail.com> wrote:

> On Tue, May 1, 2012 at 7:26 AM, P J <pauljfli...@gmail.com> wrote:
>
>> On Tue, May 1, 2012 at 7:22 AM, P J <pauljfli...@gmail.com> wrote:
>>
>>> On Mon, Apr 30, 2012 at 10:37 AM, P J <pauljfli...@gmail.com> wrote:
>>>
>>>>
>>>> On Mon, Apr 30, 2012 at 9:13 AM, Alexandr Normuradov <
>>>> norma...@gmail.com> wrote:
>>>>
>>>>> cat /proc/$(pidof -s httpd)/limitsTo troubleshoot that you should have
>>>>> at least two additional outputs from
>>>>>
>>>>> netstat -pant, with connections states
>>>>> and
>>>>> service httpd fullstatus, listing current state of all the apache
>>>>> procs/threads.
>>>>>
>>>>> What applications your Apache is serving?
>>>>> PHP? is it mod_php, mod_python, mod_perl?
>>>>>
>>>>> What the vhost access log file for the most accessed vhost is showing?
>>>>> Any pattern of slow, connections  consuming attack?
>>>>> If it is, and all tasks are in the Keep Alive wait then disable Keep
>>>>> Alive and lower the general timeout to just 7 seconds.
>>>>>
>>>>> The error "connect to listener on [::]:80" error is quite unusual.
>>>>>
>>>>> ETIMEDOUT
>>>>>    Timeout while attempting connection. The server may be too busy to
>>>>> accept new connections. Note that for IP sockets the timeout may be
>>>>> very long when syncookies are enabled on the server.
>>>>>
>>>>> cat /proc/sys/fs/file-nr
>>>>>
>>>>> cat /proc/$(pidof -s httpd)/limits
>>>>>
>>>>>
>>>>> Sincerely,
>>>>> Alexandr Normalex
>>>>>
>>>>
>>>> Hi Alexandr, thanks for taking a look at this with me.
>>>>
>>>> The traffic pattern for this website is at certain times of the day it
>>>> receives huge spikes of traffic in very short periods of time, trying to
>>>> tune Apache to accommodate it the best we can.
>>>>
>>>> cat /proc/$(pidof -s httpd)/limits
>>>>
>>>> Limit                     Soft Limit           Hard Limit
>>>> Units
>>>> Max cpu time              unlimited            unlimited
>>>>  seconds
>>>> Max file size             unlimited            unlimited
>>>>  bytes
>>>> Max data size             unlimited            unlimited
>>>>  bytes
>>>> Max stack size            10485760             unlimited
>>>>  bytes
>>>> Max core file size        0                    unlimited
>>>>  bytes
>>>> Max resident set          unlimited            unlimited
>>>>  bytes
>>>> Max processes             55296                55296
>>>>  processes
>>>> Max open files            1024                 1024
>>>> files
>>>> Max locked memory         32768                32768
>>>>  bytes
>>>> Max address space         unlimited            unlimited
>>>>  bytes
>>>> Max file locks            unlimited            unlimited
>>>>  locks
>>>> Max pending signals       55296                55296
>>>>  signals
>>>> Max msgqueue size         819200               819200
>>>> bytes
>>>> Max nice priority         0                    0
>>>> Max realtime priority     0                    0
>>>>
>>>> cat /proc/sys/fs/file-nr
>>>> 1530    0       560543
>>>>
>>>> Looking at Max open files I see what is likely the problem :)
>>>> Max open files            1024
>>>>
>>>> I swear I modified this to 4096! I've changed the limit to 4096 now,
>>>> I'll double check it tomorrow. Hopefully this will be the obvious fix!
>>>>
>>>> I will check service httpd fullstatus  and netstat -pant tomorrow
>>>> morning when this happens again, it happens the same time every day - it is
>>>> not an attack, the customers application receives massive amounts of
>>>> connections at certain times of the day.
>>>>
>>>> I've been working with Apache for 15 years and I've never seen "connect
>>>> to listener on [::]:80" error message before, I hope it's related to
>>>> reaching Max open files.
>>>>
>>>> Thanks again for your help.
>>>>
>>>> --
>>>> PJ
>>>>
>>>>
>>> I was hoping this would be fixed now that Max Open files has been
>>> updated, same issue this morning.
>>>
>>> cat /proc/$(pidof -s httpd)/limits
>>> Limit                     Soft Limit           Hard Limit
>>> Units
>>> Max cpu time              unlimited            unlimited
>>>  seconds
>>> Max file size             unlimited            unlimited
>>>  bytes
>>> Max data size             unlimited            unlimited
>>>  bytes
>>> Max stack size            10485760             unlimited
>>>  bytes
>>> Max core file size        0                    unlimited
>>>  bytes
>>> Max resident set          unlimited            unlimited
>>>  bytes
>>> Max processes             55296                55296
>>>  processes
>>> Max open files            1024                 1024
>>> files
>>> Max locked memory         32768                32768
>>>  bytes
>>> Max address space         unlimited            unlimited
>>>  bytes
>>> Max file locks            unlimited            unlimited
>>>  locks
>>> Max pending signals       55296                55296
>>>  signals
>>> Max msgqueue size         819200               819200
>>> bytes
>>> Max nice priority         0                    0
>>> Max realtime priority     0                    0
>>>
>>> Once it reaches 1000 total children
>>>
>>> [info] server seems busy, (you may need to increase StartServers, or
>>> Min/MaxSpareServers), spawning 32 children, there are 17 idle, and 1002
>>> total children
>>>
>>> After 1000 total children
>>>
>>> mpm_common.c(663): (70007)The timeout specified has expired: connect to
>>> listener on [::]:80
>>> mpm_common.c(663): (70007)The timeout specified has expired: connect to
>>> listener on [::]:80
>>> mpm_common.c(663): (70007)The timeout specified has expired: connect to
>>> listener on [::]:80
>>>
>>> Until apache is restarted.
>>>
>>> I tried to run service httpd fullstatus during this time but
>>> it want able to connect:
>>>
>>> ELinks: Connection refused.
>>>
>>> I did capture the output of netstat -pant which shows many connections
>>> to the MySQL DB as well.
>>> I've double checked MySQL has not reached max connections and that it's
>>> still working during this time.
>>>
>>> netstat output is so big I have to put it up on pastebin:
>>> http://pastebin.com/0DjvDnJp
>>>
>>> I dont understand why this is happening at 1000 children, what limit is
>>> it hitting?
>>>
>>> Apache config:
>>>
>>> Timeout 30
>>>
>>> KeepAlive On
>>> MaxKeepAliveRequests 10000
>>> KeepAliveTimeout 3
>>>
>>> <IfModule prefork.c>
>>> StartServers      80
>>> MinSpareServers   50
>>> MaxSpareServers  120
>>> ServerLimit     3500
>>> MaxClients      3500
>>> MaxRequestsPerChild  4000
>>> </IfModule
>>>
>>>
>>> Any help would be greatly appreciated.
>>>
>>> --
>>> PJ
>>>
>>>
>> Haha, Max open files still says 1024!! I hardcoded it to 16384 yesterday,
>> something keeps resetting it!
>>
>> Let me figure this out before I keep bugging the list :)
>>
>> Thanks,
>>
>> --
>> PJ
>>
>>
> Same issue this morning:
>
> [Wed May 02 07:01:57 2012] [info] server seems busy, (you may need to
> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
> are 48 idle, and 1004 total children
>
> [Wed May 02 07:02:16 2012] [debug] mpm_common.c(663): (70007)The timeout
> specified has expired: connect to listener on [::]:80
> [Wed May 02 07:02:23 2012] [debug] mpm_common.c(663): (70007)The timeout
> specified has expired: connect to listener on [::]:80
> [Wed May 02 07:02:30 2012] [debug] mpm_common.c(663): (70007)The timeout
> specified has expired: connect to listener on [::]:80
>
> --snip--
>
> And the site was down.
>
> I've confirmed the Max open files setting has been fixed:
>
> Max open files            16384                16384                files
>
> Anyone else have any insight on what the "(70007)The timeout specified has
> expired: connect to listener on [::]:80" error is and why it happens every
> day after reaching 1000 children?
>
> Not sure where else to look.
>
> Thanks in advance.
>
> --
> PJ
>

Reply via email to