Daniel, all,
Just wanted to post the solution to this issue. I wanted to wait a significant 
amount of time to make sure we had this solved.
The root caused was the LDAP Caching mechanism. I am guessing there is a bug in 
that code that causes the server to go haywire after n-number of items being 
cached or looked up. Or perhaps some memory leak.

By disabling LDAP caching the server has been stable for 60+ days.

The last changes to http.conf I made were these:
< LDAPSharedCacheSize 500000
< LDAPCacheEntries 1024
---
> LDAPSharedCacheSize 0
> LDAPCacheEntries 16
352c352
< LDAPOpCacheEntries 1024
---
> LDAPOpCacheEntries 0
386c386

Hope this helps some other poor souls out there.
MJ
 


     On Thursday, January 29, 2015 6:35 PM, Daniel <dferra...@gmail.com> wrote:
   

 
2015-01-30 1:03 GMT+01:00 Mark Jacquet <mark_jacq...@yahoo.com.invalid>:

Problem: Apache server will stay up for random amount of time, usually days, 
but eventually enters a hung state. When hung the CPU load gradually spikes on 
the machine
 and new web server requests are unresponsive.

Error logs typically contain lots of these:

    Wed Jan 28 16:06:58.667188 2015] [mpm_event:error] [pid 25336:tid 1] 
AH00485: scoreboard is full, not at MaxRequestWorkers
I have done a lot of web research on this top and have found many cases where 
others o=have had the same/similar issue but no real solutions. Seem very close 
to this bug report: https://issues.apache.org/bugzilla/show_bug.cgi?id=53555
Environment:

LDOM (VM) SunOS myhostname 5.10 Generic_118833-36 sun4v sparc 
SUNW,Sun-Fire-T200 
8G RAM
http Conf:

StartServers                8
MinSpareServers             Not set
MaxSpareServers             Not set
ServerLimit                 256
MaxRequestWorkers           100
MaxConnectionsPerChild      1000
KeepAlive                   On
TimeOut                     3000
MaxKeepAliveRequests        50
KeepAliveTimeout            2

Current non-hung Score Board:

Server Version: Apache/2.4.10 (Unix)
Server MPM: event
Server Built: Oct 30 2014 16:29:03

Current Time: Wednesday, 28-Jan-2015 10:59:39 PST
Restart Time: Wednesday, 28-Jan-2015 09:49:21 PST
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 hour 10 minutes 17 seconds
Server load: 0.60 0.46 0.41
Total accesses: 1134 - Total Traffic: 2.2 GB
CPU Usage: u9.07 s16.94 cu609.51 cs69.31 - 16.7% CPU load
.269 requests/sec - 0.5 MB/second - 2.0 MB/request
1 requests currently being processed, 99 idle workers

PID Connections     Threads Async connections
total   accepting   busy    idle    writing keep-alive  closing
25337   0   yes 1   24  0   0   0
25338   1   yes 0   25  1   0   0
25339   1   yes 0   25  0   0   1
25340   1   yes 0   25  0   0   1
Sum 3       1   99  1   0   2

Any thoughts/comments on http conf tuning, OS patches, apache bug fixes 
appreciated.

This is a production server, so you can imagine, having it go down at random 
times (usually when I am asleep) is not fun!
Thanks.
MJ





Hello,
you have some odd values.
First you don't specify ThreadsPerChild, which by default is 64. Yet you do 
specify the maxrequestworkers which represents the total of threads in all 
child processes together, but you specify a maximun of 256 processes.
By a simple math, 256 process * 64 childs per process would yield 16384 threads 
in total, yet you are just allowing a maximun of 100, so effectively your 
server is just capable of starting 1 single process and thus, every time you 
restart, having no "spare" processes available you will get scoreboard is full 
message.
Consider something more logical like this for starters:
StartServers            1 <-- starts with 1 processServerLimit             5 
<-- 4 more process available, 5 x 200 max threads = 1000 (as you can see 
bellow, math matches maxrequestworkers)MinSpareThreads         25  
MaxSpareThreads         100ThreadsPerChild         200 <-- threads per child 
processThreadLimit             200 <---max threads per child 
processMaxRequestWorkers       1000 <--- a total of 1000 
threadsMaxConnectionsPerChild  10000000
This is an example, adjust to your needs.


-- 
Daniel FerradalIT Specialist
email         dferradal@gmail.comlinkedin     es.linkedin.com/in/danielferradal

  

Reply via email to