Hello, >From my experience, I think the problem may be related with the 64 bits.
I've servers with AOLServer 32 bits, and AOLServer 64 bits, and I have seen 64 bits growing faster in memory (and even not decreasing through time), until it takes a considerable amount of memory (then I have to restart it). Also, around each 2-3 hours, AOLServer will go wild and eat 100% of one CPU core for around 1 minute... but will continue serving requests slower than usual. My 32 bits server is a FreeBSD 7, and my 64 bits server is an up-to-date Debian Linux. I don't know if it has something to do with the OS or with the 32/64 bits, but the fact is that my Debian Linux 64 bits gives problems that the FreeBSD 32 bits doesn't give. Regards, Juan José - Juan José del Río | (+34) 616 512 340 | [EMAIL PROTECTED] Simple Option S.L. Tel: (+34) 951 930 122 Fax: (+34) 951 930 122 http://www.simpleoption.com On Wed, 2008-10-29 at 06:53 -0400, Scott Goodwin wrote: > It appears that you have the same problem in all of your servers; the > goal is to find out what part of the code is failing and under what > conditions. Three things stand out: failed servers are under a heavier > load than those that don't exhibit the failure; the failure happens > shortly after shifting a load via pound onto an already running > aolserver instance; the failure happens after a reload of your procs > on that server. Since you've said that none of the loads is heavy I > don't think this problem is triggered by aolserver being overwhelmed > with traffic. This leaves two things: the shifting of the load itself > to an already running aolserver instance and the reloading of the > procs on that aolserver instance. I suspect the problem is related to > reloading the procs with ns_eval, and not due to load shifting or load > volume, but we need to confirm that. > > > Is there a way you can run an aolserver instance directly answering > queries without using pound? Maybe you could set up a test server that > you then use http_load or apache bench on. Once running, hit it with a > load and see if it stays up for at least 10-20 minutes. If it does, do > a reload of your procs on that server without doing anything else -- > what I expect is that the aolserver instance will crash shortly after > doing the proc reload. You can then restart the server and try it > again, this time reloading the procs immediately. Then repeat, but > reload the procs after 5 minutes or so. In each case, determine how > long it takes the server to crash after the proc reload (make sure the > aolserver instance has started and continues to server connections > before, during and after the reload). > > > If anyone else is experiencing the same problems, please post your > information along with your configuration. > > > /s. > > On Oct 29, 2008, at 5:29 AM, Rami Jadaa wrote: > > > Hi Scott, > > Thanks for your reply. > > > > I don't think that I can send the log as it will be so big , as > > AOlserver initiates and load a lot of ACS code... > > > > And for the checksum, we did the following: > > Using pound, we shifted the load going to this webserver to another > > server on another machine where it uses a different local copy of > > the same application, and then after the reload, the server were we > > shifted the load to crashed, and the old one didn't!! > > So i can take out he doubt on file corruption, right? > > > > > > On Tue, Oct 28, 2008 at 7:50 PM, Scott Goodwin <[EMAIL PROTECTED]> > > wrote: > > Rami, > > > > > > Tcl is attempting to create a new hash table entry on a hash > > table that was either never created or was created but has > > ceased to exist -- most likely the pointer to that hash > > table is null or corrupted. This could be something in > > AOLserver that uses the Tcl_Hash* API. First steps: > > > > > > 1. Send a copy of the nslog output for a clean startup > > through to the point where it crashes; that might indicate > > where it's getting fouled up. If that portion of the nslog > > is not very long (say no more than 100-150 lines) you can > > cut and paste into the message; otherwise attach it as a > > separate file (but limit it to the smallest necessary size > > -- don't want multimegabyte files). > > > > > > 2. Do a checksum of all your own Tcl code files used by > > AOLserver on a known good machine and those same Tcl files > > on the bad one; compare the two outputs to see what Tcl > > files on the bad machine differ from the good one. > > Investigate those differences. > > > > > > /s. > > > > > > > > > > On Oct 28, 2008, at 10:48 AM, Rami Jadaa wrote: > > > > > > > > > > Hello Everyone, > > > > > > We are running multiple instances of AOLserver on > > > different machines, and I am enjoying the reload > > > functionality to reload the proc libraries using ns_eval > > > source {fileName} in each one of them... > > > > > > However, one of the AOLservers crashes after few minutes > > > from the reload. > > > > > > The strange thing is that this is the only AOLserver that > > > crashes, while others don't!!! and I noticed that just > > > before the crash, the following error happens (which means > > > something in the C breaks, and I am assuming that it could > > > be in the TCL interpter, Curently tcl 8.4.16 ( not > > > AOLserver...But this is only an assumption): > > > > > > "called Tcl_CreateHashEntry on deleted table" > > > > > > We use this server to serve multiple domains and have a > > > pound load balancer in the front , For example if the > > > request comes for www.xyz.com we serve xyz service related > > > site and contents and if the request comes for www.abc.com > > > we serve abc related contents and site. In total we are > > > serving around 25 different sites like this . We are not > > > using any virtual hosting module or feature of Aolserver . > > > The total traffic of the server is not high . > > > > > > Any idea anybody!!! Have anyone using the reload > > > functionality noticed that it could crash the AOLserver? > > > > > > Environment : > > > Aolserver 4.0.10 , fetched from CVS almost 6 months > > > back . > > > nsoracle Oracle Driver version 2.8a1 > > > nsmysql CVS > > > Oracle 10gR2 Libraries > > > AMD x86_64 RHEL 4 > > > Curently tcl 8.4.16 also tried tcl 8.4.11 > > > > > > > > > Please help as this is driving me crazy :( > > > > > > Thanks in advance > > > -- > > > AOLserver - http://www.aolserver.com/ > > > > > > > > > > > > To Remove yourself from this list, simply send an email to > > <[EMAIL PROTECTED]> with the > > > body of "SIGNOFF AOLSERVER" in the email message. You can leave > > the Subject: field of your email blank. > > > > > > > > > > > > -- > > AOLserver - http://www.aolserver.com/ > > > > > > > > To Remove yourself from this list, simply send an email to <[EMAIL > > PROTECTED]> with the > > body of "SIGNOFF AOLSERVER" in the email message. You can leave the > > Subject: field of your email blank. > > > > > > > > > > > > -- > > AOLserver - http://www.aolserver.com/ > > > > > > > > To Remove yourself from this list, simply send an email to <[EMAIL > > PROTECTED]> with the > > body of "SIGNOFF AOLSERVER" in the email message. You can leave the > > Subject: field of your email blank. > > > > > > > > -- > AOLserver - http://www.aolserver.com/ > > > > To Remove yourself from this list, simply send an email to <[EMAIL > PROTECTED]> with the > body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: > field of your email blank. > > -- AOLserver - http://www.aolserver.com/ To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.