On 17 Aug 2006, at 14:14, David Haggerty wrote:

Here is our current setup:

All servers are up-to-date running OS X 10.4.5 and the most recent version of WebObjects:

- Web Server also running WOMonitor

- App Server 1 [running production app]

- App Server 2 [running production app]

- App Server 3 [running test and batch app]

We statically reference the four servers in the apache config:

WebObjectsConfig http://172.16.1.11:1085,http://17.... Etc.

For some reason App Server 3’s wotaskd will stop responding. Sometimes this corresponds with high CPU usage, other times it’s somewhat random. However, when App Server 3’s wotaskd stops responding, it causes the web server to become extremely sluggish. My guess is that it is waiting to communicate with wotaskd on all app servers and this is causing the slowdown.
Yes. It's consulting each in turn and the one on App Server 3 is wedged. I used to see this lots on WO451 single-box deployments. My fix was as follows:

[1] edit /System/Library/WebObjects/Adaptors/Apache/apache.conf on the webserver to change to a static file-based configuration.
/Library/WebObjects/Configuration/WOConfig.xml is a good filename.
To obtain the XML you need, for each of your app servers do something like

curl http://172.16.1.11:1085/cgi-bin/WebObjects/wotaskd.woa/wa/ woconfig > file-for-172.16.1.11

and edit it all together.

[2] Write a shell script, to be run periodically on each appserver by timed execution services such as cron, to do 'netstat -n' and count the number of connections to port 1085; if it's lots, to do some shellscript to get wotaskd's pid and kill it. Something (depending on the age of your installation, either woservice or launchd) will then restart it.

WO apps will happily run without wotaskd, but long term you do need wotaskd to start/stop/add/remove instances etc.

As soon as I do one of the following combinations, it recovers: restart Apache on the web server, restart wotaskd on app server 4, restart app server 3.

I tried swapping the instances on App Server 2 and 3 (I made 2 run the test and batch apps and 3 run production) and the same wotaskd problems occurred but on App Server 2 now instead of 3. Now it is probably because of high CPU usage on the batch instance, which we need to look into. But why shouldn’t the apache module be able to handle one server that is not responding and keep processing requests quickly on the remaining good app servers?

Because at some point the modulein Apache on the webserver needs to renew its cache of the complete WO configuration and it will block until this completes; so your request for a page from a woapp on one application server can cause request/response handling via the module to block if the wotaskd on a different application server is wedged.



Thanks in advance!

David



P.S. I also noticed in WOAdaptorInfo that it says the WebObjects Server Adaptor is 4.5.1. Is this correct and the most up-to-date adaptor?

Yes. The adaptor got rewritten for WO4.5 and the WO4.5.1 module is essentially the same for WO 5.x too. The 5.x adaptor has the same string in it; go to /System/Library/WebObjects/Adaptors/Apache and do 'strings mod_WebObjects.so | sort -u | grep 4'. Other strings in there show the module was built for Apache 1.3.33.

---
Regards Patrick
OneStep Solutions Plc
www.onestep.co.uk

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-deploy mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-deploy/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to