I think I found the problem (.. and it's not the cgid exiting problem).
The problem was because the default Listen Backlog in mod_cgid was a little small (100 outstanding connections). I got the following tusc log for the httpd processes :
{62717} connect(13, 0x9fffffffbebbfe60, 94) .............. ERR#239 ECONNREFUSED {62849} connect(29, 0x9fffffffbdcbfea0, 94) .............. ERR#239 ECONNREFUSED
I increased listen backlog and the processes are a little more happy now :)
BTW, I had to increase the "ListenBacklog" for httpd also - the default 512 caused too many "Can't connect" errors. I increased it to 1024.
-Madhu
I'm glad you're making progress. But I'm wondering why raising the mod_cgid Listen backlog was so important. If 100 mod_cgid connections wasn't enough at some point, either the workload is spikey or the steady state arrival rate of CGI requests is too fast for one daemon + your SPECweb99 CGI program to keep up.
If the latter is true, you should see more and more CGI requests building up over time in server-status with ExtendedStatus on, and a big improvement in throughput if you set DYNAMIC_CGI_GET=0 in the SPEC rc file. Then it would be worthwhile using tusc/strace/truss with some timing options set to look for unnecessary delays in mod_cgid. If that doesn't show a problem, perhaps we should have multiple cgid daemons running in parallel for best throughput. Someone brought up that idea a while back on [EMAIL PROTECTED]; cross-posting there.
Greg