You may want to double check any soft_timeout()s. If any of them go off, there will be no further timeouts around and the code is expected to properly check connection->aborted before doing anything. A quick glance doesn't show any such obvious things, but it is possible. Temporarily replacing all soft_timeout()s with hard_timeout()s would tell if this is the problem, although it could break some things; haven't really looked at the code...
On Sun, 13 Apr 1997, Brian Moore wrote: > Nope, it's a http transfer (but a large one). Not sure what it is, though: > it seems to be that alarm(0) is getting called [which in my looking at the > code is a bad thing to do] somewhere. The client request on this is on the > other side of a packet filtering router, but at 10mbps, so it shouldn't be > a client timeout. > > Since it got through the flush-the-buffer stuff in saferead(), I think it's > not the speed, just the dropped alarm. See I printed out the value of > alarms_blocked, which in theory should mean it's not blocked. :) I've left > a couple of these children running (there were three transferring files > via http from www.cdrom.com. The specific URLs (though I doubt it matters) > GET http://www.cdrom.com/pub/quake/quakec/weapons/mini20.zip > and > GET http://www.cdrom.com/pub/quake/quakec/weapons/pnc1_02b.zip > I killed the third demon-child -- those two are still running. > > Since we have about 50 machines on the far side of this router using illegal > IP's, it may be hard to spot: they do a good amount of web/ftp access and > it all runs through Apache, so it's a rare occurence. > > When tracking down the missing unblocks, I did insert some code to > whine... something like: > > alarm_save = alarm(0); > if (!alarm_save ) > syslog(LOG_DAEMON | LOG_EMERG, "saferead, no alarm! %p", getpid()); > else > alarm(alarm_save); > > When that code was in place in 1.2b7 I did see a bunch of times saferead > was getting called with no alarm, which shouldn't happen (though I will > confess I don't know what alarm() returns if say 1/2 a second is remaining > on Solaris). > > I'll see if I can find who was downloading quake this week and if they did > anything like abort it or anything. > > On Sun, 13 Apr 1997, Chuck Murcko wrote: > > > Thanks for the report, Brian. It looks like a large file transfer is > > indeed punching through a soft timeout. I assume these are FTP > > transfers? I can duplicate your environment, so I should see the problem > > when I test for it. > > > > Brian Moore wrote: > > > > > > >Number: 374 > > > >Category: mod_proxy > > > >Synopsis: mod_proxy(?) seems to alarm(0) somewhere > > > >Confidential: no > > > >Severity: serious > > > >Priority: medium > > > >Responsible: apache (Apache HTTP Project) > > > >State: open > > > >Class: sw-bug > > > >Submitter-Id: apache > > > >Arrival-Date: Sun Apr 13 12:00:01 1997 > > > >Originator: [EMAIL PROTECTED] > > > >Organization: > > > apache > > > >Release: 1.2b8 > > > >Environment: > > > Solaris 2.5, all recommended patches, gcc 2.7.2 > > > >Description: > > > Looks like there's one other problem in mod_proxy with alarms being > > > turned off > > > (not blocked via the block_alarms() call, but alarm(0)'d for some > > > reason). I'm > > > guessing on the module involved, since the three dead children this > > > morning > > > were all doing proxy stuff. > > > > > > The backtrace of a child that's been waiting for 110k seconds: > > > #0 0xef67792c in _read () > > > #1 0x29364 in saferead () > > > #2 0x29480 in bread () > > > #3 0x488b0 in proxy_send_fb () > > > #4 0x47e78 in proxy_http_handler () > > > #5 0x432c0 in proxy_handler () > > > #6 0x1f040 in invoke_handler () > > > #7 0x21dc0 in process_request_internal () > > > #8 0x21df4 in process_request () > > > #9 0x1bf30 in child_main () > > > #10 0x1c0cc in make_child () > > > #11 0x1c8c8 in standalone_main () > > > #12 0x1cb88 in main () > > > (gdb) up > > > #1 0x29364 in saferead () > > > (gdb) print alarms_blocked > > > $1 = 0 > > > > > > So this seems to be something calling alarm(0) somewhere instead of a > > > 'logical' > > > alarms-off via the official mechanism. > > > > > > >How-To-Repeat: > > > Not sure: virtually all of our proxy users are on a 10Mbps ethernet but > > > behind > > > a firewall. This usage may or may not be relevant. The children I found > > > dead > > > this morning were fetching files from cdrom.com via http, so it should be > > > normal > > > the only odd thing is that these were quake files so they were no doubt > > > huge. > > > >Fix: > > > Will be looking at the code myself this week > > > >Audit-Trail: > > > >Unformatted: > > > > -- > > chuck > > Chuck Murcko > > The Topsail Group, West Chester PA USA > > [EMAIL PROTECTED] > > > > >
