Hi Pete, Well, I eliminated WeightGate for the time being, just to do my "due diligence".
Also, since there is a fix sized buffer, I assume actually LOWERING the 3rd number (the allocation for each non-interactive process) would allow for MORE parallel processes to run (as long as the value is still large enough to support each of the applications that rely on it.) Of course, I assume the "heap" issue in reality is actually a SECONDARY problem ( a symptom of too many non-interactive tasks being launched and not completing). Since the 'heap' space is finite, there is a hard limit as to how many processes can be in a wait state at the same time. The problem to focus on is not the known, limited heap, but rather the reason why these processes were unable to complete and thus eventually too many processes being active. Best Regards, Andy From: Pete McNeil [mailto:[EMAIL PROTECTED] Sent: Saturday, October 04, 2008 10:07 PM To: Andy Schmidt Cc: [EMAIL PROTECTED] Subject: Re: FW: [sniffer] Re: Sniffer 3.0 Froze Mail Server Hello Andy, Saturday, October 4, 2008, 9:22:39 PM, you wrote: > Hi Pete, Here the log files. I can't tell you WHEN the problem was triggered. I was off site and was alerted around noon that the SMTP service had become unresponsive. I assumed it had crashed, but found it running. Thus I tried restarting the SMTP service, but after shutting down, it wouldn't allow me to restart. That's when I started looking a bit more closely. Once I realized that I had all these SNFclient processes running (I checked the event log to see if it would give me any clue - but since the errors had been occurring for a while, my system event log had wrapped around, so I couldn't tell when it actually started and how long it may have taken between the actual problem and until the SMTP service became unresponsive. This Imail server is a PowerEdge 2950, Quad CPU, 3GHz. 2 GB of RAM and normally using about 1.5 GB of virtual RAM and on weekends, CPU load is usually below 10%. When this was going on, I didn't pay close attention because I wasn't quite sure yet what was going on and was trying to figure out how to get out of it. But, based on the memory use graph, I would guess it had maxed out 4 GB of virtual RAM, which eventually starved the SMTP service and prevented it from accepting more connections.. As soon as I flushed the command line programs, the memory curve dropped very sharply by easily half. Sorry - don't have anything more specific. I've been watching your telemetry and I don't think the problem was triggered by an ordinary overload. Your message rate is not high enough to cause that -- SNFClients will only wait about 30 seconds or so at most if they are unable to make contact - - even on the busiest systems. The other thing that strikes me is that you had to kill a lot of imailsrv.exe instances as well-- this is new and very different. Once the "mystery heap" was exhausted I would expect SNFClient instances to build up in a broken state (0x0000142) but there is no good reason for imailsrv instances to build up that I can think of -- except maybe some kind of list processing event? (IIRC, imailsrv is called to handle list processing requests through an alias -- it's been a while). I will check the SNF log to see if I can identify anything useful. Thanks, _M -- Pete McNeil Chief Scientist, Arm Research Labs, LLC.
