Thanks all for the insight...

And just as Charlie predicted, the event happened again without tripping the alert.

One benefit, even though I was not actively watching what happened, I did have the server monitor running. The event happened when the server was only using 440MB, with 1.2GB free in the jvm allocation. So it definitely is not a memory issue. (It also happened in between GC cycles, so that isn't the issue either.)

As for the possibility of the CPU, I won't discount this, but I doubt it would be from CF. We do use CFDocument/CFPDF which I know grab resources, but normally those pages are during the morning, and it actually happened twice last night at a time when I would not expect it.

I'll have to gather more information. I'm starting to think that the cause may be outside CF. I'm going to try to look at all the system logs and try to piece together exactly what was happening at the time of the event.

Another question though... Fusion Reactor monitors the entire system, not just CF right? (i.e. it can track running system processes, not just what CF is doing) If this is true, this may be the next step if my efforts are fruitless.

Thanks,
Frank


On 08/09/2013 12:55 AM, Charlie Arehart wrote:

Like you, I would think this is not memory related. I think that's just a really old error message, from the days when even the then Macromedia engineers could only throw up their hands and guess when something was amiss.

I recently saw this error message happening for a client where we found (since they were on IIS) that the jrun_iis6_wildcardxxxx.logs (in [ColdFusion9]\runtime\lib\wsconfig\nn\LogFiles) had indications of errors also. I realize you're on Apache, and you say you looked at all the logs, but did you check out those logs in that wsconfig dir and its subdirs? It's just a stab in the dark whether any log messages there (around the same time) will be useful.

I would focus on something making the CF instance not responsive. I know you said you raised the simult request threads from 10 to 40, and it seemed fine at 10. But maybe you have new load, or a new problem that makes requests hang.

As Ajas said, FR (or as you're using it, the CF Server Monitor) can show you any running requests (the CFSM only show them if you turn on "start monitoring"). If you can be on when it happens you may be surprised what you find. If all 10 (or now 40) are hung, even if only for a while, that could lead to the error---not that CF's down, but the connector thinks it can't be reached.

And as you noted in a later message, turning on the alerts will help (in either CFSM, again where "start monitoring" must be enabled for them to work, or in FR, or SeeFusion), as that will give you info even when you can't be "watching" the monitors. Since you're using the CFSM, and you say you configured the alerts, did you confirm that you get the email they send? There's no test feature. What I do is set the memory alert to below the current memory used, which should trigger an alert within a few minutes. But then I turn that alert off. I find it useless, since the JVM (since 1.5) can often let used memory climb to the max before deciding to do a major GC, so you can get those memory alerts when there's no real problem, if indeed a GC at that point would have collected a lot of "not really used" memory.

But I do recommend that slow server alert in the CFSM, or the running requests alert in FR. For almost everyone, if you have many requests running at once, that's a "canary in the coal mine" indicating that problems may be afoot. The question then is whether the alert shows many slow requests. If it just shows many fast ones, then that is just a sign of a lot of traffic, and if it's being handled fast, you need to increase the number of "max simult requests", and the alert level in whatever monitor you're using.

And be careful about setting the other values in "request tuning" so low (web services, flash remoting, and remote cfcs). There's never a harm in them being more than you need. But if they are less than you need, that could be where a bottleneck happens. I know you say you don't serve web services, but I've seen shows have their own cf pages calling their own CFCs as web services. And if that request limit was low, then that becomes a single threading bottleneck. Or maybe you DO have code calling CFCs remotely (via ajax). Or about flash remoting, the monitor (and FR) use those, and your own code may (even if unexpectedly). Again, why constrict them? If you don't use them, there's no harm in them being larger (like 5 or 10, each).

Finally, note that you could have cf requests using either cfthread or reporting, and there are limits for each of those (configurable in the admin). And though you are not using CF Standard, I'll say for other readers that they could have all this sort of problem caused by using some tag that is itself single-threaded in CF Standard, as are many tags, including cfdocument, cfpdf, and more. That could cause a "low traffic" site to still have hung requests.

Let us know if any of that helps, or not. But yes, if it remains and you don't solve it, I am available for consulting, and with my satisfaction guarantee, you don't have to pay for time you don't feel is valuable.

/charlie

*From:*ad...@acfug.org [mailto:ad...@acfug.org] *On Behalf Of *Frank Moorman
*Sent:* Thursday, August 08, 2013 7:42 PM
*To:* discussion@acfug.org
*Subject:* [ACFUG Discuss] Out of Memory?!?

All,

I'm trying to figure out and determine a Jrun Out of Memory error. I get the following in my logs:

[Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182]  returning error page 
for JRun too busy or out of memory
[Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699]  returning error page 
for JRun too busy or out of memory


It doesn't happen often, (maybe once or occasionally twice a business day) but as everyone understands, users aren't happy when it happens to them.

This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm version 1.7. (The jvm was installed separately from CF for security and coldfusion uses it.)

I doubt it is actually an out of memory condition (though I could be wrong) The server has 6GB of physical memory and another 6GB of swap. It rarely needs to use swap. (i.e. I have not observed it.)

The jvm is given significant memory to use as well. It is using a 64bit jvm with the settings of 1GB min JVM heap, as well as a 3GB max. When I look through the server monitor, it is normal to see 1 to 1.5GB allocated and between 100-750MB used. (I see a normal sawtooth pattern with the memory usage, so it looks like what I would expect from the garbage collection routing. It does spike occasionally but I have never seen it close to the 3GB max. (I've never even seen it hit 2GB used.)

The server is set for 40 template requests (I recently upped it from 10 to see if that was the problem and it still occurred with the same frequency.) Flash remoting is set to 2, webservice 1, CFC 1. (These remote settings are only set for the monitor, as the server does not provide any webservices outside the running application) Jrun is set to 50 requests, and 1000 queued. (Enough to cover the CF requests.)

I looked at Charlie's blog... I have checked the logs, and other than the apache error log (above) I do not see anything. I've check the system /var/log/messages, I've checked all the CF logs (I also archived everything yesterday, and the cf logs are practically empty even after today's occurrence.) I did not find any jvm abort logs that Charlie mentioned in his blog. (I checked in the CF directory mentioned as well as the system logs and the actual JVM directory) I also checked the Jrun log (in /opt/jrun4/logs/cfusion-event.log ) and was surprised because the only entries were months ago. (Because of the age of the log, I'm curious if I am looking at the right place for it.)



Does anyone have any ideas on what might be happening? or something else that I should check?



I have searched the web and found different ideas (even the rare "add more memory")

Another mentions the requests being overloaded, but I honestly do not believe that the 10 simultaneous template requests was low for the traffic for this site. After quadrupling it, with the problem still occurring, it is even less likely.

I've seen some mentioning client variable storage, but the server is set to use cookies for that, not a database. While I do not use client storage, I know there are items like the last time visited etc, so I may just turn it off completely.

Another one I found interested mentions a bug with MySql drivers with the "Maintain Connections" setting and suggested to uncheck this box. I search for this and found the bug mentioned, one site even speculated it was still a problem with CF9, but I could not find any details. Does anyone know of this issue, I've seen it mentioned, but a lack of any details other than its bad to have that checked. (The page that mentioned it did say it ate memory.)


I'd love more ideas, I know these are not an easy or straight forward error. I may try removing the client storage next, but other ideas are welcome. (i.e. I'm not very convinced that the other things I found on the web will be effective.)

Thanks,
Frank


-------------------------------------------------------------
To unsubscribe from this list, manage your profile @
http://www.acfug.org?fa=login.edituserform

For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by FusionLink <http://www.fusionlink.com>
-------------------------------------------------------------


-------------------------------------------------------------
To unsubscribe from this list, manage your profile @
http://www.acfug.org?fa=login.edituserform

For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by FusionLink <http://www.fusionlink.com>
-------------------------------------------------------------




-------------------------------------------------------------
To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform

For more info, see http://www.acfug.org/mailinglists
Archive @ http://www.mail-archive.com/discussion%40acfug.org/
List hosted by http://www.fusionlink.com
-------------------------------------------------------------

Reply via email to