[cfaussie] Re: JRUN hanging
But what do you mean by the stack trace showing that fine so where is it caught up and not resolving? A stack trace in and of itself isn't valuable. You need to see it over time. You also said, I also monitored several of the long running threads (around 2 minutes) and capture several stack traces. They were all the same. So do you really mean they all (what? All request running? And over several thread dumps) showed being stuck with at java.util.HashMap.get(HashMap.java:325) being the first line in the stack trace? If not, then you've not really got anything to conclude from the stack traces. Sorry if it sounds like I'm being contentious. I just don't know what you do or don't understand about all this, so am going only on your words. As for the sudden burden of processing at startup, that could indeed be from a bunch of requests all coming in at once. Perhaps you could queue them up in such a case. There's no built-in singleton processing in CF, but you could use CFLOCK surrounding some troublesome code that you permit to only run one at a time, but your problem is that you only want to do it at startup, or when things are going bad, so you need to use something other than CFLOCK. You could code something in CFML, though, that acts like a singleton (only let one request at a time run this during a period you detect as a trouble time). Just thinking off the top of my head here. But I'll share one other thought: others who experience slow startup times often find that it's because that flush of new requests is triggering a lot of java class loading, and there's a known issue in the JVM 1.6 version implemented with CF. Some moved back to 1.5 to solve it, while others updated to 1.6.10 (which fixed the problem) to make it go away. Have you tried either? There are various blogs about this. The first here explains the problem and how to do it: http://www.ghidinelli.com/2008/08/18/java16-u10-gives-big-boost-to-modelglue -transfer-coldspring-performance Others with more other detail are: http://www.compoundtheory.com/?action=displayPostID=270 http://corfield.org/blog/index.cfm/do/blog.entry/entry/Java_6_and_ColdFusion _8 Hope that's helpful. /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Wednesday, January 14, 2009 1:30 AM To: cfaussie Subject: [cfaussie] Re: JRUN hanging Hi guys We've just a server hang. It was a little different to usual (perhaps because Fusion Reactor is installed). This time the website was showing the JRun error message. When I looked at the JRun process the RAM was up at 615MB!!! I restarted JRun.exe. As the website came back up it started to chock again. There is obviously extra load as all the objects, web services etc run for the first time. I know that with the 3rd party web service which I've spoken of in previous posts that always the first call takes a good couple of minutes. Unfortunately when it happens on the live server you usually get at least half a dozen users trying to hit the page which consumes this web service so all requests take a long time and chew up a lot of resources. I was watching things through FR and the Task Manager: the CPU was up near 100% plus the RAM was quite high (it normally sits around 200Mb but was 350+). Meanwhile I couldn't even hit a very light weight page on the site. I used the KILL feature in FR to terminate about 20 threads and eventually the CPU usage and RAM dropped down to stable levels. I also monitored several of the long running threads (around 2 minutes) and capture several stack traces. They were all the same. Perhaps someone will be able to explain it to me so I can understand what I'm reading. For example I read the below that the CFINVOKE was fine so where is it caught up and not resolving? at java.util.HashMap.get(HashMap.java:325) at org.apache.axis.utils.JavaUtils.isEnumClass(JavaUtils.java:1040) at org.apache.axis.encoding.ser.BeanSerializerFactory.init (BeanSerializerFactory.java:49) at org.apache.axis.encoding.ser.BeanSerializerFactory. (BeanSerializerFactory.java:42) at org.apache.axis.encoding.ser.BaseSerializerFactory.createFactory (BaseSerializerFactory.java:235) at org.apache.axis.client.Call.registerTypeMapping(Call.java:2296) at com.raileurope.web.ws.soap.RailEuropeWebServiceSoapBindingStub.createCall (RailEuropeWebServiceSoapBindingStub.java:2454) - locked 0x1534d630 (a com.raileurope.web.ws.soap.RailEuropeWebServiceSoapBindingStub) at com.raileurope.web.ws.soap.RailEuropeWebServiceSoapBindingStub.doBuildPackag eForCityPair (RailEuropeWebServiceSoapBindingStub.java:2663) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:25
[cfaussie] Re: JRUN hanging
) at coldfusion.bootstrap.BootstrapServlet.service (BootstrapServlet.java:78) at jrun.servlet.FilterChain.doFilter(FilterChain.java:86) at com.intergral.fusionreactor.filter.FusionReactorFilter.B(Unknown Source) at com.intergral.fusionreactor.filter.FusionReactorFilter.A(Unknown Source) at com.intergral.fusionreactor.filter.FusionReactorFilter.doFilter (Unknown Source) at jrun.servlet.FilterChain.doFilter(FilterChain.java:94) at jrun.servlet.FilterChain.service(FilterChain.java:101) at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:91) at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42) at jrun.servlet.JRunRequestDispatcher.invoke (JRunRequestDispatcher.java:257) at jrun.servlet.ServletEngineService.dispatch (ServletEngineService.java:541) at jrun.servlet.jrpp.JRunProxyService.invokeRunnable (JRunProxyService.java:204) at jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable (ThreadPool.java:318) at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable (ThreadPool.java:426) at jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable (ThreadPool.java:264) at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66) By the way: this doesn't really reflect what I raised in my first thread because this shows what happen straight after a CF restart where as the original issue happens when everything is stable and then suddenly CF hangs. I'm still waiting for that to happen. Cheers Matthew On Jan 8, 6:27 pm, charlie arehart charlie_li...@carehart.org wrote: Well, before you get too excited, be careful that you're not misunderstanding the report. The memory reported there is NOT for the current request, but rather for the entire CF server. So that number alone isn't too meaningful. But if it ROSE by 32 meg when you ran a request, that would be different (but even then you can't be positive that a given request caused the rise. Other things can be running to increase memory besides the request you're running.) Rather than view the memory in the stack trace, to observe how it's changing over time, I'd recommend instead you watch either the graphical interface they offer for the memory graph, or view the info in the resource-x.log file, both of which report it at 5 second intervals. But if that indeed is a stack trace of the call to the web service, notice the top line which says: java.net.SocketInputStream.socketRead0(Native Method) That would be the kind of thing (a native method) that CF's request timeout feature can't interrupt normally, but again I have confirmed that in my test where the TIMEOUT of a CFINVOKE webservice call works, it is indeed timing out while sitting in that same state. My assertion is that CF has added some additional timeout callback mechanism in such a case, which it doesn't do as a matter of course on other code. Makes sense to me. The question is why it doesn't timeout for you--but that will be the thing to confirm now. When you have a request that you see runs long, and you KNOW it has a timeout, if you sit there refreshing the stack trace, does it remain in this socketread0 method beyond that timeout time? It doesn't for me. :-) /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Wednesday, January 07, 2009 11:32 PM To: cfaussie Subject: [cfaussie] Re: JRUN hanging Hi Charlie, Out of interest (while I await a hung server) I've configured FR 2.0.4 on my dev PC and run a couple of calls to the web service whilst I watch them in FR. I was alarmed to see that my single request was costing 32MB of memory (32,438KB) This snapshot was taken at about the 30th second (the whole request finished after 55s). Here is the top part of the Thread Stack Trace: Thread Stack Trace Trace Time: 15:26:34.224 08-Jan-2009 Request ID: 25 Script Name: http://foo Started: 15:26:04.177 08-Jan-2009 Exec Time: 30047ms Memory Used: (6%)32,438KB Memory Free: 472,457KB Thread ID: jrpp-8 Priority: 5 Hashcode: 30900283 jrpp-8 prio=5 tid=0x03c5f468 nid=0x3c4 runnable [559d000..559fd90] at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at com.sun.net.ssl.internal.ssl.InputRecord.a(DashoA12275) at com.sun.net.ssl.internal.ssl.InputRecord.a(DashoA12275) at com.sun.net.ssl.internal.ssl.InputRecord.read(DashoA12275) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.a(DashoA12275) - locked 0x158afd20 (a java.lang.Object) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.a(DashoA12275) at com.sun.net.ssl.internal.ssl.AppInputStream.read(DashoA12275) - locked 0x158afc88 (a com.sun.net.ssl.internal.ssl.AppInputStream) at java.io.BufferedInputStream.read1
[cfaussie] Re: JRUN hanging
I've got Fusion Reactor 2.0.0 (this is the only licence we have) up and running and am looking at the Request History. We've not had a hang yet but I just wanted to get familiar with the tool. I'm looking at the requests which are taking 45+ seconds and they are all pages which involve a call to the web service. The pages are not timing out so I'm not getting error logs showing where the timeout occured however is there anyway to see which lines of code are the slow bits? I'm trying to work out if the call to the web service is the slow bit or if it's the unpacking and trying to display the data which may be slow. I'll report back what I'm seeing when the next hang happens. @Charlie: I've tried your idea of browsing to the WSDL while a hang is occuring and I've had mixed scenerios. Sometimes it won't come up, sometimes it comes up but when I try to submit to a method it times out, sometimes it doesn't time out but it does take ages. Anyway, once I get the next hang I'll be able to report back on what I see. Cheers Matthew On Jan 7, 2:48 pm, charlie arehart charlie_li...@carehart.org wrote: Yes, Mark makes a good point. You are saying these are the same problems, right? I didn't pick up on it earlier, but when you said that when things hang, there's high cpu, that really does seem to point to a very different problem. I'll go back to my previous point: I'd be VERY interested if you can confirm that when things hang, whether the stack traces for running requests really are doing the webservice call you say, because as Mark says, if CF's just waiting for them to come back, that shouldn't take up much CPU. Or, thinking outside the box, maybe in fact this IS where CPU time is being spent (and it may explain why the TIMEOUT isn't working). What if you find that indeed CF is in fact running the CFINVOKE when things hang, but instead of just waiting for the webservice result, what if the problem is that the web service returns a HUGE amount of data for some reason. Perhaps there's an error in the web service, or perhaps the variation depends on the kind of data users request. Just as a CFQUERY could be written to bring back a million records from a database (and user input might vary that result from 1 to a million), so too could a web service call that basically returns query-driven data that varies with user input. Or maybe it's not so much huge, but it's complex, and CF is spending time in the Axis java code called by CFINVOKE trying to render the result, converting it from whatever format it's served in to a form that CF can process. That would reflect in it being hung on the CFINVOKE, and possibly not interruptible because it's processing file system I/O. That's just a guess. But all this again speaks to the value of logging the web service calls. If you add in logging the input being passed, you may find that there's a pattern where the requests that hang up pass in some value. That's just a guess. I realize you may content that you know that there are times when it hangs on something you think should bring back one record. But here's another idea: if you do log it, and you do get a situation where CF hangs up that you think is due to the web service calls (indeed, even if you get the stack trace and prove it), then I would recommend you go try to browse the web service yourself, separately from that CF server. Some web services can be called entirely in a browser, if the input arguments are simple (as inhttp://url/service.wsdl?method=somemethodinputarg1=value1). Or if that can't work, setup a CF page that you run on a different server (not the one that's hung), such as your laptop, and run the request there. See if you get any back any input, to see what it looks like. Again, all just some ideas to consider. /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Mark Mandel Sent: Tuesday, January 06, 2009 10:29 PM To: cfaussie@googlegroups.com Subject: [cfaussie] Re: JRUN hanging Honestly, I'd be surprised if it was the problem you described (although I could be wrong), simply because it's maxing out the CPU, which tends to lean towards and infinite loop. If it was simply locking at the point of the webservice, then there would be no (or very little) CPU activity, but JRUN just wouldn't do anything. Mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
No, you can't with FR get any sort of profile of what lines were slowest within a request. Only the CF8 Server monitor has the depth of insight to report that. What I had been referring to is if you can catch the request WHILE it's running, in which case you can get a stack trace by clicking a button to the left of the request, while you see it running. One other thing: you refer to my earlier suggestion as browsing to the WSDL, but to be clear, I wasn't proposing that (though not a bad idea). I was proposing actually running the web service, by adding the method attribute and passing in any needed simple arguments. But the fact that you sometimes can't even get back the WSDL alone is certainly worrisome, and something to bring to the attention of whoever runs the server being called. This is just like how an end user would call any of us if our servers weren't responding to their requests. :-) /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Wednesday, January 07, 2009 8:04 PM To: cfaussie Subject: [cfaussie] Re: JRUN hanging I've got Fusion Reactor 2.0.0 (this is the only licence we have) up and running and am looking at the Request History. We've not had a hang yet but I just wanted to get familiar with the tool. I'm looking at the requests which are taking 45+ seconds and they are all pages which involve a call to the web service. The pages are not timing out so I'm not getting error logs showing where the timeout occured however is there anyway to see which lines of code are the slow bits? I'm trying to work out if the call to the web service is the slow bit or if it's the unpacking and trying to display the data which may be slow. I'll report back what I'm seeing when the next hang happens. @Charlie: I've tried your idea of browsing to the WSDL while a hang is occuring and I've had mixed scenerios. Sometimes it won't come up, sometimes it comes up but when I try to submit to a method it times out, sometimes it doesn't time out but it does take ages. Anyway, once I get the next hang I'll be able to report back on what I see. Cheers Matthew --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
Hi Charlie, Out of interest (while I await a hung server) I've configured FR 2.0.4 on my dev PC and run a couple of calls to the web service whilst I watch them in FR. I was alarmed to see that my single request was costing 32MB of memory (32,438KB) This snapshot was taken at about the 30th second (the whole request finished after 55s). Here is the top part of the Thread Stack Trace: Thread Stack Trace Trace Time: 15:26:34.224 08-Jan-2009 Request ID: 25 Script Name: http://foo Started: 15:26:04.177 08-Jan-2009 Exec Time:30047ms Memory Used: (6%)32,438KB Memory Free: 472,457KB Thread ID:jrpp-8 Priority: 5 Hashcode: 30900283 jrpp-8 prio=5 tid=0x03c5f468 nid=0x3c4 runnable [559d000..559fd90] at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at com.sun.net.ssl.internal.ssl.InputRecord.a(DashoA12275) at com.sun.net.ssl.internal.ssl.InputRecord.a(DashoA12275) at com.sun.net.ssl.internal.ssl.InputRecord.read(DashoA12275) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.a(DashoA12275) - locked 0x158afd20 (a java.lang.Object) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.a(DashoA12275) at com.sun.net.ssl.internal.ssl.AppInputStream.read(DashoA12275) - locked 0x158afc88 (a com.sun.net.ssl.internal.ssl.AppInputStream) at java.io.BufferedInputStream.read1(BufferedInputStream.java:220) at java.io.BufferedInputStream.read(BufferedInputStream.java:277) - locked 0x158afad8 (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:111) at org.apache.xerces.impl.XMLEntityManager$RewindableInputStream.read (Unknown Source) Cheers Matthew On Jan 8, 12:33 pm, charlie arehart charlie_li...@carehart.org wrote: No, you can't with FR get any sort of profile of what lines were slowest within a request. Only the CF8 Server monitor has the depth of insight to report that. What I had been referring to is if you can catch the request WHILE it's running, in which case you can get a stack trace by clicking a button to the left of the request, while you see it running. One other thing: you refer to my earlier suggestion as browsing to the WSDL, but to be clear, I wasn't proposing that (though not a bad idea). I was proposing actually running the web service, by adding the method attribute and passing in any needed simple arguments. But the fact that you sometimes can't even get back the WSDL alone is certainly worrisome, and something to bring to the attention of whoever runs the server being called. This is just like how an end user would call any of us if our servers weren't responding to their requests. :-) /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Wednesday, January 07, 2009 8:04 PM To: cfaussie Subject: [cfaussie] Re: JRUN hanging I've got Fusion Reactor 2.0.0 (this is the only licence we have) up and running and am looking at the Request History. We've not had a hang yet but I just wanted to get familiar with the tool. I'm looking at the requests which are taking 45+ seconds and they are all pages which involve a call to the web service. The pages are not timing out so I'm not getting error logs showing where the timeout occured however is there anyway to see which lines of code are the slow bits? I'm trying to work out if the call to the web service is the slow bit or if it's the unpacking and trying to display the data which may be slow. I'll report back what I'm seeing when the next hang happens. @Charlie: I've tried your idea of browsing to the WSDL while a hang is occuring and I've had mixed scenerios. Sometimes it won't come up, sometimes it comes up but when I try to submit to a method it times out, sometimes it doesn't time out but it does take ages. Anyway, once I get the next hang I'll be able to report back on what I see. Cheers Matthew --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
Well, before you get too excited, be careful that you're not misunderstanding the report. The memory reported there is NOT for the current request, but rather for the entire CF server. So that number alone isn't too meaningful. But if it ROSE by 32 meg when you ran a request, that would be different (but even then you can't be positive that a given request caused the rise. Other things can be running to increase memory besides the request you're running.) Rather than view the memory in the stack trace, to observe how it's changing over time, I'd recommend instead you watch either the graphical interface they offer for the memory graph, or view the info in the resource-x.log file, both of which report it at 5 second intervals. But if that indeed is a stack trace of the call to the web service, notice the top line which says: java.net.SocketInputStream.socketRead0(Native Method) That would be the kind of thing (a native method) that CF's request timeout feature can't interrupt normally, but again I have confirmed that in my test where the TIMEOUT of a CFINVOKE webservice call works, it is indeed timing out while sitting in that same state. My assertion is that CF has added some additional timeout callback mechanism in such a case, which it doesn't do as a matter of course on other code. Makes sense to me. The question is why it doesn't timeout for you--but that will be the thing to confirm now. When you have a request that you see runs long, and you KNOW it has a timeout, if you sit there refreshing the stack trace, does it remain in this socketread0 method beyond that timeout time? It doesn't for me. :-) /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Wednesday, January 07, 2009 11:32 PM To: cfaussie Subject: [cfaussie] Re: JRUN hanging Hi Charlie, Out of interest (while I await a hung server) I've configured FR 2.0.4 on my dev PC and run a couple of calls to the web service whilst I watch them in FR. I was alarmed to see that my single request was costing 32MB of memory (32,438KB) This snapshot was taken at about the 30th second (the whole request finished after 55s). Here is the top part of the Thread Stack Trace: Thread Stack Trace Trace Time: 15:26:34.224 08-Jan-2009 Request ID: 25 Script Name: http://foo Started: 15:26:04.177 08-Jan-2009 Exec Time:30047ms Memory Used: (6%)32,438KB Memory Free: 472,457KB Thread ID:jrpp-8 Priority: 5 Hashcode: 30900283 jrpp-8 prio=5 tid=0x03c5f468 nid=0x3c4 runnable [559d000..559fd90] at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at com.sun.net.ssl.internal.ssl.InputRecord.a(DashoA12275) at com.sun.net.ssl.internal.ssl.InputRecord.a(DashoA12275) at com.sun.net.ssl.internal.ssl.InputRecord.read(DashoA12275) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.a(DashoA12275) - locked 0x158afd20 (a java.lang.Object) at com.sun.net.ssl.internal.ssl.SSLSocketImpl.a(DashoA12275) at com.sun.net.ssl.internal.ssl.AppInputStream.read(DashoA12275) - locked 0x158afc88 (a com.sun.net.ssl.internal.ssl.AppInputStream) at java.io.BufferedInputStream.read1(BufferedInputStream.java:220) at java.io.BufferedInputStream.read(BufferedInputStream.java:277) - locked 0x158afad8 (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:111) at org.apache.xerces.impl.XMLEntityManager$RewindableInputStream.read (Unknown Source) Cheers Matthew --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
What makes you think this is the case? What version of CF are you on? If you are on CF8, you can use either - snapshots via the server monitor, a tool like JConsole, or thread dumps to find out exactly what is causing the infinite loop. Mark On Tue, Jan 6, 2009 at 5:20 PM, Matthew matthewbchamb...@gmail.com wrote: Hi everyone From time to time our website appears to hand. I believe it is CF (or JRun) chocking because each time it happens I can still connect to the web server via FTP, RDP, DB access. When I look at the Task Manager jrun.exe is using max CPU. If I restart the CF service or End Process on jrun.exe everything recovers. I'm pretty sure I know which bit of code is causing the problem. It's a web service call to a 3rd party (used quite a lot by users on the website). Most of the time when the website hangs I can't browse to the 3rd party's web service. I've tried setting timeouts on the web service call by using CFINVOKE, CreateObject, CFOBJECT etc but no improvement. I need to allow for up to 60s because some of the calls are quite complex. I'm trying to work out why it's hanging. I suspect that because CF is allowing 8 simaltaneous requests and each gets 60 seconds then once all these slots are taken up everyone else goes on a queue and with all that load the server appears to hang because it's in constant use i.e. after 60 seconds user 1 gets his timeout so user 9 takes position 8 and round and round it goes and it never stops because with constant traffic taking up 8 request slots and all taking up resources for 60 seconds it appears to hang. Does this make sense? So the question is: how do I save the server from getting into this state. I'm thinking I need some sort of scheduled task which probes the web service every 10 seconds to work out if it's up. Perhaps if it times out (3 times just to be sure) I shut of the web service (which means the rest of the website is still up). The scheduled task keeps running until no timeout occurs and then it re-activates the web service for everyone else. Has anyone been through something like this before? How did you work around it? Cheers Matthew -- E: mark.man...@gmail.com W: www.compoundtheory.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
I don't believe JConsole will natively hook into Java 1.4. Sounds like you're going to doing thread dumps. Do a google search for them, and you will find several articles. Mark On Wed, Jan 7, 2009 at 9:24 AM, Matthew matthewbchamb...@gmail.com wrote: Hi guys, Sorry, I should have given more info. @Steve: I'm not sure if it's a CF web service but I'm pretty sure it isn't. I think it may be Java. @Mark: not sure which point you were refering to with your first question? It's running on a CF7 server. So I'm guessing I'll need to use JConsole. Where do I start? Cheers Matthew On Jan 7, 7:42 am, Mark Mandel mark.man...@gmail.com wrote: What makes you think this is the case? What version of CF are you on? If you are on CF8, you can use either - snapshots via the server monitor, a tool like JConsole, or thread dumps to find out exactly what is causing the infinite loop. Mark On Tue, Jan 6, 2009 at 5:20 PM, Matthew matthewbchamb...@gmail.com wrote: Hi everyone From time to time our website appears to hand. I believe it is CF (or JRun) chocking because each time it happens I can still connect to the web server via FTP, RDP, DB access. When I look at the Task Manager jrun.exe is using max CPU. If I restart the CF service or End Process on jrun.exe everything recovers. I'm pretty sure I know which bit of code is causing the problem. It's a web service call to a 3rd party (used quite a lot by users on the website). Most of the time when the website hangs I can't browse to the 3rd party's web service. I've tried setting timeouts on the web service call by using CFINVOKE, CreateObject, CFOBJECT etc but no improvement. I need to allow for up to 60s because some of the calls are quite complex. I'm trying to work out why it's hanging. I suspect that because CF is allowing 8 simaltaneous requests and each gets 60 seconds then once all these slots are taken up everyone else goes on a queue and with all that load the server appears to hang because it's in constant use i.e. after 60 seconds user 1 gets his timeout so user 9 takes position 8 and round and round it goes and it never stops because with constant traffic taking up 8 request slots and all taking up resources for 60 seconds it appears to hang. Does this make sense? So the question is: how do I save the server from getting into this state. I'm thinking I need some sort of scheduled task which probes the web service every 10 seconds to work out if it's up. Perhaps if it times out (3 times just to be sure) I shut of the web service (which means the rest of the website is still up). The scheduled task keeps running until no timeout occurs and then it re-activates the web service for everyone else. Has anyone been through something like this before? How did you work around it? Cheers Matthew -- E: mark.man...@gmail.com W:www.compoundtheory.com -- E: mark.man...@gmail.com W: www.compoundtheory.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
As far as I can remember JConsole will just run on Java5+. That being said you should be able to run it on a Java 5 JVM and hook it remotely to inspect an app on 1.4 - I'm pretty sure though that you'd be limited in which information you can get. There's also a knowledgebase article somewhere on Adobe.com how to create a thread dump. Cheers Kai I don't believe JConsole will natively hook into Java 1.4. Sounds like you're going to doing thread dumps. Do a google search for them, and you will find several articles. Mark On Wed, Jan 7, 2009 at 9:24 AM, Matthew matthewbchamb...@gmail.com wrote: Hi guys, Sorry, I should have given more info. @Steve: I'm not sure if it's a CF web service but I'm pretty sure it isn't. I think it may be Java. @Mark: not sure which point you were refering to with your first question? It's running on a CF7 server. So I'm guessing I'll need to use JConsole. Where do I start? Cheers Matthew On Jan 7, 7:42 am, Mark Mandel mark.man...@gmail.com wrote: What makes you think this is the case? What version of CF are you on? If you are on CF8, you can use either - snapshots via the server monitor, a tool like JConsole, or thread dumps to find out exactly what is causing the infinite loop. Mark On Tue, Jan 6, 2009 at 5:20 PM, Matthew matthewbchamb...@gmail.com wrote: Hi everyone From time to time our website appears to hand. I believe it is CF (or JRun) chocking because each time it happens I can still connect to the web server via FTP, RDP, DB access. When I look at the Task Manager jrun.exe is using max CPU. If I restart the CF service or End Process on jrun.exe everything recovers. I'm pretty sure I know which bit of code is causing the problem. It's a web service call to a 3rd party (used quite a lot by users on the website). Most of the time when the website hangs I can't browse to the 3rd party's web service. I've tried setting timeouts on the web service call by using CFINVOKE, CreateObject, CFOBJECT etc but no improvement. I need to allow for up to 60s because some of the calls are quite complex. I'm trying to work out why it's hanging. I suspect that because CF is allowing 8 simaltaneous requests and each gets 60 seconds then once all these slots are taken up everyone else goes on a queue and with all that load the server appears to hang because it's in constant use i.e. after 60 seconds user 1 gets his timeout so user 9 takes position 8 and round and round it goes and it never stops because with constant traffic taking up 8 request slots and all taking up resources for 60 seconds it appears to hang. Does this make sense? So the question is: how do I save the server from getting into this state. I'm thinking I need some sort of scheduled task which probes the web service every 10 seconds to work out if it's up. Perhaps if it times out (3 times just to be sure) I shut of the web service (which means the rest of the website is still up). The scheduled task keeps running until no timeout occurs and then it re-activates the web service for everyone else. Has anyone been through something like this before? How did you work around it? Cheers Matthew _ Kai Koenig - Ventego Creative Ltd ph: +64 4 476 6781 - mob: +64 21 928 365 / +61 450 132 117 web: http://www.ventego-creative.co.nz blog: http://www.bloginblack.de --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
Hi Kai Thanks for the tip. I have already tried connecting the JConsole from my local dev machine to the live server but can't get it to connect. Perhaps I've got the port wrong. How do you work out which port the CF JVM is listening on? I'm looking into the Thread Dump idea (I've just finished reading the Adobe article you spoke of) however there is a lot to understand. Anyway, I've switched everything on so I can take a Thread Dump next time the server appears to hang (or what is really called SPINNING) however I don't think this will help in any great way. I'm pretty certain that whatever monitoring solution I find it will tell me what I already know, which is that because a big part of the website relies on a web service call to a 3rd party which goes down sometimes and because CFINVOKE won't obey the timeout parameter then I'll always have this problem. I need to work out a way to detect that the web service is out and disable it before too many users on the website instantiate a connection which won't timeout properly and cause the server to go into a spin. I will continue to investigate. If anyone else wants to chime in please feel free! By the way; is there anyway to get a report/graph which shows how much memory each variable is taking up i.e. can you take a snapshot at a moment in time to show all the session variables and how much memory they are costing? Cheers Matthew On Jan 7, 11:10 am, Kai Koenig k...@bloginblack.de wrote: As far as I can remember JConsole will just run on Java5+. That being said you should be able to run it on a Java 5 JVM and hook it remotely to inspect an app on 1.4 - I'm pretty sure though that you'd be limited in which information you can get. There's also a knowledgebase article somewhere on Adobe.com how to create a thread dump. Cheers Kai I don't believe JConsole will natively hook into Java 1.4. Sounds like you're going to doing thread dumps. Do a google search for them, and you will find several articles. Mark On Wed, Jan 7, 2009 at 9:24 AM, Matthew matthewbchamb...@gmail.com wrote: Hi guys, Sorry, I should have given more info. @Steve: I'm not sure if it's a CF web service but I'm pretty sure it isn't. I think it may be Java. @Mark: not sure which point you were refering to with your first question? It's running on a CF7 server. So I'm guessing I'll need to use JConsole. Where do I start? Cheers Matthew On Jan 7, 7:42 am, Mark Mandel mark.man...@gmail.com wrote: What makes you think this is the case? What version of CF are you on? If you are on CF8, you can use either - snapshots via the server monitor, a tool like JConsole, or thread dumps to find out exactly what is causing the infinite loop. Mark On Tue, Jan 6, 2009 at 5:20 PM, Matthew matthewbchamb...@gmail.com wrote: Hi everyone From time to time our website appears to hand. I believe it is CF (or JRun) chocking because each time it happens I can still connect to the web server via FTP, RDP, DB access. When I look at the Task Manager jrun.exe is using max CPU. If I restart the CF service or End Process on jrun.exe everything recovers. I'm pretty sure I know which bit of code is causing the problem. It's a web service call to a 3rd party (used quite a lot by users on the website). Most of the time when the website hangs I can't browse to the 3rd party's web service. I've tried setting timeouts on the web service call by using CFINVOKE, CreateObject, CFOBJECT etc but no improvement. I need to allow for up to 60s because some of the calls are quite complex. I'm trying to work out why it's hanging. I suspect that because CF is allowing 8 simaltaneous requests and each gets 60 seconds then once all these slots are taken up everyone else goes on a queue and with all that load the server appears to hang because it's in constant use i.e. after 60 seconds user 1 gets his timeout so user 9 takes position 8 and round and round it goes and it never stops because with constant traffic taking up 8 request slots and all taking up resources for 60 seconds it appears to hang. Does this make sense? So the question is: how do I save the server from getting into this state. I'm thinking I need some sort of scheduled task which probes the web service every 10 seconds to work out if it's up. Perhaps if it times out (3 times just to be sure) I shut of the web service (which means the rest of the website is still up). The scheduled task keeps running until no timeout occurs and then it re-activates the web service for everyone else. Has anyone been through something like this before? How did you work around it? Cheers Matthew _ Kai Koenig - Ventego Creative Ltd ph: +64 4 476 6781 - mob: +64 21 928 365 / +61 450 132 117 web:http://www.ventego-creative.co.nz
[cfaussie] Re: JRUN hanging
Matthew mentions the challenge of getting thread dumps in this later note. Just to clarify (as I wrote my last note before reading this one), the thread dump referred to here is in fact the same general idea as the stack trace I referred to. The good news again, if you use any of the 3 monitors, is that it's just a push-button operation, or you can even get it sent to you in an email when a problem situation is detected (in FusionReactor and CF8 Monitor, I know). Different folks use thread dump and stack trace to mean the same thing, or they can differ. In my sense of the words (and which FusionReactor and the CF8 Monitor uses), a thread dump is a stack trace for every thread, and a stack trace for one thread is what tells you what line of code is running and what java objects/methods are being called on the basis of that line of code. Without those tools, as has been alluded to here, you can get a full thread dump by some other methods that are a bit more complicated, and then you have to hunt through them to find the stack trace for a thread of interest. All doable, just easier with the CF monitor tools. Hope that's helpful. To answer your last question, Matthew, the ability to view how large variables are (or all vars in a given scope) is something that's enabled only in the CF8 Monitor (which runs only on CF8 Enterprise and Developer, sadly). Neither FusionReactor nor SeeFusion will tell you. There are also ways to get at the information by using the undocumented objects like coldfusion.runtime.SessionTracker. A google search for those will turn up ways to access that info. /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Tuesday, January 06, 2009 8:21 PM To: cfaussie Subject: [cfaussie] Re: JRUN hanging Hi Kai Thanks for the tip. I have already tried connecting the JConsole from my local dev machine to the live server but can't get it to connect. Perhaps I've got the port wrong. How do you work out which port the CF JVM is listening on? I'm looking into the Thread Dump idea (I've just finished reading the Adobe article you spoke of) however there is a lot to understand. Anyway, I've switched everything on so I can take a Thread Dump next time the server appears to hang (or what is really called SPINNING) however I don't think this will help in any great way. I'm pretty certain that whatever monitoring solution I find it will tell me what I already know, which is that because a big part of the website relies on a web service call to a 3rd party which goes down sometimes and because CFINVOKE won't obey the timeout parameter then I'll always have this problem. I need to work out a way to detect that the web service is out and disable it before too many users on the website instantiate a connection which won't timeout properly and cause the server to go into a spin. I will continue to investigate. If anyone else wants to chime in please feel free! By the way; is there anyway to get a report/graph which shows how much memory each variable is taking up i.e. can you take a snapshot at a moment in time to show all the session variables and how much memory they are costing? Cheers Matthew --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
Honestly, I'd be surprised if it was the problem you described (although I could be wrong), simply because it's maxing out the CPU, which tends to lean towards and infinite loop. If it was simply locking at the point of the webservice, then there would be no (or very little) CPU activity, but JRUN just wouldn't do anything. Mark On Wed, Jan 7, 2009 at 12:21 PM, Matthew matthewbchamb...@gmail.com wrote: Hi Kai Thanks for the tip. I have already tried connecting the JConsole from my local dev machine to the live server but can't get it to connect. Perhaps I've got the port wrong. How do you work out which port the CF JVM is listening on? I'm looking into the Thread Dump idea (I've just finished reading the Adobe article you spoke of) however there is a lot to understand. Anyway, I've switched everything on so I can take a Thread Dump next time the server appears to hang (or what is really called SPINNING) however I don't think this will help in any great way. I'm pretty certain that whatever monitoring solution I find it will tell me what I already know, which is that because a big part of the website relies on a web service call to a 3rd party which goes down sometimes and because CFINVOKE won't obey the timeout parameter then I'll always have this problem. I need to work out a way to detect that the web service is out and disable it before too many users on the website instantiate a connection which won't timeout properly and cause the server to go into a spin. I will continue to investigate. If anyone else wants to chime in please feel free! By the way; is there anyway to get a report/graph which shows how much memory each variable is taking up i.e. can you take a snapshot at a moment in time to show all the session variables and how much memory they are costing? Cheers Matthew On Jan 7, 11:10 am, Kai Koenig k...@bloginblack.de wrote: As far as I can remember JConsole will just run on Java5+. That being said you should be able to run it on a Java 5 JVM and hook it remotely to inspect an app on 1.4 - I'm pretty sure though that you'd be limited in which information you can get. There's also a knowledgebase article somewhere on Adobe.com how to create a thread dump. Cheers Kai I don't believe JConsole will natively hook into Java 1.4. Sounds like you're going to doing thread dumps. Do a google search for them, and you will find several articles. Mark On Wed, Jan 7, 2009 at 9:24 AM, Matthew matthewbchamb...@gmail.com wrote: Hi guys, Sorry, I should have given more info. @Steve: I'm not sure if it's a CF web service but I'm pretty sure it isn't. I think it may be Java. @Mark: not sure which point you were refering to with your first question? It's running on a CF7 server. So I'm guessing I'll need to use JConsole. Where do I start? Cheers Matthew On Jan 7, 7:42 am, Mark Mandel mark.man...@gmail.com wrote: What makes you think this is the case? What version of CF are you on? If you are on CF8, you can use either - snapshots via the server monitor, a tool like JConsole, or thread dumps to find out exactly what is causing the infinite loop. Mark On Tue, Jan 6, 2009 at 5:20 PM, Matthew matthewbchamb...@gmail.com wrote: Hi everyone From time to time our website appears to hand. I believe it is CF (or JRun) chocking because each time it happens I can still connect to the web server via FTP, RDP, DB access. When I look at the Task Manager jrun.exe is using max CPU. If I restart the CF service or End Process on jrun.exe everything recovers. I'm pretty sure I know which bit of code is causing the problem. It's a web service call to a 3rd party (used quite a lot by users on the website). Most of the time when the website hangs I can't browse to the 3rd party's web service. I've tried setting timeouts on the web service call by using CFINVOKE, CreateObject, CFOBJECT etc but no improvement. I need to allow for up to 60s because some of the calls are quite complex. I'm trying to work out why it's hanging. I suspect that because CF is allowing 8 simaltaneous requests and each gets 60 seconds then once all these slots are taken up everyone else goes on a queue and with all that load the server appears to hang because it's in constant use i.e. after 60 seconds user 1 gets his timeout so user 9 takes position 8 and round and round it goes and it never stops because with constant traffic taking up 8 request slots and all taking up resources for 60 seconds it appears to hang. Does this make sense? So the question is: how do I save the server from getting into this state. I'm thinking I need some sort of scheduled task which probes the web service every 10 seconds to work out if it's up. Perhaps if it times out (3 times just to be sure) I shut of the web service (which means the rest of the website is still up). The scheduled task
[cfaussie] Re: JRUN hanging
Yes, Mark makes a good point. You are saying these are the same problems, right? I didn't pick up on it earlier, but when you said that when things hang, there's high cpu, that really does seem to point to a very different problem. I'll go back to my previous point: I'd be VERY interested if you can confirm that when things hang, whether the stack traces for running requests really are doing the webservice call you say, because as Mark says, if CF's just waiting for them to come back, that shouldn't take up much CPU. Or, thinking outside the box, maybe in fact this IS where CPU time is being spent (and it may explain why the TIMEOUT isn't working). What if you find that indeed CF is in fact running the CFINVOKE when things hang, but instead of just waiting for the webservice result, what if the problem is that the web service returns a HUGE amount of data for some reason. Perhaps there's an error in the web service, or perhaps the variation depends on the kind of data users request. Just as a CFQUERY could be written to bring back a million records from a database (and user input might vary that result from 1 to a million), so too could a web service call that basically returns query-driven data that varies with user input. Or maybe it's not so much huge, but it's complex, and CF is spending time in the Axis java code called by CFINVOKE trying to render the result, converting it from whatever format it's served in to a form that CF can process. That would reflect in it being hung on the CFINVOKE, and possibly not interruptible because it's processing file system I/O. That's just a guess. But all this again speaks to the value of logging the web service calls. If you add in logging the input being passed, you may find that there's a pattern where the requests that hang up pass in some value. That's just a guess. I realize you may content that you know that there are times when it hangs on something you think should bring back one record. But here's another idea: if you do log it, and you do get a situation where CF hangs up that you think is due to the web service calls (indeed, even if you get the stack trace and prove it), then I would recommend you go try to browse the web service yourself, separately from that CF server. Some web services can be called entirely in a browser, if the input arguments are simple (as in http://url/service.wsdl?method=somemethodinputarg1=value1). Or if that can't work, setup a CF page that you run on a different server (not the one that's hung), such as your laptop, and run the request there. See if you get any back any input, to see what it looks like. Again, all just some ideas to consider. /charlie -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Mark Mandel Sent: Tuesday, January 06, 2009 10:29 PM To: cfaussie@googlegroups.com Subject: [cfaussie] Re: JRUN hanging Honestly, I'd be surprised if it was the problem you described (although I could be wrong), simply because it's maxing out the CPU, which tends to lean towards and infinite loop. If it was simply locking at the point of the webservice, then there would be no (or very little) CPU activity, but JRUN just wouldn't do anything. Mark --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---
[cfaussie] Re: JRUN hanging
Is this a CF webservice? -Original Message- From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf Of Matthew Sent: Tuesday, 6 January 2009 5:21 PM To: cfaussie Subject: [cfaussie] JRUN hanging Hi everyone From time to time our website appears to hand. I believe it is CF (or JRun) chocking because each time it happens I can still connect to the web server via FTP, RDP, DB access. When I look at the Task Manager jrun.exe is using max CPU. If I restart the CF service or End Process on jrun.exe everything recovers. I'm pretty sure I know which bit of code is causing the problem. It's a web service call to a 3rd party (used quite a lot by users on the website). Most of the time when the website hangs I can't browse to the 3rd party's web service. I've tried setting timeouts on the web service call by using CFINVOKE, CreateObject, CFOBJECT etc but no improvement. I need to allow for up to 60s because some of the calls are quite complex. I'm trying to work out why it's hanging. I suspect that because CF is allowing 8 simaltaneous requests and each gets 60 seconds then once all these slots are taken up everyone else goes on a queue and with all that load the server appears to hang because it's in constant use i.e. after 60 seconds user 1 gets his timeout so user 9 takes position 8 and round and round it goes and it never stops because with constant traffic taking up 8 request slots and all taking up resources for 60 seconds it appears to hang. Does this make sense? So the question is: how do I save the server from getting into this state. I'm thinking I need some sort of scheduled task which probes the web service every 10 seconds to work out if it's up. Perhaps if it times out (3 times just to be sure) I shut of the web service (which means the rest of the website is still up). The scheduled task keeps running until no timeout occurs and then it re-activates the web service for everyone else. Has anyone been through something like this before? How did you work around it? Cheers Matthew --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups cfaussie group. To post to this group, send email to cfaussie@googlegroups.com To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~--~~~~--~~--~--~---