Re: Cocoon 2.1.7 hang
* Ralph Goers: Thatupdate includedsome newcalculators written in flowscript. This versionof Cocoonis using rhino1.5r4-continuations-20040629T1232.jar. Ralph, Have you tried to replace this jar with Cocoon trunk's Rhino jar? We are successfully running it inside Cocoon 2.1. -- Jean-Baptiste Quenot http://caraldi.com/jbq/
Re: Cocoon 2.1.7 hang
On 1/19/06, Ralph Goers [EMAIL PROTECTED] wrote: I looked at what I believe is the right version of ContinuationInterpreter (http://svn.cocoondev.org/repos/rhino+cont/branches/BEFORE_PACKAGENAME_CHANGE/rhino1_5R4pre/src/org/mozilla/javascript/continuations/ContinuationInterpreter.java) and found that it has a while(true) look and that both line 657 and line 1134 are within it. The loop has a really large switch statement (line 657 is TokenStream.SETNAME and 1134 is NON_TAIL_CALL . Unfortunately, I have no idea what it is trying to do. but apparently it never breaks out of the loop. Random guess: continuation clean up? Is it possible that you some how have a looping continuations tree? Do you use createWebContination in your calculator to manually book mark things? If so, could there be a loop in the resultant continuations structure? -- Peter Hunsberger
Re: Cocoon 2.1.7 hang
No. I might suggest they test it in our development environment after they get the system stabilized. Jean-Baptiste Quenot wrote: * Ralph Goers: Thatupdate includedsome newcalculators written in flowscript. This versionof Cocoonis using rhino1.5r4-continuations-20040629T1232.jar. Ralph, Have you tried to replace this jar with Cocoon trunk's Rhino jar? We are successfully running it inside Cocoon 2.1.
Re: Cocoon 2.1.7 hang
I thought I'd update you all on the problems we have been having with our production deployment. We finally rolled out an update that include proper pool sizes for the components and the fix to the Castor mapping file. However, before we did that our system engineer provided some valuable insight. He provided a graph that shows one CPU going into a hard loop. About 7 minutes later the system became completely congested and ran out of threads. The pool and mapping file changes helped in that when the first failure after the changes occured it took about 30 minutes for the system to run out of threads. However, that information caused me to go back and look at my stack traces. It turns out that everyone single one (with the exception noted below) showed one thread doing the same thing. Now, we first deployed this product in March of 2005 and experienced no failures until a product update was released in August. That update included some new calculators written in flowscript. This version of Cocoon is using rhino1.5r4-continuations-20040629T1232.jar. The stack traces indicate that these are going into a loop and causing the system to die. At first we thought that the calculators were not doing proper input validation and causing the wierd things to happen. The stack traces kind of supported this in that they look like: http-8080-Processor18 daemon prio=1 tid=0x30e38df8 nid=0x51a8 runnable [2d351000..2d35387c] at java.lang.Class.isPrimitive(Native Method) at org.mozilla.javascript.NativeJavaObject.getConversionWeight(NativeJavaObject.java:324) at org.mozilla.javascript.NativeJavaObject.canConvert(NativeJavaObject.java:259) at org.mozilla.javascript.NativeJavaMethod.findFunction(NativeJavaMethod.java:356) at org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:193) at org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1244) at org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:1134) at org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:190) at org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:138) at org.mozilla.javascript.continuations.InterpretedFunctionImpl.call(InterpretedFunctionImpl.java:121) at org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1244) at org.mozilla.javascript.ScriptableObject.callMethod(ScriptableObject.java:1591) at org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.handleContinuation(FOM_JavaScriptInterpreter.java:812) - locked 0x66005778 (a org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter$ThreadScope) at org.apache.cocoon.components.treeprocessor.sitemap.CallFunctionNode.invoke(CallFunctionNode.java:123) However, we have been able to recreate the loop without entering bad data. In addition, we got a trace that was close to the start of the loop and it is somewhat different. It seems to imply that there is something wrong with Continuation handling, but I have no idea. Two traces taken a few minutes later both looked like the one above. at org.mozilla.javascript.Interpreter.doubleWrap(Interpreter.java:2491) at org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:657) at org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:190) at org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:138) at org.mozilla.javascript.continuations.InterpretedFunctionImpl.call(InterpretedFunctionImpl.java:121) at org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1244) at org.mozilla.javascript.ScriptableObject.callMethod(ScriptableObject.java:1591) at org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.handleContinuation(FOM_JavaScriptInterpreter.java:812) - locked 0x66005880 (a org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter$ThreadScope) at org.apache.cocoon.components.treeprocessor.sitemap.CallFunctionNode.invoke(CallFunctionNode.java:123) We have a way to get around this problem by replacing the flowscript calculators with CGIs for the time being. However, we will want to do something about this problem in the future. One difficulty in debugging this problem though is that we have no idea which calculators are running or where they are at the time of the failure because interpreted javascript doesn't show up in the stack trace. As a consequence I will probably recommend that they be rewritten as JSR-168 portlets instead of using flow - unless someone has a better idea. Thoughts and comments are welcome. Ralph
Re: Cocoon 2.1.7 hang
On 1/18/06, Ralph Goers [EMAIL PROTECTED] wrote: However, we have been able to recreate the loop without entering bad data. How ? Just random pounding on the calculator? Thoughts and comments are welcome. Looks to me like both times you've caught the process of a continuation being trapped and a flow script being executed as a result. Slightly different exit out of the continuations handler however: ContinuationInterpreter.interpret(ContinuationInterpreter.java:657) may provide the clue you need? Break points on one of the earlier ContinuationInterpreter points might also help if you can reproduce with a debugger attached? -- Peter Hunsberger
Re: Cocoon 2.1.7 hang
Peter Hunsberger wrote: On 1/18/06, Ralph Goers [EMAIL PROTECTED] wrote: However, we have been able to recreate the loop without entering bad data. How ? Just random pounding on the calculator? Yup. With nornal input. Thoughts and comments are welcome. Looks to me like both times you've caught the process of a continuation being trapped and a flow script being executed as a result. Slightly different exit out of the continuations handler however: ContinuationInterpreter.interpret(ContinuationInterpreter.java:657) may provide the clue you need? Break points on one of the earlier ContinuationInterpreter points might also help if you can reproduce with a debugger attached? I looked at what I believe is the right version of ContinuationInterpreter (http://svn.cocoondev.org/repos/rhino+cont/branches/BEFORE_PACKAGENAME_CHANGE/rhino1_5R4pre/src/org/mozilla/javascript/continuations/ContinuationInterpreter.java) and found that it has a while(true) look and that both line 657 and line 1134 are within it. The loop has a really large switch statement (line 657 is TokenStream.SETNAME and 1134 is NON_TAIL_CALL . Unfortunately, I have no idea what it is trying to do. but apparently it never breaks out of the loop. Ralph
Re: Cocoon 2.1.7 hang
* Ralph Goers: OK. I ran some basic tests on one of my machines. Just for basic info it is a P4 2.5 GHz with 1 GB of memory running RHEL 3. The only thing I did was set up JMeter to login to the portal as user cocoon. In all the tests the computer was maxed at 100% cpu. Before the change: 5 threads login repeated 10 times: Avg 3.4 seconds, Max 27 seconds. 10 threads login repeaded 5 times: Avg 6.760 seconds, Max 22 seconds After the change: 5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds 5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds 10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds. 10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds. The change has been checked into 2.1. I'll test it on 2.2 and check it in also. Hello Ralph, Happy new year! Your change seems very interesting, thank you very much. However, I have a more radical solution to this portal login problem, see: Speedup portal loading http://issues.apache.org/jira/browse/COCOON-1709 Also requires: Allow CopletInstanceDataManager to be cloneable http://issues.apache.org/jira/browse/COCOON-1708 -- Jean-Baptiste Quenot Systèmes d'Information ANYWARE TECHNOLOGIES Tel : +33 (0)5 61 00 52 90 Fax : +33 (0)5 61 00 51 46 http://www.anyware-tech.com/
Re: Cocoon 2.1.7 hang
I replied to that days ago in the issue (1709 I believe). In short, this is a good idea for sites (like mine) that only use anonymous users. However, the idea of permanantly caching millions of users profiles in memory is very scary and will be considered to be a memory leak by many people. So, I'm -1 on just checking that patch in as is. However, if there was a way to enable it for only anonymous users I'd be all for that. Ralph Jean-Baptiste Quenot wrote: * Ralph Goers: OK. I ran some basic tests on one of my machines. Just for basic info it is a P4 2.5 GHz with 1 GB of memory running RHEL 3. The only thing I did was set up JMeter to login to the portal as user cocoon. In all the tests the computer was maxed at 100% cpu. Before the change: 5 threads login repeated 10 times: Avg 3.4 seconds, Max 27 seconds. 10 threads login repeaded 5 times: Avg 6.760 seconds, Max 22 seconds After the change: 5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds 5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds 10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds. 10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds. The change has been checked into 2.1. I'll test it on 2.2 and check it in also. Hello Ralph, Happy new year! Your change seems very interesting, thank you very much. However, I have a more radical solution to this portal login problem, see: Speedup portal loading http://issues.apache.org/jira/browse/COCOON-1709 Also requires: Allow CopletInstanceDataManager to be cloneable http://issues.apache.org/jira/browse/COCOON-1708
Re: Cocoon 2.1.7 hang
* Ralph Goers: I replied to that days ago in the issue (1709 I believe). Sorry, I didn't notice your comment on JIRA, strangely. I will followup to your comment. -- Jean-Baptiste Quenot Systèmes d'Information ANYWARE TECHNOLOGIES Tel : +33 (0)5 61 00 52 90 Fax : +33 (0)5 61 00 51 46 http://www.anyware-tech.com/
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: OK. I ran some basic tests on one of my machines. Just for basic info it is a P4 2.5 GHz with 1 GB of memory running RHEL 3. The only thing I did was set up JMeter to login to the portal as user cocoon. In all the tests the computer was maxed at 100% cpu. Before the change: 5 threads login repeated 10 times: Avg 3.4 seconds, Max 27 seconds. 10 threads login repeaded 5 times: Avg 6.760 seconds, Max 22 seconds After the change: 5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds 5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds 10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds. 10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds. The change has been checked into 2.1. I'll test it on 2.2 and check it in also. Did you use in both tests the source from 2.1.8? I'm just curious if the changes I did to the CastorConverter from 2.1.7 to 2.1.8 are improving performance as well. Carsten -- Carsten Ziegeler - Open Source Group, SN AG http://www.s-und-n.de http://www.osoco.org/weblogs/rael/
Re: Cocoon 2.1.7 hang
I used the latest source for both 2.1 and trunk. Carsten Ziegeler wrote: Ralph Goers wrote: OK. I ran some basic tests on one of my machines. Just for basic info it is a P4 2.5 GHz with 1 GB of memory running RHEL 3. The only thing I did was set up JMeter to login to the portal as user cocoon. In all the tests the computer was maxed at 100% cpu. Before the change: 5 threads login repeated 10 times: Avg 3.4 seconds, Max 27 seconds. 10 threads login repeaded 5 times: Avg 6.760 seconds, Max 22 seconds After the change: 5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds 5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds 10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds. 10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds. The change has been checked into 2.1. I'll test it on 2.2 and check it in also. Did you use in both tests the source from 2.1.8? I'm just curious if the changes I did to the CastorConverter from 2.1.7 to 2.1.8 are improving performance as well. Carsten
Re: Cocoon 2.1.7 hang
OK. I ran some basic tests on one of my machines. Just for basic info it is a P4 2.5 GHz with 1 GB of memory running RHEL 3. The only thing I did was set up JMeter to login to the portal as user cocoon. In all the tests the computer was maxed at 100% cpu. Before the change: 5 threads login repeated 10 times: Avg 3.4 seconds, Max 27 seconds. 10 threads login repeaded 5 times: Avg 6.760 seconds, Max 22 seconds After the change: 5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds 5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds 10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds. 10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds. The change has been checked into 2.1. I'll test it on 2.2 and check it in also. Ralph Carsten Ziegeler wrote: Castor seems to have a lot of useful little hacks - I just found out, that we can prevent castor from checking for default constructors which I really needed for 2.2 - it's there, you only have to find out how to configure it :). Im curious how the matches configuration looks like :) Carsten Ralph Goers wrote: OK. I figured out how to use the matches attribute and was able to verify that it doesn't throw ClassNotFoundExceptions all the time. I'll do a little load testing next to see what kind of difference it makes on throughput.
Re: Cocoon 2.1.7 hang
We took some thread dumps of our product when it was running normally. It was interesting in that we still saw in almost every stack trace the portal calling castor which was in the class loader throwing a ClassNotFoundException. I then stepped through the sample site and have discovered that when bind-xml auto-naming=deriveByClass is used, Castor starts making up names and trying to load them. For example, when a named item is specified inside a composite layout castor takes org.apache.cocoon.portal.layout.impl.CompositeLayoutImpl strips off CompositeLayoutImpl and replaces it with NamedItem. It then tries to load that class and gets a ClassNotFoundException because NamedItem isn't in the same package. It eventually uses the correct class name. This makes login extremely slow as every item is throwing an exception. And, I expect, when the resource pool is exceeded the class loader is completely overstressed and the system comes to a grinding halt. It doesn't actually stop, but from then on it moves so slowly that it might as well be dead. The code in Castor suggests using the matches attribute to bypass this. Unfortunately, there are no examples to be found on how matches could solve this problem. So the bottom line is, unless all your classes are in the same package do not use deriveByClass. Or don't use Castor. Ralph
Re: Cocoon 2.1.7 hang
OK. I figured out how to use the matches attribute and was able to verify that it doesn't throw ClassNotFoundExceptions all the time. I'll do a little load testing next to see what kind of difference it makes on throughput. Ralph Goers wrote: We took some thread dumps of our product when it was running normally. It was interesting in that we still saw in almost every stack trace the portal calling castor which was in the class loader throwing a ClassNotFoundException. I then stepped through the sample site and have discovered that when bind-xml auto-naming=deriveByClass is used, Castor starts making up names and trying to load them. For example, when a named item is specified inside a composite layout castor takes org.apache.cocoon.portal.layout.impl.CompositeLayoutImpl strips off CompositeLayoutImpl and replaces it with NamedItem. It then tries to load that class and gets a ClassNotFoundException because NamedItem isn't in the same package. It eventually uses the correct class name. This makes login extremely slow as every item is throwing an exception. And, I expect, when the resource pool is exceeded the class loader is completely overstressed and the system comes to a grinding halt. It doesn't actually stop, but from then on it moves so slowly that it might as well be dead. The code in Castor suggests using the matches attribute to bypass this. Unfortunately, there are no examples to be found on how matches could solve this problem. So the bottom line is, unless all your classes are in the same package do not use deriveByClass. Or don't use Castor. Ralph
Re: Cocoon 2.1.7 hang
Castor seems to have a lot of useful little hacks - I just found out, that we can prevent castor from checking for default constructors which I really needed for 2.2 - it's there, you only have to find out how to configure it :). Im curious how the matches configuration looks like :) Carsten Ralph Goers wrote: OK. I figured out how to use the matches attribute and was able to verify that it doesn't throw ClassNotFoundExceptions all the time. I'll do a little load testing next to see what kind of difference it makes on throughput. Ralph Goers wrote: We took some thread dumps of our product when it was running normally. It was interesting in that we still saw in almost every stack trace the portal calling castor which was in the class loader throwing a ClassNotFoundException. I then stepped through the sample site and have discovered that when bind-xml auto-naming=deriveByClass is used, Castor starts making up names and trying to load them. For example, when a named item is specified inside a composite layout castor takes org.apache.cocoon.portal.layout.impl.CompositeLayoutImpl strips off CompositeLayoutImpl and replaces it with NamedItem. It then tries to load that class and gets a ClassNotFoundException because NamedItem isn't in the same package. It eventually uses the correct class name. This makes login extremely slow as every item is throwing an exception. And, I expect, when the resource pool is exceeded the class loader is completely overstressed and the system comes to a grinding halt. It doesn't actually stop, but from then on it moves so slowly that it might as well be dead. The code in Castor suggests using the matches attribute to bypass this. Unfortunately, there are no examples to be found on how matches could solve this problem. So the bottom line is, unless all your classes are in the same package do not use deriveByClass. Or don't use Castor. Ralph -- Carsten Ziegeler - Open Source Group, SN AG http://www.s-und-n.de http://www.osoco.org/weblogs/rael/
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: ... when bind-xml auto-naming=deriveByClass is used, Castor starts making up names and trying to load them. ... I expect, when the resource pool is exceeded the class loader is completely overstressed and the system comes to a grinding halt. It doesn't actually stop, but from then on it moves so slowly that it might as well be dead. Hm, just to clarify, you suggest that if ClassLoader is repeatedly asked to load a lot of non-existing classes, it eventually slows down a lot? Hm, wouldn't it be a bug on ClassLoader part... Vadim
Re: Cocoon 2.1.7 hang
Vadim Gritsenko wrote: Ralph Goers wrote: ... when bind-xml auto-naming=deriveByClass is used, Castor starts making up names and trying to load them. ... I expect, when the resource pool is exceeded the class loader is completely overstressed and the system comes to a grinding halt. It doesn't actually stop, but from then on it moves so slowly that it might as well be dead. Hm, just to clarify, you suggest that if ClassLoader is repeatedly asked to load a lot of non-existing classes, it eventually slows down a lot? Hm, wouldn't it be a bug on ClassLoader part... Class loading involves searching parent class loaders. If the class is found most class loaders put it in a map so they don't have to get it again. Thus loading the same class over and over again should be pretty cheap. But they don't keep track of what classes they didn't find. So constantly looking for non-existant classes is going to be very expensive. Since it is done in a synchronized method it is going to become a huge system bottleneck. Vadim
Re: Cocoon 2.1.7 hang
I'll check them in as soon as I get some testing done. Shouldn't be too long. Carsten Ziegeler wrote: Castor seems to have a lot of useful little hacks - I just found out, that we can prevent castor from checking for default constructors which I really needed for 2.2 - it's there, you only have to find out how to configure it :). Im curious how the matches configuration looks like :) Carsten
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: We tried to deploy an update to our product. Pretty much the only thing we did to Cocoon was to replace Xalan with XSLTC which produces a dramatic performance improvement. However, Cocoon is not consistently hanging. I tried to attach the thread dumps but they are too big. They also don't make any sense to me. I've reduced them down and pasted them below. It shows many calls to XMLFileModule waiting for a lock. The thread that has the lock is waiting for ResourceLimitingPool to get a lock. Don't know if there's a problem in ResourceLimitingPool, but looking at your stacktrace and XMLFileModule, I suspect a potential deadlock when one of the documents searched by XMLFileModule is cocoon: and that module is used when doing some parallel CInclude. The synchronized block in DocumentHelper.getDocument() can be way smaller, and avoid the lock by _not_ resolving URIs in a synch'ed block. Sylvain -- Sylvain WallezAnyware Technologies http://bluxte.net http://www.anyware-tech.com Apache Software Foundation Member Research Technology Director
Re: Cocoon 2.1.7 hang
On Dec 24, 2005, at 12:16 PM, Ralph Goers wrote: Has anyone had problems with ehcache? I suspect that our problems are being caused by problems with data being returned from the cache when the cache starts writing to disk. Do you really need persistence store? If not, replace store with default implementation - should be much more efficient. Vadim
Re: Cocoon 2.1.7 hang
Has anyone had problems with ehcache? I suspect that our problems are being caused by problems with data being returned from the cache when the cache starts writing to disk. Ralph
Re: Cocoon 2.1.7 hang
Does anyone have any experience with these binding frameworks? https://bindmark.dev.java.net/ It sure looks like Castor is a poor choice for what the portal is doing. Notably missing from the list is JaxMe. Ralph Carsten Ziegeler wrote: The latest version of Castor has bugs in the reference handling (as far as I remember). As we don't use the castor references in 2.1.x (but in 2.2), this shouldn't affect you, so a newer version should work. HTH Carsten
Re: Cocoon 2.1.7 hang
We are running Sun JDK 1.4.2_05 on RHEL 3. The Tomcat is 5.0 something (I'm out of tow at the moment so I don't have access to the info). Frankly, I'm suspecting that ehcache is returning a bad document to Castor although I don't have any proof. But if that is the case I really would have expected Castor to get or throw an exception, not just call the classloader over and over. The lock at 0x60b19148 was held by the last thread which was http-8080-Processor26 daemon prio=1 tid=0x0821b148 nid=0x1e8b waiting for monitor entry [2cafd000..2caff87c] at java.lang.String.replace(String.java:1555) at java.net.URLClassLoader$1.run(URLClassLoader.java:190) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:187) at java.lang.ClassLoader.loadClass(ClassLoader.java:289) - locked 0x60b1d920 (a sun.misc.Launcher$ExtClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:282) - locked 0x60b19148 (a sun.misc.Launcher$AppClassLoader) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274) - locked 0x60b19148 (a sun.misc.Launcher$AppClassLoader) Ralph Pier Fumagalli wrote: On 22 Dec 2005, at 18:16, Ralph Goers wrote: We finally got some thread dumps from our production server. It shows something very different than what we were seeing in testing. First, this happens under light load after running for days. To summarize, many threads are waiting for the ResourceLimitingPool and several are waiting for the class loader. This system hasn't had the pools tuned so I'm not surprised about pool contention, but I don't believe that is the issue. That is because the thread holding the lock is simply waiting for the class loader. We took two traces and both were similar, but not identical. Different threads were holding the class loader lock in both. However, in both cases the threads holding the class loader lock were called from Castor while creating the portal layout. So far, we have been speculating that the problem is due to a problem with the NPTL threads on Enterprise Linux 3. However, I'm wondering if perhaps castor is having problems and simply calling the class loader over and over. I'd appreciate any ideas. Ok, as far as I can see down the dumps you might have some problems with Catalina's classloader implementation locking up at 0x60b19148: at org.apache.catalina.loader.WebappClassLoader.loadClass (WebappClassLoader.java:1255) That seems odd though... I thought that code was debugged pretty thoroughly, unless, a seconday lock at 0x60cd9970 prevents the first one to be released... Anyhow, from my experience, NPTL don't cause any whatsoever problem under Linux, but that said, I'm running on Jetty 4 with BEA JRockit 1.4.2. What VM and what container are you actually using? Pier
Re: Cocoon 2.1.7 hang
Thanks for the info. Do you think we should rewrite this using Digester instead of Castor? Carsten Ziegeler wrote: Someone told me some months ago that they experienced several strange issues with castor under load - I think he mentioned class loading and synchronization problems. I still can't remember *who* it was :( So, chances are that this is really caused somehow by castor. I improved the the castor portal implementation for 2.1.8 slightly - the version in 2.1.7 parsed the mapping each time a profile is loaded. I guess that through this operation the classes are loaded as well. In 2.1.8 the mapping is only read once on startup. So my suggestion is to use the CastorSourceConverter from 2.1.8 in your 2.1.7 environment and see what happens. Carsten
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: Thanks for the info. Do you think we should rewrite this using Digester instead of Castor? I never really liked Castor; at the time when we started with the portal Digester was not usable for us as there were some problems with complex types like lists and maps (at least as far as i remember). But if Digester meets your needs, I think it makes sense to give it a try. Carsten -- Carsten Ziegeler - Open Source Group, SN AG http://www.s-und-n.de http://www.osoco.org/weblogs/rael/
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: Thanks for the info. Do you think we should rewrite this using Digester instead of Castor? Commons Digester is only a one-way tool: XML-Java. Castor could be replaced by XMLBeans (if you need to use XML Schema), XStream (if you only want to de/serialiaze XML to Java) or JAXB (also provides separate mapping files). You could also try to upgrade Castor to the latest release. If there are problems, I'm sure the Castor community will be very helpful. -- Reinhard Pötz Independent Consultant, Trainer (IT)-Coach {Software Engineering, Open Source, Web Applications, Apache Cocoon} web(log): http://www.poetz.cc
Re: Cocoon 2.1.7 hang
What were the problems you had with the last version of Castor you tried? (I believe a new version is now available). Reinhard pointed out a potential problem with Digester - we have to write out the instance data when preferences are updated. I wish I understood what exactly the problem is. We have run this under load in our QA environment with no problems but it hangs up in production with a fairly light load. Ralph Carsten Ziegeler wrote: Ralph Goers wrote: Thanks for the info. Do you think we should rewrite this using Digester instead of Castor? I never really liked Castor; at the time when we started with the portal Digester was not usable for us as there were some problems with complex types like lists and maps (at least as far as i remember). But if Digester meets your needs, I think it makes sense to give it a try. Carsten
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: What were the problems you had with the last version of Castor you tried? (I believe a new version is now available). Reinhard pointed out a potential problem with Digester - we have to write out the instance data when preferences are updated. Does the data follow an XML schema or some sort of contract? If not and you just want to serialize them in an human readable format, use XStream which is *very* simple to use. Of course this leads to backwards incompatibilities ... -- Reinhard Pötz Independent Consultant, Trainer (IT)-Coach {Software Engineering, Open Source, Web Applications, Apache Cocoon} web(log): http://www.poetz.cc
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: What were the problems you had with the last version of Castor you tried? (I believe a new version is now available). Reinhard pointed out a potential problem with Digester - we have to write out the instance data when preferences are updated. I wish I understood what exactly the problem is. We have run this under load in our QA environment with no problems but it hangs up in production with a fairly light load. The latest version of Castor has bugs in the reference handling (as far as I remember). As we don't use the castor references in 2.1.x (but in 2.2), this shouldn't affect you, so a newer version should work. HTH Carsten -- Carsten Ziegeler - Open Source Group, SN AG http://www.s-und-n.de http://www.osoco.org/weblogs/rael/
Re: Cocoon 2.1.7 hang
We finally got some thread dumps from our production server. It shows something very different than what we were seeing in testing. First, this happens under light load after running for days. To summarize, many threads are waiting for the ResourceLimitingPool and several are waiting for the class loader. This system hasn't had the pools tuned so I'm not surprised about pool contention, but I don't believe that is the issue. That is because the thread holding the lock is simply waiting for the class loader. We took two traces and both were similar, but not identical. Different threads were holding the class loader lock in both. However, in both cases the threads holding the class loader lock were called from Castor while creating the portal layout. So far, we have been speculating that the problem is due to a problem with the NPTL threads on Enterprise Linux 3. However, I'm wondering if perhaps castor is having problems and simply calling the class loader over and over. I'd appreciate any ideas. We see many threads like this one... http-8080-Processor155 daemon prio=1 tid=0x083e3378 nid=0x1e8b waiting for monitor entry [22dc..22dc187c] at org.apache.avalon.excalibur.pool.ResourceLimitingPool.get(ResourceLimitingPool.java:262) - waiting to lock 0x60cd9970 (a java.lang.Object) at org.apache.avalon.excalibur.component.PoolableComponentHandler.doGet(PoolableComponentHandler.java:198) at org.apache.avalon.excalibur.component.ComponentHandler.get(ComponentHandler.java:381) at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.select(ExcaliburComponentSelector.java:213) at org.apache.cocoon.components.ExtendedComponentSelector.select(ExtendedComponentSelector.java:260) at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.addTransformer(AbstractProcessingPipeline.java:267) at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.addTransformer(AbstractCachingProcessingPipeline.java:143) at org.apache.cocoon.components.treeprocessor.sitemap.TransformNode.invoke(TransformNode.java:59) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:180) at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:243) at org.apache.cocoon.Cocoon.process(Cocoon.java:606) at org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1119) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:237) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:214) at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520) at org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:198) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:152) at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137) at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:118) at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:102) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520) at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929) at
Re: Cocoon 2.1.7 hang
On 22 Dec 2005, at 18:16, Ralph Goers wrote: We finally got some thread dumps from our production server. It shows something very different than what we were seeing in testing. First, this happens under light load after running for days. To summarize, many threads are waiting for the ResourceLimitingPool and several are waiting for the class loader. This system hasn't had the pools tuned so I'm not surprised about pool contention, but I don't believe that is the issue. That is because the thread holding the lock is simply waiting for the class loader. We took two traces and both were similar, but not identical. Different threads were holding the class loader lock in both. However, in both cases the threads holding the class loader lock were called from Castor while creating the portal layout. So far, we have been speculating that the problem is due to a problem with the NPTL threads on Enterprise Linux 3. However, I'm wondering if perhaps castor is having problems and simply calling the class loader over and over. I'd appreciate any ideas. Ok, as far as I can see down the dumps you might have some problems with Catalina's classloader implementation locking up at 0x60b19148: at org.apache.catalina.loader.WebappClassLoader.loadClass (WebappClassLoader.java:1255) That seems odd though... I thought that code was debugged pretty thoroughly, unless, a seconday lock at 0x60cd9970 prevents the first one to be released... Anyhow, from my experience, NPTL don't cause any whatsoever problem under Linux, but that said, I'm running on Jetty 4 with BEA JRockit 1.4.2. What VM and what container are you actually using? Pier smime.p7s Description: S/MIME cryptographic signature
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: We finally got some thread dumps from our production server. It shows something very different than what we were seeing in testing. First, this happens under light load after running for days. To summarize, many threads are waiting for the ResourceLimitingPool and several are waiting for the class loader. This system hasn't had the pools tuned so I'm not surprised about pool contention, but I don't believe that is the issue. That is because the thread holding the lock is simply waiting for the class loader. We took two traces and both were similar, but not identical. Different threads were holding the class loader lock in both. However, in both cases the threads holding the class loader lock were called from Castor while creating the portal layout. So far, we have been speculating that the problem is due to a problem with the NPTL threads on Enterprise Linux 3. However, I'm wondering if perhaps castor is having problems and simply calling the class loader over and over. I'd appreciate any ideas. Someone told me some months ago that they experienced several strange issues with castor under load - I think he mentioned class loading and synchronization problems. I still can't remember *who* it was :( So, chances are that this is really caused somehow by castor. I improved the the castor portal implementation for 2.1.8 slightly - the version in 2.1.7 parsed the mapping each time a profile is loaded. I guess that through this operation the classes are loaded as well. In 2.1.8 the mapping is only read once on startup. So my suggestion is to use the CastorSourceConverter from 2.1.8 in your 2.1.7 environment and see what happens. Carsten -- Carsten Ziegeler - Open Source Group, SN AG http://www.s-und-n.de http://www.osoco.org/weblogs/rael/
Re: Cocoon 2.1.7 hang
We are still experiencing this hang. Does anyone have any idea what may be causing these threads to hang? It seems to hanging at the same spot, but admittedly I'm not sure what I'm looking at. http-8080-Processor1 daemon prio=1 tid=0x082ca7d8 nid=0x574c runnable [2e6fe000..2e6ff87c]at java.net.SocketOutputStream.socketWrite0(Native Method)at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java :92)at java.net.SocketOutputStream.write(SocketOutputStream.java:136)at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:714)at org.apache.tomcat.util.buf.ByteChunk.flushBuffer (ByteChunk.java:398)at org.apache.coyote.http11.InternalOutputBuffer.flush(InternalOutputBuffer.java:304)at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:921)at org.apache.coyote.Response.action (Response.java:182)at org.apache.coyote.tomcat5.OutputBuffer.doFlush(OutputBuffer.java:326)at org.apache.coyote.tomcat5.OutputBuffer.flush(OutputBuffer.java:297)at org.apache.coyote.tomcat5.CoyoteOutputStream.flush (CoyoteOutputStream.java:85)at org.apache.cocoon.util.BufferedOutputStream.realFlush(BufferedOutputStream.java:128)at org.apache.cocoon.environment.AbstractEnvironment.commitResponse(AbstractEnvironment.java:512) at org.apache.cocoon.Cocoon.process(Cocoon.java:630)at org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1119)at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:237)at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:214) at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)at org.apache.catalina.core.StandardContextValve.invokeInternal (StandardContextValve.java:198)at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:152)at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)at org.apache.catalina.core.StandardValveContext.invokeNext (StandardValveContext.java:104)at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:118)at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:102)at org.apache.catalina.core.StandardPipeline.invoke (StandardPipeline.java:520)at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)at org.apache.catalina.core.StandardPipeline.invoke (StandardPipeline.java:520)at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929)at org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160)at org.apache.coyote.http11.Http11Processor.process (Http11Processor.java:799)at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)at org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)at java.lang.Thread.run(Thread.java:534) ... http-8080-Processor1 daemon prio=1 tid=0x082ca7d8 nid=0x574c runnable [2e6fe000..2e6ff87c]at java.net.SocketOutputStream.socketWrite0(Native Method)at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java :92)at java.net.SocketOutputStream.write(SocketOutputStream.java:136)at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:714)at org.apache.tomcat.util.buf.ByteChunk.flushBuffer (ByteChunk.java:398)at org.apache.coyote.http11.InternalOutputBuffer.flush(InternalOutputBuffer.java:304)at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:921)at org.apache.coyote.Response.action (Response.java:182)at org.apache.coyote.tomcat5.OutputBuffer.doFlush(OutputBuffer.java:326)at org.apache.coyote.tomcat5.OutputBuffer.flush(OutputBuffer.java:297)at org.apache.coyote.tomcat5.CoyoteOutputStream.flush (CoyoteOutputStream.java:85)at org.apache.cocoon.util.BufferedOutputStream.realFlush(BufferedOutputStream.java:128)at org.apache.cocoon.environment.AbstractEnvironment.commitResponse(AbstractEnvironment.java:512) at org.apache.cocoon.Cocoon.process(Cocoon.java:630)at org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1119)at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter (ApplicationFilterChain.java:237)at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)at
Re: Cocoon 2.1.7 hang
Java version? BTW, the src are here: http://apache.secsup.org/dist/avalon/excalibur-pool/source/ Best Regards, Antonio Gallardo. Ralph Goers wrote: We tried to deploy an update to our product. Pretty much the only thing we did to Cocoon was to replace Xalan with XSLTC which produces a dramatic performance improvement. However, Cocoon is not consistently hanging. I tried to attach the thread dumps but they are too big. They also don't make any sense to me. I've reduced them down and pasted them below. It shows many calls to XMLFileModule waiting for a lock. The thread that has the lock is waiting for ResourceLimitingPool to get a lock. However, the thread dump doesn't show any threads holding that lock. Does anyone have any ideas on this? This is cocoon 2.1.7-dev svn revision 122686. Cocoon-2.1.7 is using excalibur-1.2. I can't find the source at excalibur. Anyone know where I can get it? Lots of theese waiting http-8080-Processor25 daemon prio=1 tid=0x2dffa660 nid=0x19eb waiting for monitor entry [2d3eb000..2d3ed87c] at org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper.getDocument(XMLFileModule.java:155) - waiting to lock 0x5f1824b8 (a org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper) at org.apache.cocoon.components.modules.input.XMLFileModule.getContextObject(XMLFileModule.java:357) at org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:380) at org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:368) at org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.processModule(PreparedVariableResolver.java:256) at org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.resolve(PreparedVariableResolver.java:207) at org.apache.cocoon.components.treeprocessor.sitemap.TransformNode.invoke(TransformNode.java:59) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:97) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) at
Re: Cocoon 2.1.7 hang
Ralph Goers wrote: We tried to deploy an update to our product. Pretty much the only thing we did to Cocoon was to replace Xalan with XSLTC which produces a dramatic performance improvement. However, Cocoon is not consistently hanging. I tried to attach the thread dumps but they are too big. They also don't make any sense to me. I've reduced them down and pasted them below. It shows many calls to XMLFileModule waiting for a lock. The thread that has the lock is waiting for ResourceLimitingPool to get a lock. However, the thread dump doesn't show any threads holding that lock. Does anyone have any ideas on this? This is cocoon 2.1.7-dev svn revision 122686. Cocoon-2.1.7 is using excalibur-1.2. I can't find the source at excalibur. Anyone know where I can get it? the src are here: http://apache.secsup.org/dist/avalon/excalibur-pool/source/ BTW, java version?, xalan version? Best Regards, Antonio Gallardo. snip/
Re: Cocoon 2.1.7 hang
It is running on AIX so it is IBM's JDK 1.4 Thanks, I found the code. I don't see a whole lot different but there is clearly a problem in this code. Ralph Antonio Gallardo wrote: Java version? BTW, the src are here: http://apache.secsup.org/dist/avalon/excalibur-pool/source/ Best Regards, Antonio Gallardo. Ralph Goers wrote: We tried to deploy an update to our product. Pretty much the only thing we did to Cocoon was to replace Xalan with XSLTC which produces a dramatic performance improvement. However, Cocoon is not consistently hanging. I tried to attach the thread dumps but they are too big. They also don't make any sense to me. I've reduced them down and pasted them below. It shows many calls to XMLFileModule waiting for a lock. The thread that has the lock is waiting for ResourceLimitingPool to get a lock. However, the thread dump doesn't show any threads holding that lock. Does anyone have any ideas on this? This is cocoon 2.1.7-dev svn revision 122686. Cocoon-2.1.7 is using excalibur-1.2. I can't find the source at excalibur. Anyone know where I can get it? Lots of theese waiting http-8080-Processor25 daemon prio=1 tid=0x2dffa660 nid=0x19eb waiting for monitor entry [2d3eb000..2d3ed87c] at org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper.getDocument(XMLFileModule.java:155) - waiting to lock 0x5f1824b8 (a org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper) at org.apache.cocoon.components.modules.input.XMLFileModule.getContextObject(XMLFileModule.java:357) at org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:380) at org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:368) at org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.processModule(PreparedVariableResolver.java:256) at org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.resolve(PreparedVariableResolver.java:207) at org.apache.cocoon.components.treeprocessor.sitemap.TransformNode.invoke(TransformNode.java:59) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:97) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) at org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) at
Re: Cocoon 2.1.7 hang
Antonio Gallardo wrote: Ralph Goers wrote: We tried to deploy an update to our product. Pretty much the only thing we did to Cocoon was to replace Xalan with XSLTC which produces a dramatic performance improvement. However, Cocoon is not consistently hanging. I tried to attach the thread dumps but they are too big. They also don't make any sense to me. I've reduced them down and pasted them below. It shows many calls to XMLFileModule waiting for a lock. The thread that has the lock is waiting for ResourceLimitingPool to get a lock. However, the thread dump doesn't show any threads holding that lock. Does anyone have any ideas on this? This is cocoon 2.1.7-dev svn revision 122686. Cocoon-2.1.7 is using excalibur-1.2. I can't find the source at excalibur. Anyone know where I can get it? the src are here: http://apache.secsup.org/dist/avalon/excalibur-pool/source/ BTW, java version?, xalan version? xalan version is 2.6.1- As I recall the revision I am using of Cocoon was very close to the final 2.1.7 release. Ralph