Re: Cocoon 2.1.7 hang

2006-01-19 Thread Jean-Baptiste Quenot
* Ralph Goers:

 Thatupdate   includedsome   newcalculators   written
 in   flowscript. This   versionof   Cocoonis   using
 rhino1.5r4-continuations-20040629T1232.jar.

Ralph,

Have you tried to replace this  jar with Cocoon trunk's Rhino jar?
We are successfully running it inside Cocoon 2.1.
-- 
Jean-Baptiste Quenot
http://caraldi.com/jbq/


Re: Cocoon 2.1.7 hang

2006-01-19 Thread Peter Hunsberger
On 1/19/06, Ralph Goers [EMAIL PROTECTED] wrote:
 
 I looked at what I believe is the right version of
 ContinuationInterpreter
 (http://svn.cocoondev.org/repos/rhino+cont/branches/BEFORE_PACKAGENAME_CHANGE/rhino1_5R4pre/src/org/mozilla/javascript/continuations/ContinuationInterpreter.java)
 and found that it has a while(true) look and that both line 657 and line
 1134 are within it.  The loop has a really large switch statement (line
 657 is TokenStream.SETNAME and 1134 is NON_TAIL_CALL . Unfortunately, I
 have no idea what it is trying to do. but apparently it never breaks out
 of the loop.

Random guess: continuation clean up?  Is it possible that you some how
have a looping continuations tree? Do you use createWebContination
in your calculator to manually book mark things?  If so, could there
be a loop in the resultant continuations structure?

--
Peter Hunsberger


Re: Cocoon 2.1.7 hang

2006-01-19 Thread Ralph Goers
No. I might suggest they test it in our development environment after 
they get the system stabilized.


Jean-Baptiste Quenot wrote:


* Ralph Goers:

 


Thatupdate   includedsome   newcalculators   written
in   flowscript. This   versionof   Cocoonis   using
rhino1.5r4-continuations-20040629T1232.jar.
   



Ralph,

Have you tried to replace this  jar with Cocoon trunk's Rhino jar?
We are successfully running it inside Cocoon 2.1.
 



Re: Cocoon 2.1.7 hang

2006-01-18 Thread Ralph Goers
I thought I'd update you all on the problems we have been having with 
our production deployment.  We finally rolled out an update that include 
proper pool sizes for the components and the fix to the Castor mapping 
file.  However, before we did that our system engineer provided some 
valuable insight.  He provided a graph that shows one CPU going into a 
hard loop.  About 7 minutes later the system became completely congested 
and ran out of threads.  The pool and mapping file changes helped in 
that when the first failure after the changes occured it took about 30 
minutes for the system to run out of threads.


However, that information caused me to go back and look at my stack 
traces.  It turns out that everyone single one (with the exception noted 
below) showed one thread doing the same thing.  Now, we first deployed 
this product in March of 2005 and experienced no failures until a 
product update was released in August.  That update included some new 
calculators written in flowscript.  This version of Cocoon is using 
rhino1.5r4-continuations-20040629T1232.jar. The stack traces indicate 
that these are going into a loop and causing the system to die.  At 
first we thought that the calculators were not doing proper input 
validation and causing the wierd things to happen. The stack traces kind 
of supported this in that they look like:


http-8080-Processor18 daemon prio=1 tid=0x30e38df8 nid=0x51a8 runnable 
[2d351000..2d35387c]

   at java.lang.Class.isPrimitive(Native Method)
   at 
org.mozilla.javascript.NativeJavaObject.getConversionWeight(NativeJavaObject.java:324)
   at 
org.mozilla.javascript.NativeJavaObject.canConvert(NativeJavaObject.java:259)
   at 
org.mozilla.javascript.NativeJavaMethod.findFunction(NativeJavaMethod.java:356)
   at 
org.mozilla.javascript.NativeJavaMethod.call(NativeJavaMethod.java:193)

   at org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1244)
   at 
org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:1134)
   at 
org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:190)
   at 
org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:138)
   at 
org.mozilla.javascript.continuations.InterpretedFunctionImpl.call(InterpretedFunctionImpl.java:121)

   at org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1244)
   at 
org.mozilla.javascript.ScriptableObject.callMethod(ScriptableObject.java:1591)
   at 
org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.handleContinuation(FOM_JavaScriptInterpreter.java:812)
   - locked 0x66005778 (a 
org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter$ThreadScope)
   at 
org.apache.cocoon.components.treeprocessor.sitemap.CallFunctionNode.invoke(CallFunctionNode.java:123)


However, we have been able to recreate the loop without entering bad 
data.  In addition, we got a trace that was close to the start of the 
loop and it is somewhat different.  It seems to imply that there is 
something wrong with Continuation handling, but I have no idea. Two 
traces taken a few minutes later both looked like the one above.


   at org.mozilla.javascript.Interpreter.doubleWrap(Interpreter.java:2491)
   at 
org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:657)
   at 
org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:190)
   at 
org.mozilla.javascript.continuations.ContinuationInterpreter.interpret(ContinuationInterpreter.java:138)
   at 
org.mozilla.javascript.continuations.InterpretedFunctionImpl.call(InterpretedFunctionImpl.java:121)

   at org.mozilla.javascript.ScriptRuntime.call(ScriptRuntime.java:1244)
   at 
org.mozilla.javascript.ScriptableObject.callMethod(ScriptableObject.java:1591)
   at 
org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.handleContinuation(FOM_JavaScriptInterpreter.java:812)
   - locked 0x66005880 (a 
org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter$ThreadScope)
   at 
org.apache.cocoon.components.treeprocessor.sitemap.CallFunctionNode.invoke(CallFunctionNode.java:123)


We have a way to get around this problem by replacing the flowscript 
calculators with CGIs for the time being. However, we will want to do 
something about this problem in the future.  One difficulty in debugging 
this problem though is that we have no idea which calculators are 
running or where they are at the time of the failure because interpreted 
javascript doesn't show up in the stack trace.  As a consequence I will 
probably recommend that they be rewritten as JSR-168 portlets instead of 
using flow - unless someone has a better idea.


Thoughts and comments are welcome.

Ralph






Re: Cocoon 2.1.7 hang

2006-01-18 Thread Peter Hunsberger
On 1/18/06, Ralph Goers [EMAIL PROTECTED] wrote:

 However, we have been able to recreate the loop without entering bad
 data.

How ?  Just random pounding on the calculator?


 Thoughts and comments are welcome.

Looks to me like both times you've caught the process of a
continuation being trapped and a flow script being executed as a
result.  Slightly different exit out of the continuations handler
however: ContinuationInterpreter.interpret(ContinuationInterpreter.java:657)
may provide the clue you need?  Break points on one of the earlier
ContinuationInterpreter points might also help if you can reproduce
with a debugger attached?

--
Peter Hunsberger


Re: Cocoon 2.1.7 hang

2006-01-18 Thread Ralph Goers

Peter Hunsberger wrote:


On 1/18/06, Ralph Goers [EMAIL PROTECTED] wrote:

 


However, we have been able to recreate the loop without entering bad
data.
   



How ?  Just random pounding on the calculator?
 


Yup. With nornal input.

 


Thoughts and comments are welcome.
   



Looks to me like both times you've caught the process of a
continuation being trapped and a flow script being executed as a
result.  Slightly different exit out of the continuations handler
however: ContinuationInterpreter.interpret(ContinuationInterpreter.java:657)
may provide the clue you need?  Break points on one of the earlier
ContinuationInterpreter points might also help if you can reproduce
with a debugger attached?
 

I looked at what I believe is the right version of 
ContinuationInterpreter 
(http://svn.cocoondev.org/repos/rhino+cont/branches/BEFORE_PACKAGENAME_CHANGE/rhino1_5R4pre/src/org/mozilla/javascript/continuations/ContinuationInterpreter.java) 
and found that it has a while(true) look and that both line 657 and line 
1134 are within it.  The loop has a really large switch statement (line 
657 is TokenStream.SETNAME and 1134 is NON_TAIL_CALL . Unfortunately, I 
have no idea what it is trying to do. but apparently it never breaks out 
of the loop.


Ralph


Re: Cocoon 2.1.7 hang

2006-01-02 Thread Jean-Baptiste Quenot
* Ralph Goers:
 OK. I ran some basic tests on one of my machines.  Just for basic info it is 
 a P4 2.5 GHz with 1 GB of memory 
 running RHEL 3.
 The only thing I did was set up JMeter to login to the portal as user cocoon. 
  In all the tests the computer was 
 maxed at 100% cpu.
 
 Before the change:
 5 threads login repeated 10 times:  Avg 3.4 seconds, Max 27 seconds.
 10 threads login repeaded 5 times:  Avg 6.760 seconds, Max 22 seconds
 
 After the change:
 5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds
 5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds
 10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds.
 10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds.
 
 The change has been checked into 2.1.  I'll test it on 2.2 and check it in 
 also.

Hello Ralph,

Happy new year!

Your change seems very interesting, thank you very much.  However,
I have a more radical solution to this portal login problem, see:

Speedup portal loading
http://issues.apache.org/jira/browse/COCOON-1709

Also requires:

Allow CopletInstanceDataManager to be cloneable
http://issues.apache.org/jira/browse/COCOON-1708
-- 
Jean-Baptiste Quenot
Systèmes d'Information
ANYWARE TECHNOLOGIES
Tel : +33 (0)5 61 00 52 90
Fax : +33 (0)5 61 00 51 46
http://www.anyware-tech.com/


Re: Cocoon 2.1.7 hang

2006-01-02 Thread Ralph Goers
I replied to that days ago in the issue (1709 I believe).  In short, 
this is a good idea for sites (like mine) that only use anonymous 
users.  However, the idea of permanantly caching millions of users 
profiles in memory is very scary and will be considered to be a memory 
leak by many people.  So, I'm -1 on just checking that patch in as is.  
However, if there was a way to enable it for only anonymous users I'd be 
all for that.


Ralph

Jean-Baptiste Quenot wrote:


* Ralph Goers:
 

OK. I ran some basic tests on one of my machines.  Just for basic info it is a P4 2.5 GHz with 1 GB of memory 
running RHEL 3.
The only thing I did was set up JMeter to login to the portal as user cocoon.  In all the tests the computer was 
maxed at 100% cpu.


Before the change:
5 threads login repeated 10 times:  Avg 3.4 seconds, Max 27 seconds.
10 threads login repeaded 5 times:  Avg 6.760 seconds, Max 22 seconds

After the change:
5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds
5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds
10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds.
10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds.

The change has been checked into 2.1.  I'll test it on 2.2 and check it in also.
   



Hello Ralph,

Happy new year!

Your change seems very interesting, thank you very much.  However,
I have a more radical solution to this portal login problem, see:

Speedup portal loading
http://issues.apache.org/jira/browse/COCOON-1709

Also requires:

Allow CopletInstanceDataManager to be cloneable
http://issues.apache.org/jira/browse/COCOON-1708
 



Re: Cocoon 2.1.7 hang

2006-01-02 Thread Jean-Baptiste Quenot
* Ralph Goers:

 I replied to that days ago in the issue (1709 I believe).

Sorry, I  didn't notice your  comment on JIRA, strangely.   I will
followup to your comment.
-- 
Jean-Baptiste Quenot
Systèmes d'Information
ANYWARE TECHNOLOGIES
Tel : +33 (0)5 61 00 52 90
Fax : +33 (0)5 61 00 51 46
http://www.anyware-tech.com/


Re: Cocoon 2.1.7 hang

2006-01-02 Thread Carsten Ziegeler
Ralph Goers wrote:

OK. I ran some basic tests on one of my machines.  Just for basic info it is 
a P4 2.5 GHz with 1 GB of memory 
running RHEL 3.
The only thing I did was set up JMeter to login to the portal as user 
cocoon.  In all the tests the computer was 
maxed at 100% cpu.

Before the change:
5 threads login repeated 10 times:  Avg 3.4 seconds, Max 27 seconds.
10 threads login repeaded 5 times:  Avg 6.760 seconds, Max 22 seconds

After the change:
5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds
5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds
10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds.
10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds.

The change has been checked into 2.1.  I'll test it on 2.2 and check it in 
also.
   
Did you use in both tests the source from 2.1.8? I'm just curious if the
changes I did to the CastorConverter from 2.1.7 to 2.1.8 are improving
performance as well.

Carsten

-- 
Carsten Ziegeler - Open Source Group, SN AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/


Re: Cocoon 2.1.7 hang

2006-01-02 Thread Ralph Goers
I used the latest source for both 2.1 and trunk. 


Carsten Ziegeler wrote:


Ralph Goers wrote:
 

OK. I ran some basic tests on one of my machines.  Just for basic info it is a P4 2.5 GHz with 1 GB of memory 
running RHEL 3.
The only thing I did was set up JMeter to login to the portal as user cocoon.  In all the tests the computer was 
maxed at 100% cpu.


Before the change:
5 threads login repeated 10 times:  Avg 3.4 seconds, Max 27 seconds.
10 threads login repeaded 5 times:  Avg 6.760 seconds, Max 22 seconds

After the change:
5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds
5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds
10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds.
10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds.

The change has been checked into 2.1.  I'll test it on 2.2 and check it in also.
 
   


Did you use in both tests the source from 2.1.8? I'm just curious if the
changes I did to the CastorConverter from 2.1.7 to 2.1.8 are improving
performance as well.

Carsten

 



Re: Cocoon 2.1.7 hang

2005-12-31 Thread Ralph Goers
OK. I ran some basic tests on one of my machines.  Just for basic info 
it is a P4 2.5 GHz with 1 GB of memory running RHEL 3.
The only thing I did was set up JMeter to login to the portal as user 
cocoon.  In all the tests the computer was maxed at 100% cpu.


Before the change:
5 threads login repeated 10 times:  Avg 3.4 seconds, Max 27 seconds.
10 threads login repeaded 5 times:  Avg 6.760 seconds, Max 22 seconds

After the change:
5 threads login repeated 10 times: Avg 1.3 seconds, Max 2.6 seconds
5 threads login repeated 20 times: Avg 1.2 seconds, Max 2.5 seconds
10 threads login repeated 5 times: Avg 2.4 seconds, Max 10 seconds.
10 threads login repeated 10 times: Avg 2.1 seconds, Max 13 seconds.

The change has been checked into 2.1.  I'll test it on 2.2 and check it 
in also.


Ralph

Carsten Ziegeler wrote:


Castor seems to have a lot of useful little hacks - I just found out,
that we can prevent castor from checking for default constructors which
I really needed for 2.2 - it's there, you only have to find out how to
configure it :). Im curious how the matches configuration looks like :)

Carsten

Ralph Goers wrote:
 

OK. I figured out how to use the matches attribute and was able to 
verify that it doesn't throw ClassNotFoundExceptions all the time.  I'll 
do a little load testing next to see what kind of difference it makes on 
throughput.



   

 



Re: Cocoon 2.1.7 hang

2005-12-30 Thread Ralph Goers
We took some thread dumps of our product when it was running normally.  
It was interesting in that we still saw in almost every stack trace the 
portal calling castor which was in the class loader throwing a 
ClassNotFoundException. I then stepped through the sample site and have 
discovered that when

bind-xml auto-naming=deriveByClass
is used, Castor starts making up names and trying to load them.  For 
example, when a named item is specified inside a composite layout castor 
takes

org.apache.cocoon.portal.layout.impl.CompositeLayoutImpl
strips off CompositeLayoutImpl and replaces it with NamedItem.  It then 
tries to load that class and gets a ClassNotFoundException because 
NamedItem isn't in the same package. It eventually uses the correct 
class name.  This makes login extremely slow as every item is throwing 
an exception.  And, I expect, when the resource pool is exceeded the 
class loader is completely overstressed and the system comes to a 
grinding halt.  It doesn't actually stop, but from then on it moves so 
slowly that it might as well be dead.


The code in Castor suggests using the matches attribute to bypass this.  
Unfortunately, there are no examples to be found on how matches could 
solve this problem.


So the bottom line is, unless all your classes are in the same package 
do not use deriveByClass.  Or don't use Castor.


Ralph


Re: Cocoon 2.1.7 hang

2005-12-30 Thread Ralph Goers
OK. I figured out how to use the matches attribute and was able to 
verify that it doesn't throw ClassNotFoundExceptions all the time.  I'll 
do a little load testing next to see what kind of difference it makes on 
throughput.


Ralph Goers wrote:

We took some thread dumps of our product when it was running 
normally.  It was interesting in that we still saw in almost every 
stack trace the portal calling castor which was in the class loader 
throwing a ClassNotFoundException. I then stepped through the sample 
site and have discovered that when

bind-xml auto-naming=deriveByClass
is used, Castor starts making up names and trying to load them.  For 
example, when a named item is specified inside a composite layout 
castor takes

org.apache.cocoon.portal.layout.impl.CompositeLayoutImpl
strips off CompositeLayoutImpl and replaces it with NamedItem.  It 
then tries to load that class and gets a ClassNotFoundException 
because NamedItem isn't in the same package. It eventually uses the 
correct class name.  This makes login extremely slow as every item is 
throwing an exception.  And, I expect, when the resource pool is 
exceeded the class loader is completely overstressed and the system 
comes to a grinding halt.  It doesn't actually stop, but from then on 
it moves so slowly that it might as well be dead.


The code in Castor suggests using the matches attribute to bypass 
this.  Unfortunately, there are no examples to be found on how matches 
could solve this problem.


So the bottom line is, unless all your classes are in the same package 
do not use deriveByClass.  Or don't use Castor.


Ralph




Re: Cocoon 2.1.7 hang

2005-12-30 Thread Carsten Ziegeler
Castor seems to have a lot of useful little hacks - I just found out,
that we can prevent castor from checking for default constructors which
I really needed for 2.2 - it's there, you only have to find out how to
configure it :). Im curious how the matches configuration looks like :)

Carsten

Ralph Goers wrote:
 OK. I figured out how to use the matches attribute and was able to 
 verify that it doesn't throw ClassNotFoundExceptions all the time.  I'll 
 do a little load testing next to see what kind of difference it makes on 
 throughput.
 
 Ralph Goers wrote:
 
 
We took some thread dumps of our product when it was running 
normally.  It was interesting in that we still saw in almost every 
stack trace the portal calling castor which was in the class loader 
throwing a ClassNotFoundException. I then stepped through the sample 
site and have discovered that when
bind-xml auto-naming=deriveByClass
is used, Castor starts making up names and trying to load them.  For 
example, when a named item is specified inside a composite layout 
castor takes
org.apache.cocoon.portal.layout.impl.CompositeLayoutImpl
strips off CompositeLayoutImpl and replaces it with NamedItem.  It 
then tries to load that class and gets a ClassNotFoundException 
because NamedItem isn't in the same package. It eventually uses the 
correct class name.  This makes login extremely slow as every item is 
throwing an exception.  And, I expect, when the resource pool is 
exceeded the class loader is completely overstressed and the system 
comes to a grinding halt.  It doesn't actually stop, but from then on 
it moves so slowly that it might as well be dead.

The code in Castor suggests using the matches attribute to bypass 
this.  Unfortunately, there are no examples to be found on how matches 
could solve this problem.

So the bottom line is, unless all your classes are in the same package 
do not use deriveByClass.  Or don't use Castor.

Ralph
 
 
 


-- 
Carsten Ziegeler - Open Source Group, SN AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/


Re: Cocoon 2.1.7 hang

2005-12-30 Thread Vadim Gritsenko

Ralph Goers wrote:
...

when
bind-xml auto-naming=deriveByClass
is used, Castor starts making up names and trying to load them.

...
I expect, when the resource pool is exceeded the 
class loader is completely overstressed and the system comes to a 
grinding halt.  It doesn't actually stop, but from then on it moves so 
slowly that it might as well be dead.


Hm, just to clarify, you suggest that if ClassLoader is repeatedly asked to load 
a lot of non-existing classes, it eventually slows down a lot?


Hm, wouldn't it be a bug on ClassLoader part...

Vadim


Re: Cocoon 2.1.7 hang

2005-12-30 Thread Ralph Goers



Vadim Gritsenko wrote:


Ralph Goers wrote:
...


when
bind-xml auto-naming=deriveByClass
is used, Castor starts making up names and trying to load them.


...

I expect, when the resource pool is exceeded the class loader is 
completely overstressed and the system comes to a grinding halt.  It 
doesn't actually stop, but from then on it moves so slowly that it 
might as well be dead.



Hm, just to clarify, you suggest that if ClassLoader is repeatedly 
asked to load a lot of non-existing classes, it eventually slows down 
a lot?


Hm, wouldn't it be a bug on ClassLoader part...


Class loading involves searching parent class loaders. If the class is 
found most class loaders put it in a map so they don't have to get it 
again.  Thus loading the same class over and over again should be pretty 
cheap.  But they don't keep track of what classes they didn't find.  So 
constantly looking for non-existant classes is going to be very 
expensive.  Since it is done in a synchronized method it is going to 
become a huge system bottleneck.




Vadim




Re: Cocoon 2.1.7 hang

2005-12-30 Thread Ralph Goers
I'll check them in as soon as I get some testing done.  Shouldn't be too 
long.


Carsten Ziegeler wrote:


Castor seems to have a lot of useful little hacks - I just found out,
that we can prevent castor from checking for default constructors which
I really needed for 2.2 - it's there, you only have to find out how to
configure it :). Im curious how the matches configuration looks like :)

Carsten

 



Re: Cocoon 2.1.7 hang

2005-12-29 Thread Sylvain Wallez

Ralph Goers wrote:
We tried to deploy an update to our product. Pretty much the only 
thing we did to Cocoon was to replace Xalan with XSLTC which produces 
a dramatic performance improvement.  However, Cocoon is not 
consistently hanging.  I tried to attach the thread dumps but they are 
too big. They also don't make any sense to me. I've reduced them down 
and pasted them below.  It shows many calls to XMLFileModule waiting 
for a lock. The thread that has the lock is waiting for 
ResourceLimitingPool to get a lock.


Don't know if there's a problem in ResourceLimitingPool, but looking at 
your stacktrace and XMLFileModule, I suspect a potential deadlock when 
one of the documents searched by XMLFileModule is cocoon: and that 
module is used when doing some parallel CInclude.


The synchronized block in DocumentHelper.getDocument() can be way 
smaller, and avoid the lock by _not_ resolving URIs in a synch'ed block.


Sylvain

--
Sylvain WallezAnyware Technologies
http://bluxte.net http://www.anyware-tech.com
Apache Software Foundation Member Research  Technology Director



Re: Cocoon 2.1.7 hang

2005-12-28 Thread Vadim Gritsenko

On Dec 24, 2005, at 12:16 PM, Ralph Goers wrote:

Has anyone had problems with ehcache?  I suspect that our problems  
are being caused by problems with data being returned from the  
cache when the cache starts writing to disk.


Do you really need persistence store? If not, replace store with  
default implementation - should be much more efficient.


Vadim



Re: Cocoon 2.1.7 hang

2005-12-24 Thread Ralph Goers
Has anyone had problems with ehcache?  I suspect that our problems are 
being caused by problems with data being returned from the cache when 
the cache starts writing to disk.


Ralph


 



Re: Cocoon 2.1.7 hang

2005-12-24 Thread Ralph Goers

Does anyone have any experience with these binding frameworks?

https://bindmark.dev.java.net/

It sure looks like Castor is a poor choice for what the portal is 
doing.  Notably missing from the list is JaxMe.


Ralph

Carsten Ziegeler wrote:


The latest version of Castor has bugs in the reference handling (as far
as I remember). As we don't use the castor references in 2.1.x (but in
2.2), this shouldn't affect you, so a newer version should work.

HTH
Carsten

 



Re: Cocoon 2.1.7 hang

2005-12-23 Thread Ralph Goers
We are running Sun JDK 1.4.2_05 on RHEL 3. The Tomcat is 5.0 something 
(I'm out of tow at the moment so I don't have access to the info).


Frankly, I'm suspecting that ehcache is returning a bad document to 
Castor although I don't have any proof.  But if that is the case I 
really would have expected Castor to get or throw an exception, not just 
call the classloader over and over.


The lock at 0x60b19148 was held by the last thread which was

http-8080-Processor26 daemon prio=1 tid=0x0821b148 nid=0x1e8b waiting 
for monitor entry [2cafd000..2caff87c]

  at java.lang.String.replace(String.java:1555)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:190)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
  - locked 0x60b1d920 (a sun.misc.Launcher$ExtClassLoader)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:282)
  - locked 0x60b19148 (a sun.misc.Launcher$AppClassLoader)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
  - locked 0x60b19148 (a sun.misc.Launcher$AppClassLoader)

Ralph

Pier Fumagalli wrote:


On 22 Dec 2005, at 18:16, Ralph Goers wrote:

We finally got some thread dumps from our production server.   It  
shows something very different than what we were seeing in testing.  
First, this happens under light load after running for days.  To  
summarize, many threads are waiting for the ResourceLimitingPool  and 
several are waiting for the class loader. This system hasn't  had the 
pools tuned so I'm not surprised about pool contention, but  I don't 
believe that is the issue. That is because the thread  holding the 
lock is simply waiting for the class loader.
We took two traces and both were similar, but not identical.  
Different threads were holding the class loader lock in both.   
However, in both cases the threads holding the class loader lock  
were called from Castor while creating the portal layout.


So far, we have been speculating that the problem is due to a  
problem with the NPTL threads on Enterprise Linux 3.  However, I'm  
wondering if perhaps castor is  having problems and simply calling  
the class loader over and over.


I'd appreciate any ideas.



Ok, as far as I can see down the dumps you might have some problems  
with Catalina's classloader implementation locking up at 0x60b19148:


   at org.apache.catalina.loader.WebappClassLoader.loadClass 
(WebappClassLoader.java:1255)


That seems odd though... I thought that code was debugged pretty  
thoroughly, unless, a seconday lock at 0x60cd9970 prevents the first  
one to be released...


Anyhow, from my experience, NPTL don't cause any whatsoever problem  
under Linux, but that said, I'm running on Jetty 4 with BEA JRockit  
1.4.2. What VM and what container are you actually using?


Pier




Re: Cocoon 2.1.7 hang

2005-12-23 Thread Ralph Goers
Thanks for the info.  Do you think we should rewrite this using Digester 
instead of Castor?


Carsten Ziegeler wrote:



Someone told me some months ago that they experienced several strange
issues with castor under load - I think he mentioned class loading and
synchronization problems. I still can't remember *who* it was :( So,
chances are that this is really caused somehow by castor. I improved the
the castor portal implementation for 2.1.8 slightly - the version in
2.1.7 parsed the mapping each time a profile is loaded. I guess that
through this operation the classes are loaded as well. In 2.1.8 the
mapping is only read once on startup.
So my suggestion is to use the CastorSourceConverter from 2.1.8 in your
2.1.7 environment and see what happens.

Carsten

 



Re: Cocoon 2.1.7 hang

2005-12-23 Thread Carsten Ziegeler
Ralph Goers wrote:
 Thanks for the info.  Do you think we should rewrite this using Digester 
 instead of Castor?
 
I never really liked Castor; at the time when we started with the
portal Digester was not usable for us as there were some problems
with complex types like lists and maps (at least as far as i remember).
But if Digester meets your needs, I think it makes sense to give it a try.

Carsten


-- 
Carsten Ziegeler - Open Source Group, SN AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/


Re: Cocoon 2.1.7 hang

2005-12-23 Thread Reinhard Poetz

Ralph Goers wrote:
Thanks for the info.  Do you think we should rewrite this using Digester 
instead of Castor?


Commons Digester is only a one-way tool: XML-Java.
Castor could be replaced by XMLBeans (if you need to use XML Schema), XStream 
(if you only want to de/serialiaze XML to Java) or JAXB (also provides separate 
mapping files).


You could also try to upgrade Castor to the latest release. If there are 
problems, I'm sure the Castor community will be very helpful.


--
Reinhard Pötz   Independent Consultant, Trainer  (IT)-Coach 


{Software Engineering, Open Source, Web Applications, Apache Cocoon}

   web(log): http://www.poetz.cc



Re: Cocoon 2.1.7 hang

2005-12-23 Thread Ralph Goers
What were the problems you had with the last version of Castor you 
tried? (I believe a new version is now available).


Reinhard pointed out a potential problem with Digester - we have to 
write out the instance data when preferences are updated.


I wish I understood what exactly the problem is.  We have run this under 
load in our QA environment with no problems but it hangs up in 
production with a fairly light load. 


Ralph

Carsten Ziegeler wrote:


Ralph Goers wrote:
 

Thanks for the info.  Do you think we should rewrite this using Digester 
instead of Castor?


   


I never really liked Castor; at the time when we started with the
portal Digester was not usable for us as there were some problems
with complex types like lists and maps (at least as far as i remember).
But if Digester meets your needs, I think it makes sense to give it a try.

Carsten


 



Re: Cocoon 2.1.7 hang

2005-12-23 Thread Reinhard Poetz

Ralph Goers wrote:
What were the problems you had with the last version of Castor you 
tried? (I believe a new version is now available).


Reinhard pointed out a potential problem with Digester - we have to 
write out the instance data when preferences are updated.


Does the data follow an XML schema or some sort of contract? If not and you just 
want to serialize them in an human readable format, use XStream which is *very* 
simple to use. Of course this leads to backwards incompatibilities ...


--
Reinhard Pötz   Independent Consultant, Trainer  (IT)-Coach 


{Software Engineering, Open Source, Web Applications, Apache Cocoon}

   web(log): http://www.poetz.cc



Re: Cocoon 2.1.7 hang

2005-12-23 Thread Carsten Ziegeler
Ralph Goers wrote:
 What were the problems you had with the last version of Castor you 
 tried? (I believe a new version is now available).
 
 Reinhard pointed out a potential problem with Digester - we have to 
 write out the instance data when preferences are updated.
 
 I wish I understood what exactly the problem is.  We have run this under 
 load in our QA environment with no problems but it hangs up in 
 production with a fairly light load. 
 
The latest version of Castor has bugs in the reference handling (as far
as I remember). As we don't use the castor references in 2.1.x (but in
2.2), this shouldn't affect you, so a newer version should work.

HTH
Carsten

-- 
Carsten Ziegeler - Open Source Group, SN AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/


Re: Cocoon 2.1.7 hang

2005-12-22 Thread Ralph Goers
We finally got some thread dumps from our production server.   It shows 
something very different than what we were seeing in testing. First, 
this happens under light load after running for days.  To summarize, 
many threads are waiting for the ResourceLimitingPool and several are 
waiting for the class loader. This system hasn't had the pools tuned so 
I'm not surprised about pool contention, but I don't believe that is the 
issue. That is because the thread holding the lock is simply waiting for 
the class loader. 

We took two traces and both were similar, but not identical. Different 
threads were holding the class loader lock in both.  However, in both 
cases the threads holding the class loader lock were called from Castor 
while creating the portal layout.


So far, we have been speculating that the problem is due to a problem 
with the NPTL threads on Enterprise Linux 3.  However, I'm wondering if 
perhaps castor is  having problems and simply calling the class loader 
over and over.


I'd appreciate any ideas.

We see many threads like this one...

http-8080-Processor155 daemon prio=1 tid=0x083e3378 nid=0x1e8b waiting 
for monitor entry [22dc..22dc187c]
   at 
org.apache.avalon.excalibur.pool.ResourceLimitingPool.get(ResourceLimitingPool.java:262)

   - waiting to lock 0x60cd9970 (a java.lang.Object)
   at 
org.apache.avalon.excalibur.component.PoolableComponentHandler.doGet(PoolableComponentHandler.java:198)
   at 
org.apache.avalon.excalibur.component.ComponentHandler.get(ComponentHandler.java:381)
   at 
org.apache.avalon.excalibur.component.ExcaliburComponentSelector.select(ExcaliburComponentSelector.java:213)
   at 
org.apache.cocoon.components.ExtendedComponentSelector.select(ExtendedComponentSelector.java:260)
   at 
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.addTransformer(AbstractProcessingPipeline.java:267)
   at 
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.addTransformer(AbstractCachingProcessingPipeline.java:143)
   at 
org.apache.cocoon.components.treeprocessor.sitemap.TransformNode.invoke(TransformNode.java:59)
   at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46)
   at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130)
   at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68)
   at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138)
   at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68)
   at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89)
   at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240)
   at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:180)
   at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:243)

   at org.apache.cocoon.Cocoon.process(Cocoon.java:606)
   at 
org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1119)

   at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
   at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:237)
   at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)
   at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:214)
   at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
   at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
   at 
org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:198)
   at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:152)
   at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
   at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
   at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)
   at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
   at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:118)
   at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:102)
   at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
   at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at 
org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
   at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)

   at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929)
   at 

Re: Cocoon 2.1.7 hang

2005-12-22 Thread Pier Fumagalli

On 22 Dec 2005, at 18:16, Ralph Goers wrote:

We finally got some thread dumps from our production server.   It  
shows something very different than what we were seeing in testing.  
First, this happens under light load after running for days.  To  
summarize, many threads are waiting for the ResourceLimitingPool  
and several are waiting for the class loader. This system hasn't  
had the pools tuned so I'm not surprised about pool contention, but  
I don't believe that is the issue. That is because the thread  
holding the lock is simply waiting for the class loader.
We took two traces and both were similar, but not identical.  
Different threads were holding the class loader lock in both.   
However, in both cases the threads holding the class loader lock  
were called from Castor while creating the portal layout.


So far, we have been speculating that the problem is due to a  
problem with the NPTL threads on Enterprise Linux 3.  However, I'm  
wondering if perhaps castor is  having problems and simply calling  
the class loader over and over.


I'd appreciate any ideas.


Ok, as far as I can see down the dumps you might have some problems  
with Catalina's classloader implementation locking up at 0x60b19148:


   at org.apache.catalina.loader.WebappClassLoader.loadClass 
(WebappClassLoader.java:1255)


That seems odd though... I thought that code was debugged pretty  
thoroughly, unless, a seconday lock at 0x60cd9970 prevents the first  
one to be released...


Anyhow, from my experience, NPTL don't cause any whatsoever problem  
under Linux, but that said, I'm running on Jetty 4 with BEA JRockit  
1.4.2. What VM and what container are you actually using?


Pier




smime.p7s
Description: S/MIME cryptographic signature


Re: Cocoon 2.1.7 hang

2005-12-22 Thread Carsten Ziegeler
Ralph Goers wrote:
 We finally got some thread dumps from our production server.   It shows 
 something very different than what we were seeing in testing. First, 
 this happens under light load after running for days.  To summarize, 
 many threads are waiting for the ResourceLimitingPool and several are 
 waiting for the class loader. This system hasn't had the pools tuned so 
 I'm not surprised about pool contention, but I don't believe that is the 
 issue. That is because the thread holding the lock is simply waiting for 
 the class loader. 
 
 We took two traces and both were similar, but not identical. Different 
 threads were holding the class loader lock in both.  However, in both 
 cases the threads holding the class loader lock were called from Castor 
 while creating the portal layout.
 
 So far, we have been speculating that the problem is due to a problem 
 with the NPTL threads on Enterprise Linux 3.  However, I'm wondering if 
 perhaps castor is  having problems and simply calling the class loader 
 over and over.
 
 I'd appreciate any ideas.
 
Someone told me some months ago that they experienced several strange
issues with castor under load - I think he mentioned class loading and
synchronization problems. I still can't remember *who* it was :( So,
chances are that this is really caused somehow by castor. I improved the
the castor portal implementation for 2.1.8 slightly - the version in
2.1.7 parsed the mapping each time a profile is loaded. I guess that
through this operation the classes are loaded as well. In 2.1.8 the
mapping is only read once on startup.
So my suggestion is to use the CastorSourceConverter from 2.1.8 in your
2.1.7 environment and see what happens.

Carsten

-- 
Carsten Ziegeler - Open Source Group, SN AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/


Re: Cocoon 2.1.7 hang

2005-12-19 Thread Erron Austin
We are still experiencing this hang. Does anyone have any idea what may be causing these threads to hang? It seems to hanging at the same spot, but admittedly I'm not sure what I'm looking at.

http-8080-Processor1 daemon prio=1 tid=0x082ca7d8 nid=0x574c runnable [2e6fe000..2e6ff87c]at java.net.SocketOutputStream.socketWrite0(Native Method)at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java
:92)at java.net.SocketOutputStream.write(SocketOutputStream.java:136)at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:714)at org.apache.tomcat.util.buf.ByteChunk.flushBuffer
(ByteChunk.java:398)at org.apache.coyote.http11.InternalOutputBuffer.flush(InternalOutputBuffer.java:304)at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:921)at org.apache.coyote.Response.action
(Response.java:182)at org.apache.coyote.tomcat5.OutputBuffer.doFlush(OutputBuffer.java:326)at org.apache.coyote.tomcat5.OutputBuffer.flush(OutputBuffer.java:297)at org.apache.coyote.tomcat5.CoyoteOutputStream.flush
(CoyoteOutputStream.java:85)at org.apache.cocoon.util.BufferedOutputStream.realFlush(BufferedOutputStream.java:128)at org.apache.cocoon.environment.AbstractEnvironment.commitResponse(AbstractEnvironment.java:512)
at org.apache.cocoon.Cocoon.process(Cocoon.java:630)at org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1119)at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
(ApplicationFilterChain.java:237)at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:214)
at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)at org.apache.catalina.core.StandardContextValve.invokeInternal
(StandardContextValve.java:198)at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:152)at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:137)at org.apache.catalina.core.StandardValveContext.invokeNext
(StandardValveContext.java:104)at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:118)at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:102)at org.apache.catalina.core.StandardPipeline.invoke
(StandardPipeline.java:520)at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)at org.apache.catalina.core.StandardValveContext.invokeNext(StandardValveContext.java:104)at org.apache.catalina.core.StandardPipeline.invoke
(StandardPipeline.java:520)at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929)at org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160)at org.apache.coyote.http11.Http11Processor.process
(Http11Processor.java:799)at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)at org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)at java.lang.Thread.run(Thread.java:534)
...

http-8080-Processor1 daemon prio=1 tid=0x082ca7d8 nid=0x574c runnable [2e6fe000..2e6ff87c]at java.net.SocketOutputStream.socketWrite0(Native Method)at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java
:92)at java.net.SocketOutputStream.write(SocketOutputStream.java:136)at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:714)at org.apache.tomcat.util.buf.ByteChunk.flushBuffer
(ByteChunk.java:398)at org.apache.coyote.http11.InternalOutputBuffer.flush(InternalOutputBuffer.java:304)at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:921)at org.apache.coyote.Response.action
(Response.java:182)at org.apache.coyote.tomcat5.OutputBuffer.doFlush(OutputBuffer.java:326)at org.apache.coyote.tomcat5.OutputBuffer.flush(OutputBuffer.java:297)at org.apache.coyote.tomcat5.CoyoteOutputStream.flush
(CoyoteOutputStream.java:85)at org.apache.cocoon.util.BufferedOutputStream.realFlush(BufferedOutputStream.java:128)at org.apache.cocoon.environment.AbstractEnvironment.commitResponse(AbstractEnvironment.java:512)
at org.apache.cocoon.Cocoon.process(Cocoon.java:630)at org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1119)at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter
(ApplicationFilterChain.java:237)at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)at 

Re: Cocoon 2.1.7 hang

2005-12-15 Thread Antonio Gallardo

Java version?

BTW, the src are here: 
http://apache.secsup.org/dist/avalon/excalibur-pool/source/


Best Regards,

Antonio Gallardo.

Ralph Goers wrote:

We tried to deploy an update to our product. Pretty much the only 
thing we did to Cocoon was to replace Xalan with XSLTC which produces 
a dramatic performance improvement.  However, Cocoon is not 
consistently hanging.  I tried to attach the thread dumps but they are 
too big. They also don't make any sense to me. I've reduced them down 
and pasted them below.  It shows many calls to XMLFileModule waiting 
for a lock. The thread that has the lock is waiting for 
ResourceLimitingPool to get a lock. However, the thread dump doesn't 
show any threads holding that lock.  Does anyone have any ideas on 
this?  This is cocoon 2.1.7-dev svn revision 122686.
Cocoon-2.1.7 is using excalibur-1.2.  I can't find the source at 
excalibur.  Anyone know where I can get it?


Lots of theese waiting

http-8080-Processor25 daemon prio=1 tid=0x2dffa660 nid=0x19eb 
waiting for monitor entry [2d3eb000..2d3ed87c]
at 
org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper.getDocument(XMLFileModule.java:155) 

- waiting to lock 0x5f1824b8 (a 
org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper)
at 
org.apache.cocoon.components.modules.input.XMLFileModule.getContextObject(XMLFileModule.java:357) 

at 
org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:380) 

at 
org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:368) 

at 
org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.processModule(PreparedVariableResolver.java:256) 

at 
org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.resolve(PreparedVariableResolver.java:207) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.TransformNode.invoke(TransformNode.java:59) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:97) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) 

at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) 

at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) 

at 

Re: Cocoon 2.1.7 hang

2005-12-15 Thread Antonio Gallardo

Ralph Goers wrote:

We tried to deploy an update to our product. Pretty much the only 
thing we did to Cocoon was to replace Xalan with XSLTC which produces 
a dramatic performance improvement.  However, Cocoon is not 
consistently hanging.  I tried to attach the thread dumps but they are 
too big. They also don't make any sense to me. I've reduced them down 
and pasted them below.  It shows many calls to XMLFileModule waiting 
for a lock. The thread that has the lock is waiting for 
ResourceLimitingPool to get a lock. However, the thread dump doesn't 
show any threads holding that lock.  Does anyone have any ideas on 
this?  This is cocoon 2.1.7-dev svn revision 122686.
Cocoon-2.1.7 is using excalibur-1.2.  I can't find the source at 
excalibur.  Anyone know where I can get it?


the src are here: 
http://apache.secsup.org/dist/avalon/excalibur-pool/source/


BTW, java version?, xalan version?

Best Regards,

Antonio Gallardo.

snip/


Re: Cocoon 2.1.7 hang

2005-12-15 Thread Ralph Goers

It is running on AIX so it is IBM's JDK 1.4

Thanks, I found the code. I don't see a whole lot different but there is 
clearly a problem in this code.


Ralph

Antonio Gallardo wrote:


Java version?

BTW, the src are here: 
http://apache.secsup.org/dist/avalon/excalibur-pool/source/


Best Regards,

Antonio Gallardo.

Ralph Goers wrote:

We tried to deploy an update to our product. Pretty much the only 
thing we did to Cocoon was to replace Xalan with XSLTC which produces 
a dramatic performance improvement.  However, Cocoon is not 
consistently hanging.  I tried to attach the thread dumps but they 
are too big. They also don't make any sense to me. I've reduced them 
down and pasted them below.  It shows many calls to XMLFileModule 
waiting for a lock. The thread that has the lock is waiting for 
ResourceLimitingPool to get a lock. However, the thread dump doesn't 
show any threads holding that lock.  Does anyone have any ideas on 
this?  This is cocoon 2.1.7-dev svn revision 122686.
Cocoon-2.1.7 is using excalibur-1.2.  I can't find the source at 
excalibur.  Anyone know where I can get it?


Lots of theese waiting

http-8080-Processor25 daemon prio=1 tid=0x2dffa660 nid=0x19eb 
waiting for monitor entry [2d3eb000..2d3ed87c]
at 
org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper.getDocument(XMLFileModule.java:155) 

- waiting to lock 0x5f1824b8 (a 
org.apache.cocoon.components.modules.input.XMLFileModule$DocumentHelper)
at 
org.apache.cocoon.components.modules.input.XMLFileModule.getContextObject(XMLFileModule.java:357) 

at 
org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:380) 

at 
org.apache.cocoon.components.modules.input.XMLFileModule.getAttribute(XMLFileModule.java:368) 

at 
org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.processModule(PreparedVariableResolver.java:256) 

at 
org.apache.cocoon.components.treeprocessor.variables.PreparedVariableResolver.resolve(PreparedVariableResolver.java:207) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.TransformNode.invoke(TransformNode.java:59) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:97) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) 

at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:138) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:89) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:240) 

at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.buildPipeline(ConcreteTreeProcessor.java:198) 

at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.buildPipeline(TreeProcessor.java:256) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:108) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:46) 

at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:130) 

at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:68) 

at 

Re: Cocoon 2.1.7 hang

2005-12-15 Thread Ralph Goers

Antonio Gallardo wrote:


Ralph Goers wrote:

We tried to deploy an update to our product. Pretty much the only 
thing we did to Cocoon was to replace Xalan with XSLTC which produces 
a dramatic performance improvement.  However, Cocoon is not 
consistently hanging.  I tried to attach the thread dumps but they 
are too big. They also don't make any sense to me. I've reduced them 
down and pasted them below.  It shows many calls to XMLFileModule 
waiting for a lock. The thread that has the lock is waiting for 
ResourceLimitingPool to get a lock. However, the thread dump doesn't 
show any threads holding that lock.  Does anyone have any ideas on 
this?  This is cocoon 2.1.7-dev svn revision 122686.
Cocoon-2.1.7 is using excalibur-1.2.  I can't find the source at 
excalibur.  Anyone know where I can get it?


the src are here: 
http://apache.secsup.org/dist/avalon/excalibur-pool/source/


BTW, java version?, xalan version?


xalan version is 2.6.1- As I recall the revision I am using of Cocoon 
was very close to the final 2.1.7 release.


Ralph