Hi Johan,

Does the same problem occur with mod_proxy (full http) instead of mod_proxy_ajp.

We have encountered some problems with mod_proxy_ajp that were solved by using 
simple mod_proxy.

Cheers, buddy.

André


-----Original Message-----
From: Johan Cwiklinski <johan.cwiklin...@ajlsm.com>
Date: Wed, 22 Dec 2010 08:51:51 
To: <users@cocoon.apache.org>
Reply-To: users@cocoon.apache.org
Subject: Tomcat6/Cocoon 2.1.10 using 100% CPU on windows

Hello,

I have a problem with a cocoon 2.1.10 webapp running under tomcat 6.0.26
under windows 2003 server 64 bits with oracle's JDK 1.6.0_21.

This application is installed on a 'background' server, an application
on another server request it via AJP using apache mod_proxy_ajp.

For some reasons, the application will often eats 100% of the CPU, we
then need to kill and restart tomcat. Logs says absolutely nothing :(
I was not able yet to reproduce the issue on my dev environment.

This application mainly use some classes we've developed on the top of
cocoon that will:
- search for image file in some directories on different hard disks
(mainly by testing each directory + image path and looking if the file
exists),
- retrieve and show the image,
- additionally use ImageMagick to resize, rotate, etc.

The 'main' class extends cocoon's ResourceReader.

Using the jvisualvm tool provided with Oracle's JDK, I can observe that:
- ajp threads are sometimes running, and sometimes waiting; ok, that
seems normal,
- when the 100% cpu issue occurs, some ajp threads keeps running (never
get back to waiting state). At the beginning, only one or two threads
are affected, many more will be if we wait. I can also observe that some
threads (a few ones unfortunately) will still having the normal behavior.

All running threads using our class (ImageMagickReader) seems to be kind
of blocked on super.setup or super.generate methods:

"ajp-9009-9" - Thread t...@65
    java.lang.Thread.State: RUNNABLE
  at java.util.HashMap.get(HashMap.java:303)
  at
org.apache.cocoon.reading.ResourceReader.getLastModified(ResourceReader.java:242)
  at
org.apache.cocoon.reading.ResourceReader.setupHeaders(ResourceReader.java:177)
  at org.apache.cocoon.reading.ResourceReader.setup(ResourceReader.java:157)
  at org.pleade.reading.ImageMagickReader.setup(ImageMagickReader.java:272)
[...]

Line 242 of ResourceReader.java is:
final String systemId = (String) documents.get(request.getRequestURI());

"ajp-9009-8" - Thread t...@102
    java.lang.Thread.State: RUNNABLE
  at java.util.HashMap.transfer(HashMap.java:484)
  at java.util.HashMap.resize(HashMap.java:463)
  at java.util.HashMap.addEntry(HashMap.java:755)
  at java.util.HashMap.put(HashMap.java:385)
  at
org.apache.cocoon.reading.ResourceReader.generate(ResourceReader.java:346)
  at
org.pleade.reading.ImageMagickReader.generate(ImageMagickReader.java:584)
[...]

Line 346 of ResourceReader.java is:
documents.put(request.getRequestURI(), inputSource.getURI());

Those two examples are based on the first threads that will never release.
I do not know if it is possible for a HashMap to be sort of corrupted ;
of maybe HTTP headers? I'm not sure even if what we're seeing is is the
cause or the consequence of the issue :(

The same issue has been observed in the past on another server which is
now running under GNU/Linux, and now seems to be ok (about two weeks
under Linux, and no longer 100% CPU!).
We've trying several tomcat and java versions, that changes anything.

The issue can occurs after several uptime hours, or only a few minutes!
If there are many connections, the issue will occurs more often; but is
still present with just a few connections.

I really do not know where the problem should be... Is it our code? Is
it cocoon? Is it tomcat? Or more probably something one of them is doing
that windows dislikes?

It's difficult to know when exactly the problem happens (we've asked
system administrators but get no answer) ; so I've not yet tried to log
in debug mode (well, I've tried it once, but this is really verbose...).

Any ideas? I do not know what to try or where to look at now :/

Many similar issues I can show over the web were related to a bug in
tcnative under tomcat 5.5 ; I guess that is resolved now, I did not
found any similar bug under tomcat 6. I also found a few ones with
tomcat 6, but some were related to the apps, and the others were not
resolved (at least there is no information on the forums/mailing lists
saying it is resolved and what was the issue).

You could take a look at the whole thread dump after about 10-15 minutes
of 100% cpu usage:
http://ouessant2.ajlsm.com/cocoon_app_cpu_issue

The two threads I gave in example (ajp-9009-8 and ajp-9009-9) are ones
that were started approximately when the server runs out of CPU ; and
are still in the same state 10-15 minutes after.

Thank you.

Regards,
-- 
Johan Cwiklinski
AJLSM

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org

Reply via email to