Hi,
I have pasted in at the bottom a previous inquiry to this group
about a long 'hang' we were seeing in the XML Beans
XmlObject::Validate method on WebSphere. It seemed random, and we
couldn't reproduce it by replaying the same message again, when we
found one that did cause the hang.
Since the inquiry below, we have since isolated the problem with IBM
WebSphere Support, and IBM has issued a fix for the issue.
The problem was that the IBM JDK that is embedded in WebSphere 5.x
and 6.x was JIT compiling the following class or method, which would
sometimes take 4-5 minutes to complete:
org.apache.xmlbeans.impl.store.Saver$ValidatorSaver.textIsWhitespace(Saver.java:3985)
When this occured all of WebSphere comes to a complete halt. All
threads in the container stopped executing, and you couldn't even
get to the WebSphere Admin Console Web Application page. The reason
this was occuring was because while it was taking this 4-5 minutes
to compile, it was holding on to lock called JIT Global Compile
Lock. Every other websphere thread was waiting for this lock while
this 4-5 minute compile was going on...hence WebSphere is dead while
this is happening.
IBM Issued a fix for this with their JDK 1.4.2 SR7 release, which
can be found here:
http://www-1.ibm.com/support/docview.wss?uid=swg24011133
This problem is not specific to any version of xml beans, it occurs
in 1.0.3, 1.0.4, and 2.x, as we have tried all versions, and they
all experienced the issue.
Once we figured this out via core dumps, stack traces, and lock
dumps, we actually found out that there was another WebSphere
customer out there that was having the exact same issue as us. The
other customer confirmed their problem went away with this SR7. We
applied the update yesterday and are testing it out now. So far we
have not been able to recreate the issue.
Hope this helps anyone having similar issues.
Thanks,
Joe
Here is the contents of my earlier posts to this thread:
Radu, thanks for the thoughts. I will investigate some of those
ideas...since we are out of ideas at this point.
Thanks,
Joe
-----Original Message-----
From: Radu Preotiuc-Pietro [mailto:[EMAIL PROTECTED]
Sent: Monday, October 16, 2006 9:56 AM
To: [email protected]
Subject: RE: Wierd "Freeze" in Validate on WebSphere 5.1
Well, I got to say I have no idea where in the XMLBeans code this problem
might be. You say it's related to validation and it's not around the
SchemaTypeVisitorImpl.visit() method (which is the "trickiest"
validation method).
On the other hand, you are saying that you don't see this problem on
Windows, which hints that this is not an "algorithmic" problem, but
something environment-related (the one that we fixed for 1.0.4 was
happening regardless of the environment used if I remember correctly).
I guess one simple test that you could do is try to validate the same
document using the command-line tool "validate" or even better, your code
running outside of WebSphere.
One thing which _might_ be happening is that the loading of all the Schema
type information off the classpath during validation triggers some weird
classloading behaviour from WAS (that could also explain why the whole
server is frozen). By the way, XMLBeans uses the context classloader (by
default) to load this type information.
Radu
-----Original Message-----
From: Joseph Mihalich [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 10, 2006 6:15 PM
To: [email protected]
Subject: RE: Wierd "Freeze" in Validate on WebSphere 5.1
Hi all,
We are using xml beans 1.0.4, latest binary release. We're
having a really horrible problem on WebSphere 5.1 trying to
validate
xml requests coming in on our web services.
What seems to be randomly, different components (in our
ear's)
deserialize XML received from Axis, will call into the XmlObject
Validate method, and just go in, and sit there for anywhere
between
2 and 4 minutes. The xml is usually 1K or less, so it's really
small.
And the same XML Document doesn't cause the long delay if I try
to
replay the message.
During this time that validate is taking so long the entire
WebSphere AppServer is frozen...all execution and logging stops
until this call returns. (Not sure how that can happen, but
that's
another story).
I noticed a bug logged against 1.0.3 where the exact same
behavior
was noted, and an optimization was put into 1.0.4 to fix it. We
were
on 1.0.3, and upgraded to 1.0.4 and still experienced the
problem.
I pulled the 1.0.4 source, and added logging around the fix
that was made, and verified that in fact, the "delay" was NOT being
observed in that method, so the problem is somewhere else.
I attempted to add more logging to try and isolate where the
delay was coming from, but of course, once I did that I could
not
reproduce it...hinting at some kind of timing issue. Then I ran
out
of time, and had to put this problem aside for a while.
This happens on every single box we install on. This occurs
on
Suse Linux 9.1 Ent, WAS 5.1, IBM JDK 1.4.2. Also, we do NOT see
this
on our windows platform which runs Windows Server 2003, Jboss
4.0.4,
and Sun JDK 1.4.2.
Anyway, I was wondering if anyone else has experienced this
issue
on WAS 5.1, and if anyone has any suggestions on what or where
the
problem might be in the xml beans code?
Thanks for any help you can provide.
Thanks,
Joe
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]