My test case has been posted to JIRA Muse-270:

https://issues.apache.org/jira/browse/MUSE-270
 

-----Original Message-----
From: Vinh Nguyen (vinguye2) 
Sent: Wednesday, September 05, 2007 12:54 PM
To: [email protected]
Subject: RE: EMPTY_DOC thread stability issues

Minor correction to point (b):

SimpleNotificationMessage actually calls EndpointReference.toXML(Document), 
which correctly returns a new Element copy.  So I made updates to 
SimpleNotificationMessage to use a new Document instead of EMPTY_DOC when 
calling EndpointReference.toXML(Document), since there can be several 
SimpleNotificationMessage objects trying to simultaneously access the child 
nodes of the toXML(Document) result.

The EndpointReference.toXML() method (no Document parameter) returns the same 
internal Element reference.  So callers must be careful to synchronize access 
to the result Element's children from multiple threads.  I think this can 
easily be done by applying a "synchronized(element)" lock on the Element object 
before accessing it.


-----Original Message-----
From: Vinh Nguyen (vinguye2)
Sent: Wednesday, September 05, 2007 11:40 AM
To: [email protected]
Subject: RE: EMPTY_DOC thread stability issues

To sum up, the two problems are:

1) A Document is not thread safe.
2) A Node/Element is not thread safe.

For #1, EMPTY_DOC can still be used in cases where an Element is created and 
only used in a single thread.  Or, with the help of ThreadLocal, it can be used 
in multiple threads...but not if the Element's child nodes will be read from 
multiple threads.

#2 requires extra care when creating an Element.  This means not just knowing 
when to use a new Document, but knowing if the Element itself needs to be 
created anew each time it needs to be accessed.  So, take the EndpointReference 
class for example.

a) Since the toXML() method is called from multiple threads  (i.e. Muse gets 
the producer resource's
   EPR to set it in a notification message), it should always use 
XmlUtils.createDocument() instead
   of XmlUtils.EMPTY_DOC.
b) Since toXML() returns an Element whose children will be accessed from 
multiple threads, then the
   Element itself should be created anew for each thread/caller.  So ideally, 
EndpointReference.toXML()
   should not return the same internal Element reference, but create a new 
Element each time.  This prevents
   errors from occurring when two SimpleNotificationMessage classes need to 
simultaneously read the child
   nodelist of the same resource EPR.

These rules apply not just to EMPTY_DOC, but to any class that needs to 
use/create a Document or Element.  The use of a shared EMPTY_DOC  increases 
performance, but it also can cause code confusion.  For example, a lot of Muse 
code will obtain an Element and automatically assume the owner document is 
EMPTY_DOC (i.e. serializers need to return an Element whose owner is 
EMPTY_DOC).  But, if my custom code doesn't follow this undocumented 
convention, I will run into errors.  It took me awhile to figure out the 
serializer restriction.

I'm also still concerned about ThreadLocal.  It does allow the EMPTY_DOC to be 
reused across multiple threads.  But, if Xerces is in fact storing an internal 
cache for the document and is depending on the thread to expire in order for 
the cache to clear, then that might be a problem.  ThreadLocal will not expire 
until the app shuts down, which means the Xerces cache will continue to build 
up for as many times EMPTY_DOC is used, and this might cause unexpected 
problems later on during runtime.
-Vinh



-----Original Message-----
From: Oliver Waeldrich [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 05, 2007 6:16 AM
To: [email protected]
Subject: RE: EMPTY_DOC thread stability issues

Hi all,

> point 1) affects this directly, the problem is that both the 
> getChildNodes optimisation and the importing (as Vinh just showed) 
> aren't thread safe.

I agree. The point is that using ThreadLocal is just an optimized version 
compared to the approach of creating new documents. It can be used for several 
cases were the XML fragments are only used temporally. Remember the initial 
idea why EMPTY_DOC was introduced was to minimize document creation. It should 
be used as a factory for creating XML fragments. 
In a lot of cases you create XML fragments (e.g. SOAP envelope, header, body), 
use them (e.g. de-/serialize) and discard them. 

Therefore, these fragments only exist for a short time within one worker thread 
of a servlet container, or more concrete they only exist within a 
reqest/response cycle. These fragments are also only accessed by the current 
thread. However, since document is not thread safe we can not share EMPTY_DOC 
over multiple worker threads.

MUSE would profit from using ThreadLocal and not creating new XML documents all 
the time, since every worker thread has its own factory document. Hence we do 
not have the threading issues.

If we need to use objects over multiple requests, we need to create them in 
thread safe representations, which potentially do not include DOM elements. But 
this is only the case for a minority of the XML related objects created in muse.

Again, the TreadLocal approach does not solve our problem alone. It is just an 
optimization. Since some refactoring has to be done, I would suggest not to 
simply search and replace EMPTY_DOC by createDocument(), since the basic idea 
of reusing the document object was totally fine.

> point 2) is exactly the same problem just pushed further down the line. 
> The resource EPR with its seperate document is then read 
> simultaneously by seperate threads wishing to publish events leading to same 
> issue.

I think this is really the root of the matter (at least for the EPRs). 
Potentially we must really use raw strings as internal representations of EPRs 
instead of DOM elements. 

Oliver


-------- Original-Nachricht --------
> Datum: Wed, 5 Sep 2007 13:20:21 +0200
> Von: [EMAIL PROTECTED]
> An: [email protected]
> Betreff: RE: RE: EMPTY_DOC thread stability issues

> Hi All,
> 
> Any piece of code that uses nodelists will be thread unsafe as soon as 
> a document is shared across threads.  It doesn't matter if the 
> original EPR has a seperate document or not.  The problem with TLSing 
> the Document itself is that the standard says its not thread safe 
> either, all that happens is push the problem further down the line 
> until two threads try to nodelist the same document.
> 
> point 1) affects this directly, the problem is that both the 
> getChildNodes optimisation and the importing (as Vinh just showed) aren't 
> thread safe.
> 
> For EPRs at least the only way to be truly thread safe is to store the 
> actual information in raw xml strings (or objects) and re-parse it 
> every time you use an epr.  Its expensive to then reparse but at least safe.
> 
> point 2) is exactly the same problem just pushed further down the line. 
> The resource EPR with its seperate document is then read 
> simultaneously by seperate threads wishing to publish events leading to same 
> issue.
> getFirstChild/getNextSibling help with this but only as another hack 
> tied to a specific version of xerces.
> 
> Please upload the test case, the more test cases we can use the better.
> 
> cheers,
> Chris
> 
> -----Original Message-----
> From: Oliver Waeldrich [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, September 05, 2007 12:52 PM
> To: [email protected]
> Subject: Re: RE: EMPTY_DOC thread stability issues
> 
> 
> Hi all,
> 
> I did some additional testing related to the EMPTY_DOC problem. 
> Therefore I set up a test where multiple threads created new EPRs 
> based on a set of existing (shared) EPRs. Such shared EPRs are resource EPRs 
> (e.g.
> notification producer EPR). In this test two different kinds of exceptions 
> are thrown:
> 
> 1.)
> java.lang.NullPointerException
>     at org.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
>     at org.apache.xerces.dom.ParentNode.item(Unknown Source)
>     at org.apache.muse.util.xml.XmlUtils.getAllElements(XmlUtils.java:932)
>     at org.apache.muse.util.xml.XmlUtils.getElement(XmlUtils.java:1144)
>     at org.apache.muse.util.xml.XmlUtils.getElement(XmlUtils.java:1119)
>     at
> org.apache.muse.test.case1.TestEndpointReference.initializeFromXML(TestEndpointReference.java:560)
>     at
> org.apache.muse.test.case1.TestEndpointReference.<init>(TestEndpointReference.java:193)
>     at org.apache.muse.test.case1.TestThread.run(TestThread.java:60)
> 
> This exception relates to the following code parts in the EPR
> implementation:
>    
>         ...
>         Document doc = XmlUtils.EMPTY_DOC;
>         _xml = XmlUtils.createElement(doc, typeName);
> 
>         NodeList children = copy.toXML().getChildNodes();
>         int length = children.getLength();
>         
>         for (int n = 0; n < length; ++n)
>         {
>             Node next = doc.importNode(children.item(n), true);
>             _xml.appendChild(next);
>         }
> 
>         try
>         {
>             initializeFromXML();
>         }
>         ...
>         
> The reason for the exception is that multiple threads import nodes 
> into the SHARED document. Later on in initializeFromXML() the 
> different element nodes within the SHARED document are traversed 
> simulations, which leads to the threading issue. The usage of 
> ThreadLocal WILL eliminate such errors, since no SHARED document is 
> used between the different threads to create XML elements. In fact each 
> thread has its own document that is used for this.
> This is shown in the sample thread local.
> 
> 2.)
> 
> java.lang.NullPointerException
>     at org.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
>     at org.apache.xerces.dom.ParentNode.item(Unknown Source)
>     at
> org.apache.muse.test.case1.TestEndpointReference.<init>(TestEndpointReference.java:187)
>     at org.apache.muse.test.case1.TestThread.run(TestThread.java:60)
> 
> The second exception has a similar background. In the test case the 
> initial EPRs are created in the class Test via the
>     
>         TestEndpointReference(URI address, QName typeName)
>          
> constructor. This constructor creates the internal XML representation 
> by using EMPTY_DOC. The original EPR is copied by the following code:
> 
>     NodeList children = copy.toXML().getChildNodes();
>     int length = children.getLength();
>     
>     for (int n = 0; n < length; ++n)
>     {
>         Node next = doc.importNode(children.item(n), true);
>         _xml.appendChild(next);
>     }
> 
> The copy.toXML() method simply returns the XML element created with 
> EMPTY_DOC. Here again multiple thread try to traverse the same node
> concurrently which leads to the exception.        
>         
> Therefore, in my point of view we have two different problems:
> 
>      
>  1.) Multiple threads must not use the same target document to import 
> nodes.
>   
>      This means XML elements, that are only used within a request/response
>      cycle (e.g. for de-/serialization) can use ThreadLocal mechnisms.
>      It is not necessary to create new documents all the way for this 
> purpose.
>      
>  2.) XML elements, that have a lifetime longer than a request/response
>      cycle, must be created by using new (not shared) documents. This 
> I already
>      pointed out in my first mail. This is especially true for EPRs, 
> that are
>      
> 
> I attached a number of test cases that illustrate my point.
> 
> To Rich:
> 
>   >>The default implementation creates a thread for processing the
>   >>notify request, and then immediately releases the thread that 
> called it.
>   
>   You are right: in this special case the objects need to be copied
>   by using a separate document, since here the message objects leave the
>   request/response cycle. This has to be taken into account.
>   
> Finally, the usage of createDocument() compared to the solution I 
> presented adds a performance overhead of more than 50%. This can 
> simply be verified with the tests I attached. However, I admit that my 
> solution requires more attention by the programmers than simply replacing 
> EMPTY_DOC, but I think
> it is worth.   
> 
> Interestingly, I still found that sometimes exceptions will occur, 
> even when using new documents for every XML operation. This is really 
> rare, but in my opinion this results from the “still” shared usage 
> of DOM objects within the EPR (the _xml element). Therefore, 
> concurrent read access to this instance variable can still occour. To 
> solve the problem with the EPR completely, this instance variable 
> should perhaps be removed from the EPR implementation.
>   
> Regards,
> Oliver
> 
> p.s. unfortunatelly I could not attach the test files, since this is 
> rejected by the mailer deamon. However, I could provide the test in the JIRA.
> --
> Psssst! Schon vom neuen GMX MultiMessenger gehört?
> Der kanns mit allen: http://www.gmx.net/de/go/multimessenger
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

--
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kanns mit allen: http://www.gmx.net/de/go/multimessenger

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to