Improve handling of input in XSLTMediator
-----------------------------------------
Key: SYNAPSE-213
URL: https://issues.apache.org/jira/browse/SYNAPSE-213
Project: Synapse
Issue Type: Improvement
Components: Core
Affects Versions: 1.1, NIGHTLY
Reporter: Andreas Veithen
Priority: Minor
Currently XSLTMediator uses two different strategies to feed the XML input into
the XSLT processor:
* When useDOMSourceAndResults is set to false, the Axiom tree will be
serialized to a byte stream (in memory or to a temporary file for large
documents) and then fed into the XSLT processor using a StreamSource object.
* When useDOMSourceAndResults is set to true, the code will call
ElementHelper.importOMElement to get a DOM compliant version of the Axiom tree.
The resulting DOM tree is then passed to the XSLT processor using a DOMSource.
First it should be noted that using a temporary file for the XML input (in
contrast to the output of the transformation) doesn't eliminate the need to
keep the entire input document in memory. Indeed:
* When the input is read, Axiom will built the entire tree and keep in memory.
* Due to the way XSLT works, the XSLT processor also requires a complete
in-memory representation of the input document. The only exception is for XSLT
processors that supports streaming, which is not the case for Xalan. Xalan uses
its own object model called DTM (Document Table Model) to store the input
document in memory.
Since the input document must be kept in memory anyway, the only question is
how to efficiently feed the original Axiom tree into the XSLT processor without
creating too much overhead and consuming too much memory. Assuming that Xalan
is used, the current situation is as follows:
* When useDOMSourceAndResults is set to false, three copies of the XML input
will be built: the Axiom tree, the serialized byte stream and Xalan's DTM
representation. When temporary files are used for large documents, only two
will coexist in memory. However, using temporary files introduces a large
overhead.
* When useDOMSourceAndResults is set to true, at least two copies of the input
will be built: the Axiom tree and the DOM tree. Indeed, from the code in
ElementHelper.importOMElement it can be seen that an entirely new copy of the
input tree will be created. In addition, Xalan will create a DTM representation
of the DOM tree. The document at http://xml.apache.org/xalan-j/dtm.html
suggests that this representation is not a complete copy of the DOM tree, but a
wrapper/adapter that is backed by the original DOM tree.
Both strategies used by XSLTMediator are far from optimal. There are at least
two strategies that should give better results (with at least one of them being
actually simpler):
* Trick Axis2 into producing a DOM compatible tree from the outset, by using a
StAXSOAPModelBuilder with a DOMSOAPFactory (this produces objects that
implement both the Axiom and DOM interfaces). This however might require some
tweaking. The advantage is that there is no need to create a copy anymore.
Xalan will only create a DTM wrapper around the existing tree.
* Make sure that a DTM representation is created directly from the Axiom tree
without intermediate copy (byte stream or DOM tree). With Java 6/JAXP 1.4 this
would be very easy because it has support for StAXSource, which integrates
nicely with Axiom. In the meantime, the solution is to pull StAX events from
Axiom, convert them to SAX events and push them to the XSLT processor. The
Spring WS project has a utility class StaxSource (extending SAXSource) that
does this in a completely transparent way (new
StaxSource(omElement.getXMLStreamReader())). By using
getXMLStreamReaderWithoutCaching instead of getXMLStreamReader, this could
probably be further optimized to instruct Axiom not to create the tree for the
part of the input message that is being transformed (unless it has already been
constructed at that moment).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]