Hi,

If you look into the XPathBuilder in camel (actually the doInEvaluateAs() 
method), you see that the data that the evaluated with the XPath expression (a 
header or the body) is first converted into a data type defined in the 
documentType attribute of the XPath builder. Afterwards the expression is 
evaluated with the Object (or the node attribute of it if it is a DOMSource).

The default for the documentType is Document (DOM), which is pretty much memory 
consuming. On large XML documents (e.g. 100 MB) parsing a DOM may lead to an 
OutOfMemoryError. If the Saxon parser is used for transformation, the 
implementation is capable of using a TinyTree instead of e Xerces DOM, which is 
much smaller, however that doesn't help if the JVM goes OOM when parsing the 
Document with the Xerces parser into a DOM tree even before the transformation 
takes place.

In Java DSL it is possible to set the documentType to an XPath expression (as 
in)
        from("direct:setbody")
            .setBody(xpath("/a/b/c", Document.class)
                    .documentType(SAXSource.class)
                    .factory(new XPathFactoryImpl())
                    );

The route is capable of transforming much larger Documents than the same route 
without the .documentType(SAXSource.class) statement (InputSource will also 
work if the incoming data has a type converter to InputSource).

In XML DSL there is unfortunately no way to set the document type.

I have some questions about that:

1.       Does anybody know why Document was taken as a default documentType?

2.       Why is the documentType not configurable in XML DSL? What would I need 
to do in order to add an extra attribute to the XML DSL?

3.       Wouldn't a more dynamic approach be better? E.g. if the  data is a DOM 
tree from the beginning us that, if it's a SAXSource use that one and if it's 
something like an InputStream or String use an InputSource?

What do you think about this?

Best regards
Stephan

Reply via email to