Re: StAX Parser progress

Eran Chinthaka Tue, 20 Jun 2006 02:26:22 -0700

Oshani Seneviratne wrote:
> Hi devs,
> 
> This mail is mainly to update you with the progress I've made with the
> StAX based parser for Woden and to get your comments on the approach
> I'm taking (especially if I'm not heading in the expected direction).
> 
> I initially started parsing a WSDL purely with a StAX XMLStreamReader
> to build the Woden element model. The idea I had, was to cache the
> XMLStreamReader at each and every top level element every time as they
> are accessed. I wanted to use this cached parser in cases where the
> later elements needed information from previously accessed elements.
> However, I realized that when there are so many nested elements, this
> approach created many parser instances even when it was not required
> (i.e. when those elements could have been accessed with the current
> parser). And this was a major problem when it came to the schema
> validation.
> So IMHO, we need to have a proper object model based on StAX, rather
> than building with a pure cursor/event based StAX approach.


+1. You can not always create multiple parsers, if you are trying to
build from *any* stream. Its better you abandon this approach earlier :).

> 
> Therefore, how about using AXIOM instead? Since AXIOM is based on
> StAX, the resulting implementation would be fast and efficient, as it
> is expected from a StAX parser. If one of the objectives of Woden is
> to be used in Axis2, I suppose using AXIOM in Woden would not be much
> of a problem :).
> 
> I implemented a prototype OMWSDLReader as an alternative to the
> DOMWSDLReader, and at the moment it can correctly read some of the
> components in the hotel-reservation.wsdl. Parsing for extension
> attributes, imports and includes are yet to be added.
> 
> In the case of schema, I suppose we can stick with XMLSchema as in the
> current DOM impl. However, the arguments to the XMLSchemaCollection's
> read method posed a problem, and I could only come up with the
> following:
> 
> <code snippet>
> 
> //omElement is an OMElement which contains the <xs:schema> element
> String elementString = omElement.toString();

use toStringWithConsume and not toString. The former will not build the
object model during the serialization phase.

> byte[] bytes = elementString.getBytes();
> 
> //Deserialize from the byte array
> InputStream inputStream = new ByteArrayInputStream(bytes);
> InputSource inputSource = new InputSource(inputStream);
> 
> XmlSchemaCollection xsc = new XmlSchemaCollection();
> XmlSchema schemaDef = xsc.read(inputSource, null);
> 
> </code snippet>
> 
> This returned the correct XMLSchema as it was there in the WSDL.
> However, unlike as in the DOM impl, apart from the targetNamespace,
> the other namespaces were not there as attributes to <xs:schema>. I
> wonder whether this could lead to a bug later in the model for schema
> in Woden!

I think there is a way to feed in additional namespaces to the schema
model. Need to check.

-- Chinthaka

signature.asc
Description: OpenPGP digital signature

Re: StAX Parser progress

Reply via email to