SAX: need to parse attribute values with namespace prefixes

Michael Klepikov Thu, 03 Apr 2003 16:02:02 -0800

Hi All,

It seems there may be a need to expose some things in Xerces-C++
public API that are currently considered private.


We are writing a SAX based SOAP C++ engine (why our own is a separate
story), and we need to interpret attribute *values* that contain
namespace prefixes:

<soapenv:Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema"; ...>
..
 <inputStruct xsi:type="ns2:SOAPStruct" xmlns:ns2="http://soapinterop.org/xsd";>
  <varString xsi:type="xsd:string">Hello</varString>
 </inputStruct>

That is, we basically need the ability to create a QName corresponding
to an arbitrary string ("ns2:SOAPStruct", "xsd:string"), such that the
string is interpreted according to the current parser state with respect
to namespaces and prefixes. Currently there doesn't seem to be public
API in Xerces-C++ which would allow a SAX based program to do that.

A Xerces-C++ XMLScanner has just the method for it:

    /**
      * This method separate the passed QName into prefix
      * and local part, and then return the URI id by resolving
      * the prefix.
      *
      * mode: Indicate if this QName comes from an Element or Attribute
      */
    unsigned int resolveQName
    (
        const   XMLCh* const        qName
        ,       XMLBuffer&          nameBufToFill
        ,       XMLBuffer&          prefixBufToFill
        , const ElemStack::MapModes mode
    );

and then we can construct a Xerces QName using the returned URI ID,
prefix, and local part.

But:

1. We don't have access to the scanner from the ContentHandler during
SAX parsing, and we couldn't find any other way to get the parser's
current scanner.

2. XMLScanner is not an official documented public interface, even
though it is used in XMLValidator::setScannerInfo, which is a public
method that may be overridden by users, as far as I understand.

(2) wouldn't prevent us from using XMLScanner (although we would have
to live with some guilt and in fear of changing API:), but (1) is the
real show stopper.

We looked at moving our SOAP code from being a ContentHandler to being
an XMLValidator (which is given a scanner), but it just seems
inappropriate architecturally, and awkward to implement.

I also looked at what Axis does, it being also SAX based. In Axis-Java
they appear to be building their own stack of DOM nodes and using them
to resolve namespace references. This is kind of inefficient, because
it's more or less double allocation of the deserialized data structure
-- once for the DOM nodes, and once for the actual user classes. I
haven't looked too closely though, because we are primarily interested
in C++ at this point. In Axis-C++ (currently just an independent
contrib effort) they do what we also thought of doing: maintain their
own stack of namespace maps:

http://cvs.apache.org/viewcvs.cgi/xml-axis/contrib/Axis-C%2B%2B/src/Xml/XMLDeSerializer.cpp?rev=1.4&content-type=text/vnd.viewcvs-markup
(see XMLDeSerializer::GetQNameFromStr, XMLDeSerializer::RegisterPrefixForURI)

http://cvs.apache.org/viewcvs.cgi/xml-axis/contrib/Axis-C%2B%2B/src/Util/NsStack.hpp?rev=1.3&content-type=text/vnd.viewcvs-markup

That's also a duplication, because Xerces for sure also must be
maintaining a similar stack. Axis-C++ was also forced to create its
own QName structure, presumably because Xerces QName requires a URI ID
to create, and in Xerces-C++ SAX API there doesn't seem to be a way to
get that ID without a scanner, which isn't accessible as I mentioned
earlier. It's a bummer to have this useful functionality right under
our noses inside Xerces, yet but not being able to get to it.

The same problem may exist in Xerces-Java, but I haven't checked it
thoroughly.

The questions are:

1. Am I missing something -- is there a public way to resolve a
namespace reference from a raw string?

2. If there isn't such a way today, can we expect it to appear in
reasonable future?

3. Any ideas on other resolution methods for namespace references?

Any thoughts are appreciated. 

--Michael Klepikov

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

SAX: need to parse attribute values with namespace prefixes

Reply via email to