Author: veithen
Date: Sun Jul 26 13:41:30 2009
New Revision: 797928

URL: http://svn.apache.org/viewvc?rev=797928&view=rev
Log:
Added some StAX related information to the dev guide.

Modified:
    webservices/commons/trunk/modules/axiom/src/docbkx/devguide.xml

Modified: webservices/commons/trunk/modules/axiom/src/docbkx/devguide.xml
URL: 
http://svn.apache.org/viewvc/webservices/commons/trunk/modules/axiom/src/docbkx/devguide.xml?rev=797928&r1=797927&r2=797928&view=diff
==============================================================================
--- webservices/commons/trunk/modules/axiom/src/docbkx/devguide.xml (original)
+++ webservices/commons/trunk/modules/axiom/src/docbkx/devguide.xml Sun Jul 26 
13:41:30 2009
@@ -85,4 +85,230 @@
             </variablelist>
         </section>
     </chapter>
+    
+    <chapter>
+        <title>The StAX specification</title>
+        <para>
+            The StAX specification comprises two parts: a specification 
document titled <quote>Streaming API
+            For XML JSR-173 Specification</quote> and a Javadoc describing the 
API. Both can be downloaded from the
+            <ulink url="http://jcp.org/en/jsr/detail?id=173";>JSR-173 
page</ulink>. Since StAX is part of Java 6,
+            the Javadocs can also be viewed
+            <ulink 
url="http://java.sun.com/javase/6/docs/api/javax/xml/stream/package-summary.html";>online</ulink>.
 
+        </para>
+        <section>
+            <title>Semantics of the <methodname>setPrefix</methodname> 
method</title>
+            <para>
+                Probably one of the more obscure parts of the StAX 
specifications is the meaning of the
+                <methodname>setPrefix</methodname><footnote><para>For 
simplicity, we only discuss
+                <methodname>setPrefix</methodname> here. The same remarks also 
apply to
+                
<methodname>setDefaultNamespace</methodname>.</para></footnote> method defined 
by <classname>XMLStreamWriter</classname>.
+                To understand how this method works, it is necessary to look 
at different parts of the specification:
+            </para>
+            <itemizedlist>
+                <listitem>
+                    <para>
+                        The Javadoc of the <methodname>setPrefix</methodname> 
method.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        The table shown in the Javadoc of the 
<classname>XMLStreamWriter</classname> class
+                        in Java 6<footnote><para>This table is not included in 
the Javadoc in the original StAX
+                        specification.</para></footnote>.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        Section 5.2.2, <quote>Binding Prefixes</quote> of the 
specification.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        The example shown in section 5.3.2, 
<quote>XMLStreamWriter</quote> of the specification.
+                    </para>
+                </listitem>
+            </itemizedlist>
+            <para>
+                In addition, it is important to note the following facts:
+            </para>
+            <itemizedlist>
+                <listitem>
+                    <para>
+                        The terms <firstterm>defaulting prefixes</firstterm> 
used in section 5.2.2 of the
+                        specification and <firstterm>namespace 
repairing</firstterm> used in the Javadocs
+                        of <classname>XMLStreamWriter</classname> are synonyms.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        The methods writing namespace qualified information 
items, i.e.
+                        <methodname>writeStartElement</methodname>, 
<methodname>writeEmptyElement</methodname>
+                        and <methodname>writeAttribute</methodname> all come 
in two variants: one that
+                        takes a namespace URI and a prefix as arguments and 
one that only takes a
+                        namespace URI, but no prefix.
+                    </para>
+                </listitem>
+            </itemizedlist>
+            <para>
+                The purpose of the <methodname>setPrefix</methodname> method 
is simply to define the prefixes that
+                will be used by the variants of the 
<methodname>writeStartElement</methodname>,
+                <methodname>writeEmptyElement</methodname> and 
<methodname>writeAttribute</methodname> methods
+                that only take a namespace URI (and the local name). This 
becomes clear by looking at the
+                table in the <classname>XMLStreamWriter</classname> Javadoc. 
Note that a call to
+                <methodname>setPrefix</methodname> doesn't cause any output 
and it is still necessary
+                to use <methodname>writeNamespace</methodname> to actually 
write the necessary
+                namespace declarations. Otherwise the produced document will 
not be well formed with
+                respect to namespaces.
+            </para>
+            <para>
+                The Javadoc of the <methodname>setPrefix</methodname> method 
also clearly defines the scope
+                of the prefix bindings defined using that method: a prefix 
bound using
+                <methodname>setPrefix</methodname> remains valid till the 
invocation of
+                <methodname>writeEndElement</methodname> corresponding to the 
last invocation of
+                <methodname>writeStartElement</methodname>. While not 
explicitly mentioned in the
+                specifications, it is clear that a prefix binding may be 
masked by another binding
+                for the same prefix defined in a nested element.
+            </para>
+            <para>
+                An aspect that may cause confusion is the fact that in the 
example shown in section
+                5.3.2 of the specifications, the calls to 
<methodname>setPrefix</methodname> (and
+                <methodname>setDefaultNamespace</methodname>) all appear 
immediately before a
+                call to <methodname>writeStartElement</methodname> or 
<methodname>writeEmptyElement</methodname>.
+                This may lead people to incorrectly believe that a prefix 
binding defined using
+                <methodname>setPrefix</methodname> only applies to the next 
element
+                written<footnote><para>Another factor that contributes to the 
confusion is that in SAX,
+                prefix mappings are always generated before the corresponding 
<methodname>startElement</methodname>
+                event and that their scope ends with the corresponding 
<methodname>endElement</methodname>
+                event. This is so because the 
<classname>ContentHandler</classname> interface specifies that
+                <quote>all <methodname>startPrefixMapping</methodname> events 
will occur immediately before the
+                corresponding <methodname>startElement</methodname> event, and 
all <methodname>endPrefixMapping</methodname>
+                events will occur immediately after the corresponding 
<methodname>endElement</methodname>
+                event</quote>.</para></footnote>.
+                This interpretation is clearly in contradiction with the 
<methodname>setPrefix</methodname>
+                Javadoc, unless one assumes that <quote>the current 
START_ELEMENT / END_ELEMENT pair</quote>
+                means the element opened by a call to 
<methodname>writeStartElement</methodname> immediately following
+                the call to <methodname>setPrefix</methodname>. This however 
would be a very arbitrary interpretation
+                of the Javadoc.
+            </para>
+            <para>
+                The correctness of the comments in the previous paragraph can 
be checked using the following
+                code snippet:
+            </para>
+<programlisting>XMLOutputFactory f = XMLOutputFactory.newInstance();
+XMLStreamWriter writer = f.createXMLStreamWriter(System.out);
+writer.writeStartElement("root");
+writer.setPrefix("p", "urn:ns1");
+writer.writeEmptyElement("urn:ns1", "element1");
+writer.writeEmptyElement("urn:ns1", "element2");
+writer.writeEndElement();
+writer.flush();
+writer.close();</programlisting>
+            <para>
+                This produces the following output<footnote><para>This has 
been tested with
+                Woodstox 3.2.9, SJSXP 1.0.1 and version 1.2.0 of the reference
+                implementation.</para></footnote>:
+            </para>
+<screen><![CDATA[<root><p:element1/><p:element2/></root>]]></screen>
+            <para>
+                Since the code doesn't call 
<methodname>writeNamespace</methodname>, the output is obviously not
+                well formed with respect to namespaces, but it also clearly 
shows that the scope of the
+                prefix binding for <literal>p</literal> extends to the end of 
the
+                <sgmltag class="element">root</sgmltag> element and is not 
limited to
+                <sgmltag class="element">element1</sgmltag>.
+            </para>
+            <para>
+                To avoid unexpected results and keep the code maintainable, it 
is in general advisable to keep
+                the calls to <methodname>setPrefix</methodname> and 
<methodname>writeNamespace</methodname> aligned,
+                i.e. to make sure that the scope (in 
<classname>XMLStreamWriter</classname>) of the prefix binding
+                defined by <methodname>setPrefix</methodname> is compatible 
with the scope (in the produced
+                document) of the namespace declaration written by the 
corresponding call
+                to <methodname>writeNamespace</methodname>. This makes it 
necessary to write code like this:
+            </para>
+<programlisting>writer.writeStartElement("p", "element1", "urn:ns1");
+writer.setPrefix("p", "urn:ns1");
+writer.writeNamespace("p", "urn:ns1");</programlisting>
+            <para>
+                As can be seen from this code snippet, keeping the two scopes 
in sync makes it necessary to use
+                the <methodname>writeStartElement</methodname> variant which 
takes an explicit prefix. Note that
+                this somewhat conflicts with the purpose of the 
<methodname>setPrefix</methodname> method;
+                one may consider this as a flaw in the design of the StAX API.
+            </para>
+        </section>
+        <section>
+            <title>The three <classname>XMLStreamWriter</classname> usage 
patterns</title>
+            <para>
+                Drawing the conclusions from the previous section and taking 
into account that
+                <classname>XMLStreamWriter</classname> also has a 
<quote>namespace repairing</quote>
+                mode, one can see that there are in fact three different ways 
to use
+                <classname>XMLStreamWriter</classname>. These usage patterns 
correspond to the
+                three bullets in section 5.2.2 of the StAX 
specification<footnote><para>The content
+                of this section is largely based on a <ulink 
url="http://markmail.org/message/olsdl3p3gciqqeob";>reply
+                posted by Tatu Saloranta on the Axiom mailing list</ulink>. 
Tatu is the main developer of the
+                Woodstox project.</para></footnote>:
+            </para>
+            <orderedlist>
+                <listitem>
+                    <para>
+                        In the <quote>namespace repairing</quote> mode 
(enabled by the
+                        
<varname>javax.xml.stream.isRepairingNamespaces</varname> property), the writer
+                        takes care of all namespace bindings and declarations, 
with minimal help from
+                        the calling code. This will always produce output that 
is well-formed with respect
+                        to namespaces. On the other hand, this adds some 
overhead and the result may
+                        depend on the particular StAX implementation (though 
the result produced by
+                        different implementations will be equivalent).
+                    </para>
+                    <para>
+                        In repairing mode the calling code should avoid 
writing namespaces explicitly
+                        and leave that job to the writer. There is also no 
need to call
+                        <methodname>setPrefix</methodname>, except to suggest 
a preferred prefix for
+                        a namespace URI. All variants of 
<methodname>writeStartElement</methodname>,
+                        <methodname>writeEmptyElement</methodname> and 
<methodname>writeAttribute</methodname>
+                        may be used in this mode, but the implementation can 
choose whatever prefix mapping
+                        it wants, as long as the output results in proper URI 
mapping for elements and
+                        attributes.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        Only use the variants of the writer methods that take 
an explicit prefix together
+                        with the namespace URI. In this usage pattern, 
<methodname>setPrefix</methodname>
+                        is not used at all and it is the responsibility of the 
calling code to keep
+                        track of prefix bindings.
+                    </para>
+                    <para>
+                        Note that this approach is difficult to implement when 
different parts of the output document
+                        will be produced by different components (or even 
different libraries). Indeed, when
+                        passing the <classname>XMLStreamWriter</classname> 
from one method or component
+                        to the other, it will also be necessary to pass 
additional information about the
+                        prefix mappings in scope at that moment, unless the it 
is acceptable to let the
+                        called method write (potentially redundant) namespace 
declarations for all namespaces
+                        it uses.
+                    </para>
+                </listitem>
+                <listitem>
+                    <para>
+                        Use <methodname>setPrefix</methodname> to keep track 
of prefix bindings and make sure that
+                        the bindings are in sync with the namespace 
declarations that have been written,
+                        i.e. always use <methodname>setPrefix</methodname> 
immediately before or immediately
+                        after each call to 
<methodname>writeNamespace</methodname>. Note that the code is
+                        still free to use all variants of 
<methodname>writeStartElement</methodname>,
+                        <methodname>writeEmptyElement</methodname> and 
<methodname>writeAttribute</methodname>;
+                        it only needs to make sure that the usage it makes of 
these methods is consistent with
+                        the prefix bindings in scope.
+                    </para>
+                    <para>
+                        The advantage of this approach is that it allows to 
write modular code: when a
+                        method receives an 
<classname>XMLStreamWriter</classname> object (to write
+                        part of the document), it can use
+                        the namespace context of that writer (i.e. 
<methodname>getPrefix</methodname>
+                        and <methodname>getNamespaceContext</methodname>) to 
determine which namespace
+                        declarations are currently in scope in the output 
document and to avoid
+                        redundant or conflicting namespace declarations. Note 
that in order to do so,
+                        such code will have to check for an existing prefix 
binding before starting
+                        to use a namespace.
+                    </para>
+                </listitem>
+            </orderedlist>
+        </section>
+    </chapter>
 </book>
\ No newline at end of file


Reply via email to