[PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements
----------------------------------------------------------

                 Key: XALANJ-2337
                 URL: http://issues.apache.org/jira/browse/XALANJ-2337
             Project: XalanJ2
          Issue Type: Bug
          Components: DOM, Serialization
    Affects Versions: Latest Development Code
            Reporter: Michael Glavassevich
         Attachments: dom3-ls-serializer-fixes.patch.txt

The attached patch addresses the following issues:

The DOM Level 3 Load and Save specification requires that implementations 
choose a default sequence [1] which matches one allowed by XML 1.0 (or XML 
1.1). The current code blindly accepts whatever the value of "line.separator" 
is. If the value of "line.separator" isn't one of the XML 1.0 end-of-line 
sequences then we should select "\n" as the default value.

Setting the newLine attribute [1] to null on an LSSerializer is supposed to 
restore the default value. The current code isn't doing that.

Support for the format-pretty-print feature is broken. Setting it to true with 
setParameter() has no effect. The wrong properties are being set internally. We 
need to set {http://www.w3.org/TR/DOM-Level-3-LS}format-pretty-print in order 
to enable the feature.

Revision 474947 to DOM3TreeWalker fixed a bug where a char[] was being 
converted to a String using toString(). This probably creeped in due to the 
repeated conversions of the end of line sequence (String -> char[] -> String -> 
char[]). I don't understand why we keep flipping back and forth between String 
and char[] but it's a waste to create the temp string every time in the 
TreeWalker. I moved the conversion into the setter on DOM3SerializerImpl.

LSSerializerImpl is creating OutputStreamWriters directly. This bypasses the 
underlying serializer's encoding handling and prevents it from using its 
optimized writers for UTF-8 and ASCII. I've changed the code so that it just 
sets the OutputStream and lets the real serializer deal with the encoding 
issues.

When writing to a file URI LSSerializerImpl needs to decode the URI escape 
sequences before it creates the FileOutputStream. For a URI like 
"file:///D:/My%20Documents/file.xml" we should be writing this to "D:\My 
Documents\file.xml" not "D:\My%20Documents\file.xml". The proposed change is 
based on code which has been in the Xerces DOM Level 3 serializer for over two 
years.

On Java 1.4 we should be using the exception chaining mechanism to capture the 
cause of the LSException. This was implemented in Xerces 2.8.0 and should be 
carried forward into the Xalan based serializer.

There are several printStackTrace() calls scattered around LSSerializerImpl. We 
should never be making these. I've removed all of them.

[1] 
http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSSerializer-newLine

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to