Fatal error is called for valid surrogate characters 
-----------------------------------------------------

         Key: XERCESJ-1063
         URL: http://issues.apache.org/jira/browse/XERCESJ-1063
     Project: Xerces2-J
        Type: Bug
  Components: Serialization  
    Versions: 2.6.2    
    Reporter: Serghei Balaban


I use XMLSerializer and XML11Serializer to generate xml messages. When I was 
testing my application, I found that valid surrogate characters can be 
misinterpreted in printText( char[],int,int,boolean,boolean) methods of 
XMLSerializer and XML11Serializer classes. In excerpt from XMLSerializer.java 
(CVS apache.org) you can see that the variable length is decremented while the 
variable start is incremented, so when start>length the check for surrogate is 
never called and the fatal error is raised:

 protected void printText( char[] chars, int start, int length,
                              boolean preserveSpace, boolean unescaped ) throws 
IOException {
        int index;
        char ch;

        if ( preserveSpace ) {
            // Preserving spaces: the text must print exactly as it is,
            // without breaking when spaces appear in the text and without
            // consolidating spaces. If a line terminator is used, a line
            // break will occur.
            while ( length-- > 0 ) {
                ch = chars[ start ];
                ++start;
                if (!XMLChar.isValid(ch)) {
                    // check if it is surrogate
                    if (++start <length) {
                        surrogates(ch, chars[start]);
                    } else {
                        fatalError("The character '"+(char)ch+"' is an invalid 
XML character"); 
                    }
                    continue;
                }
                if ( unescaped )
                    _printer.printText( ch );
                else
                    printXMLChar( ch );
            }
        } else {
            // Not preserving spaces: print one part at a time, and
            // use spaces between parts to break them into different
            // lines. Spaces at beginning of line will be stripped
            // by printing mechanism. Line terminator is treated
            // no different than other text part.
            while ( length-- > 0 ) {
                ch = chars[ start ];
                ++start;

                if (!XMLChar.isValid(ch)) {
                    // check if it is surrogate
                    if (++start <length) {
                        surrogates(ch, chars[start]);
                    } else {
                        fatalError("The character '"+(char)ch+"' is an invalid 
XML character"); 
                    }
                    continue;
                }
                if ( unescaped )
                    _printer.printText( ch );
                else
                    printXMLChar( ch );
            }
        }
    }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to