Assuming I have the following code, where I try to send pound symbol:


    Vector    params = new Vector();
    Hashtable hashParams = new Hashtable();
    hashParams.put( "msg", "£" );
    params.add( hashParams );

client.execute( "XMLRPCHandler.getContent", params );

Problem is that, the XML-RPC API sends
   £
as
   £

Of course, this problem will happen to any entity-encoded character, not just the pound symbol.
The root of the problem is the org.apache.xml.XmlWriter.chardata() method, which does the following:


/**
* Writes text as <code>PCDATA</code>.
*
* @param text The data to write.
* @exception XmlRpcException Unsupported character data found.
* @exception IOException Problem writing data.
*/
protected void chardata(String text)
throws XmlRpcException, IOException
{
int l = text.length ();
for (int i = 0; i < l; i++)
{
char c = text.charAt (i);
switch (c)
{
case '\t':
case '\r':
case '\n':
write(c);
break;
case '<':
write(LESS_THAN_ENTITY);
break;
case '>':
write(GREATER_THAN_ENTITY);
break;
case '&':
write(AMPERSAND_ENTITY);
break;
default:
if (c < 0x20 || c > 0xff)
{
// Though the XML-RPC spec allows any ASCII
// characters except '<' and '&', the XML spec
// does not allow this range of characters,
// resulting in a parse error from most XML
// parsers.
throw new XmlRpcException(0, "Invalid character data " +
"corresponding to XML entity &#" +
String.valueOf((int) c) + ';');
}
else
{
write(c);
}
}
}
}



What should happen is that it should follow the same logic / code as in Apache Xerces.
In org.apache.xml.serialize.BaseMarkUpSerializer:


/**
* Escapes a string so it may be printed as text content or attribute
* value. Non printable characters are escaped using character references.
* Where the format specifies a deault entity reference, that reference
* is used (e.g. <tt>&amp;lt;</tt>).
*
* @param source The string to escape
*/
protected void printEscaped( String source )
throws IOException
{
for ( int i = 0 ; i < source.length() ; ++i ) {
int ch = source.charAt(i);
if ((ch & 0xfc00) == 0xd800 && i+1 < source.length()) {
int lowch = source.charAt(i+1);
if ((lowch & 0xfc00) == 0xdc00) {
ch = 0x10000 + ((ch-0xd800)<<10) + lowch-0xdc00;
i++;
}
}
printEscaped(ch);
}
}


protected void printEscaped( int ch )
throws IOException
{
String charRef;
// If there is a suitable entity reference for this
// character, print it. The list of available entity
// references is almost but not identical between
// XML and HTML.
charRef = getEntityRef( ch );
if ( charRef != null ) {
_printer.printText( '&' );
_printer.printText( charRef );
_printer.printText( ';' );
} else if ( ( ch >= ' ' && _encodingInfo.isPrintable((char)ch) && ch != 0xF7 ) ||
ch == '\n' || ch == '\r' || ch == '\t' ) {
// Non printables are below ASCII space but not tab or line
// terminator, ASCII delete, or above a certain Unicode threshold.
if (ch < 0x10000) {
_printer.printText((char)ch );
} else {
_printer.printText((char)(((ch-0x10000)>>10)+0xd800));
_printer.printText((char)(((ch-0x10000)&0x3ff)+0xdc00));
}
} else {
// The character is not printable, print as character reference.
_printer.printText( "&#x" );
_printer.printText(Integer.toHexString(ch));
_printer.printText( ';' );
}
}



-- Jesus M. Salvo Jr. Mobile Internet Group Pty Ltd (formerly Softgame International Pty Ltd) M: +61 409 126699 T: +61 2 94604777 F: +61 2 94603677

PGP Public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC0BA5348





Reply via email to