StringEscapeUtils.unescapeXml(str) does not support supplemental characters.
----------------------------------------------------------------------------

                 Key: LANG-729
                 URL: https://issues.apache.org/jira/browse/LANG-729
             Project: Commons Lang
          Issue Type: Improvement
          Components: lang.*
    Affects Versions: 2.6
            Reporter: Taro Yabuki
            Priority: Trivial
         Attachments: lang_2_6_unescapexml_20110716.diff

StringEscapeUtils.unescapeXml(str) does not unescape numeric character 
references of supplemental characters:

String str2 = StringEscapeUtils.unescapeXml("𣎴");
System.out.println(str2.codePointAt(0));
//38 (it means '&'.)

This output should be 144308.

Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is 
equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728, 
StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed, 
StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be 
equal to str. We do not expect it. (Of course, we don't expect that 
StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal 
to str.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to