[ 
https://issues.apache.org/jira/browse/LANG-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Barrs updated LANG-898:
------------------------------

    Description: 
{noformat} 
In an escaped XML string with escaped whitespace, in this case linefeed ( 
 
), escapexml and unescapexml treat the linefeed inconsistently. 

unescape converts 
 to a linefeed, yet escapexml does not convert linefeed 
back to 


I've put spaces between the & and # and 10 in this bug as Jira will interpret 
it: 
& # 10;

Here's code and output...


public static void main(String[] args) {
        String escaped =
                "<?xml version="1.0" 
encoding="iso-8859-1"?>
<?xml version="1.0" 
encoding="iso-8859-1"?>";

        System.out.println(escaped);
        System.out.println();
        System.out.println(StringEscapeUtils.unescapeXml(escaped));
        System.out.println();
        System.out.println(StringEscapeUtils.escapeXml(StringEscapeUtils
                .unescapeXml(escaped)));

    }
    

Output:

<?xml version="1.0" 
encoding="iso-8859-1"?>
<?xml version="1.0" 
encoding="iso-8859-1"?>

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml version="1.0" encoding="iso-8859-1"?>

&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
{noformat} 

  was:
{noformat} 
In an escaped XML string with escaped whitespace, in this case linefeed ( & # 
10; ), escapexml and unescapexml treat the linefeed inconsistently. 

unescape converts & # 10; to a linefeed, yet escapexml does not convert 
linefeed back to &#10;

I've put spaces between the & and # and 10 in this bug as Jira will interpret 
it: 
& # 10;

Here's code and output...


public static void main(String[] args) {
        String escaped =
                "&lt;?xml version=&quot;1.0&quot; 
encoding=&quot;iso-8859-1&quot;?&gt;& #10 ;&lt;?xml version=&quot;1.0&quot; 
encoding=&quot;iso-8859-1&quot;?&gt;";

        System.out.println(escaped);
        System.out.println();
        System.out.println(StringEscapeUtils.unescapeXml(escaped));
        System.out.println();
        System.out.println(StringEscapeUtils.escapeXml(StringEscapeUtils
                .unescapeXml(escaped)));

    }
    

Output:

&lt;?xml version=&quot;1.0&quot; 
encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml version=&quot;1.0&quot; 
encoding=&quot;iso-8859-1&quot;?&gt;

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml version="1.0" encoding="iso-8859-1"?>

&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
{noformat} 

    
> StringEscapeUtils un/escapexml inconsistant with escaped whitespace
> -------------------------------------------------------------------
>
>                 Key: LANG-898
>                 URL: https://issues.apache.org/jira/browse/LANG-898
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*
>    Affects Versions: 3.1
>         Environment: Windows 7, Java 7
>            Reporter: Martin Barrs
>
> {noformat} 
> In an escaped XML string with escaped whitespace, in this case linefeed ( 
> &#10; ), escapexml and unescapexml treat the linefeed inconsistently. 
> unescape converts &#10; to a linefeed, yet escapexml does not convert 
> linefeed back to &#10;
> I've put spaces between the & and # and 10 in this bug as Jira will interpret 
> it: 
> & # 10;
> Here's code and output...
> public static void main(String[] args) {
>         String escaped =
>                 "&lt;?xml version=&quot;1.0&quot; 
> encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml version=&quot;1.0&quot; 
> encoding=&quot;iso-8859-1&quot;?&gt;";
>         System.out.println(escaped);
>         System.out.println();
>         System.out.println(StringEscapeUtils.unescapeXml(escaped));
>         System.out.println();
>         System.out.println(StringEscapeUtils.escapeXml(StringEscapeUtils
>                 .unescapeXml(escaped)));
>     }
>     
> Output:
> &lt;?xml version=&quot;1.0&quot; 
> encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml version=&quot;1.0&quot; 
> encoding=&quot;iso-8859-1&quot;?&gt;
> <?xml version="1.0" encoding="iso-8859-1"?>
> <?xml version="1.0" encoding="iso-8859-1"?>
> &lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
> &lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
> {noformat} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to