On Tuesday 30 December 2008 08:24, j16sdiz at freenetproject.org wrote:
> Author: j16sdiz
> Date: 2008-12-30 08:24:13 +0000 (Tue, 30 Dec 2008)
> New Revision: 24846
> 
> Modified:
>    trunk/freenet/src/freenet/support/HTMLEncoder.java
> Log:
> encodeXML() method
> 
> Modified: trunk/freenet/src/freenet/support/HTMLEncoder.java
> ===================================================================
> --- trunk/freenet/src/freenet/support/HTMLEncoder.java        2008-12-30 
> 07:26:32 
UTC (rev 24845)
> +++ trunk/freenet/src/freenet/support/HTMLEncoder.java        2008-12-30 
> 08:24:13 
UTC (rev 24846)
> @@ -14,7 +14,7 @@
>  public class HTMLEncoder {
>       public final static CharTable charTable = 
>               new CharTable(HTMLEntities.encodeMap);
> -
> +     
>       public static String encode(String s) {
>               int n = s.length();
>               StringBuilder sb = new StringBuilder(n);
> @@ -41,6 +41,28 @@
>               }
>               
>       }
> +
> +     /**
> +      * Encode String so it is safe to be used in XML attribute value and 
> text.
> +      * 
> +      * HTMLEncode.encode() use some HTML-specific entities (e.g. &) 
> hence 
not suitable for
> +      * generic XML.
> +      */
> +     public static String encodeXML(String s) {
> +             // Extensible Markup Language (XML) 1.0 (Fifth Edition)
> +             // [10]         AttValue           ::=          '"' ([^<&"] | 
> Reference)* '"'
> +             //                                                              
> |   "'" ([^<&'] | Reference)* "'"
> +             // [14]         CharData           ::=          [^<&]* - 
> ([^<&]* ']]>' [^<&]*)
> +             s = s.replace("&", "&#38;");
> +
> +             s = s.replace("\"", "&#34;");
> +             s = s.replace("'", "&#39;");
> +
> +             s = s.replace("<", "&#60;");
> +             s = s.replace(">", "&#62;"); // CharData can't contain ']]>'
> +
> +             return s;
> +     }

Why is this a blacklist rather than a whitelist? Why does it not encode double 
quotes, or newlines? In other words is it safe if fed arbitrary 
attacker-specified data? If not, please clearly label it.

>               
>       private final static class CharTable{
>               private char[] chars;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20090110/4cb4bd35/attachment.pgp>

Reply via email to