Re: [gwt-contrib] Re: Added all safehtml packages. (issue771801)

Arthur Kalmenson Thu, 19 Aug 2010 07:44:20 -0700

I'm just wondering, the idea of this is to display potentially unsafe
HTML we got from the server, right? Is there a way to discourage using
this to sanitizing HTML on the client before sending it to the server?
I can definitely see newer programmers assuming that doing this
sanitization on the client side is safe enough, when it's clearly not.
I guess some documentation to that effect would be enough.


--
Arthur Kalmenson



On Wed, Aug 18, 2010 at 10:11 AM, Christoph Kern <x...@google.com> wrote:
>
>
> On Wed, Aug 18, 2010 at 01:44, <t.bro...@gmail.com> wrote:
>>
>> On 2010/08/17 23:23:39, xtof wrote:
>>>
>>> On 2010/08/17 23:05:06, tbroyer wrote:
>>> >
>>> > Looking at the code more closely it would merely "fail" by overly
>>> > rejecting tags that are whitelisted: i.e. "<b foo=<i>should be
>>> > bold" would be sanitized to "&lt;b foo=<i>should be bold" and the
>>> > end part would be italicized instead of bold.
>>
>>> And that is exactly correct behavior for this class as document. It
>>> only claims to accept HTML with attribute-free tags within the
>>> whitelist. It doesn't claim to do anything particularly sensible
>>> with input that doesn't fit this constraint; it does however claim
>>> that whatever it outputs is safe (will not result in XSS/script
>>> execution).
>>
>> Oops, yes, sorry, I can't tell how it happened but I misread the
>> "whitelisting" code (matches the whole thing between '<' and '>', so any
>> attribute, or even whitespace, as in "<b >bold</b>", would make it fail
>> and thus be escaped).
>> It's fine then. Sorry again for the noise.
>
> No prob, always better to err on the side of caution!
>
>>
>> Still, there's a small issue with the fact that
>> SafeHtmlTemplatesGenerator doesn't use the HTML5 serialization algorithm
>> (or any similar one): @Template("<br/>") will result in "<br></br>"
>> which is interpreted by browsers as "<br><br>" [1], which makes it
>> impossible to generate a single "line break" in a SafeHtmlTemplates.
>>
>> [1] http://www.w3.org/TR/html5/tokenization.html#parsing-main-inbody
>> (search for « An end tag whose tag name is "br" », it's there for
>> "compat with the Web")
>
> Agreed.  Another potential problem with this parser is that it's too strict
> -- it insists on well-formed XHTML, but that's a much stronger constraint
> than needed to ensure the SafeHtml type contract.
> For example,
> �...@template("<a href=\"{0}\">")
>   SafeHtml openATag(String href);
> is perfectly safe, and might be useful in something where one needs to
> conditionally assemble various pieces of HTML markup,  like,
>   SafeHtmlBuilder shb;
>   //...
>   shb.append(openATag(someUrl));
>   if (...) {
>     shb.append(someThing)
>   } else {
>    shb.append(someThingElse);
>   }
> To ensure the SafeHtml type contract, the parser need only record the
> HTML-context ("inner text", "url-valued attribute", "style", etc) of the
> positional parameters, and require that the template ends in "inner text"
> context (to ensure that SafeHtmls are safely appendable to each other).
> Note that a bug in code like the above could result in non-well-formed HTML
> (like unbalanced tags), but not unsafe HTML that might cause XSS (that is of
> course absent bugs in the SafeHtml generators themselves).
> As I'd mentioned, I'm proposing to replace the current XML based parser with
> the java streamhtmlparser, which has this property.
> I'd be fine with removing the templating code from this change and land it
> separately along with the new parser.
> cheers,
> --xtof
>
>
>>
>>
>> http://gwt-code-reviews.appspot.com/771801/show
>
> --
> http://groups.google.com/group/Google-Web-Toolkit-Contributors

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors

Re: [gwt-contrib] Re: Added all safehtml packages. (issue771801)

Reply via email to