[gwt-contrib] Re: Add SafeUri type, similar to SafeHtml but for values in a URL attribute context. (issue1380806)

Christoph Kern Mon, 11 Apr 2011 16:18:18 -0700

On Sun, Apr 10, 2011 at 17:02, <t.bro...@gmail.com> wrote:

> On 2011/04/10 18:32:38, xtof wrote:
>
>> I think this is pretty much ready, except for one thing I just thought
>>
> of.
>
>> Sorry, I should have thought of that earlier :/
>>
>
>  In ClippedImageImpl, we're using a SafeUri in the context of a url()
>>
> style
>
>> expression ("background: url({3})"). However, the contract of SafeUri
>>
> is
>
>> currently not tight enough for this to be safe I think:  SafeUri would
>>
> allow
>
>> e.g., parentheses, colons and dashes in un-escaped form in the URL.
>>
> In a
>
>> template where the URL appears in a url() css expression as above, a
>>
> URL like,
>
>> http://harmless.com/foo);background:url(javascript:evil());
>> would pass UriUtils#sanitizeUri (without getting URI-escaped or
>>
> anything), and
>
>> would result in script execution via CSS injection.
>>
>
> I hadn't looked closely at the SafeStyles API and thought it would be
> covered there (and ClippedImageImpl would of course be updated to use
> SafeStyles).
> But note that, similarly, http://harmless.com"onclick="evil()" would
> pass sanitizeUri and could result in script execution via HTML injection
> (not through SafeHtml/SafeHtmlTemplates though, as the quotes would be
> escaped).
>


Ok, I think you've convinced me :)

I think what you're saying is essentially this: The SafeUri contract only
promises that a URL will not result in script execution if dereferenced
as-is (which in practice means that its scheme is on a whitelist of ones
that don't cause script exec, and in particular isn't javascript:, etc).
However, when a SafeUri is embedded in a specific context, it *still* needs
to be appropriately escaped.

Which is what happens for uri-valued attribute context in SafeHtmlTemplates,
and I agree with you that we should treat URI-in-CSS context the same way,
and deal with any necessary escaping (or sanitization) there.

I do think that it might be worth to constrain SafeUri to only allow
syntactically valid URIs, or at least only strings consisting of characters
valid in URIs per the RFC.  Unfortunately, GWT doesn't implement
java.net.URI and the former might be hard to do efficiently in browser-side
code; I can't think of any reason the latter isn't sufficient.

If we're both in agreement on that, I think we should change the doc of the
SafeUri contract to be a bit more specific about that.  I.e., instead of

 18 /**
 19  * An object that implements this interface encapsulates a URI that is
 20  * guaranteed to be safe to use (with respect to potential
Cross-Site-Scripting
 21  * vulnerabilities) in a URL context, for example in a URL-typed
attribute in
 22  * an HTML document.

more something along the lines:

/**
An object that implements this interface encapsulates a lexically valid
(with respect to the set of valid characters specified RFC 3986) URI that
when dereferenced, will not result in browser-side execution of script that
is not under program control.

<p>
Note that this type's contract does not constrain the set of characters that
may appear in the URI beyond the requirements of RFC 3986. If the value of a
SafeUri object is used in certain contexts of a HTML document, appropriate
escaping must be applied.


>
> The CSS spec says whitespace, quotes (single and double) and parentheses
> can be escaped using a backslash:
> http://www.w3.org/TR/CSS2/syndata.html#uri

I believe I've read somewhere that this doesn't work in IE, but I can't find
the reference now...

One problem I'd worried about is this:  parentheses are legal in URIs per
the RFC.  At the same time, \xxxxxx escaping them in the CSS seems to not
actually be sufficient; such escaping doesn't prevent the string from being
interpreted as CSS syntax (what were they thinking???), see
http://code.google.com/p/browsersec/wiki/Part1#CSS_character_encoding.
I think we'll just have to %-escape parentheses and single quotes when using
a SafeUri in a CSS url(). I'm not really aware of them meaning anything
special in *http* URIs; even if RFC 3986 says that %-escaping parentheses
won't result in an equivalent URI I don't know of any reason doing so would
result in a different resource being addressed on a web server.


>
> So maybe we could special-case SafeUri in CSS contexts in the generator?
>
I think we should just ban it (or at least warn), since we don't parse the
CSS and can't really tell what context in the CSS the URI is used in.  A
safe style builder API would know, and would escape the URI correctly.

(we already special-case it for it's .asString(), and special-case URL
> contexts for non-SafeUri values to sanitize them, so...)
>
>
>  So I think we need to,
>> - tighten the SafeUri contract so it explicitly specifies which
>>
> character set
>
>> can occur in a SafeUri
>> - add a sanitizeAndEscapeUri method to UriUtils that both checks for
>>
> an allowed
>
>> scheme and URI/%-escapes characters not in the allowed charset (or
>>
> maybe rejects
>
>> URLs with certain funny characters in them).
>>
>
> We can quite simply %-escape those chars that are special to CSS (see
> above), or use the more radical approach of re-%-encode the whole URI:
> i.e. %-encode any "funny char" (except % chars that are not followed by
> a pair of hex digits; similar to SafeHtmlUtils#htmlEscapeAllowEntities,
> but for %-escaping).
> Or we can reverse-engineer the (proposed) Web Address algorithm to only
> %-encode those chars that would cause issues:
> http://www.w3.org/html/wg/href/draft continued at
> http://tools.ietf.org/html/draft-ietf-iri-3987bis-05#section-7.2
>
> But I wonder if this shouldn't be the rule for sanitizeUri (instead of
> introducing a new method and leaving a "hole" in sanitizeUri under
> certain circumstances).
>
I think you're right; sanitizeUri should %-escape characters not allowed by
RFC 3986.  Looks like passing the string through
http://google-web-toolkit.googlecode.com/svn/javadoc/latest/com/google/gwt/http/client/URL.html#encode(java.lang.String)would
be sufficient?  Anyway, I'll take a TODO item for this.

I'm still wondering if we should properly parse the whole URL and insist
it's syntactically valid; when I wrote sanitizeUri I chose not to because
there's no easily available way to do it in GWT (since java.net.URI isn't
implemented), and it would seem to be fairly costly to do.

There should probably be an encodeAllowEscapes too, similar to
> htmlEscapeAllowEntities, when e.g. you know your URL is safe, but you're
> unsure if it's "encoded enough" for every context (including CSS), i.e.
> the URL could contain unescaped parentheses or single quotes.
>
> Should we defer this patch? or tackle this in a following one? (maybe
> adding a TODO in
> fromTrustedString/fromString/fromUntrustedString/sanitizeUri?)
>
>
>
> http://gwt-code-reviews.appspot.com/1380806/
>

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors

[gwt-contrib] Re: Add SafeUri type, similar to SafeHtml but for values in a URL attribute context. (issue1380806)

Reply via email to