On Sun, Apr 10, 2011 at 17:02, <t.bro...@gmail.com> wrote: > On 2011/04/10 18:32:38, xtof wrote: > >> I think this is pretty much ready, except for one thing I just thought >> > of. > >> Sorry, I should have thought of that earlier :/ >> > > In ClippedImageImpl, we're using a SafeUri in the context of a url() >> > style > >> expression ("background: url({3})"). However, the contract of SafeUri >> > is > >> currently not tight enough for this to be safe I think: SafeUri would >> > allow > >> e.g., parentheses, colons and dashes in un-escaped form in the URL. >> > In a > >> template where the URL appears in a url() css expression as above, a >> > URL like, > >> http://harmless.com/foo);background:url(javascript:evil()); >> would pass UriUtils#sanitizeUri (without getting URI-escaped or >> > anything), and > >> would result in script execution via CSS injection. >> > > I hadn't looked closely at the SafeStyles API and thought it would be > covered there (and ClippedImageImpl would of course be updated to use > SafeStyles). > But note that, similarly, http://harmless.com"onclick="evil()" would > pass sanitizeUri and could result in script execution via HTML injection > (not through SafeHtml/SafeHtmlTemplates though, as the quotes would be > escaped). >
Ok, I think you've convinced me :) I think what you're saying is essentially this: The SafeUri contract only promises that a URL will not result in script execution if dereferenced as-is (which in practice means that its scheme is on a whitelist of ones that don't cause script exec, and in particular isn't javascript:, etc). However, when a SafeUri is embedded in a specific context, it *still* needs to be appropriately escaped. Which is what happens for uri-valued attribute context in SafeHtmlTemplates, and I agree with you that we should treat URI-in-CSS context the same way, and deal with any necessary escaping (or sanitization) there. I do think that it might be worth to constrain SafeUri to only allow syntactically valid URIs, or at least only strings consisting of characters valid in URIs per the RFC. Unfortunately, GWT doesn't implement java.net.URI and the former might be hard to do efficiently in browser-side code; I can't think of any reason the latter isn't sufficient. If we're both in agreement on that, I think we should change the doc of the SafeUri contract to be a bit more specific about that. I.e., instead of 18 /** 19 * An object that implements this interface encapsulates a URI that is 20 * guaranteed to be safe to use (with respect to potential Cross-Site-Scripting 21 * vulnerabilities) in a URL context, for example in a URL-typed attribute in 22 * an HTML document. more something along the lines: /** An object that implements this interface encapsulates a lexically valid (with respect to the set of valid characters specified RFC 3986) URI that when dereferenced, will not result in browser-side execution of script that is not under program control. <p> Note that this type's contract does not constrain the set of characters that may appear in the URI beyond the requirements of RFC 3986. If the value of a SafeUri object is used in certain contexts of a HTML document, appropriate escaping must be applied. > > The CSS spec says whitespace, quotes (single and double) and parentheses > can be escaped using a backslash: > http://www.w3.org/TR/CSS2/syndata.html#uri I believe I've read somewhere that this doesn't work in IE, but I can't find the reference now... One problem I'd worried about is this: parentheses are legal in URIs per the RFC. At the same time, \xxxxxx escaping them in the CSS seems to not actually be sufficient; such escaping doesn't prevent the string from being interpreted as CSS syntax (what were they thinking???), see http://code.google.com/p/browsersec/wiki/Part1#CSS_character_encoding. I think we'll just have to %-escape parentheses and single quotes when using a SafeUri in a CSS url(). I'm not really aware of them meaning anything special in *http* URIs; even if RFC 3986 says that %-escaping parentheses won't result in an equivalent URI I don't know of any reason doing so would result in a different resource being addressed on a web server. > > So maybe we could special-case SafeUri in CSS contexts in the generator? > I think we should just ban it (or at least warn), since we don't parse the CSS and can't really tell what context in the CSS the URI is used in. A safe style builder API would know, and would escape the URI correctly. (we already special-case it for it's .asString(), and special-case URL > contexts for non-SafeUri values to sanitize them, so...) > > > So I think we need to, >> - tighten the SafeUri contract so it explicitly specifies which >> > character set > >> can occur in a SafeUri >> - add a sanitizeAndEscapeUri method to UriUtils that both checks for >> > an allowed > >> scheme and URI/%-escapes characters not in the allowed charset (or >> > maybe rejects > >> URLs with certain funny characters in them). >> > > We can quite simply %-escape those chars that are special to CSS (see > above), or use the more radical approach of re-%-encode the whole URI: > i.e. %-encode any "funny char" (except % chars that are not followed by > a pair of hex digits; similar to SafeHtmlUtils#htmlEscapeAllowEntities, > but for %-escaping). > Or we can reverse-engineer the (proposed) Web Address algorithm to only > %-encode those chars that would cause issues: > http://www.w3.org/html/wg/href/draft continued at > http://tools.ietf.org/html/draft-ietf-iri-3987bis-05#section-7.2 > > But I wonder if this shouldn't be the rule for sanitizeUri (instead of > introducing a new method and leaving a "hole" in sanitizeUri under > certain circumstances). > I think you're right; sanitizeUri should %-escape characters not allowed by RFC 3986. Looks like passing the string through http://google-web-toolkit.googlecode.com/svn/javadoc/latest/com/google/gwt/http/client/URL.html#encode(java.lang.String)would be sufficient? Anyway, I'll take a TODO item for this. I'm still wondering if we should properly parse the whole URL and insist it's syntactically valid; when I wrote sanitizeUri I chose not to because there's no easily available way to do it in GWT (since java.net.URI isn't implemented), and it would seem to be fairly costly to do. There should probably be an encodeAllowEscapes too, similar to > htmlEscapeAllowEntities, when e.g. you know your URL is safe, but you're > unsure if it's "encoded enough" for every context (including CSS), i.e. > the URL could contain unescaped parentheses or single quotes. > > Should we defer this patch? or tackle this in a following one? (maybe > adding a TODO in > fromTrustedString/fromString/fromUntrustedString/sanitizeUri?) > > > > http://gwt-code-reviews.appspot.com/1380806/ > -- http://groups.google.com/group/Google-Web-Toolkit-Contributors