Le Tue, 14 Mar 2006 14:03:42 +0200, Alexey Feldgendler <[EMAIL PROTECTED]> a écrit:

To access the nodes inside sandboxes, the script in the parent document can eithher "manually" traverse the DOM tree or do the following: first find all relevant elements in the main document (starting from the root noode), then find all sandboxes with getElementsByTagName() (which doesn't dive inside sandboxes, but is able to return the sandboxes themselves), then continue recursively from each sandbox found. This involves somewhat more coding work, but I expect that finding all mathing elements across sandbox boundaries will be a significantly more unusual task than finding elements in the parent document (outside sandboxes) or within a given sandbox.

Yes, I saw Ric's reply. A nice suggestion, but that implies <sandbox> is a documentElement by itself, or is it a DOMSandbox needing to be defined?

<...>
I hope that defining getElement(s)By* to not cross sandbox boundaries will do the work.

Yes.

<...>
Anyway, even if there are cases when "sandbox {overflow: hidden}" is not enough, the possible extent of damage from misplaced content that visually "jumps" out of the sandbox is a whole order less than the extent of damage from the exploit shown in my original message. It's more important to handle the latter.

A side note: it may help to specify a set of default styling rules for the sandbox element so that it doesn't allow visual leakage of content.

I agree.

The spec can't do much in these situations. Shall the spec provide a way for CSS files to *not* be applied in <sandbox>ed content?

*:not(sandbox) p { text-align: left; }

Yes, very interesting. I was aware of this, but I forgot of it.

This would be better used coupled with a suggestion made in a thread "styling the unstylable" (on www-style): style-blocks.

Sorry, I must have completely missed that thread... Can you give me the link?

http://lists.w3.org/Archives/Public/www-style/2006Mar/thread.html
http://lists.w3.org/Archives/Public/www-style/2006Mar/0035.html

See my last reply. Theoretically it's not even remotely related to this thread, but if you think of it: style-blocks would help with sandboxing user styling (I can explain it to you on ICQ or private emails).

<...>
Wikis are a somewhat outstanding example. These traditionally use custom markup languages (mainly to make hyperlinking easier), but many of them, like MediaWiki, allow a subset of HTML as well. (MediaWiki uses the "whitelist" approach, but it seems to be at least theoretically vulnerable to the duplicate ID trick.)

Very good point.

<...>
This is true, but there is a problem with the whitelisting approach: the set of elements and attributes isn't in one-to-one correspondence with the set of broowser features. For example, one can't define a set of elements and attributes which must be removed to prohibit scripting: it's not enough to just remove <script> elements and on* attributes, one must also check attributes which contain URIs to filter out "javascript:". (I know it's a bad example because one would need to convert javscript: to safe-javascript: anyway, but you've got the idea, right?)

While filtering the DOM tree by the HTML cleaner is easy, it approaches the problem from the syntax point of view, not semantic. It's more robust to write something like <sandbox scripting="disallow"> to disallow all scripting within the sandbox, including any obscure or future flavors of scripts as well as those enabled by proprietary extensions (like MSIE's "expression()" in CSS). Browser developers know better what makes "all possible kinds of scripts" than the web application developers.

Likewise, other browser features are better controlled explicitly ("I want to disable all external content within this sandbox") than by filtering the DOM tree. At least because not all new features, like new ways to load external conteent, come with new elements or attributes which aren't on the whitelist. Some features reuse existing syntax in elegant ways.

Again, good point, but this is not entirely related to "duplicate ID as a security issue". Meaning, you are advocating for the <sandbox> element. That's something I also do, depending the way it's going to be defined (of course).

The <sandbox> element would make securing a web application from common security holes and other pitfalls much easier and elegant. Of course, it would also solve the duplicate IDs issue.

IDs are useful to make anchors for navigation to sections of the page, and classs names are useful to style the content in uniformity with the rest of the site (for example, Wikipedia's skins define the class "wikitable" to make user tables look the same throughout the site). These two features are good for the web. Taking them away for security reasons would lower the quality of the web content. For example, if Wikipedia disallowed the class attribute, then each such table would have to bear physical formatting attached to it, which is a step behind.

Of course, comments on forums don't need these features. But I'm talking more of your "grade 2" applications.

Wikipedia is a special case I forgot of. It's very close to "grade 3", but not quite.

<...>
As for scripting, if there's any user wanting to post his/her script in a forum, then that's a problem. I wouldn't ever allow it (except probably for research purposes, such as "how users act when they are given all power" :) ).

Scripting isn't useful for forum posts, but it is useful in blogs/CMS/wikis, mainly because today's HTML sucks. People want things like collapsible sections, popup menus, tables with changeable sort order etc. (Some of these tasks won't require scripting according to WA1).

I have to somewhat disagree with this, because blogs, CMS and wiki applications must provide the scripts, the "toys" in a WYSIWYG environment. Those can be secured by the application authors in a proper way, and user-scripts should be not allowed. Table sorting, popup menus and similar are all toys. Does Wikipedia allow full-scripting access? I believe they allow access to some toys only.

I've mentioned it in the original message. Though I find it too strict to strip all id and class attributes from user-supplied text. They usually do more good than bad.

I don't. It's not too strict at all. I actually find it very loose to allow these specific attributes. They should be allowed *only* when there are real requirements (especially IDs).

Navigational anchors is a real use case for IDs.

Classes have many use cases, the primary being to avoid presentational in favor of semantic formatting. Another harmless but useful way to apply classes is the so-called microformats (see http://microformats.org/).

True.

<...>
Yes, this is good. Web-based viruses don't yet exist, but it's only a matter of time.

Java applets exist for many years, but there aren't any viruses distributed this way. The framework for the Java applets is so well-defined that it's just not possible.

I'd say it's just like viruses for Linux. Not many want to do a virus for Linux, they all make viruses for Windows. If we'd all switch to Linux, we'd have many viruses for Linux too (it's not impossible as you should already know). All the same goes for the web, the java applets, etc. But this is off-topic and it's a very different story.

Returning to the duplicate IDs, I think we should define some standard behavior for getElementById() when there is more than one element with the given ID. To lower the possible extent of duplicate ID attacks, I propose that getElementById() should throw an exception in that case. It's better to crash the script than to make it do what the attacker wants.

Bad idea. I've just worked with a guy on a web application done the "industrial way" (as in "get it done ASAP, no matter how"). This was done entirely with copy/pasted frameworks with Java on the server-side, DOJO client-side and some more frameworks (5 to 10!!!). It was horrible: many duplicate IDs, slowly loading ("web 2.0 ready with AJAX"), etc. I was amazed it even worked :). The guy wasn't fully aware of the "behind the scenes" (he didn't even see how badly the generated DOM looks in the browser).

Point is, web applications currently do rely on duplicate IDs support. Throwing errors (thus breaking scripts) also badly breaks backwards compatibility. That web application is not the only one having such badly coded backend, it's one of many (look at most corporate web sites done in "a snap" by "gurus").

<...>
- grade 2
Full-blown ones: for blog articles, CMSs, ...

Scripting: none
Styling: yes
Tags and attributes: same as grade 2, with the exception that these must allow class and style attributes.

For these applications, user-supplied JavaScript is highly demanded, and it can't be fulfilled by a limited set of predefined JavaScript toys.

They also need IDs for navigational purposes.

Predefined toys are enough. It's almost useless to allow scripts to run in a sandboxed "frame-like" environment: in your blog article, without being able to interact with the page navigation (which is outside the sandbox), and do other stuff.

- grade 3
Web authoring tools: similar to NVU, Dreamweaver, ...

Scripts, styling, tags and attributes: everything.

Security concerns regarding scripting are eliminated in grade 1 and grade 2 WYSIWYG editors, because you can't really expect average Jane and Joe to want to do so scripting for their articles and pages in CMSs. If they'd want, they'd make their own site "by hand".

They probably don't want to do "scripting", they just want these interactive things like tables with changeable sort order. If they were given the ability to use scripts in their articles, they would find a nice JavaScript through a search engine and paste it on the site.

I disagree. Online web authoring tools must allow full scripting support, exactly as NVU does. As for grade 1 and 2, I said it above: just toys are enough (like table sorting, menus, etc).

P.S. You have sent the reply only to me. I suppose it's by mistake (nothing personal was in it). I have sent my reply to your email back to WHATWG (I expect your future replies to also do so - it's a public discussion).

You're right, I've hit the wrong button. Thanks.

No problem, it already happend to me twice on these mailing lists :(.


--
http://www.robodesign.ro
ROBO Design - We bring you the future

Reply via email to