Re: [whatwg] The problem of duplicate ID as a security issue

Mihai Sucan Tue, 14 Mar 2006 12:35:37 -0800

Le Tue, 14 Mar 2006 14:03:42 +0200, Alexey Feldgendler<[EMAIL PROTECTED]> a écrit:

To access the nodes inside sandboxes, the script in the parent documentcan eithher "manually" traverse the DOM tree or do the following: firstfind all relevant elements in the main document (starting from the rootnoode), then find all sandboxes with getElementsByTagName() (whichdoesn't dive inside sandboxes, but is able to return the sandboxesthemselves), then continue recursively from each sandbox found. Thisinvolves somewhat more coding work, but I expect that finding allmathing elements across sandbox boundaries will be a significantly moreunusual task than finding elements in the parent document (outsidesandboxes) or within a given sandbox.

Yes, I saw Ric's reply. A nice suggestion, but that implies <sandbox> is adocumentElement by itself, or is it a DOMSandbox needing to be defined?


<...>

I hope that defining getElement(s)By* to not cross sandbox boundarieswill do the work.


Yes.

<...>

Anyway, even if there are cases when "sandbox {overflow: hidden}" is notenough, the possible extent of damage from misplaced content thatvisually "jumps" out of the sandbox is a whole order less than theextent of damage from the exploit shown in my original message. It'smore important to handle the latter.
A side note: it may help to specify a set of default styling rules forthe sandbox element so that it doesn't allow visual leakage of content.


I agree.

The spec can't do much in these situations. Shall the spec provide away for CSS files to *not* be applied in <sandbox>ed content?
*:not(sandbox) p { text-align: left; }
Yes, very interesting. I was aware of this, but I forgot of it.
This would be better used coupled with a suggestion made in a thread"styling the unstylable" (on www-style): style-blocks.
Sorry, I must have completely missed that thread... Can you give me thelink?


http://lists.w3.org/Archives/Public/www-style/2006Mar/thread.html
http://lists.w3.org/Archives/Public/www-style/2006Mar/0035.html

See my last reply. Theoretically it's not even remotely related to thisthread, but if you think of it: style-blocks would help with sandboxinguser styling (I can explain it to you on ICQ or private emails).


<...>

Wikis are a somewhat outstanding example. These traditionally use custommarkup languages (mainly to make hyperlinking easier), but many of them,like MediaWiki, allow a subset of HTML as well. (MediaWiki uses the"whitelist" approach, but it seems to be at least theoreticallyvulnerable to the duplicate ID trick.)


Very good point.

<...>

This is true, but there is a problem with the whitelisting approach: theset of elements and attributes isn't in one-to-one correspondence withthe set of broowser features. For example, one can't define a set ofelements and attributes which must be removed to prohibit scripting:it's not enough to just remove <script> elements and on* attributes, onemust also check attributes which contain URIs to filter out"javascript:". (I know it's a bad example because one would need toconvert javscript: to safe-javascript: anyway, but you've got the idea,right?)
While filtering the DOM tree by the HTML cleaner is easy, it approachesthe problem from the syntax point of view, not semantic. It's morerobust to write something like <sandbox scripting="disallow"> todisallow all scripting within the sandbox, including any obscure orfuture flavors of scripts as well as those enabled by proprietaryextensions (like MSIE's "expression()" in CSS). Browser developers knowbetter what makes "all possible kinds of scripts" than the webapplication developers.
Likewise, other browser features are better controlled explicitly ("Iwant to disable all external content within this sandbox") than byfiltering the DOM tree. At least because not all new features, like newways to load external conteent, come with new elements or attributeswhich aren't on the whitelist. Some features reuse existing syntax inelegant ways.

Again, good point, but this is not entirely related to "duplicate ID as asecurity issue". Meaning, you are advocating for the <sandbox> element.That's something I also do, depending the way it's going to be defined (ofcourse).

The <sandbox> element would make securing a web application from commonsecurity holes and other pitfalls much easier and elegant. Of course, itwould also solve the duplicate IDs issue.

IDs are useful to make anchors for navigation to sections of the page,and classs names are useful to style the content in uniformity with therest of the site (for example, Wikipedia's skins define the class"wikitable" to make user tables look the same throughout the site).These two features are good for the web. Taking them away for securityreasons would lower the quality of the web content. For example, ifWikipedia disallowed the class attribute, then each such table wouldhave to bear physical formatting attached to it, which is a step behind.
Of course, comments on forums don't need these features. But I'm talkingmore of your "grade 2" applications.

Wikipedia is a special case I forgot of. It's very close to "grade 3", butnot quite.


<...>

As for scripting, if there's any user wanting to post his/her script ina forum, then that's a problem. I wouldn't ever allow it (exceptprobably for research purposes, such as "how users act when they aregiven all power" :) ).
Scripting isn't useful for forum posts, but it is useful inblogs/CMS/wikis, mainly because today's HTML sucks. People want thingslike collapsible sections, popup menus, tables with changeable sortorder etc. (Some of these tasks won't require scripting according toWA1).

I have to somewhat disagree with this, because blogs, CMS and wikiapplications must provide the scripts, the "toys" in a WYSIWYGenvironment. Those can be secured by the application authors in a properway, and user-scripts should be not allowed. Table sorting, popup menusand similar are all toys. Does Wikipedia allow full-scripting access? Ibelieve they allow access to some toys only.

I've mentioned it in the original message. Though I find it too strictto strip all id and class attributes from user-supplied text. Theyusually do more good than bad.
I don't. It's not too strict at all. I actually find it very loose toallow these specific attributes. They should be allowed *only* whenthere are real requirements (especially IDs).
Navigational anchors is a real use case for IDs.
Classes have many use cases, the primary being to avoid presentationalin favor of semantic formatting. Another harmless but useful way toapply classes is the so-called microformats (seehttp://microformats.org/).


True.

<...>

Yes, this is good. Web-based viruses don't yet exist, but it's only amatter of time.
Java applets exist for many years, but there aren't any virusesdistributed this way. The framework for the Java applets is sowell-defined that it's just not possible.

I'd say it's just like viruses for Linux. Not many want to do a virus forLinux, they all make viruses for Windows. If we'd all switch to Linux,we'd have many viruses for Linux too (it's not impossible as you shouldalready know). All the same goes for the web, the java applets, etc. Butthis is off-topic and it's a very different story.

Returning to the duplicate IDs, I think we should define some standardbehavior for getElementById() when there is more than one element withthe given ID. To lower the possible extent of duplicate ID attacks, Ipropose that getElementById() should throw an exception in that case.It's better to crash the script than to make it do what the attackerwants.

Bad idea. I've just worked with a guy on a web application done the"industrial way" (as in "get it done ASAP, no matter how"). This was doneentirely with copy/pasted frameworks with Java on the server-side, DOJOclient-side and some more frameworks (5 to 10!!!). It was horrible: manyduplicate IDs, slowly loading ("web 2.0 ready with AJAX"), etc. I wasamazed it even worked :). The guy wasn't fully aware of the "behind thescenes" (he didn't even see how badly the generated DOM looks in thebrowser).

Point is, web applications currently do rely on duplicate IDs support.Throwing errors (thus breaking scripts) also badly breaks backwardscompatibility. That web application is not the only one having such badlycoded backend, it's one of many (look at most corporate web sites done in"a snap" by "gurus").


<...>

- grade 2
Full-blown ones: for blog articles, CMSs, ...

Scripting: none
Styling: yes
Tags and attributes: same as grade 2, with the exception that thesemust allow class and style attributes.
For these applications, user-supplied JavaScript is highly demanded, andit can't be fulfilled by a limited set of predefined JavaScript toys.
They also need IDs for navigational purposes.

Predefined toys are enough. It's almost useless to allow scripts to run ina sandboxed "frame-like" environment: in your blog article, without beingable to interact with the page navigation (which is outside the sandbox),and do other stuff.

- grade 3
Web authoring tools: similar to NVU, Dreamweaver, ...

Scripts, styling, tags and attributes: everything.
Security concerns regarding scripting are eliminated in grade 1 andgrade 2 WYSIWYG editors, because you can't really expect average Janeand Joe to want to do so scripting for their articles and pages inCMSs. If they'd want, they'd make their own site "by hand".
They probably don't want to do "scripting", they just want theseinteractive things like tables with changeable sort order. If they weregiven the ability to use scripts in their articles, they would find anice JavaScript through a search engine and paste it on the site.

I disagree. Online web authoring tools must allow full scripting support,exactly as NVU does. As for grade 1 and 2, I said it above: just toys areenough (like table sorting, menus, etc).

P.S. You have sent the reply only to me. I suppose it's by mistake(nothing personal was in it). I have sent my reply to your email backto WHATWG (I expect your future replies to also do so - it's a publicdiscussion).
You're right, I've hit the wrong button. Thanks.


No problem, it already happend to me twice on these mailing lists :(.


--
http://www.robodesign.ro
ROBO Design - We bring you the future

Re: [whatwg] The problem of duplicate ID as a security issue

Reply via email to