Re: [whatwg] The problem of duplicate ID as a security issue

Alexey Feldgendler Tue, 14 Mar 2006 21:53:02 -0800

On Wed, 15 Mar 2006 02:36:51 +0600, Mihai Sucan <[EMAIL PROTECTED]>wrote:

To access the nodes inside sandboxes, the script in the parent documentcan eithher "manually" traverse the DOM tree or do the following: firstfind all relevant elements in the main document (starting from the rootnoode), then find all sandboxes with getElementsByTagName() (whichdoesn't dive inside sandboxes, but is able to return the sandboxesthemselves), then continue recursively from each sandbox found. Thisinvolves somewhat more coding work, but I expect that finding allmathing elements across sandbox boundaries will be a significantly moreunusual task than finding elements in the parent document (outsidesandboxes) or within a given sandbox.

Yes, I saw Ric's reply. A nice suggestion, but that implies <sandbox> isa documentElement by itself, or is it a DOMSandbox needing to be defined?

Sandboxes are quite special things, so we'll need a DOMSandbox anyway. Butinstead of adding things like getElementById() to the DOMSandboxinterface, I tend to make the "fake document" which is visible from insidethe sandbox a member of the sandbox itself. The call will look likesandbox.document.getElementById().

This is true, but there is a problem with the whitelisting approach:the set of elements and attributes isn't in one-to-one correspondencewith the set of broowser features. For example, one can't define a setof elements and attributes which must be removed to prohibit scripting:it's not enough to just remove <script> elements and on* attributes,one must also check attributes which contain URIs to filter out"javascript:". (I know it's a bad example because one would need toconvert javscript: to safe-javascript: anyway, but you've got the idea,right?)
While filtering the DOM tree by the HTML cleaner is easy, it approachesthe problem from the syntax point of view, not semantic. It's morerobust to write something like <sandbox scripting="disallow"> todisallow all scripting within the sandbox, including any obscure orfuture flavors of scripts as well as those enabled by proprietaryextensions (like MSIE's "expression()" in CSS). Browser developers knowbetter what makes "all possible kinds of scripts" than the webapplication developers.
Likewise, other browser features are better controlled explicitly ("Iwant to disable all external content within this sandbox") than byfiltering the DOM tree. At least because not all new features, like newways to load external conteent, come with new elements or attributeswhich aren't on the whitelist. Some features reuse existing syntax inelegant ways.

Again, good point, but this is not entirely related to "duplicate ID asa security issue". Meaning, you are advocating for the <sandbox>element. That's something I also do, depending the way it's going to bedefined (of course).

Yes, really. I've actually gave the wrong subject to the thread. It shouldhave been titled "Sandboxing can make contained HTML harmless in more waysthan just isolating scripts".

The <sandbox> element would make securing a web application from commonsecurity holes and other pitfalls much easier and elegant. Of course, itwould also solve the duplicate IDs issue.

Actually, now it seems the only solution to me because, as you say below,the behavior on duplicate IDs cannot be changed to a safe way withoutbreaking backwaard compatibility.

I have to somewhat disagree with this, because blogs, CMS and wikiapplications must provide the scripts, the "toys" in a WYSIWYGenvironment. Those can be secured by the application authors in a properway, and user-scripts should be not allowed. Table sorting, popup menusand similar are all toys. Does Wikipedia allow full-scripting access? Ibelieve they allow access to some toys only.


They don't provide any JavaScript toys. I hope they'll do it someday.

Returning to the duplicate IDs, I think we should define some standardbehavior for getElementById() when there is more than one element withthe given ID. To lower the possible extent of duplicate ID attacks, Ipropose that getElementById() should throw an exception in that case.It's better to crash the script than to make it do what the attackerwants.

Bad idea. I've just worked with a guy on a web application done the"industrial way" (as in "get it done ASAP, no matter how"). This wasdone entirely with copy/pasted frameworks with Java on the server-side,DOJO client-side and some more frameworks (5 to 10!!!). It was horrible:many duplicate IDs, slowly loading ("web 2.0 ready with AJAX"), etc. Iwas amazed it even worked :). The guy wasn't fully aware of the "behindthe scenes" (he didn't even see how badly the generated DOM looks in thebrowser).
Point is, web applications currently do rely on duplicate IDs support.Throwing errors (thus breaking scripts) also badly breaks backwardscompatibility. That web application is not the only one having suchbadly coded backend, it's one of many (look at most corporate web sitesdone in "a snap" by "gurus").

Well, if browsers did throw exceptions on duplicate IDs, there wouldn't beany duplicate IDs in existing applications. The problem is that there arealready such applications.


(A wild thought: maybe enforce ID uniqueness only for <!DOCTYPE html>?)

For these applications, user-supplied JavaScript is highly demanded,and it can't be fulfilled by a limited set of predefined JavaScripttoys.
They also need IDs for navigational purposes.

Predefined toys are enough. It's almost useless to allow scripts to runin a sandboxed "frame-like" environment: in your blog article, withoutbeing able to interact with the page navigation (which is outside thesandbox), and do other stuff.

Someone could post a JavaScript game in his blog, a horoscope calculatoretc.

And, by the way, blog entries aren't the only place where sandboxing canbe appliied in blogs. For example, LiveJournal allows user-defined journalstyles which are written by the users in a self-invented programminglanguage which outputs HTML. That HTML goes through the HTML cleanerafterwards, of course. Manny people would love to add dynamic menus, AJAXcomments folding etc to their styles. This could be partly solved with aset of predefined "toys", but actually the entire LiveJournal stylingsystem is about user-initiated development. Those with programming skillswrite new styles, and other users may take and use them.



-- Opera M2 9.0 TP2 on Debian Linux 2.6.12-1-k7

* Origin: X-Man's Station at SW-Soft, Inc. [ICQ: 115226275]<[EMAIL PROTECTED]>

Re: [whatwg] The problem of duplicate ID as a security issue

Reply via email to