On Wed, 15 Mar 2006 02:36:51 +0600, Mihai Sucan <[EMAIL PROTECTED]>
wrote:
To access the nodes inside sandboxes, the script in the parent document
can eithher "manually" traverse the DOM tree or do the following: first
find all relevant elements in the main document (starting from the root
noode), then find all sandboxes with getElementsByTagName() (which
doesn't dive inside sandboxes, but is able to return the sandboxes
themselves), then continue recursively from each sandbox found. This
involves somewhat more coding work, but I expect that finding all
mathing elements across sandbox boundaries will be a significantly more
unusual task than finding elements in the parent document (outside
sandboxes) or within a given sandbox.
Yes, I saw Ric's reply. A nice suggestion, but that implies <sandbox> is
a documentElement by itself, or is it a DOMSandbox needing to be defined?
Sandboxes are quite special things, so we'll need a DOMSandbox anyway. But
instead of adding things like getElementById() to the DOMSandbox
interface, I tend to make the "fake document" which is visible from inside
the sandbox a member of the sandbox itself. The call will look like
sandbox.document.getElementById().
This is true, but there is a problem with the whitelisting approach:
the set of elements and attributes isn't in one-to-one correspondence
with the set of broowser features. For example, one can't define a set
of elements and attributes which must be removed to prohibit scripting:
it's not enough to just remove <script> elements and on* attributes,
one must also check attributes which contain URIs to filter out
"javascript:". (I know it's a bad example because one would need to
convert javscript: to safe-javascript: anyway, but you've got the idea,
right?)
While filtering the DOM tree by the HTML cleaner is easy, it approaches
the problem from the syntax point of view, not semantic. It's more
robust to write something like <sandbox scripting="disallow"> to
disallow all scripting within the sandbox, including any obscure or
future flavors of scripts as well as those enabled by proprietary
extensions (like MSIE's "expression()" in CSS). Browser developers know
better what makes "all possible kinds of scripts" than the web
application developers.
Likewise, other browser features are better controlled explicitly ("I
want to disable all external content within this sandbox") than by
filtering the DOM tree. At least because not all new features, like new
ways to load external conteent, come with new elements or attributes
which aren't on the whitelist. Some features reuse existing syntax in
elegant ways.
Again, good point, but this is not entirely related to "duplicate ID as
a security issue". Meaning, you are advocating for the <sandbox>
element. That's something I also do, depending the way it's going to be
defined (of course).
Yes, really. I've actually gave the wrong subject to the thread. It should
have been titled "Sandboxing can make contained HTML harmless in more ways
than just isolating scripts".
The <sandbox> element would make securing a web application from common
security holes and other pitfalls much easier and elegant. Of course, it
would also solve the duplicate IDs issue.
Actually, now it seems the only solution to me because, as you say below,
the behavior on duplicate IDs cannot be changed to a safe way without
breaking backwaard compatibility.
I have to somewhat disagree with this, because blogs, CMS and wiki
applications must provide the scripts, the "toys" in a WYSIWYG
environment. Those can be secured by the application authors in a proper
way, and user-scripts should be not allowed. Table sorting, popup menus
and similar are all toys. Does Wikipedia allow full-scripting access? I
believe they allow access to some toys only.
They don't provide any JavaScript toys. I hope they'll do it someday.
Returning to the duplicate IDs, I think we should define some standard
behavior for getElementById() when there is more than one element with
the given ID. To lower the possible extent of duplicate ID attacks, I
propose that getElementById() should throw an exception in that case.
It's better to crash the script than to make it do what the attacker
wants.
Bad idea. I've just worked with a guy on a web application done the
"industrial way" (as in "get it done ASAP, no matter how"). This was
done entirely with copy/pasted frameworks with Java on the server-side,
DOJO client-side and some more frameworks (5 to 10!!!). It was horrible:
many duplicate IDs, slowly loading ("web 2.0 ready with AJAX"), etc. I
was amazed it even worked :). The guy wasn't fully aware of the "behind
the scenes" (he didn't even see how badly the generated DOM looks in the
browser).
Point is, web applications currently do rely on duplicate IDs support.
Throwing errors (thus breaking scripts) also badly breaks backwards
compatibility. That web application is not the only one having such
badly coded backend, it's one of many (look at most corporate web sites
done in "a snap" by "gurus").
Well, if browsers did throw exceptions on duplicate IDs, there wouldn't be
any duplicate IDs in existing applications. The problem is that there are
already such applications.
(A wild thought: maybe enforce ID uniqueness only for <!DOCTYPE html>?)
For these applications, user-supplied JavaScript is highly demanded,
and it can't be fulfilled by a limited set of predefined JavaScript
toys.
They also need IDs for navigational purposes.
Predefined toys are enough. It's almost useless to allow scripts to run
in a sandboxed "frame-like" environment: in your blog article, without
being able to interact with the page navigation (which is outside the
sandbox), and do other stuff.
Someone could post a JavaScript game in his blog, a horoscope calculator
etc.
And, by the way, blog entries aren't the only place where sandboxing can
be appliied in blogs. For example, LiveJournal allows user-defined journal
styles which are written by the users in a self-invented programming
language which outputs HTML. That HTML goes through the HTML cleaner
afterwards, of course. Manny people would love to add dynamic menus, AJAX
comments folding etc to their styles. This could be partly solved with a
set of predefined "toys", but actually the entire LiveJournal styling
system is about user-initiated development. Those with programming skills
write new styles, and other users may take and use them.
-- Opera M2 9.0 TP2 on Debian Linux 2.6.12-1-k7
* Origin: X-Man's Station at SW-Soft, Inc. [ICQ: 115226275]
<[EMAIL PROTECTED]>