[whatwg] Built Firefox, time to get cracking
title says it all really, only took me a few days of trying, heh. There's little to no chance that anything i do stick in will make it into the trunk (esp as i'm only building FX not seamonkey) but it should all be good clean fun anyway, what does anyone think i should toy with first? quite tempted to have a go at some of the forms stuff, specifically the submit button overrides (action, etc.) Ric Hardacre http://www.cyclomedia.co.uk/
Re: [whatwg] Internal character encoding declaration
On Mar 14, 2006, at 15:07, Peter Karlsson wrote: Henri Sivonen on 2006-03-14: Transcoding is very popular, especially in Russia. In *proxies* *today*? What's the point considering that browsers have supported the Cyrillic encoding soup *and* UTF-8 for years? The mod_charset is not proxying, it's on the server level. Right. So, as a data point, it neither proves nor disproves the legends about transcoding *proxies* around Russia and Japan. How could proxies properly transcode form submissions coming back without messing everything up spectacularly? That's why the hidden-string technique was invented. Introduce a hidden input with a character string that will get encoded differently depending on the encoding used. When data comes in, use this character string to determine what encoding was used. I thought that method was for detecting broken browsers and users meddling with the encoding menu, and I though using that method was relatively rare. In order for deploying a transcoding proxy to be safe for a Russian ISP, virtually every form handler in Russia would have take countermeasures against the adverse effects of transcoding proxies. Are the countermeasures ubiquitous? Easy parse errors are not fatal in browsers. Surely it is OK for a conformance checker to complain that much at server operators whose HTTP layer and meta do not match. I just reacted at the notion of calling such documents invalid. It is the transport layer that defines the encoding, whatever the document says or how it looks like is irrelevant, and is just something that you can look at if the transport layer neglects to say anything. If two layers disagree, it suggests there is a problem and, in my opinion, it should be flagged as an error. (Especially considering Ruby's Postulate[1].) Operators of transcoding origin servers (or reverse proxies which viewed from the Web count as origin servers) are free not to send a disagreeing charset meta. [1] http://intertwingly.net/slides/2004/devcon/69.html -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] Internal character encoding declaration
Henri Sivonen on 2006-03-16: Right. So, as a data point, it neither proves nor disproves the legends about transcoding *proxies* around Russia and Japan. The only transcoding proxies I know about are WAP gateways. They tend to do interesting things with input, especially when the source doesn't specify what it is. In order for deploying a transcoding proxy to be safe for a Russian ISP, virtually every form handler in Russia would have take countermeasures against the adverse effects of transcoding proxies. Are the countermeasures ubiquitous? I haven't investigated, so I don't have a reply to that. -- \\// Peter, software engineer, Opera Software The opinions expressed are my own, and not those of my employer. Please reply only by follow-ups on the mailing list.
Re: [whatwg] The problem of duplicate ID as a security issue
On Wed, 15 Mar 2006 19:26:03 +0600, Mihai Sucan [EMAIL PROTECTED] wrote: Sandboxes are quite special things, so we'll need a DOMSandbox anyway. But instead of adding things like getElementById() to the DOMSandbox interface, I tend to make the fake document which is visible from inside the sandbox a member of the sandbox itself. The call will look like sandbox.document.getElementById(). As Ric said, having sandboxes treated too similar to a document is overkill. A DOMDocument interface has to be exposed to the contained scripts anyway, ahy not also make it accessible from the outside? (A wild thought: maybe enforce ID uniqueness only for !DOCTYPE html?) I think enforcing ID uniqueness in standards mode would be good, but that would still probably break (very?) few pages. Those web authors should have to live with it, because they want standards-compliant sites. I'm not speaking about enforcing ID uniqueness at the time of parsing the page, but only at the time of calling getElementById(). I believe it will break very few pages, if any. I know that many web applications have bugs like this: they have a CSS rule like #titlebar { font-weight: bold; } and a single titlebar on the page. After that, the requirements change, and they have more than one titlebar on a page. To make the rule apply to all titlebars, they give them all the same ID (instead of using class, not ID, in CSS rules). While such documents are non-connforming, they should not, in my opinion, cause parse errors even in standards mode. Here is why: duplicate IDs are wrong, but it's obvious what the author means, and it's easy to do what the author intended. Usually in such applications the scripts don't call getElementById() for those ID values which occur more than once. If they occasionally do, it's really a programming bug. I don't believe that there are applications that really rely on the particular behavior in this case, though I admit that there are possibly some that have this bug unnoticed and still work. I think that this case should trigger an exception in standards mode because, for this bug, there is no obvious fix to apply, and we don't know what the author meant -- does he want to do something to the first element with the specified ID, the second, or both. Side note and wild guess: We are probably forgeting that the beauty of the web is actually allowing everyone to contribute, be it bad code or better code. Wanting something *that* strict is like disproving one of the essential concepts contributing to the success of the web. Simply picking the last matching node is actually hiding a bug and letting it go unnoticed. (Why the last one? Why not the first, for example?) And, by the way, blog entries aren't the only place where sandboxing can be appliied in blogs. For example, LiveJournal allows user-defined journal styles which are written by the users in a self-invented programming language which outputs HTML. That HTML goes through the HTML cleaner afterwards, of course. Manny people would love to add dynamic menus, AJAX comments folding etc to their styles. This could be partly solved with a set of predefined toys, but actually the entire LiveJournal styling system is about user-initiated development. Those with programming skills write new styles, and other users may take and use them. I did not see LiveJournal, so I don't know what kind of features they offer. sandbox would probably do the trick (would help a lot with security in this case also). Yes, I think so. Actually, my activity around the sandboxing idea has been inspired by several recent security incidents with LiveJournal and its styling system which failed to filter out some patterns of dangerous HTML. Take HTML, for example, it's a markup language greatly appreciated by many and despised by others. Even you said in one reply to this thread today's HTML sucks - advocating for the need of allowing user-scripts in pages, for having table sorting, popup menus, etc. A few minutes later in another reply you say we already have a great markup language, which is HTML - advocating for allowing users to write HTML, instead of custom markup. Yeah, really, I sound a bit contradictory. Actually, in my opinion, HTML is better than most of ad-hoc markup languages, and HTML with scripts is still better than just HTML. And another thing: HTML 5 is about to make HTML pages more powerful, there are going to be menus, datagrids and such, but most of these features are useless without scripting, aren't they? For example, a datagrid isn't really sortable at client side without a script, which makes it useless in blogs and CMS unless they allow some scripting. So, sandbox may be designed to help tighting-up security on the web, but we should also try to think of how's it actually in usage, side-effects, etc. It definitely solves
Re: [whatwg] Internal character encoding declaration
Peter Karlsson wrote: Transcoding is very popular, especially in Russia. Ahem... I wouldn't say it is. Only most, shall we say, conservative hosters still insist on these archaic setups and refuse to understand that trying to stick everything into windows-1251 is long unneeded. But overall the nasty thing called 'Russian Apache' is really going away, and for good.
Re: [whatwg] The problem of duplicate ID as a security issue
Le Thu, 16 Mar 2006 13:45:54 +0200, Alexey Feldgendler [EMAIL PROTECTED] a écrit: ... A DOMDocument interface has to be exposed to the contained scripts anyway, ahy not also make it accessible from the outside? Yes, but I'm afraid it's a technical challenge to implementors. Their browser engines might need some rewrites to properly support sandboxing content. Therefore, instead of rewrites, they'll hack the sandboxes, opening a wide variety of security holes competing for the crown of the first web virus. ... I'm not speaking about enforcing ID uniqueness at the time of parsing the page, but only at the time of calling getElementById(). I believe it will break very few pages, if any. I know that many web applications have bugs like this: they have a CSS rule like #titlebar { font-weight: bold; } and a single titlebar on the page. After that, the requirements change, and they have more than one titlebar on a page. To make the rule apply to all titlebars, they give them all the same ID (instead of using class, not ID, in CSS rules). While such documents are non-connforming, they should not, in my opinion, cause parse errors even in standards mode. Here is why: duplicate IDs are wrong, but it's obvious what the author means, and it's easy to do what the author intended. Usually in such applications the scripts don't call getElementById() for those ID values which occur more than once. If they occasionally do, it's really a programming bug. I don't believe that there are applications that really rely on the particular behavior in this case, though I admit that there are possibly some that have this bug unnoticed and still work. I think that this case should trigger an exception in standards mode because, for this bug, there is no obvious fix to apply, and we don't know what the author meant -- does he want to do something to the first element with the specified ID, the second, or both. Under no way should this happen. This is adding confusion to an already over-confused web author (as in: a web author who doesn't know much web development). Therefore, it's clear nothing has to be changed in quirks mode, but in standards mode: 1. break during parsing. 2. break JS code if it sets the id of a node to a duplicate ID. Or simply leave it as it is: quirks mode behaviour. ... Simply picking the last matching node is actually hiding a bug and letting it go unnoticed. (Why the last one? Why not the first, for example?) That's true, but this happens in many, many other cases. ... I did not see LiveJournal, so I don't know what kind of features they offer. sandbox would probably do the trick (would help a lot with security in this case also). Yes, I think so. Actually, my activity around the sandboxing idea has been inspired by several recent security incidents with LiveJournal and its styling system which failed to filter out some patterns of dangerous HTML. True. As you said, there are risks with buggy sandbox implementations, but that's an advantage actually: relying on browser fixes, instead of site-by-site fixes. I prefer to get a single patch from the implementor, than wait for hundreds of sites to fix them. Yet, this is an advantage to malicious users too: distribution of the virus/exploit can be very powerful and fast. ... Yeah, really, I sound a bit contradictory. Actually, in my opinion, HTML is better than most of ad-hoc markup languages, and HTML with scripts is still better than just HTML. Exactly. And another thing: HTML 5 is about to make HTML pages more powerful, there are going to be menus, datagrids and such, but most of these features are useless without scripting, aren't they? For example, a datagrid isn't really sortable at client side without a script, which makes it useless in blogs and CMS unless they allow some scripting. True. So, sandbox may be designed to help tighting-up security on the web, but we should also try to think of how's it actually in usage, side-effects, etc. It definitely solves problems, but will it cause other problems? How important are they? Of course, there is a lot more to think and talk about. I suppose there are going to be problems with particular buggy implementations of sandboxing and exploits specifically targetted at holes in such implementations. I suspect that web application authors and site administrators will be hesitant to allow user scripting even in sandboxes because of the possible browser bugs. Though, because sandboxes can be useful even if scripting inside them is completely disallowed, I hope that the use of sandboxes becomes somewhat popular even before site administrators decide to allow scripting. True, but I'd test. If it works in major browsers as I want, then why not? Especially in the case of intranet web applications. -- http://www.robodesign.ro ROBO Design - We bring you the future
Re: [whatwg] JSONRequest
On 3/16/06, Hallvord R M Steen [EMAIL PROTECTED] wrote: On 3/11/06, Jim Ley [EMAIL PROTECTED] wrote: Accessing JSON resources on a local intranet which are secured by nothing more than the requesting IP address. While this is a valid concern I think the conclusion no *new* security vulnerabilities is correct. If you today embed data on an intranet in JavaScript I can create a page that loads that script in a SCRIPT tag and steal the data. Could you please describe how exactly? the contents of remote script elements are not typically available (and if they are it's a large security hole today) unless valid javascript objects are produced to be queried, that is not the case with bare JSON. Jim.
Re: [whatwg] [html5] tags, elements and generated DOM
On Feb 25, 2006, at 01:06, Ian Hickson wrote: On Thu, 7 Apr 2005, Henri Sivonen wrote: I am very hostile towards the idea of requiring UAs to implement any XML parsing features that are in the realm of the XML 1.0 spec but that the XML 1.0 spec does not require. This means processing the DTD beyond checking the internal subset for well-formedness. I would rather suggest that What WG specs explicitly discourage people from using a doctype on the XHTML side and point out that authors should not expect UAs to process the DTD. Those who want to use entities for input, should parse and reserialize as UTF-8 in their own lair and not expose their entity references (or parochial legacy encodings) to the public network. The spec has text to this effect in places now; let me know if you have more specific text you'd like to see. I don't want to be too strong, since if you're using XML, exactly how you do so is the problem of the XML spec, not the Web Apps / XHTML5 spec. At the end of section 1.8 it says: These XML documents may contain a DOCTYPE if desired, but this is not required to conform to this specification. I'd like to see a note here. Something like this: Note: According to [XML], XML processors are not guaranteed to process the external DTD subset referenced in the DOCTYPE. This means, for example, that using entities for characters is unsafe (except for lt;, gt;, amp;, quot; and apos;). For interoperability, authors are advised to avoid optional features of XML. -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] JSONRequest
Hallvord R M Steen wrote: You are right, if no variables are created one can't see the data by loading it in a SCRIPT tag. Are you aware of intranets/CMSes that use this as a security mechanism? That's not actually right. I'm pretty sure this came across a public security list, so... You can override the constructor on the prototype of the Object object and get access to JSON objects before the JavaScript engine throws them away when it realises they don't get assigned to a variable. Or something like that, anyway. I can't remember exactly how it worked. But I'm pretty sure that it's true that you can get JSON data if it's not protected. Gerv