[whatwg] Built Firefox, time to get cracking

2006-03-16 Thread Ric Hardacre
title says it all really, only took me a few days of trying, heh. 
There's little to no chance that anything i do stick in will make it 
into the trunk (esp as i'm only building FX not seamonkey) but it should 
all be good clean fun anyway, what does anyone think i should toy with 
first? quite tempted to have a go at some of the forms stuff, 
specifically the submit button overrides (action, etc.)



Ric Hardacre
http://www.cyclomedia.co.uk/


Re: [whatwg] Internal character encoding declaration

2006-03-16 Thread Henri Sivonen

On Mar 14, 2006, at 15:07, Peter Karlsson wrote:


Henri Sivonen on 2006-03-14:



Transcoding is very popular, especially in Russia.
In *proxies* *today*? What's the point considering that browsers  
have supported the Cyrillic encoding soup *and* UTF-8 for years?


The mod_charset is not proxying, it's on the server level.


Right. So, as a data point, it neither proves nor disproves the  
legends about transcoding *proxies* around Russia and Japan.


How could proxies properly transcode form submissions coming back  
without messing everything up spectacularly?


That's why the hidden-string technique was invented. Introduce a  
hidden input with a character string that will get encoded  
differently depending on the encoding used. When data comes in, use  
this character string to determine what encoding was used.


I thought that method was for detecting broken browsers and users  
meddling with the encoding menu, and I though using that method was  
relatively rare.


In order for deploying a transcoding proxy to be safe for a Russian  
ISP, virtually every form handler in Russia would have take  
countermeasures against the adverse effects of transcoding proxies.  
Are the countermeasures ubiquitous?


Easy parse errors are not fatal in browsers. Surely it is OK for a  
conformance checker to complain that much at server operators  
whose HTTP layer and meta do not match.


I just reacted at the notion of calling such documents invalid. It  
is the transport layer that defines the encoding, whatever the  
document says or how it looks like is irrelevant, and is just  
something that you can look at if the transport layer neglects to  
say anything.


If two layers disagree, it suggests there is a problem and, in my  
opinion, it should be flagged as an error. (Especially considering  
Ruby's Postulate[1].) Operators of transcoding origin servers (or  
reverse proxies which viewed from the Web count as origin servers)  
are free not to send a disagreeing charset meta.


[1] http://intertwingly.net/slides/2004/devcon/69.html

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] Internal character encoding declaration

2006-03-16 Thread Peter Karlsson

Henri Sivonen on 2006-03-16:

Right. So, as a data point, it neither proves nor disproves the legends 
about transcoding *proxies* around Russia and Japan.


The only transcoding proxies I know about are WAP gateways. They tend to do 
interesting things with input, especially when the source doesn't specify 
what it is.


In order for deploying a transcoding proxy to be safe for a Russian ISP, 
virtually every form handler in Russia would have take countermeasures 
against the adverse effects of transcoding proxies. Are the 
countermeasures ubiquitous?


I haven't investigated, so I don't have a reply to that.

--
\\//
Peter, software engineer, Opera Software

 The opinions expressed are my own, and not those of my employer.
 Please reply only by follow-ups on the mailing list.


Re: [whatwg] The problem of duplicate ID as a security issue

2006-03-16 Thread Alexey Feldgendler
On Wed, 15 Mar 2006 19:26:03 +0600, Mihai Sucan [EMAIL PROTECTED]  
wrote:


Sandboxes are quite special things, so we'll need a DOMSandbox anyway.  
But instead of adding things like getElementById() to the DOMSandbox  
interface, I tend to make the fake document which is visible from  
inside the sandbox a member of the sandbox itself. The call will look  
like sandbox.document.getElementById().


As Ric said, having sandboxes treated too similar to a document is  
overkill.


A DOMDocument interface has to be exposed to the contained scripts anyway,  
ahy not also make it accessible from the outside?



(A wild thought: maybe enforce ID uniqueness only for !DOCTYPE html?)


I think enforcing ID uniqueness in standards mode would be good, but  
that would still probably break (very?) few pages. Those web authors  
should have to live with it, because they want standards-compliant  
sites.


I'm not speaking about enforcing ID uniqueness at the time of parsing the  
page, but only at the time of calling getElementById(). I believe it will  
break very few pages, if any.


I know that many web applications have bugs like this: they have a CSS  
rule like #titlebar { font-weight: bold; } and a single titlebar on the  
page. After that, the requirements change, and they have more than one  
titlebar on a page. To make the rule apply to all titlebars, they give  
them all the same ID (instead of using class, not ID, in CSS rules). While  
such documents are non-connforming, they should not, in my opinion, cause  
parse errors even in standards mode. Here is why: duplicate IDs are wrong,  
but it's obvious what the author means, and it's easy to do what the  
author intended.


Usually in such applications the scripts don't call getElementById() for  
those ID values which occur more than once. If they occasionally do, it's  
really a programming bug. I don't believe that there are applications that  
really rely on the particular behavior in this case, though I admit that  
there are possibly some that have this bug unnoticed and still work. I  
think that this case should trigger an exception in standards mode  
because, for this bug, there is no obvious fix to apply, and we don't know  
what the author meant -- does he want to do something to the first  
element with the specified ID, the second, or both.


Side note and wild guess: We are probably forgeting that the beauty of  
the web is actually allowing everyone to contribute, be it bad code or  
better code. Wanting something *that* strict is like disproving one of  
the essential concepts contributing to the success of the web.


Simply picking the last matching node is actually hiding a bug and letting  
it go unnoticed. (Why the last one? Why not the first, for example?)


And, by the way, blog entries aren't the only place where sandboxing  
can be appliied in blogs. For example, LiveJournal allows user-defined  
journal styles which are written by the users in a self-invented  
programming language which outputs HTML. That HTML goes through the  
HTML cleaner afterwards, of course. Manny people would love to add  
dynamic menus, AJAX comments folding etc to their styles. This could be  
partly solved with a set of predefined toys, but actually the entire  
LiveJournal styling system is about user-initiated development. Those  
with programming skills write new styles, and other users may take and  
use them.


I did not see LiveJournal, so I don't know what kind of features they  
offer.


sandbox would probably do the trick (would help a lot with security  
in this case also).


Yes, I think so. Actually, my activity around the sandboxing idea has been  
inspired by several recent security incidents with LiveJournal and its  
styling system which failed to filter out some patterns of dangerous HTML.


Take HTML, for example, it's a markup language greatly appreciated by  
many and despised by others. Even you said in one reply to this thread  
today's HTML sucks - advocating for the need of allowing user-scripts  
in pages, for having table sorting, popup menus, etc. A few minutes  
later in another reply you say we already have a great markup language,  
which is HTML - advocating for allowing users to write HTML, instead of  
custom markup.


Yeah, really, I sound a bit contradictory. Actually, in my opinion, HTML  
is better than most of ad-hoc markup languages, and HTML with scripts is  
still better than just HTML.


And another thing: HTML 5 is about to make HTML pages more powerful, there  
are going to be menus, datagrids and such, but most of these features are  
useless without scripting, aren't they? For example, a datagrid isn't  
really sortable at client side without a script, which makes it useless in  
blogs and CMS unless they allow some scripting.


So, sandbox may be designed to help tighting-up security on the web,  
but we should also try to think of how's it actually in usage,  
side-effects, etc. It definitely solves 

Re: [whatwg] Internal character encoding declaration

2006-03-16 Thread Ivan Sagalaev

Peter Karlsson wrote:


Transcoding is very popular, especially in Russia.


Ahem... I wouldn't say it is. Only most, shall we say, conservative 
hosters still insist on these archaic setups and refuse to understand 
that trying to stick everything into windows-1251 is long unneeded. But 
overall the nasty thing called 'Russian Apache' is really going away, 
and for good.


Re: [whatwg] The problem of duplicate ID as a security issue

2006-03-16 Thread Mihai Sucan
Le Thu, 16 Mar 2006 13:45:54 +0200, Alexey Feldgendler  
[EMAIL PROTECTED] a écrit:


...
A DOMDocument interface has to be exposed to the contained scripts  
anyway, ahy not also make it accessible from the outside?


Yes, but I'm afraid it's a technical challenge to implementors. Their  
browser engines might need some rewrites to properly support sandboxing  
content. Therefore, instead of rewrites, they'll hack the sandboxes,  
opening a wide variety of security holes competing for the crown of the  
first web virus.


...
I'm not speaking about enforcing ID uniqueness at the time of parsing  
the page, but only at the time of calling getElementById(). I believe it  
will break very few pages, if any.


I know that many web applications have bugs like this: they have a CSS  
rule like #titlebar { font-weight: bold; } and a single titlebar on  
the page. After that, the requirements change, and they have more than  
one titlebar on a page. To make the rule apply to all titlebars, they  
give them all the same ID (instead of using class, not ID, in CSS  
rules). While such documents are non-connforming, they should not, in my  
opinion, cause parse errors even in standards mode. Here is why:  
duplicate IDs are wrong, but it's obvious what the author means, and  
it's easy to do what the author intended.


Usually in such applications the scripts don't call getElementById() for  
those ID values which occur more than once. If they occasionally do,  
it's really a programming bug. I don't believe that there are  
applications that really rely on the particular behavior in this case,  
though I admit that there are possibly some that have this bug unnoticed  
and still work. I think that this case should trigger an exception in  
standards mode because, for this bug, there is no obvious fix to apply,  
and we don't know what the author meant -- does he want to do  
something to the first element with the specified ID, the second, or  
both.


Under no way should this happen. This is adding confusion to an already  
over-confused web author (as in: a web author who doesn't know much web  
development).


Therefore, it's clear nothing has to be changed in quirks mode, but in  
standards mode:


1. break during parsing.
2. break JS code if it sets the id of a node to a duplicate ID.

Or simply leave it as it is: quirks mode behaviour.

...
Simply picking the last matching node is actually hiding a bug and  
letting it go unnoticed. (Why the last one? Why not the first, for  
example?)


That's true, but this happens in many, many other cases.

...
I did not see LiveJournal, so I don't know what kind of features they  
offer.


sandbox would probably do the trick (would help a lot with security  
in this case also).


Yes, I think so. Actually, my activity around the sandboxing idea has  
been inspired by several recent security incidents with LiveJournal and  
its styling system which failed to filter out some patterns of dangerous  
HTML.


True. As you said, there are risks with buggy sandbox implementations,  
but that's an advantage actually: relying on browser fixes, instead of  
site-by-site fixes. I prefer to get a single patch from the implementor,  
than wait for hundreds of sites to fix them. Yet, this is an advantage to  
malicious users too: distribution of the virus/exploit can be very  
powerful and fast.


...
Yeah, really, I sound a bit contradictory. Actually, in my opinion, HTML  
is better than most of ad-hoc markup languages, and HTML with scripts is  
still better than just HTML.


Exactly.

And another thing: HTML 5 is about to make HTML pages more powerful,  
there are going to be menus, datagrids and such, but most of these  
features are useless without scripting, aren't they? For example, a  
datagrid isn't really sortable at client side without a script, which  
makes it useless in blogs and CMS unless they allow some scripting.


True.

So, sandbox may be designed to help tighting-up security on the web,  
but we should also try to think of how's it actually in usage,  
side-effects, etc. It definitely solves problems, but will it cause  
other problems? How important are they?


Of course, there is a lot more to think and talk about. I suppose there  
are going to be problems with particular buggy implementations of  
sandboxing and exploits specifically targetted at holes in such  
implementations. I suspect that web application authors and site  
administrators will be hesitant to allow user scripting even in  
sandboxes because of the possible browser bugs. Though, because  
sandboxes can be useful even if scripting inside them is completely  
disallowed, I hope that the use of sandboxes becomes somewhat popular  
even before site administrators decide to allow scripting.


True, but I'd test. If it works in major browsers as I want, then why not?  
Especially in the case of intranet web applications.



--
http://www.robodesign.ro
ROBO Design - We bring you the future


Re: [whatwg] JSONRequest

2006-03-16 Thread Jim Ley
On 3/16/06, Hallvord R M Steen [EMAIL PROTECTED] wrote:
 On 3/11/06, Jim Ley [EMAIL PROTECTED] wrote:

  Accessing JSON resources on a local intranet which are
  secured by nothing more than the requesting IP address.

 While this is a valid concern I think the conclusion no *new*
 security vulnerabilities is correct. If you today embed data on an
 intranet in JavaScript I can create a page that loads that script in a
 SCRIPT tag and steal the data.

Could you please describe how exactly?  the contents of remote script
elements are not typically available (and if they are it's a large
security hole today) unless valid javascript objects are produced to
be queried, that is not the case with bare JSON.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2006-03-16 Thread Henri Sivonen

On Feb 25, 2006, at 01:06, Ian Hickson wrote:


On Thu, 7 Apr 2005, Henri Sivonen wrote:


I am very hostile towards the idea of requiring UAs to implement  
any XML
parsing features that are in the realm of the XML 1.0 spec but  
that the

XML 1.0 spec does not require. This means processing the DTD beyond
checking the internal subset for well-formedness.

I would rather suggest that What WG specs explicitly discourage  
people
from using a doctype on the XHTML side and point out that authors  
should

not expect UAs to process the DTD.

Those who want to use entities for input, should parse and  
reserialize

as UTF-8 in their own lair and not expose their entity references (or
parochial legacy encodings) to the public network.


The spec has text to this effect in places now; let me know if you  
have
more specific text you'd like to see. I don't want to be too  
strong, since
if you're using XML, exactly how you do so is the problem of the  
XML spec,

not the Web Apps / XHTML5 spec.


At the end of section 1.8 it says:
These XML documents may contain a DOCTYPE if desired, but this is  
not required to conform to this specification.


I'd like to see a note here. Something like this:
Note: According to [XML], XML processors are not guaranteed to  
process the external DTD subset referenced in the DOCTYPE. This  
means, for example, that using entities for characters is unsafe  
(except for lt;, gt;, amp;, quot; and apos;). For  
interoperability, authors are advised to avoid optional features of XML.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] JSONRequest

2006-03-16 Thread Gervase Markham
Hallvord R M Steen wrote:
 You are right, if no variables are created one can't see the data by
 loading it in a  SCRIPT tag. Are you aware of intranets/CMSes that use
 this as a security mechanism?

That's not actually right. I'm pretty sure this came across a public
security list, so...

You can override the constructor on the prototype of the Object object
and get access to JSON objects before the JavaScript engine throws them
away when it realises they don't get assigned to a variable.

Or something like that, anyway. I can't remember exactly how it worked.
But I'm pretty sure that it's true that you can get JSON data if it's
not protected.

Gerv