Re: [whatwg] HTML 5 vs. XHTML 2.0

2008-02-12 Thread James Graham

Brian Smith wrote:

Ian Hickson wrote:
  

On Sat, 13 Nov 2004, Henri Sivonen wrote:

Anyway, I do think it's a problem for styling, automatic content 
extraction and non-CSS presentation that HTML lacks the markup for 
indicating which parts of the page are content proper and which are 
navigation and other chrome. Therefore, a footer element 
  
for isolating 

navigation and legal stuff from content would make sense. (Already 
suggested in 

  
http://lists.w3.org/Archives/Public/www-html/2002Aug/0229.html at the 


end of the message.)
  

I hope the nav, footer, and article elements help this case.



How should advertisements be marked up?
  
It's worth considering that an advert element (or banner or whatever 
you decide to call it) would just cause style rules like advert 
{display:none;} to  become widespread (e.g. by integration into Adblock 
and equivalent). Therefore I can't see this type of markup being used by 
most advertisers.


Re: [whatwg] Fixed a security problem with postMessage()

2008-02-12 Thread Jeff Walden

Ian Hickson wrote:

 * message.domain isn't actually enough to verify any security, given that
on shared hosts one IP address can map to several hostnames and thuspeople 
can end up running servers on different ports that respond torequests from 
domains they don't own.

 * message.uri can leak information, e.g. if the user's password is in the
query component of the URI.


Good catches on both; I agree these changes make sense.



I've replaced both with .origin, which is intended to return the 
scheme://hostname/ or scheme://hostname:port/ (when the port is non-standard) 
of the origin of the source document.


I assume you meant without the trailing slash, given that that's actually part 
of the path?


This doesn't sound like it should be too hard to implement, although the manual 
splicing-out of the username/password from the origin is slightly worrying (if 
entirely necessary) from a careful-manipulation-is-tricky point of view.  I 
don't see any other option, tho, on that point.

Jeff


Re: [whatwg] postMessage and serialization

2008-02-12 Thread Ian Hickson
On Mon, 11 Feb 2008, Aaron Boodman wrote:

 Has the topic of automatic serialization and deserialization of objects 
 passed across postMessage() come up already? It seems like boolean, 
 number, string, arrays, and objects should be supported.
 
 I realize that you can just use a json library, but I wonder why we 
 should force every application that wants to use postMessage() to 
 include a json library when the browser can just do the right thing 
 automatically.

This was originally how the DOM Storage API worked, but there was 
significant pushback on this, resulting in the current string-only 
approach. When I came to writing the postMessage API, I considered that 
feedback and decided not to bother even trying.

Passing booleans, numbers and strings is trivial using the current API. 
Passing arrays of booleans and numbers is trivial too.

Passing objects, or arrays of strings, arrays, or objects, is more 
complex, but as you point out, it can be done using JSON libraries. Since 
it is likely that JSON will be supported natively by UAs in due course, it 
seems better to wait for that support rather than adding type support to 
postMessage().

It seems that most messages will consist either of simple strings, or of 
complex data structures (objects). Reconstructing JS objects is not a 
trivial operation; you have to worry about references into other parts of 
the structure, getters and setters that hang or throw or return infinite 
arrays, functions, members that aren't enumerable, etc. I'd rather not go 
down that rat hole with v1.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Referer header sent with a ping?

2008-02-12 Thread Kornel Lesinski

On Tue, 12 Feb 2008 21:54:25 -, Philip Taylor [EMAIL PROTECTED] wrote:

It's quite a different situation when the Referer is used as a security  
measure in deciding to trust a user's request, where false negatives can  
have significant consequences (like editing data via cross-site request  
forgery). That is the situation where a ping mustn't introduce new  
risks.


I looked for some examples of code that checks the Referer for security,  
and found:

[...]

That's interesting. In that case attack outlined on Mozilla's list is even  
less likely to succeed than I thought. So maybe a less abusive approach  
would suffice:


* if ping is cross-domain, always send Referer
* if ping originates from the same domain, don't send any Referer at all

--
regards, Kornel Lesiński


Re: [whatwg] Web Forms 2.0 Feedback

2008-02-12 Thread Ian Hickson
On Wed, 8 Dec 2004, Matthew Thomas wrote:
 On 8 Dec, 2004, at 3:19 PM, Ian Hickson wrote:
  On Wed, 10 Nov 2004, Matthew Thomas wrote:

In the current spec, optgroup may be nested, but this doesn't 
imply hiearchical menus like in the HTML4 spec. It would just mean 
indented options under headings, like you see in Windows sometimes
 
  When I said Windows, I meant the OS. I think the Windows XP Printer 
  window has this (but I don't have it at hand to check).
 
 http://www.uwec.edu/help/WinXP/print.htm
 http://www.wellesley.edu/Computing/WinXP/printing.html
 I don't see it ... Maybe I'm looking at examples with not enough printers.

http://junkyard.damowmow.com/304


  (One of the main reasons I haven't yet specified the tree control in 
  the Web Apps draft is that I can't work out how to make it support the 
  basic things a tree control needs to support while still having some 
  sort of backwards-compatibility story, btw.)
 
 select id=wiblet initialsort=flavor
   shead
 sh data=Name
 sh data=Size sortorder=S, M, L, XL, XXL, XXXL
 sh id=flavor data=Flavour
   /shead
   sbody
 option value=foo icon=foo.jpgFoo
 sd data=Msd data=Vanilla
 option value=bar icon=bar.pngBar
 sd class=strange data=XOSsd data=Vanilla
 option value=hum iconcolor=#a06033Hum
 sd data=XXLsd data=Caramel
 optgroup label=Adjectives
   option value=squiggly icon=squiggle.pngSquiggly
   sd data=Ssd data=Strawberry
   option value=hum icon=dunce.pngUnfortunate
   sd data=XLsd data=Hokey Pokey
 /optgroup
   /sbody
 /select
 
 Some things might need to be tweaked (such as the names of the new 
 elements and attributes, and the possible necessity of /option for 
 forward compatibility); but I don't see any backward compatibility 
 problems, other than that authors may mistakenly put essential data in 
 non-primary columns.

The big problem with the above is that it doesn't let you put select 
drop-downs into the treegrid, something which is relatively common. The 
spec now covers this, anyway.


   (For example: You can search for DOCTYPEs all day at w3.org without 
   finding one page that lists them all. And when you do hunt down a 
   DOCTYPE (generally in relation to a particular Recommendation or 
   Working Draft), it's often one that won't work on your site. 
   http://alistapart.com/articles/doctype/)
  
  The answer to that, of course, is for us to drop DOCTYPEs altogether, 
  which I suggest we do in any XML-based version.
 
 Are you suggesting that it is also desirable in SGML-based versions? (A 
 doctype will help UAs distinguish between, for example, HTML 3.2's 
 menu and HTML 5's menu. That would be just the first example of 
 redefinition in what is a very young language by linguistic standards.)

menu in HTML5 is a superset of HTML 3.2's, so it doesn't need to be 
distinguished. Of course, there are no SGML versions of HTML5 defined in 
the HTML5 spec, so the point here may be moot now.


  All the presentational stuff will likely be deprecated, but I don't 
  really see any sensible way in which we can drop semantic markup 
  elements. For example, I'd love to drop div and tfoot, but I don't 
  see any sensible way to do that. It would likely lose the goodwill of 
  many authors.
 
 Even if goodwill was irrelevant, if you made HTML semantically complete 
 enough to drop div, I guarantee you would have added too many block 
 elements for authors to choose the correct one anything like most of the 
 time. div, b, i, sup, sub, and span might be presentational 
 tofu, but they keep HTML from being too complex, and that's important.

div is, to my regret, still in the spec, for this very reason. So are 
b, i, sup, and sub, though not necessarily in a strictly 
presentational sense.

Cheers,
-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] textContent and bidi

2008-02-12 Thread Ian Hickson
On Fri, 1 Dec 2006, fantasai wrote:
 
 BTW, for 'dir', I think the inheritance behavior should be a little more 
 clearly specified:
 
 # If the attribute is omitted or has another value, then the directionality
 # is unchanged.
 
 Unchanged from what?

This changed a while ago to say same as its parent.


 You should specify that the initial value (i.e. what html gets in the 
 absence of any other declarations) is ltr.

Done.


 You should also somehow make a reference (either directly or indirectly) 
 to UAX 9. Your text doesn't anywhere specify what directionality 
 means. The closest we get is an example referencing CSS2.1's sample 
 style sheet. Saying only that the processing layer is responsible for 
 handling the 'dir' attribute not only doesn't give enough information 
 about what is expected, it also doesn't normatively require use of the 
 Unicode bidi algorithm rather than some other bidi algorithm. (Other 
 bidi algorithms do exist.)

It's not clear to me what would need to be defined. Is there some reason 
that a UA couldn't use another bidi algorithm?


 # For example, CSS 2.1 defines a mapping from this attribute to the CSS 
 # 'direction' and 'unicode-bidi' properties
 
 s/defines/suggests/

 The sample style sheet is informative. The only normative text about 
 handling 'dir' in CSS2.1 is a sentence that says bidi behavior in HTML 
 is defined by the HTML 4 specification.

It does define the mapping, whether its normative or not.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [WA1] Insignificant white space

2008-02-12 Thread Ian Hickson
On Fri, 8 Jul 2005, Robin Berjon wrote:
 fantasai wrote:
  # The whitespace characters U+0020 SPACE, U+000A LINE FEED, and U+000D
  CARRIAGE # RETURN are always allowed between elements.
  
  What about U+0009 TAB?
 
 And NEL and VERTICAL TAB?

All the above except NEL have now been space characters for a while.

NEL isn't, mostly because in practice nobody uses it, and adding new space 
characters is moderately expensive. UAs with different sets of space 
characters will end up with different behaviour, e.g. in processing the 
class attribute. It's also desireable for us to have the raw syntax be 
a pure subset of ASCII, so that you can safely code HTML parsers and be 
certain that they won't parse documents syntactically differently based 
just on whether the encoding was correctly guessed or not (so long as 
you're within a subset of ASCII).

Cheers,
-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] A potential slight security enhancement to postMessage

2008-02-12 Thread Ian Hickson

On Wed, 30 Jan 2008, Collin Jackson wrote:

 Here is a suggestion for a backwards-compatible addition to the 
 postMessage specification:
 
 Currently postMessage is great for sending authenticated messages 
 between frames. The receiver knows exactly where each message came from. 
 However, it doesn't provide any confidentiality guarantees. When you're 
 posting a message to a window, you have no way of knowing who is 
 listening on the other end, because the same-origin policy prevents you 
 from reading the domain and URI of that window. The window may have been 
 showing a page loaded from foo.com the last time you received a message 
 from it, but it might be displaying content from bar.com now; if you 
 send it a message, you don't whether the message will be received by 
 foo.com or bar.com.
 
 For non-security-sensitive messages, like change your font color to 
 red, confidentiality might not be needed. However, if the message 
 you're trying to send contains a password, it would be nice to be able 
 to specify which domain you're trying to send it to.
 
 The postMessage API could be extended to provide confidentiality by 
 adding some optional arguments:
 
 void postMessage(in DOMString message, [optional] in DOMString domain, 
 [optional] in DOMString uri);

Done, using just 'origin'.


On Fri, 1 Feb 2008, Collin Jackson wrote:
 
 You can try it out here: 
 http://crypto.stanford.edu/websec/post-message/challenge-response/.
 
 This turned out to be slightly tricky. To send a single message, the 
 sender has to first post a message to the recipient. The recipient then 
 responds. At this point, during the execution of this callback, the 
 domain and uri attributes of the event are accurate and the sender can 
 safely send the message. There are a number of gotchas, which we think 
 we've handled correctly, but it's hard to be sure. In the end, it would 
 be much simpler and less error-prone to write this as a single line of 
 code:
 
 frames[0].postMessage(message, theory.stanford.edu);

You now have to say:

   frames[0].postMessage(message, http://theory.standford.edu;);


Note that as defined, this:

   frames[0].postMessage(message, http://example.com/victim;);

...will allow messages to be sent to, e.g. http://example.com/evil;.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] A potential slight security enhancement to postMessage

2008-02-12 Thread Ian Hickson
On Wed, 30 Jan 2008, Jeff Walden wrote:

 I briefly wrote up some documentation on postMessage for the Mozilla 
 Developer Center:
 
 http://developer.mozilla.org/en/docs/DOM:window.postMessage
 
 If you pull it up, you'll note two places where I include big, huge, 
 overbearing, somewhat-exaggerating injunctions about first checking the 
 domain/uri/source properties of the received message before trusting the 
 sent data.
 
 Writing those got me thinking: what if we could enforce not touching 
 the data before verifying the sender's identity?  Specifically, what if 
 we required that either .domain or .uri be read prior to allowing .data 
 to be successfully accessed, say, without throwing a security error?  
 (No reason comes to mind for .source to participate in this scheme, 
 either to throw or to allow access to .data, but I haven't given it 
 serious thought.)  This would prevent unknowing misuse of this 
 functionality, and safe uses wouldn't be affected.  I think this would 
 only apply to the event dispatched by postMessage, not to MessageEvent, 
 as the latter is same-origin and there's no harm to a same-origin 
 MessageEvent.
 
 Thoughts?  A no-harm slight increase of the ability to prevent incorrect 
 use of postMessage, or excessive nannying?

I think most uses of this are actually going to be accepting connections 
from any domain. I don't think it makes sense to require people who are 
just exposing an interface to the world to require people to go through 
hoops like this.


On Wed, 30 Jan 2008, Maciej Stachowiak wrote:
 
 The more convenient version of that would be to require clients to 
 describe allowed senders when registering for the event in some way. 
 That would seem more like a convenience and less like a hoop to jump 
 through.

That's an option, but I fear that people would just end up calling 
allowAnyone() (or whatever we call it) in a cargo-cult fashion, 
undermining any security benefit.


On Thu, 31 Jan 2008, Jeff Walden wrote:
 
 The key, tho, is that this really isn't a hoop to jump through.  
 Excluding toy public message board demos, can you describe a use case 
 for postMessage where it is not necessary to know the identity of the 
 sender?  To know the identity you have to check domain or uri, and 
 there's no reason not to do that before getting the sent data.

Almost all cases I intend to use this API for are open to anyone to embed. 
Game components, widgets, etc, none of them really care who is embedding 
them.


On Thu, 31 Jan 2008, Aaron Boodman wrote:
 
 Not necessarily. You could do something like this:
 
 window.createMessageReceiver(http://www.google.com;)
 .addEventListener(post-message, function() {
   ...
 }, fase);
 
 Could probably come up with a better method name, and I forget the name 
 of the event to use with PostMessage, but I hope you get the idea.
 
 I like Maciej's suggestion of making it a natural part of the interface. 
 If you tell people they have to read x property before y property, they 
 will just do:
 
 // spec says we have to read this first
 var foo = event.domain;
 alert(event.message);

But then people will just end up doing:

   window.createMessageReceiver(*).addEventListener('message', foo, false);

...without understanding what the createMessageReceiver() part is about. I 
don't think we'd really gain anything, except for slightly slowing things.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] createImageData - new ImageData() ?

2008-02-12 Thread Kornel Lesinski

On Sun, 10 Feb 2008 23:25:51 -, Ian Hickson [EMAIL PROTECTED] wrote:


That would mean that passing ImageData around between two canvas
elements doesn't always work as expected. I think that's highly
undesirable. Is there any implementation where we know this will the
case?


Not today, but why preclude it? Implementations could get higher quality
renderings on canvases that get resized dynamically, by using a bigger
backing store. What's wrong with what we have now?


It's very easy to write code which assumes that size of imageData is the  
same as size given in canvas width height (I know, because I did it :)


Since the ratio is system/browser-dependent and 200dpi screens aren't  
popular yet, such bug may be easily overlooked/ignored and get widely  
deployed.


The difference in sizes isn't intuitive. For example if browser doubled  
number of pixels in ImageData only when user used zoom, I think authors  
would rather think that browser's zoom is buggy.


I think that by default getImageData should return data with same size as  
specified in width/height attributes of canvas. There might be another  
method (getImageScreenData?) or method argument that returns all pixels of  
canvas.


ImageData can be made portable between canvases by adding aspect ratio  
field or additional width/height fields given in CSS pixels.


--
regards, Kornel Lesiński


Re: [whatwg] Fixed a security problem with postMessage()

2008-02-12 Thread Ian Hickson
On Tue, 12 Feb 2008, Jeff Walden wrote:
 
 I assume you meant without the trailing slash, given that that's 
 actually part of the path?

Yes.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] SQL storage and onunload

2008-02-12 Thread Geoffrey Garen
It seems to be a natural idea to save Web application state from an  
unload event handler. But is it guaranteed that client-side database  
API is still functional at this point? And if it is - can one queue  
up more statements and/or transactions from statement callbacks?


I see two options here:

1. Delay leaving the page indefinitely, until all outstanding database  
operations have completed.


2. Leave the page immediately, canceling all outstanding database  
operations.


Option #1 seems undesirable because it allows a malicious or poorly  
programmed website to hijack the browser. That's a pretty bad user  
experience -- one that the database API's asynchronous callbacks were  
specifically designed to avoid.


Option #2 is not ideal, but it's workable. A beforeunload event  
handler can detect unsaved changes and prompt the user to cancel the  
navigation and save. Once the save succeeds, the page can update its  
UI to indicate that navigation is OK.


So, I recommend option #2.

Geoff

On Feb 8, 2008, at 1:44 AM, Alexey Proskuryakov wrote:



 There needs to be some limits put on this, as otherwise a script  
could continue to use resources indefinitely after a browser window  
is closed. But I do not see where it is specified, explicitly or  
implicitly.


- WBR, Alexey Proskuryakov





[whatwg] Fixed a security problem with postMessage()

2008-02-12 Thread Ian Hickson

While going through the feedback for postMessage(), I noticed a couple of 
security problems that nobody had raised:

 * message.domain isn't actually enough to verify any security, given that 
   on shared hosts one IP address can map to several hostnames and thus 
   people can end up running servers on different ports that respond to 
   requests from domains they don't own.

 * message.uri can leak information, e.g. if the user's password is in the 
   query component of the URI.

Basically, .domain is too little, and .uri is too much.

I've replaced both with .origin, which is intended to return the 
scheme://hostname/ or scheme://hostname:port/ (when the port is 
non-standard) of the origin of the source document.

It's still vague for data: URIs, etc; I have outstanding feedback on that 
matter and will address that when I respond to that feedback.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Issues concerning the base element and xml:base

2008-02-12 Thread Ian Hickson
On Mon, 13 Aug 2007, Jonas Sicking wrote:
 Ian Hickson wrote:
   Also, if we're going to be inconsistent in how current browsers and web
   pages handle multiple bases, why not simply use the first base for
   both href= and target=?
  
  Done.
 
 I realized another limitation. It is very hard for implementations to 
 'correctly' deal with dynamic modifications to bases. Ideally all 
 external resources, such as iframes, imgs, css backgrounds, 
 svg:use elements and css @imports should be updated to potentially use 
 new URIs. This can happen when base elements and xml:base attributes 
 are inserted or mutated. So far no UA that I know of does this, and it 
 would be very hard to implement.

I completely agree with you that this is an area that is problematic.


 What I suggest is that we add similar language as the XBL spec does for 
 xmlns attributes and xbl:attr attributes. Say that dynamic modifications 
 are allowed, but that the implementation is not expected to update the 
 resolved URI unless the URI is explicitly touched.

Unfortunately it's unclear when that would be. At least with xbl:attr we 
have a somewhat well-defined set of steps for when things happen. Here 
it's far less clear. For example, clicking a link is likely to reresolve 
the URI relative to the base URI. Maybe even hovering it might. Or maybe 
even just a repaint in general.

I'm not sure what to do here. It seems like UAs should support a 
notification mechanism so that when a base URI is changed, all URIs in the 
document (for base) or in that subtree (for xml:base) get reresolved. 
That actually seems relatively simple and has little (no) overhead in the 
common case of nothing being changed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] HTML 5 vs. XHTML 2.0

2008-02-12 Thread Brian Smith
James Graham wrote:
 Brian Smith wrote:
  How should advertisements be marked up?

 It's worth considering that an advert element (or banner 
 or whatever you decide to call it) would just cause style 
 rules like advert {display:none;} to  become widespread (e.g. 
 by integration into Adblock and equivalent). Therefore I 
 can't see this type of markup being used by most advertisers.

Exactly. Right now it is very difficult to build a user agent that
display advertisements in a fashion other than what the advertiser
intends. Advertisements are marked up in many different, incompatible
ways. If a simple, easy-to-implement mechanism for marking up
advertisements was standardized and deployed, then building a web
browser with good support for advertisements would be much easier.

If there was an advert element or equivalent, then its use would
quickly become a mandatory accessibility requirement, and its use would
pretty much be required by any site built by anybody with any money to
lose. Similarly, any jurisdiction with truth in advertising laws would
also require the use of such a construct.

However, I don't recommend an advert element. A role='advertisement'
attribute would be better, because then it could be applied to CSS
files, script files, flash content, images, video, etc.; then a smart
web browser wouldn't download that stuff at all if the user didn't want
to see it, or it could prioritize the downloading/display of other
content higher than that of advertisements.

Ian, in your analysis of existing web content, did you find many
instances of advertising?

- Brian