Re: [whatwg] [html5] r3859 - [acgiowt] (2) Parser changes: dc, ds, dialog are now treated differently. [...]

2009-09-16 Thread Ian Hickson
On Tue, 15 Sep 2009, Smylers wrote:

 wha...@whatwg.org writes:
 
  +  pThe span class=implfirst/span codea 
  href=#the-dt-elementdt/a/code element child
  +  of the element, if any, represents the caption of the
  +  codea href=#the-figure-elementfigure/a/code element's contents. 
  If there is no child
  +  codea href=#the-dt-elementdt/a/code element, then there is no 
  caption./p
   
  +  pThe span class=implfirst/span codea 
  href=#the-dd-elementdd/a/code element child
  +  of the elementspan class=impl, if any,/span represents the
  +  element's contents. span class=implIf there is no child
  +  codea href=#the-dd-elementdd/a/code element, then there is no 
  caption./span/p
 
 I think that last caption is supposed to be content.

Fixed.


 Also, the If there is no for dd is class=impl but the equivalent one
 for dt isn't.

Yeah, the dt is optional, so that case is relevant to authors.


 While proofreading this change I also spotted an inconsistency in the
 related example undet the dd element:
 
   http://www.whatwg.org/html5#the-dd-element
 
 I think the first class=part-of-speech should be on the i rather than
 the dd (matching the other instances).

That was intentional; the thinking was that when styling you'd want to 
know when the whole dd was used for the part of speech or whether the 
dd had a part of speech and a definition.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] HTML 5 drag and drop feedback

2009-09-16 Thread Francisco Tolmasky


Yes, that is a neat solution. However, it is still the case that at  
this
time we should not add new features, otherwise we might get too far  
ahead
of the implementations, and the quality of implementations will go  
down.


Since I am new to the list I'm not sure how to interpret the context  
of this type of answer: in other words, does this mean wait until  
next month or wait until HTML 6. Similarly, if it was determined  
that a sufficient number of browsers implemented this existing feature  
to a satisfactory degree, would that itself be enough to request this  
addition again? As you stated, both IE and Safari have this thing  
pretty nailed down for quite a while now already. Firefox has done a  
considerable amount of work to implement this as well and at the very  
least advertises it as a complete feature. Is there some way to  
measure the quality of implementations?


Decisions are made based on their technical merits, it doesn't  
matter how

many people support it. :-)


Also being new to the list I feel compelled to ask whether this is  
some sort of meme or inside joke as I have seen it more than once and  
is clearly self-contradictory.


Thanks,

Francisco


Re: [whatwg] the cite element

2009-09-16 Thread Ian Hickson
On Tue, 15 Sep 2009, Erik Vorhes wrote:
 On Thu, Aug 27, 2009 at 7:08 PM, Ian Hickson i...@hixie.ch wrote:
 
  Earlier, when justifying why you changed the definition of cite 
  from HTML 4.01, you said:
 
   I don't think it makes sense to use the cite element to refer to 
   people, because typographically people aren't generally marked up 
   anyway. I don't really see how you'd use it to refer to untitled 
   works.
 
  This usage is an example of when people are typographically marked 
  up.
 
  It's a minor case. The semantic here wouldn't be name of person, it 
  would be name of person when immediately following a quote in a 
  pullquote, which is far too specific to deserve a whole element.
 
 I don't think anyone is arguing that there should be a new element 
 exclusively for the above use or that cite should be limited only to 
 that definition (name of person when immediately following a quote in a 
 pullquote or the more forgiving person to whom the quote is 
 attributed). Still, it would be nice to be able to use cite to mark 
 up people being cited (along with other citations that don't explicitly 
 involve a work's title).

Why? What problem would such a solution solve?

Names aren't generally styled, certainly not in italics, so that isn't the 
problem solved.

cite as a way to mark the source of a q or blockquote only works if 
we limit cite to only those cases, which you're not proposing, so it 
doesn't solve that problem either.

So what problem are you solving?


  ... more importantly, the element's style is made non-italics, thus 
  completely defeating the entire point of marking up the element in the 
  first place.
 
 I'm not sure this is a reasonable argument against the use of cite. 
 Following this line of reasoning, it is not worthwhile to mark up titles 
 of works if they are *not* to be italicized;

Yes, that is correct.


 moreover, it is even pointless to mark up headings using h1-h6 if 
 you intend to remove the bold styling.

h1-h6 have two related effects other than the styling: they allow the 
document structure (outline) to be exposed, e.g. when editing, and they 
allow significantly easier navigation of the document for users of 
accessibility tools.

Neither of these use cases apply to cite.

If people didn't commonly style headings, and if headings didn't by 
default have a different style, and if knowing what was a heading didn't 
help accessibility, then yeah, h1-h6 would be pointless.


 The counter to this approach is that h1-h6 provide semantic value 
 even when styled differently from the default. But the same can be said 
 for cite, whether it is defined as title of work or as a more 
 general citation.

What value? Just marking up semantics does not have enough value to 
justify it. If it did, we'd be adding thousands of elements. Why not 
color for marking up colours, price for marking up prices, 
mortgagerate for marking up mortgage rates, postaladdress for 
addresses, boardgame for names of board games, vat for the cost of 
sales tax in advertising...

What is special about marking up authors that doesn't apply to all the 
above? Or are you also asking for all the above?


  When examining pages, you have to first pick a random sample, then 
  study those, because otherwise you get sampling bias. With a trillion 
  pages on the Web, it's easy to find thousands of examples of any 
  particular use of HTML elements; the question is what is the most 
  useful definition, not what is used at all.
 
 Because you believe title of work to be the most useful definition, 
 does that mean that you would reject even a majority use of cite for 
 marking up citations that aren't only or exclusively titles?

It's a judgement call. (FWIW, the data shows that cite is rarely used 
for people's names, often used for titles of works, sometimes used for the 
entire citation (usually associated with going through hoops to avoid the 
default styling), and most often used just for its italics effect.)


 There are plenty of examples of authors using cite to mark up the 
 following (among other things):
 
 - titles of works
 - full citations
 - names and other sources of quote attribution (leaving aside
 placement relative to the quote)
 - names of blog post commenters and authors (in the context of their
 comments, posts, etc.)

Sure. There are even more examples of them using it just as synonym for 
the i element.


 Even if titles are by for the most common use case, it doesn't make 
 sense to exclude other semantically justifiable uses of what appear to 
 be valid uses of the cite element, at least according to the English 
 language usages associated with the word cite.

That's not the reasoning that was used though. The reasoning that was used 
is people who aren't using this for italics are mostly using it for 
titles of works. Is that useful? Yes, titles of works are often italics, 
so this would help people. Would it be more helpful if we increased it to 
mark up 

Re: [whatwg] Surrogate pairs and character references

2009-09-16 Thread Ian Hickson
On Tue, 15 Sep 2009, Øistein E. Andersen wrote:
  
  I suppose we could just change the spec and say that surrogate 
  characters (whether literal characters, e.g. in UTF-8, or from 
  character references) all get converted to U+FFFD?.
 
 That seems to be the only reasonable option if handling #xD800;#xDC00; 
 as U+FFFD U+FFFD is deemed desirable and sufficiently compatible with 
 existing documents.  It would simplify things a bit in non-UTF-16 
 environments (as compared to my interpretation of the current text) 
 without much added complexity in UTF-16 environments.

Ok, done.


  The spec says Bytes or sequences of bytes in the original byte stream 
  that could not be converted to Unicode characters must be converted to 
  U+FFFD REPLACEMENT CHARACTER code points.
 
 I take it you mean that \xD800#xDC00; should turn into \xFFFD#xDC00; at this
 point, which is only supported by the quoted text if bytes or sequences of
 bytes representing surrogates [cannot] be converted to Unicode characters
 or, to put it differently, if surrogates are not Unicode characters.

Correct. Surrogates aren't Unicode characters.


 Unfortunately for this reading, the term Unicode character does not 
 seem to be defined in HTML5 or in Unicode,

I've added a definition to HTML5. The proper Unicode term is Unicode 
scalar value, apparently.


 and the following paragraph (which appears shortly after the one you 
 quoted) clearly includes surrogate code points within the concept of 
 Unicode character:
 
 Any occurrences of any characters in the ranges [...] U+D800 to U+DFFF, 
 [...] are parse errors. (These are all control characters or permanently 
 undefined Unicode characters.)
 
 Moreover, this paragraph would be pointless if the characters mentioned 
 therein could never occur at all.

I've changed the text to refer to code points when it talks about 
surrogate code points.


 The use of Unicode character without a definition is fine in other 
 parts of HTML5, but clearly not sufficiently precise in this instance. 
 If you want to exclude (unpaired) surrogate code points only, the 
 appropriate term to use would probably be Unicode scalar value.

Yeah. Fixed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Editorial: Colloquial contractions

2009-09-16 Thread Ian Hickson
On Wed, 16 Sep 2009, Øistein E. Andersen wrote:
 On 15 Sep 2009, at 02:37, Ian Hickson wrote:
  On Tue, 8 Sep 2009, Øistein E. Andersen wrote:
   
   The spec currently contains a few occurrences of colloquial 
   contractions like can't, won't and there's, which should be 
   changed to cannot, will not, there is etc. for consistency.
  
  I haven't changed this, because it doesn't seem especially important, 
  frankly. If there are specific cases where you think the current text 
  reads poorly due to the use of contractions, please let me know.
 
 'Tis not that specific instances are particularly horrific; the problem 
 is that unmotivated alternation 'tween the two, just like any other 
 typographical error or inconsistency, gives the impression of 
 carelessness, which is always discomforting in a technical 
 specification, whate'er the cause might be.
 
 I still think 'twould be worth fixing this, though I must admit 'tis 
 more pervasive than I first thought.  To make it less open-ended, please 
 find below a list of changes that would correct most instances 
 (capitalised forms not listed separately):
 
 s/doesn't/does not/
 s/isn't/is not/ except: isn't his
 s/don't/do not/ except: don't know,, don't.
 s/it's/it is/ except: it's hot (twice), it's so pedantic, it's unarguably,
 it's about
 s/can't/cannot/
 s/I'm looking/I am looking/
 s/there's/there is/ except: there's a microphone (twice)
 s/won't/will not/ except: won't be that
 s/that's/that is/ except: that's right
 s/wasn't/was not/
 s/aren't/are not/
 s/wouldn't/would not/
 s/we're/we are/ except: team we're, gt;we're (twice)
 s/they're/they are/ except: they're really
 s/here's/here is/ (except: there's a microphone)
 s/didn't/did not/ except: didn't have, didn't know, didn't lt;
 s/we'll/we shall/
 s/we'd/we would/
 s/I've/I have/ except: I've liked, I've got, I've only
 s/hasn't/has not/ except: hasn't changed
 s/you're doing/you are doing/
 s/couldn't/could not/ except: couldn't admit
 s/we've/we have/
 s/shouldn't/should not/ except: shouldn't say
 s/let's simulate/let us simulate/
 s/it'll/it will/
 s/haven't/have not/
 s/you'd/you would/
 s/it'd/it would/
 s/I'd probably/I would probably/
 s/I'd realised/I had realised/
 s/hadn't/had not/
 s/they've/they have/
 s/they'll/they will/
 s/there'd/there would/
 s/he's covered/he is covered/

That's far too many things for me to change safely just to make the spec 
sound more formal. I'm as likely to introduce errors in doing this change 
as I am to fix everything I set out to fix.

Also, many of the contractions above are in explicitly non-normative text 
(like examples), and I definitely think it's fine to have them there.

Again, if there are specific cases (e.g. section 5.4.76's second 
paragraph says 'Browsers shouldn't do that', which is an ambiguous use of 
RFC2119 terminology, please use 'should not'), then let me know, but I'm 
not ready to make such a wide series of changes as you describe above.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Ian Hickson
On Tue, 15 Sep 2009, Jonas Sicking wrote:
 
  I'd like to rename article, if someone can come up with a better 
  word that means blog post, blog comment, forum post, or widget. I do 
  think there is an important difference between a subpart of a page 
  that is a potential candidate for syndication, and a subsection of a 
  page that only makes sense with the rest of the page.
 
 How about section type=article or section article=?

A big part of the point here is removing the need for a class attribute. 
There's not much point us changing:

   div class=post/div

...to:

   section type=article/section

...if it's longer.


Thanks to everyone for the various ideas. It seems that in practice, 
people use article and section reasonably well, so I'll leave it as is 
for now.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Setting .value on input type=file

2009-09-16 Thread Ian Hickson
On Tue, 15 Sep 2009, Jonas Sicking wrote:
 
 Currently the spec says to throw an INVALID_ACCESS_ERR exception anytime 
 the 'value' IDL attribute is set. However allowing it to be set to the 
 empty string would be good.
 
 The simplest use case is allowing the page to implement a 'cancel' 
 button. Many UAs (in fact all that I know of) don't have explicit UI for 
 clearing the field. In fact a quick test shows that at least Firefox and 
 Safari has no way to clear the field at all.
 
 While the page can always delete the old input element and create a new 
 one, that is much more complicated. Especially if the element has event 
 listeners or user data associated with it.
 
 Setting it to the empty string works in Firefox, Safari, and Chrome. It 
 does not appear to work in IE or Opera.

Done.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Keryx Web

2009-09-16 03:08, Ian Hickson skrev:


I'd like to renamearticle, if someone can come up with a better word
that means blog post, blog comment, forum post, or widget. I do think
there is an important difference between a subpart of a page that is
a potential candidate for syndication, and a subsection of a page that
only makes sense with the rest of the page.

Cheers,


Has entry been discussed? (Shamelessly stolen from Atom.)


--
Keryx Web (Lars Gunther)
http://keryx.se/
http://twitter.com/itpastorn/
http://itpastorn.blogspot.com/


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread David Workman
To throw my views into the mix:
I think 'article' is more suitable than 'post' or 'entry' semantically. A
blog post can reasonably be called an article (although it stretches the
concept a bit for forum posts), whereas in an online newspaper or magazine,
'article' is definitely appropriate whereas 'post' isn't.

A comment element may be useful. Or an optional 'type' attribute on article
that can only take one of several semantically meaningful values to be
valid. Some possibilities would include 'post', 'comment', 'report'. This
would provide semantic meaning, the ability to style them using CSS
selectors, avoids baking a brand new element and doesn't invalidate existing
uses as an article tag without a type is still semantically meaningful.

Cheers,

2009/9/16 Keryx Web webmas...@keryx.se

 2009-09-16 03:08, Ian Hickson skrev:

  I'd like to renamearticle, if someone can come up with a better word
 that means blog post, blog comment, forum post, or widget. I do think
 there is an important difference between a subpart of a page that is
 a potential candidate for syndication, and a subsection of a page that
 only makes sense with the rest of the page.

 Cheers,


 Has entry been discussed? (Shamelessly stolen from Atom.)


 --
 Keryx Web (Lars Gunther)
 http://keryx.se/
 http://twitter.com/itpastorn/
 http://itpastorn.blogspot.com/



Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread James Graham

Keryx Web wrote:

2009-09-16 03:08, Ian Hickson skrev:


I'd like to renamearticle, if someone can come up with a better word
that means blog post, blog comment, forum post, or widget. I do think
there is an important difference between a subpart of a page that is
a potential candidate for syndication, and a subsection of a page that
only makes sense with the rest of the page.

Cheers,


Has entry been discussed? (Shamelessly stolen from Atom.)


Dunno about discussed, but I had the same idea*. It seems like it might 
help people understand where article is supposed to be used since 
articles are used in cases where content could stand alone, for example 
in syndication.



*No really, check the IRC logs ;)



Re: [whatwg] the cite element

2009-09-16 Thread Nils Dagsson Moskopp
Am Mittwoch, den 16.09.2009, 09:16 + schrieb Ian Hickson:
 Names aren't generally styled, certainly not in italics, so that isn't the 
 problem solved.

Important names are sometimes styled through use of small-caps, though
it may be that this is an older / rare convention and not applicable
here.

-- 
Nils Dagsson Moskopp
http://dieweltistgarnichtso.net



Re: [whatwg] HTML 5 drag and drop feedback

2009-09-16 Thread Ian Hickson
On Wed, 16 Sep 2009, Francisco Tolmasky wrote:
  
  Yes, that is a neat solution. However, it is still the case that at 
  this time we should not add new features, otherwise we might get too 
  far ahead of the implementations, and the quality of implementations 
  will go down.
 
 Since I am new to the list I'm not sure how to interpret the context of 
 this type of answer: in other words, does this mean wait until next 
 month or wait until HTML 6.

It's hard to tell -- it depends on how fast implementations line up on 
what HTML5 already says.


 Similarly, if it was determined that a sufficient number of browsers 
 implemented this existing feature to a satisfactory degree, would that 
 itself be enough to request this addition again?

It's not just this feature -- for example, canvas is pretty well 
implemented, but we're not adding new features to it at the moment, 
because browser implementors jump at the chance to implement anything I 
add to canvas, instead of fixing other bugs. So each time we add a canvas 
feature, we delay the time until other things are implemented well.


 As you stated, both IE and Safari have this thing pretty nailed down for 
 quite a while now already.

Both IE and Safari are quite buggy when it comes to drag-and-drop 
actually, at least compared to what the spec says (especially IE).


 Firefox has done a considerable amount of work to implement this as well 
 and at the very least advertises it as a complete feature. Is there 
 some way to measure the quality of implementations?

We'll need a test suite.


  Decisions are made based on their technical merits, it doesn't matter 
  how many people support it. :-)
 
 Also being new to the list I feel compelled to ask whether this is some 
 sort of meme or inside joke as I have seen it more than once and is 
 clearly self-contradictory.

Neither.

As an extreme example: if a thousand people want the HTML5 spec to include 
an element that executes arbitrary author-provided inline assembler, and 
one person points out that that would allow for remote code execution 
attacks, then the one person wins.

An example where this actually happened: lots of people think we should 
include the longdesc= attribute in HTML5. However, we did some research, 
and found that it isn't a technically good solution according to the 
collected data. So we don't have longdesc=.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Bruce Lawson

On Wed, 16 Sep 2009 12:28:36 +0100, Ian Hickson i...@hixie.ch wrote:


On Wed, 16 Sep 2009, Bruce Lawson wrote:


Seems to me that (current) sections aren't for syndicating (tabs,
chapters etc), while blog posts (currently articles) *are* for potential
syndication (although the cite attribute was recently removed from
article).


I've adjusted the spec's definition more in line with this.


Groovy.



A comment in an article is also marked up as article, but is unlikely to
be a candidate for syndication as it's out of context.

Is this correct?


As James on IRC pointed out:

   http://intertwingly.net/blog/comments.html
   http://firehose.diveintomark.org/
   http://www.zeldman.com/comments/feed/

Also, consider Twitter, Reddit, most forums, etc, where individual
comments are definitely syndicated.


Yup. Makes sense to me.


Re: [whatwg] the cite element

2009-09-16 Thread Ian Hickson
On Wed, 16 Sep 2009, Nils Dagsson Moskopp wrote:
 Am Mittwoch, den 16.09.2009, 09:16 + schrieb Ian Hickson:
  Names aren't generally styled, certainly not in italics, so that isn't the 
  problem solved.
 
 Important names are sometimes styled through use of small-caps, though 
 it may be that this is an older / rare convention and not applicable 
 here.

The spec suggests the b element for this particular use case, and has an 
example of this. Search for the two occurrances of gossip.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] the cite element

2009-09-16 Thread Erik Vorhes
A few points of clarification:

On Wed, Sep 16, 2009 at 4:16 AM, Ian Hickson i...@hixie.ch wrote:
 Unless there is some semantic value to the name being more than just a
 name, yes.

 Is there?

Yes, and with the removal of the dialog element (of which I was
unaware when I sent my last message) makes a compelling case for the
re-expansion of cite for dialog.

On October 31, 2006, Michael Fortin suggested the following pattern:
pciteMe:/cite qCan I say something?/q

Which Jeremy Keith also recommends. [1]

(For longer text it would make more sense to do something like
cite/citeblockquote/blockquote, but that's beside the point.)

You didn't explicitly object to such a pattern (though implemented a
different one for dialog) as late as May 5, 2008 [2].

Aside from the current definition of cite, I think this would be a
good use of the element, since it makes more sense than b or span
(what do those signify in this context?) and there's nothing wrong
with an italicized name in this context. Moreover, there are examples
of Fortin/Keith's usage in the wild.


 There's nothing wrong with overriding default presentaional styles, but
 there _is_ something wrong with a spec's defaults being different than
 what authors want.

Agreed.


 How many sites using cite for people's names (or other reasonable uses
 that deviate from title of work) would it take to convince you that it
 _was_ a common case?

 Benjamin already asked me that, I was turning the tables on him when I
 asked the question above. :-)

Oops! I like to think of myself as a better reader than that. Sorry! :)


 I had answered:

  A random sample of the Web would need to show more uses of this than
  uses of other things.

I'm not sure the lack of majority use should be an impediment, but
that's an issue of conclusions rather than reasoning. (And I
sympathize with needing to draw the line at some point, even if it
makes some of us unhappy or some of us feel it's incorrect.)



 ... I don't understand what your proposal is, at this
 point. How do you define citation? What problem does it solve?

I should have made this clearer, I suppose, sorry. What I propose is
that cite should be allowed for markup in the following instances:

- titles of works
- full citations
- names and other sources of quote attribution (including identifying
speakers in dialog)
- names of blog post commenters and authors (in the context of their
comments, posts, etc.)


 It doesn't matter how many people say something on this mailing list,
 that's not an unbiased sample. (The people who think cite is fine as
 defined in HTML5 don't have motivation to say so, for example.)

I agree that basing decisions exclusively on what is said on the
mailing list is not always the right approach. The length of this
thread (and filtering out your and my messages) suggests that the
representation of voices pro  con (re: cite in HTML5) is pretty
close to equal. In other words, it's not just you and a bunch of
cranky folks objecting to the specification (as much as it may feel
that way sometimes).


Erik


[1] http://adactio.com/journal/1609/
[2] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-May/014684.html


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Keryx Web

2009-09-16 12:03, David Workman skrev:

think 'article' is more suitable than 'post' or 'entry' semantically.


I am thinking about the mental model, more than pure semantics.

Maybe a *comparison* to entry in Atom and item in RSS in the non 
normative text could be beneficial?



--
Keryx Web (Lars Gunther)
http://keryx.se/
http://twitter.com/itpastorn/
http://itpastorn.blogspot.com/


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Erik Vorhes
On Wed, Sep 16, 2009 at 3:35 AM, Bruce Lawson bru...@opera.com wrote:
 there would also need to be a comment element

I'd be *slightly* concerned that confusion could arise between a
comment element and the !-- comment syntax --, at least in
discussion. (I.e., what would HTML comment mean?)

entry (which has already been proposed) might more logically suit
the bill for standalone articles (in a blog or whatever) as well as
blog/forum comments. And since it's part of the Atom spec., there's
some precedent for defining its use.

That said, I don't have a problem with article as a special kind of
section (though having articles nested within articles doesn't agree
with my brain at this point).

Erik


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Divya Manian
Hi all,

The IRC conversation that Jeremy Keith[1] highlighted was really useful.
From there, this is my understanding of the differences between section
and article, forgive me if I am not rigorous in my usage of English:

1. section to cut or section different parts of the layout of the
webpage

2. article is used for each similar content, each blog post in a set of (1
or more) blog posts, each user member avatar in a set of user member
avatars, each product in a set of products.

3. articles always occur within a section except in rare occasions when
there is nothing else other than the specific article on the page as
content.  

The usecase that leaps to me is:
HTML 4:

div class=maincolumn
div class=item
News entry 1
/div
div class=item
News entry 2
/div
/div

HTML 5:
section class=maincolumn
article
News entry 1
/article
article
News entry 2
/article
/section


I often use div class=item to mark up the smallest large unit of content
that is repeatable and my understanding is that article would be a good
replacement of that.

If that is the case, we can use article to mark up each comment, or each
blog post. But I think article is quite a newsroom lingo which might not
be appropriate for marking up a product or a user member avatar.
 
Would this be a good description of article and section?

Apologies if this seems like noise.

Regards,
Divya


[1] http://krijnhoetmer.nl/irc-logs/whatwg/20090916

-- 
http://nimbu.in




Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Jeremy Keith

Lars asked:

All articles are sections but not all sections are articles


Just to be clear: Does that include the rules for headers when  
articles are nested, or when an article is contained in a section?


Yes, the content model is identical. They are both sectioning content.

E.g. would this structure be treated as an identical flow from a  
headings level perspective if all article tags where replaced with  
section tags? I.e. Would it be as if I'd use h1, h2 and h3 today?


section
 h1 /
 article
   h1 /
 ...
   article
 h1 /
   /article
 /article
/section


Yes. From an outline point of view, this is identical:

section
 h1 /
 section
   h1 /
 ...
   section
 h1 /
   /section
 /section
/section

As is this:

article
 h1 /
 article
   h1 /
 ...
   article
 h1 /
   /article
 /article
/article

--
Jeremy Keith

a d a c t i o

http://adactio.com/




Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Gordon P. Hemsley
I'd sent this earlier, but it got caught in the message queue that
apparently nobody checks. Let's see if it works this time.

-- Forwarded message --
From: Gordon P. Hemsley gphems...@gmail.com
Date: Tue, Sep 15, 2009 at 11:31 PM
Subject: Re: [whatwg] article/section/details naming/definition problems
To: whatwg List wha...@whatwg.org


On Tue, Sep 15, 2009 at 9:08 PM, Ian Hickson i...@hixie.ch wrote:

 On Tue, 15 Sep 2009, Jeremy Keith wrote:
  In that blog post, I point out that section and article were once
 more
  divergent but have converged over time (since the @cite and @pubdate
  attributes were dropped from article).
 
  I've also seen a lot of confusion from authors wondering when to use
 section
  and when to use article. Bruce wrote an article on HTML5 doctor
 recently to
  address this:
  http://html5doctor.com/the-section-element/
 
  Probably the best tutorial I've seen on this issue is from Ted:
  http://edward.oconnor.cx/2009/09/using-the-html5-sectioning-elements
 
  ...but even so, the confusion remains. The very fact that tutorials are
  required for what should be intuitive structural elements is worrying — I
  don't see the same issues around nav, header or footer (now that
 the
  content model has been changed) ...although there is continuing confusion
  around aside.

 I'd like to rename article, if someone can come up with a better word
 that means blog post, blog comment, forum post, or widget. I do think
 there is an important difference between a subpart of a page that is
 a potential candidate for syndication, and a subsection of a page that
 only makes sense with the rest of the page.


What about item? (Directly, it's a coincidence that RSS happens to have
the same-named element, as I just used a thesaurus. But perhaps [indirectly]
there's a reason RSS uses item to begin with. And, after all, it's
supposed to be used as a hint that it could be syndicated content, right?)

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/
http://sasha.sourceforge.net/ • http://www.yoursasha.com/



-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/
http://sasha.sourceforge.net/ • http://www.yoursasha.com/


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Jeremy Keith

Divya wrote:

this is my understanding of the differences between section
and article, forgive me if I am not rigorous in my usage of English:

1. section to cut or section different parts of the layout of the
webpage


No. This is what div is for.

section is for enclosing related content. article is for enclosing  
related content *that is also independent*.


2. article is used for each similar content, each blog post in a set  
of (1

or more) blog posts, each user member avatar in a set of user member
avatars, each product in a set of products.


Not necessarily. If you would use article for a page of 10 blog  
posts, you should also use article for a page containing only one of  
those blog posts. The context isn't as important as the content. If  
the content *could* stand alone, then you are supposed to use  
article. Whether or not the contact actually *is* standing alone (in  
the current document) doesn't matter.


3. articles always occur within a section except in rare occasions  
when

there is nothing else other than the specific article on the page as
content.


No. There is no correlation.

* articles do not need to be nested within a section. They can be  
children of the body element, for example (the body element isn't  
sectioning content although it is a sectioning root).
* articles can be nested within an article. The spec currently  
advises doing this for blog comments (even though it's questionable  
whether or not those comments stand alone).
* sections can be nested within an article. Different sections of  
a news story or blog post, for example.

* sections can nested within a section.


The usecase that leaps to me is:
HTML 4:

div class=maincolumn
div class=item
News entry 1
/div
div class=item
News entry 2
/div
/div

HTML 5:
section class=maincolumn
article
News entry 1
/article
article
News entry 2
/article
/section


This should probably be:

div class=maincolumn
article
News entry 1
/article
article
News entry 2
/article
/div

I often use div class=item to mark up the smallest large unit of  
content
that is repeatable and my understanding is that article would be a  
good

replacement of that.


Only if the content is independent. Otherwise use div (or section  
if the content is related).


--
Jeremy Keith

a d a c t i o

http://adactio.com/




Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Judson Collier
I really don't see the relevency of article as anything other than an blog
post or article- this is the obvious definition. If you are going to keep
article, diverting the definition to coments and widgets is way off the
beaten path. It'd be cool to see some type of user generated content tag,
but that's a bit superflous.

However, one single specialized section tag begs for *more *specialized
section tags. Keep that in mind.

On Wed, Sep 16, 2009 at 11:37 AM, Jeremy Keith jer...@adactio.com wrote:

 Divya wrote:

 this is my understanding of the differences between section
 and article, forgive me if I am not rigorous in my usage of English:

 1. section to cut or section different parts of the layout of the
 webpage


 No. This is what div is for.




 section is for enclosing related content. article is for enclosing
 related content *that is also independent*.

 2. article is used for each similar content, each blog post in a set of (1
 or more) blog posts, each user member avatar in a set of user member
 avatars, each product in a set of products.


 Not necessarily. If you would use article for a page of 10 blog posts,
 you should also use article for a page containing only one of those blog
 posts. The context isn't as important as the content. If the content *could*
 stand alone, then you are supposed to use article. Whether or not the
 contact actually *is* standing alone (in the current document) doesn't
 matter.

 3. articles always occur within a section except in rare occasions when
 there is nothing else other than the specific article on the page as
 content.


 No. There is no correlation.

 * articles do not need to be nested within a section. They can be
 children of the body element, for example (the body element isn't
 sectioning content although it is a sectioning root).
 * articles can be nested within an article. The spec currently advises
 doing this for blog comments (even though it's questionable whether or not
 those comments stand alone).
 * sections can be nested within an article. Different sections of a
 news story or blog post, for example.
 * sections can nested within a section.

 The usecase that leaps to me is:
 HTML 4:

 div class=maincolumn
 div class=item
 News entry 1
 /div
 div class=item
 News entry 2
 /div
 /div

 HTML 5:
 section class=maincolumn
 article
 News entry 1
 /article
 article
 News entry 2
 /article
 /section


 This should probably be:

 div class=maincolumn
 article
 News entry 1
 /article
 article
 News entry 2
 /article
 /div

 I often use div class=item to mark up the smallest large unit of
 content
 that is repeatable and my understanding is that article would be a good
 replacement of that.


 Only if the content is independent. Otherwise use div (or section if
 the content is related).


 --
 Jeremy Keith

 a d a c t i o

 http://adactio.com/





-- 
Judson Collier
http://judsoncollier.com/
http://twitter.com/judsoncollier


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread James Cready
Jeremy Keith said:
 article
   h1 /
   article
 h1 /
   ...
 article
   h1 /
 /article
   /article
 /article

Just curious as to how your above examples would affect SEO. Wouldn't Google
lower your rank (even just slightly) because you're using multiple h1 tags?
Also in this example which header is the most important (for SEO, not just
semantics). Is it the first h3 or the first h1?

body
  h3 /
  header
h3 /
  /header
  article
h2 /
  ...
article
  h1 /
/article
hgroup
  h1 /
  h2 /
/hgroup
  /article
  footer
h3 /
  /footer
/body

-- 
James W Cready



Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Tab Atkins Jr.
On Wed, Sep 16, 2009 at 11:17 AM, James Cready jcre...@rtcrm.com wrote:
 Jeremy Keith said:
 article
   h1 /
   article
     h1 /
       ...
     article
       h1 /
     /article
   /article
 /article

 Just curious as to how your above examples would affect SEO. Wouldn't Google
 lower your rank (even just slightly) because you're using multiple h1 tags?

I actually got Ian to ask the search team about this a while back.
Their answer was that, currently, there's probably a small negative
effect from using multiple h1s as the spec recommends, but that
it'll likely get revised once the practice becomes common (and thus
not indicative of spam), and in any case the effect is small enough
that one shouldn't worry about it on an otherwise-good site.

Obviously this isn't very specific, but Google relies on obscurity as
a major component of its ranking algorithm, so shrug.

 Also in this example which header is the most important (for SEO, not just
 semantics). Is it the first h3 or the first h1?
 body
  h3 /
  header
    h3 /
  /header
  article
    h2 /
      ...
    article
      h1 /
    /article
    hgroup
      h1 /
      h2 /
    /hgroup
  /article
  footer
    h3 /
  /footer
 /body

The first h3 and the two h3s in the header and footer all
create top-level sections by the outline algorithm, I believe.  If we
pretend that the header and footer had h4s instead, then the
first h3 is the most important one on the page.

For SEO purposes the h1 *may* be more important currently, but then
again it might not be since it's buried in the page content.  I'm
pretty sure that some extra weight is given to headers early in the
document.  Search engine juju in these cases is really hard to
determine.

~TJ


Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Jeremy Keith

James Cready wrote:
Just curious as to how your above examples would affect SEO.  
Wouldn't Google
lower your rank (even just slightly) because you're using multiple  
h1 tags?


Given Google's support for HTML5, I don't think the new algorithm is  
going to be a problem for SEO (though I'd welcome clarification on  
that).




Also in this example which header is the most important (for SEO,  
not just

semantics). Is it the first h3 or the first h1?


An h1 nested within two sectioning elements has exactly the same  
importance as an h3. That is, one isn't more or less important than  
the other; they have an equivalence in importance.


--
Jeremy Keith

a d a c t i o

http://adactio.com/




Re: [whatwg] article/section/details naming/definition problems

2009-09-16 Thread Jeremy Keith

I wrote:
An h1 nested within two sectioning elements has exactly the same  
importance as an h3


Whoops. I was looking at a different example. Ignore what I said and  
listen to Tab Atkins Jr.


--
Jeremy Keith

a d a c t i o

http://adactio.com/




Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Drew Wilson
Jeremy, what's the use case here - do developers want workers to have access
to shared local storage with pages? Or do they just want workers to have
access to their own non-shared local storage?
Because we could just give workers their own separate WorkerLocalStorage and
let them have at it. A worker could block all the other accesses to
WorkerLocalStorage within that domain, but so be it - it wouldn't affect
page access, and we already had that issue with the (now removed?)
synchronous SQL API.

I think a much better case can be made for WorkerLocalStorage than for give
workers access to page LocalStorage, and the design issues are much
simpler.

-atw

On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org wrote:
  One possible solution is to add an asynchronous callback interface for
  LocalStorage into workers.  For example:
  function myCallback(localStorage) {
localStorage.accountBalance = localStorage.accountBalance + 100;
  }
  executeLocalStorageCallback(myCallback);  // TODO: Make this name better
   :-)
  The interface is simple.  You can only access localStorage via a
 callback.
   Any use outside of the callback is illegal and would raise an exception.
   The callback would acquire the storage mutex during execution, but the
  worker's execution would not block during this time.  Of course, it's
 still
  possible for a poorly behaving worker to do large amounts
 of computation in
  the callback, but hopefully the fact they're executing in a callback
 makes
  the developer more aware of the problem.

 First off, I agree that not having localStorage in workers is a big
 problem that we need to address.

 If I were designing the localStorage interface today I would use the
 above interface that you suggest. Grabbing localStorage can only be
 done asynchronously, and while you're using it, no one else can get a
 reference to it. This way there are no race conditions, but also no
 way for anyone to have to lock.

 So one solution is to do that in parallel to the current localStorage
 interface. Let's say we introduce a 'clientStorage' object. You can
 only get a reference to it using a 'getClientStorage' function. This
 function is available both to workers and windows. The storage is
 separate from localStorage so no need to worry about the 'storage
 mutex'.

 There is of course a risk that a worker grabs on to the clientStorage
 and holds it indefinitely. This would result in the main window (or
 another worker) never getting a reference to it. However it doesn't
 affect responsiveness of that window, it's just that the callback will
 never happen. While that's not ideal, it seems like a smaller problem
 than any other solution that I can think of. And the WebDatabase
 interfaces are suffering from the same problem if I understand things
 correctly.

 There's a couple of other interesting things we could expose on top of
 this:

 First, a synchronous API for workers. We could allow workers to
 synchronously get a reference to clientStorage. If someone is
 currently using clientStorage then the worker blocks until the storage
 becomes available. We could either use a callback as the above, which
 blocks until the clientStorage is acquired and only holds the storage
 until the callback exists. Or we could expose clientStorage as a
 property which holds the storage until control is returned to the
 worker eventloop, or until some explicit release API is called. The
 latter would be how localStorage is now defined, with the important
 difference that localStorage exposes the synchronous API to windows.

 Second, allow several named storage areas. We could add an API like
 getNamedClientStorage(name, callback). This would allow two different
 workers to simultaneously store things in a storage areas, as long as
 they don't need to use the *same* storage area. It would also allow a
 worker and the main window to simultaneously use separate storage
 areas.

 However we need to be careful if we add both above features. We can't
 allow a worker to grab multiple storage areas at the same time since
 that could cause deadlocks. However with proper APIs I believe we can
 avoid that.

 / Jonas



[whatwg] the cite element

2009-09-16 Thread Jim Jewett
In 
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-September/023005.html,
Ian quoted Erik Vorhes as writing:

 Put another way, if you had no prior knowledge of the current HTML5
 definition of cite (and perhaps any other specification's definition
 of the element), what would seem to be logical and appropriate uses of
 the element?

Ian:
 You mean based on just the element name? I wouldn't use it without reading
 the spec first. Most people seem to think it means italics, though, for
 what that's worth.

I think that gets at the root of the problem with cite.  Most people
don't read the spec, or even know where to find it.  cite isn't common
enough to just copy by example, and it turns out to be ambiguous as
the name of an element or attribute.


Do you wrap the actual excerpt (the precise thing you're citing), or
the name of the source?  If you wrap the name/title of the source, is
there a way to show the scope of what you're attributing?

The HTML 4 definition (CITE: Contains a citation or a reference to
other sources.) didn't help much, but I'm not sure it can be fixed by
a spec change.  If you have to look it up, then only careful people
will use it properly.  (On the other hand, if there is any HTML
element whose users are likely to be extra careful, cite is a strong
candidate.)

My own interpretation of (a fraction of)
http://philip.html5.org/data/cite.txt did not support narrowing the
definition only to titles.  For example

(1)  Examples of citing a person, arguably the creator.

(1a)  http://www.hiddenmickeys.org/Movies/MaryPoppins.html

The cite element is used to give credit to the person who
found/verified each Hidden Mickey:
CITEREPORTED: A HREF=mailto:...;Beverly O'Dell/A 12 MAR 98/CITE
CITEUPDATE: Greg Bevier 29 JUL 98/CITE

(1b)  http://www.webporter.com -- they give the author of the article.
 But it looks like they (at least sometimes) include the title as
well, which fits under full citation.

(1c)  http://www.thesentencegame.com/ -- a link to the snipped author.

(1d)  http://drotner.com/squirtboating/  -- the phototographer and subject
cite class=subjectPaddler: Kelly McCauley/cite
cite class=attributionPhoto: April McCauley, 2001/cite

These do seem useful; if you wanted more information, it might well be
How do I contact this photographer or that model to get something
similar?


(2)  Several uses -- and several *non-uses* for titles from
http://www.growndodo.com/wordplay/oulipo/

The page begins with carefully attributed blockquotes.  These are
*not* done with cite, presumably because it didn't seem flexible
enough.  Instead, it was marked up as

p class=quote...
p class=citation
  span class=citationauthorFranccedil;ois Le Lionnais/span,
  span class=citationsourceLipo: First Manifesto/span/p


Within the text, cite was used to point to source materials, but
there didn't seem to be anything qouted; in most cases the texts were
used as example objects of study; if they actually need a title
markup, then so does the specific Viking ship in Leif's example.
Sample usage:   citeS + 7/cite (substrata (quot;novelettequot; +
7) does appear to be a title.

At the end of the page, there is a further readings section.
dtauthorcitetitle/citepublisher/dt is used for printed
reference books
but
p class=linklista href ... is used for equivalent references
on the web,
and cite is also used to name the professor of a course
cite4-5 units, a
href=http://www.centerforbookculture.org/dalkey/bio_gsorrentino.html;Sorrentino/a/cite


(3)  Example of usage as per HTML5
http://www.pacifier.com/~tpope/

(4)  Example of italics -- though they may be going for the
commendation meaning of cite:
 http://www.patriagrande.net/guatemala/otto.htm

(5)  Clearly just for italics -- http://www.truck-town.com/

(6)  http://www.winthrop.dk/hender.html -- Using it to wrap the
portion of your own text that was cited as opposed to original.

That said, I can't rule out that it was just a way to get italics;
later on the page, there was cites for shot heard round
the#10;world (title of event?) and revolutionaries (describing the
original settlers).




-jJ


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Michael Nordman
On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.com wrote:

 Jeremy, what's the use case here - do developers want workers to have
 access to shared local storage with pages? Or do they just want workers to
 have access to their own non-shared local storage?
 Because we could just give workers their own separate WorkerLocalStorage
 and let them have at it. A worker could block all the other accesses to
 WorkerLocalStorage within that domain, but so be it - it wouldn't affect
 page access, and we already had that issue with the (now removed?)
 synchronous SQL API.

 I think a much better case can be made for WorkerLocalStorage than for
 give workers access to page LocalStorage, and the design issues are much
 simpler.


Putting workers in their own storage silo doesn't really make much sense?
Sure it may be simpler for browser vendors, but does that make life simpler
 for app developers, or just have them scratching their heads about how to
read/write the same data set from either flavor of context in their
application?

I see no rhyme or reason for the arbitrary barrier except for browser
vendors to work around the awkward implict locks on LocalStorage (the source
of much grief). Consider this... would it make sense to cordon off the
databases workers vs pages can see? I would think not, and i would hope
others agree.




 -atw

 On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  One possible solution is to add an asynchronous callback interface for
  LocalStorage into workers.  For example:
  function myCallback(localStorage) {
localStorage.accountBalance = localStorage.accountBalance + 100;
  }
  executeLocalStorageCallback(myCallback);  // TODO: Make this name better
   :-)
  The interface is simple.  You can only access localStorage via a
 callback.
   Any use outside of the callback is illegal and would raise an
 exception.
   The callback would acquire the storage mutex during execution, but the
  worker's execution would not block during this time.  Of course, it's
 still
  possible for a poorly behaving worker to do large amounts
 of computation in
  the callback, but hopefully the fact they're executing in a callback
 makes
  the developer more aware of the problem.

 First off, I agree that not having localStorage in workers is a big
 problem that we need to address.

 If I were designing the localStorage interface today I would use the
 above interface that you suggest. Grabbing localStorage can only be
 done asynchronously, and while you're using it, no one else can get a
 reference to it. This way there are no race conditions, but also no
 way for anyone to have to lock.

 So one solution is to do that in parallel to the current localStorage
 interface. Let's say we introduce a 'clientStorage' object. You can
 only get a reference to it using a 'getClientStorage' function. This
 function is available both to workers and windows. The storage is
 separate from localStorage so no need to worry about the 'storage
 mutex'.

 There is of course a risk that a worker grabs on to the clientStorage
 and holds it indefinitely. This would result in the main window (or
 another worker) never getting a reference to it. However it doesn't
 affect responsiveness of that window, it's just that the callback will
 never happen. While that's not ideal, it seems like a smaller problem
 than any other solution that I can think of. And the WebDatabase
 interfaces are suffering from the same problem if I understand things
 correctly.

 There's a couple of other interesting things we could expose on top of
 this:

 First, a synchronous API for workers. We could allow workers to
 synchronously get a reference to clientStorage. If someone is
 currently using clientStorage then the worker blocks until the storage
 becomes available. We could either use a callback as the above, which
 blocks until the clientStorage is acquired and only holds the storage
 until the callback exists. Or we could expose clientStorage as a
 property which holds the storage until control is returned to the
 worker eventloop, or until some explicit release API is called. The
 latter would be how localStorage is now defined, with the important
 difference that localStorage exposes the synchronous API to windows.

 Second, allow several named storage areas. We could add an API like
 getNamedClientStorage(name, callback). This would allow two different
 workers to simultaneously store things in a storage areas, as long as
 they don't need to use the *same* storage area. It would also allow a
 worker and the main window to simultaneously use separate storage
 areas.

 However we need to be careful if we add both above features. We can't
 allow a worker to grab multiple storage areas at the same time since
 that could cause deadlocks. However with proper APIs I believe we can
 avoid that.

 / Jonas





Re: [whatwg] LocalStorage in workers

2009-09-16 Thread James Robinson
On Wed, Sep 16, 2009 at 10:53 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.com wrote:

 Jeremy, what's the use case here - do developers want workers to have
 access to shared local storage with pages? Or do they just want workers to
 have access to their own non-shared local storage?
 Because we could just give workers their own separate WorkerLocalStorage
 and let them have at it. A worker could block all the other accesses to
 WorkerLocalStorage within that domain, but so be it - it wouldn't affect
 page access, and we already had that issue with the (now removed?)
 synchronous SQL API.

 I think a much better case can be made for WorkerLocalStorage than for
 give workers access to page LocalStorage, and the design issues are much
 simpler.


 Putting workers in their own storage silo doesn't really make much sense?
 Sure it may be simpler for browser vendors, but does that make life simpler
  for app developers, or just have them scratching their heads about how to
 read/write the same data set from either flavor of context in their
 application?

 I see no rhyme or reason for the arbitrary barrier except for browser
 vendors to work around the awkward implict locks on LocalStorage (the source
 of much grief). Consider this... would it make sense to cordon off the
 databases workers vs pages can see? I would think not, and i would hope
 others agree.


The difference is that the database interface is purely asynchronous whereas
storage is synchronous.

If multiple threads have synchronous access to the same shared resource then
there has to be a consistency model.  ECMAScript does not provide for one so
it has to be done at a higher level.  Since there was not a solution in the
first versions that shipped, the awkward implicit locks you mention were
suggested as a workaround.  However it's far from clear that these solve the
problem and are implementable.  It seems like the only logical continuation
of this path would be to add explicit, blocking synchronization primitives
for developers to deal with - which I think everyone agrees would be a
terrible idea.  If you're worried about developers scratching their heads
about how to pass data between workers just think about happens-before
relationships and multi-threaded memory models.

In a hypothetical world without synchronous access to LocalStorage/cookies
from workers, there is no shared memory between threads except via message
passing.  This can seem a bit tricky for developers but is very easy to
reason about and prove correctness and the absence of deadlocks.

- James






 -atw

 On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  One possible solution is to add an asynchronous callback interface for
  LocalStorage into workers.  For example:
  function myCallback(localStorage) {
localStorage.accountBalance = localStorage.accountBalance + 100;
  }
  executeLocalStorageCallback(myCallback);  // TODO: Make this name
 better
   :-)
  The interface is simple.  You can only access localStorage via a
 callback.
   Any use outside of the callback is illegal and would raise an
 exception.
   The callback would acquire the storage mutex during execution, but the
  worker's execution would not block during this time.  Of course, it's
 still
  possible for a poorly behaving worker to do large amounts
 of computation in
  the callback, but hopefully the fact they're executing in a callback
 makes
  the developer more aware of the problem.

 First off, I agree that not having localStorage in workers is a big
 problem that we need to address.

 If I were designing the localStorage interface today I would use the
 above interface that you suggest. Grabbing localStorage can only be
 done asynchronously, and while you're using it, no one else can get a
 reference to it. This way there are no race conditions, but also no
 way for anyone to have to lock.

 So one solution is to do that in parallel to the current localStorage
 interface. Let's say we introduce a 'clientStorage' object. You can
 only get a reference to it using a 'getClientStorage' function. This
 function is available both to workers and windows. The storage is
 separate from localStorage so no need to worry about the 'storage
 mutex'.

 There is of course a risk that a worker grabs on to the clientStorage
 and holds it indefinitely. This would result in the main window (or
 another worker) never getting a reference to it. However it doesn't
 affect responsiveness of that window, it's just that the callback will
 never happen. While that's not ideal, it seems like a smaller problem
 than any other solution that I can think of. And the WebDatabase
 interfaces are suffering from the same problem if I understand things
 correctly.

 There's a couple of other interesting things we could expose on top of
 this:

 First, a synchronous API 

Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Michael Nordman
On Wed, Sep 16, 2009 at 11:24 AM, James Robinson jam...@google.com wrote:

 On Wed, Sep 16, 2009 at 10:53 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.com wrote:

 Jeremy, what's the use case here - do developers want workers to have
 access to shared local storage with pages? Or do they just want workers to
 have access to their own non-shared local storage?
 Because we could just give workers their own separate WorkerLocalStorage
 and let them have at it. A worker could block all the other accesses to
 WorkerLocalStorage within that domain, but so be it - it wouldn't affect
 page access, and we already had that issue with the (now removed?)
 synchronous SQL API.

 I think a much better case can be made for WorkerLocalStorage than for
 give workers access to page LocalStorage, and the design issues are much
 simpler.


 Putting workers in their own storage silo doesn't really make much sense?
 Sure it may be simpler for browser vendors, but does that make life simpler
  for app developers, or just have them scratching their heads about how to
 read/write the same data set from either flavor of context in their
 application?

 I see no rhyme or reason for the arbitrary barrier except for browser
 vendors to work around the awkward implict locks on LocalStorage (the source
 of much grief). Consider this... would it make sense to cordon off the
 databases workers vs pages can see? I would think not, and i would hope
 others agree.


 The difference is that the database interface is purely asynchronous
 whereas storage is synchronous.


Sure... we're talking about adding an async api that allows worker to access
a local storage repository... should such a thing exist, why should it not
provide access to the same repository as seen by pages?


 If multiple threads have synchronous access to the same shared resource
 then there has to be a consistency model.  ECMAScript does not provide for
 one so it has to be done at a higher level.  Since there was not a solution
 in the first versions that shipped, the awkward implicit locks you mention
 were suggested as a workaround.  However it's far from clear that these
 solve the problem and are implementable.  It seems like the only logical
 continuation of this path would be to add explicit, blocking synchronization
 primitives for developers to deal with - which I think everyone agrees would
 be a terrible idea.  If you're worried about developers scratching their
 heads about how to pass data between workers just think about happens-before
 relationships and multi-threaded memory models.

 In a hypothetical world without synchronous access to LocalStorage/cookies
 from workers, there is no shared memory between threads except via message
 passing.  This can seem a bit tricky for developers but is very easy to
 reason about and prove correctness and the absence of deadlocks.

 - James






 -atw

 On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  One possible solution is to add an asynchronous callback interface for
  LocalStorage into workers.  For example:
  function myCallback(localStorage) {
localStorage.accountBalance = localStorage.accountBalance + 100;
  }
  executeLocalStorageCallback(myCallback);  // TODO: Make this name
 better
   :-)
  The interface is simple.  You can only access localStorage via a
 callback.
   Any use outside of the callback is illegal and would raise an
 exception.
   The callback would acquire the storage mutex during execution, but
 the
  worker's execution would not block during this time.  Of course, it's
 still
  possible for a poorly behaving worker to do large amounts
 of computation in
  the callback, but hopefully the fact they're executing in a callback
 makes
  the developer more aware of the problem.

 First off, I agree that not having localStorage in workers is a big
 problem that we need to address.

 If I were designing the localStorage interface today I would use the
 above interface that you suggest. Grabbing localStorage can only be
 done asynchronously, and while you're using it, no one else can get a
 reference to it. This way there are no race conditions, but also no
 way for anyone to have to lock.

 So one solution is to do that in parallel to the current localStorage
 interface. Let's say we introduce a 'clientStorage' object. You can
 only get a reference to it using a 'getClientStorage' function. This
 function is available both to workers and windows. The storage is
 separate from localStorage so no need to worry about the 'storage
 mutex'.

 There is of course a risk that a worker grabs on to the clientStorage
 and holds it indefinitely. This would result in the main window (or
 another worker) never getting a reference to it. However it doesn't
 affect responsiveness of that window, it's just that the callback will
 never happen. While 

Re: [whatwg] the cite element

2009-09-16 Thread Erik Vorhes
A use-case for person's name in the context of cite:

In reference to many Classical texts one will often refer to the
author in lieu of the title (or in some cases that author's corpus).
E.g.:

pYou should read citeHerodotus/cite./p



Erik


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Drew Wilson
I'm saying that an async API is overkill and unwieldy if all you need is
WorkerLocalStorage.
If you're going to route your localstorage access through an async API
anyway, then you might as well proxy it to the parent page - there's very
little advantage to doing it otherwise, other than access to lexically
scoped resources from within your callback.

But, yeah, if you want to provide access to shared worker/page storage, then
an async API would be the way to go - I'm just saying that if you don't
actually need shared storage, then you could maintain a more convenient
synchronous silo'd API.

Since Jeremy didn't really elaborate on the use case, and he's getting
feedback from app developers that I'm not privy to, I figured I'd ask him.

-atw

On Wed, Sep 16, 2009 at 11:34 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 11:24 AM, James Robinson jam...@google.comwrote:

 On Wed, Sep 16, 2009 at 10:53 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.comwrote:

 Jeremy, what's the use case here - do developers want workers to have
 access to shared local storage with pages? Or do they just want workers to
 have access to their own non-shared local storage?
 Because we could just give workers their own separate WorkerLocalStorage
 and let them have at it. A worker could block all the other accesses to
 WorkerLocalStorage within that domain, but so be it - it wouldn't affect
 page access, and we already had that issue with the (now removed?)
 synchronous SQL API.

 I think a much better case can be made for WorkerLocalStorage than for
 give workers access to page LocalStorage, and the design issues are much
 simpler.


 Putting workers in their own storage silo doesn't really make much sense?
 Sure it may be simpler for browser vendors, but does that make life simpler
  for app developers, or just have them scratching their heads about how to
 read/write the same data set from either flavor of context in their
 application?

 I see no rhyme or reason for the arbitrary barrier except for browser
 vendors to work around the awkward implict locks on LocalStorage (the source
 of much grief). Consider this... would it make sense to cordon off the
 databases workers vs pages can see? I would think not, and i would hope
 others agree.


 The difference is that the database interface is purely asynchronous
 whereas storage is synchronous.


 Sure... we're talking about adding an async api that allows worker to
 access a local storage repository... should such a thing exist, why should
 it not provide access to the same repository as seen by pages?


 If multiple threads have synchronous access to the same shared resource
 then there has to be a consistency model.  ECMAScript does not provide for
 one so it has to be done at a higher level.  Since there was not a solution
 in the first versions that shipped, the awkward implicit locks you mention
 were suggested as a workaround.  However it's far from clear that these
 solve the problem and are implementable.  It seems like the only logical
 continuation of this path would be to add explicit, blocking synchronization
 primitives for developers to deal with - which I think everyone agrees would
 be a terrible idea.  If you're worried about developers scratching their
 heads about how to pass data between workers just think about happens-before
 relationships and multi-threaded memory models.

 In a hypothetical world without synchronous access to LocalStorage/cookies
 from workers, there is no shared memory between threads except via message
 passing.  This can seem a bit tricky for developers but is very easy to
 reason about and prove correctness and the absence of deadlocks.

 - James






 -atw

 On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.ccwrote:

 On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  One possible solution is to add an asynchronous callback interface
 for
  LocalStorage into workers.  For example:
  function myCallback(localStorage) {
localStorage.accountBalance = localStorage.accountBalance + 100;
  }
  executeLocalStorageCallback(myCallback);  // TODO: Make this name
 better
   :-)
  The interface is simple.  You can only access localStorage via a
 callback.
   Any use outside of the callback is illegal and would raise an
 exception.
   The callback would acquire the storage mutex during execution, but
 the
  worker's execution would not block during this time.  Of course, it's
 still
  possible for a poorly behaving worker to do large amounts
 of computation in
  the callback, but hopefully the fact they're executing in a callback
 makes
  the developer more aware of the problem.

 First off, I agree that not having localStorage in workers is a big
 problem that we need to address.

 If I were designing the localStorage interface today I would use the
 above interface that you suggest. Grabbing localStorage can only 

Re: [whatwg] LocalStorage in workers

2009-09-16 Thread James Robinson
On Wed, Sep 16, 2009 at 11:34 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 11:24 AM, James Robinson jam...@google.comwrote:

 On Wed, Sep 16, 2009 at 10:53 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.comwrote:

 Jeremy, what's the use case here - do developers want workers to have
 access to shared local storage with pages? Or do they just want workers to
 have access to their own non-shared local storage?
 Because we could just give workers their own separate WorkerLocalStorage
 and let them have at it. A worker could block all the other accesses to
 WorkerLocalStorage within that domain, but so be it - it wouldn't affect
 page access, and we already had that issue with the (now removed?)
 synchronous SQL API.

 I think a much better case can be made for WorkerLocalStorage than for
 give workers access to page LocalStorage, and the design issues are much
 simpler.


 Putting workers in their own storage silo doesn't really make much sense?
 Sure it may be simpler for browser vendors, but does that make life simpler
  for app developers, or just have them scratching their heads about how to
 read/write the same data set from either flavor of context in their
 application?

 I see no rhyme or reason for the arbitrary barrier except for browser
 vendors to work around the awkward implict locks on LocalStorage (the source
 of much grief). Consider this... would it make sense to cordon off the
 databases workers vs pages can see? I would think not, and i would hope
 others agree.


 The difference is that the database interface is purely asynchronous
 whereas storage is synchronous.


 Sure... we're talking about adding an async api that allows worker to
 access a local storage repository... should such a thing exist, why should
 it not provide access to the same repository as seen by pages?


Not quite - Jeremy proposed giving workers access to a synchronous API
(localStorage.*) but to only allow it to be called within the context of a
callback that the UA can run when it chooses.  It's another way to approach
the implicit locking since a UA would have to, in effect, hold the storage
mutex for the duration of the callback.  The page's context could still be
blocked for an indefinite amount of time by a worker thread.

Drew suggested isolating the worker's access to a separate storage 'arena'
so that there wouldn't be shared, synchronous access between the page
context and a worker context.  This way the synchronous Storage API can be
used essentially unchanged without having to deal with the more nasty parts
of synchronization.

- James



 If multiple threads have synchronous access to the same shared resource
 then there has to be a consistency model.  ECMAScript does not provide for
 one so it has to be done at a higher level.  Since there was not a solution
 in the first versions that shipped, the awkward implicit locks you mention
 were suggested as a workaround.  However it's far from clear that these
 solve the problem and are implementable.  It seems like the only logical
 continuation of this path would be to add explicit, blocking synchronization
 primitives for developers to deal with - which I think everyone agrees would
 be a terrible idea.  If you're worried about developers scratching their
 heads about how to pass data between workers just think about happens-before
 relationships and multi-threaded memory models.

 In a hypothetical world without synchronous access to LocalStorage/cookies
 from workers, there is no shared memory between threads except via message
 passing.  This can seem a bit tricky for developers but is very easy to
 reason about and prove correctness and the absence of deadlocks.

 - James






 -atw

 On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.ccwrote:

 On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  One possible solution is to add an asynchronous callback interface
 for
  LocalStorage into workers.  For example:
  function myCallback(localStorage) {
localStorage.accountBalance = localStorage.accountBalance + 100;
  }
  executeLocalStorageCallback(myCallback);  // TODO: Make this name
 better
   :-)
  The interface is simple.  You can only access localStorage via a
 callback.
   Any use outside of the callback is illegal and would raise an
 exception.
   The callback would acquire the storage mutex during execution, but
 the
  worker's execution would not block during this time.  Of course, it's
 still
  possible for a poorly behaving worker to do large amounts
 of computation in
  the callback, but hopefully the fact they're executing in a callback
 makes
  the developer more aware of the problem.

 First off, I agree that not having localStorage in workers is a big
 problem that we need to address.

 If I were designing the localStorage interface today I would use the
 above interface that you suggest. Grabbing localStorage 

Re: [whatwg] Spec comments, section 4.11

2009-09-16 Thread Aryeh Gregor
On Mon, Sep 14, 2009 at 7:52 AM, Ian Hickson i...@hixie.ch wrote:
 Tool bar appears to be the historically correct term, toolbar seems to
 be a new spelling.

I can't recall ever seeing tool bar before.  [toolbar] has
177,000,000 hits on Google, [tool bar] has 2,350,000.  And most of
the top hits for the latter are either actually toolbar (Google
cleverly corrects my (mis)spelling).

 It means the same as it always means in computer science:

   http://en.wikipedia.org/wiki/Variable_shadowing

Hmm, fair enough.  It's not a term I recognized offhand, but it seems
pretty standard, you're right.


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jeremy Orlow
On Wed, Sep 16, 2009 at 1:06 PM, James Robinson jam...@google.com wrote:

 On Wed, Sep 16, 2009 at 11:34 AM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 11:24 AM, James Robinson jam...@google.comwrote:

 On Wed, Sep 16, 2009 at 10:53 AM, Michael Nordman 
 micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.comwrote:

 Jeremy, what's the use case here - do developers want workers to have
 access to shared local storage with pages? Or do they just want workers to
 have access to their own non-shared local storage?
 Because we could just give workers their own separate
 WorkerLocalStorage and let them have at it. A worker could block all the
 other accesses to WorkerLocalStorage within that domain, but so be it - it
 wouldn't affect page access, and we already had that issue with the (now
 removed?) synchronous SQL API.

 I think a much better case can be made for WorkerLocalStorage than for
 give workers access to page LocalStorage, and the design issues are much
 simpler.


 Putting workers in their own storage silo doesn't really make much
 sense? Sure it may be simpler for browser vendors, but does that make life
 simpler  for app developers, or just have them scratching their heads about
 how to read/write the same data set from either flavor of context in their
 application?

 I see no rhyme or reason for the arbitrary barrier except for browser
 vendors to work around the awkward implict locks on LocalStorage (the 
 source
 of much grief). Consider this... would it make sense to cordon off the
 databases workers vs pages can see? I would think not, and i would hope
 others agree.


 The difference is that the database interface is purely asynchronous
 whereas storage is synchronous.


 Sure... we're talking about adding an async api that allows worker to
 access a local storage repository... should such a thing exist, why should
 it not provide access to the same repository as seen by pages?


 Not quite - Jeremy proposed giving workers access to a synchronous API
 (localStorage.*) but to only allow it to be called within the context of a
 callback that the UA can run when it chooses.  It's another way to approach
 the implicit locking since a UA would have to, in effect, hold the storage
 mutex for the duration of the callback.  The page's context could still be
 blocked for an indefinite amount of time by a worker thread.


Exactly.  And I acknowledged this in my original email.

Unfortunately, I don't have a good solution to the problem.  The only thing
I can think of is some timeout, but these would be inherently racy.  There
are times when the system is under heavy load and _everything_ goes slower.
 I don't see how we could enforce that workers don't keep LocalStorage
locked for long enough that UI threads become affected and still keep
behavior deterministic from the web developers point of view.

That said, IF (and this is a big if) we decide to make localStorage NOT have
any run to completion semantics across event loops and instead add in an
async interface for atomic access to localStorage, then this would work,
however.


 Drew suggested isolating the worker's access to a separate storage 'arena'
 so that there wouldn't be shared, synchronous access between the page
 context and a worker context.  This way the synchronous Storage API can be
 used essentially unchanged without having to deal with the more nasty parts
 of synchronization.


Drew, the most important thing to the developers I talked to is that they
need _some_ storage directly accessible to workers.  Many were thinking in
terms of a shared worker syncing and doing the bulk of the processing and
then one or more pages acting as thin clients.  Some of them wanted to use
it like shared memory (i.e. workers syncing and doing some processing and
then storing information in storage; pages then display based on that data).
 This is especially interesting because storage events can be used to
trigger updates to content.

As I think about it, I suppose most of the use cases could actually be
solved by a storage for workers, whether or not pages can also access it.
 If the burden were low, it would be nice to make them accessible though.
 If we did this, keeping some form of storage events would be nice.

Note that most of the developers I talked to thought LocalStorage was not
powerful enough for their needs, but since it's the only API supported by
all the browsers they figured they were stuck with it, so they'd have to
find _some_ way to make it work.

Somaybe it doesn't make sense to spend a lot of effort making
LocalStorage (in some form or another) work in workers.  Maybe we should
instead put our effort into something like WebSimpleDatabase?

J


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jonas Sicking
On Wed, Sep 16, 2009 at 12:58 PM, Drew Wilson atwil...@google.com wrote:
 I'm saying that an async API is overkill and unwieldy if all you need is
 WorkerLocalStorage.
 If you're going to route your localstorage access through an async API
 anyway, then you might as well proxy it to the parent page - there's very
 little advantage to doing it otherwise, other than access to lexically
 scoped resources from within your callback.

Actually, there's a pretty big difference. With the current state of
affairs, if a worker wants to make a computation based on values in
the localStorage, and store the result in localStorage, this is
extremely hard.

For example, say that a worker want to perform the following operation:

localStorage.result = F(localStorage.n);

where F(n) is the n:th value in the Fibonacci sequence.

To do this today the worker would first have to call the main window
to get the localStoreage.n value. It could then calculate the result
of F(localStorage.n). It would then send a message to the main window
to store the result in localStorage.result. However in the meantime
localStorage.n might have changed, which causes an inconsistent state.

So instead the worker has to send both the value of localStorage.n as
well as the result to the window. The window can then check if
localStorage.n has changed. If it has changed, the window has to send
the new value back to the worker, and then the worker has to redo its
calculation.

This has several problems. It's bug prone since the developer might
not realize the race condition. It's very hard to do correctly. And
even when done correctly risks wasting a lot of cycles.

An alternative solution is to do all calculations in the main window,
which has synchronous access to localStorage. But the whole point of a
worker is to avoid having to do heavy work in the window.

However, with the solution Jeremy proposed, calculating the above
algorithm can be done in the worker after the worker while the worker
is inside the callback and thus have synchronous access to
localStorage.

Say that instead of calculating Fibonacci numbers, we were storing a
database of emails in localStorage, and using a worker to synchronize
that database to a server. In this case it seems extermely complex to
have to communicate asynchronously through the window and deal with
race conditions where the user is modifying the email database at the
same time.

/ Jonas


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jeremy Orlow
On Wed, Sep 16, 2009 at 2:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Sep 16, 2009 at 12:58 PM, Drew Wilson atwil...@google.com wrote:
  I'm saying that an async API is overkill and unwieldy if all you need is
  WorkerLocalStorage.
  If you're going to route your localstorage access through an async API
  anyway, then you might as well proxy it to the parent page - there's very
  little advantage to doing it otherwise, other than access to lexically
  scoped resources from within your callback.

 Actually, there's a pretty big difference. With the current state of
 affairs, if a worker wants to make a computation based on values in
 the localStorage, and store the result in localStorage, this is
 extremely hard.

 For example, say that a worker want to perform the following operation:

 localStorage.result = F(localStorage.n);

 where F(n) is the n:th value in the Fibonacci sequence.

 To do this today the worker would first have to call the main window
 to get the localStoreage.n value. It could then calculate the result
 of F(localStorage.n). It would then send a message to the main window
 to store the result in localStorage.result. However in the meantime
 localStorage.n might have changed, which causes an inconsistent state.

 So instead the worker has to send both the value of localStorage.n as
 well as the result to the window. The window can then check if
 localStorage.n has changed. If it has changed, the window has to send
 the new value back to the worker, and then the worker has to redo its
 calculation.

 This has several problems. It's bug prone since the developer might
 not realize the race condition. It's very hard to do correctly. And
 even when done correctly risks wasting a lot of cycles.

 An alternative solution is to do all calculations in the main window,
 which has synchronous access to localStorage. But the whole point of a
 worker is to avoid having to do heavy work in the window.

 However, with the solution Jeremy proposed, calculating the above
 algorithm can be done in the worker after the worker while the worker
 is inside the callback and thus have synchronous access to
 localStorage.

 Say that instead of calculating Fibonacci numbers, we were storing a
 database of emails in localStorage, and using a worker to synchronize
 that database to a server. In this case it seems extermely complex to
 have to communicate asynchronously through the window and deal with
 race conditions where the user is modifying the email database at the
 same time.


True.

The problem is that some page from the same origin might also try to access
LocalStorage.  If it does, it'll block the entire event loop until the
worker is finished.  I can't think of how to fix this in a way that's not
racy.  My originally proposal was written in the hope that developers would
be more cautious since they're doing things inside an async callback, but
the more I think about it, the more I think this isn't realistic.

I think we have 3 options:

1) Create a LocalStorage like API that can only be accessed in an async way
via pages (kind of like WebDatabase).

2) Remove any atomicity/consistency guarantees from synchronous LocalStorage
access within pages (like IE8 currently does) and add an async interface for
when pages do need atomicity/consistency.

3) Come up with a completely different storage API that all the browser
vendors are willing to implement that only allows Async access from within
pages.  WebSimpleDatabase might be a good starting point for this.


1 is probably the simplest to implement, but it seems pretty hacky and it's
likely not powerful enough for many advanced web apps (offline web mail
would be an example).  If we do 2, many (most?) web developers will just use
the sync interface and write racy apps.  3 will take the longest time to do,
but is definitely the best long term solution.


Do others agree with my list?  What's the best option out of these?
 Honestly, I'm kind of leaning towards 3 at this point.

J


Re: [whatwg] Spec comments, section 4.11

2009-09-16 Thread Kevin Benson
On Wed, Sep 16, 2009 at 4:30 PM, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Mon, Sep 14, 2009 at 7:52 AM, Ian Hickson i...@hixie.ch wrote:
 Tool bar appears to be the historically correct term, toolbar seems to
 be a new spelling.

 I can't recall ever seeing tool bar before.  [toolbar] has
 177,000,000 hits on Google, [tool bar] has 2,350,000.  And most of
 the top hits for the latter are either actually toolbar (Google
 cleverly corrects my (mis)spelling).


Just for the sake of unscientific Google hits comparison:

It would _seem_ that browser terminology has embraced toolbar

[browser toolbar] has 64,900,000
[browser tool bar] has 3,200,000

whereas applications and programs favor references to tool bar

[application tool bar] has 186,000
[application toolbar] has 49,300

[program tool bar] has 109,000
[program toolbar] has 26,000

-- 
-- 
   --
   --
   ô¿ô¬
K e V i N
   /¯\


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jonas Sicking
On Wed, Sep 16, 2009 at 2:56 PM, Jeremy Orlow jor...@chromium.org wrote:
 On Wed, Sep 16, 2009 at 2:27 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Sep 16, 2009 at 12:58 PM, Drew Wilson atwil...@google.com wrote:
  I'm saying that an async API is overkill and unwieldy if all you need is
  WorkerLocalStorage.
  If you're going to route your localstorage access through an async API
  anyway, then you might as well proxy it to the parent page - there's
  very
  little advantage to doing it otherwise, other than access to lexically
  scoped resources from within your callback.

 Actually, there's a pretty big difference. With the current state of
 affairs, if a worker wants to make a computation based on values in
 the localStorage, and store the result in localStorage, this is
 extremely hard.

 For example, say that a worker want to perform the following operation:

 localStorage.result = F(localStorage.n);

 where F(n) is the n:th value in the Fibonacci sequence.

 To do this today the worker would first have to call the main window
 to get the localStoreage.n value. It could then calculate the result
 of F(localStorage.n). It would then send a message to the main window
 to store the result in localStorage.result. However in the meantime
 localStorage.n might have changed, which causes an inconsistent state.

 So instead the worker has to send both the value of localStorage.n as
 well as the result to the window. The window can then check if
 localStorage.n has changed. If it has changed, the window has to send
 the new value back to the worker, and then the worker has to redo its
 calculation.

 This has several problems. It's bug prone since the developer might
 not realize the race condition. It's very hard to do correctly. And
 even when done correctly risks wasting a lot of cycles.

 An alternative solution is to do all calculations in the main window,
 which has synchronous access to localStorage. But the whole point of a
 worker is to avoid having to do heavy work in the window.

 However, with the solution Jeremy proposed, calculating the above
 algorithm can be done in the worker after the worker while the worker
 is inside the callback and thus have synchronous access to
 localStorage.

 Say that instead of calculating Fibonacci numbers, we were storing a
 database of emails in localStorage, and using a worker to synchronize
 that database to a server. In this case it seems extermely complex to
 have to communicate asynchronously through the window and deal with
 race conditions where the user is modifying the email database at the
 same time.

 True.
 The problem is that some page from the same origin might also try to access
 LocalStorage.  If it does, it'll block the entire event loop until the
 worker is finished.  I can't think of how to fix this in a way that's not
 racy.  My originally proposal was written in the hope that developers would
 be more cautious since they're doing things inside an async callback, but
 the more I think about it, the more I think this isn't realistic.
 I think we have 3 options:
 1) Create a LocalStorage like API that can only be accessed in an async way
 via pages (kind of like WebDatabase).
 2) Remove any atomicity/consistency guarantees from synchronous LocalStorage
 access within pages (like IE8 currently does) and add an async interface for
 when pages do need atomicity/consistency.
 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.

 1 is probably the simplest to implement, but it seems pretty hacky and it's
 likely not powerful enough for many advanced web apps (offline web mail
 would be an example).  If we do 2, many (most?) web developers will just use
 the sync interface and write racy apps.  3 will take the longest time to do,
 but is definitely the best long term solution.

I think 2 is right out. 1 is what we should have done in the first
place if we had thought about the multiple processes thing. The only
thing that's bad about 1 is that we're creating two extremely similar
features. So I'd say 1 is unfortunate rather than hacky.

3 is is something that I personally think we should do no matter what
as I'm not a big fan of the current SQL interface. But I wonder if we
might want to do 1 anyway. After all, localStorage and the SQL APIs
were both suggested. Presumably to allow localStorage to handle the
simple cases and SQL to handle the more complex ones.

/ Jonas


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Robert O'Callahan
On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org wrote:

 1) Create a LocalStorage like API that can only be accessed in an async way
 via pages (kind of like WebDatabase).

 2) Remove any
 atomicity/consistency guarantees from synchronous LocalStorage access within
 pages (like IE8 currently does) and add an async interface for when pages do
 need atomicity/consistency.

 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.


4) Create WorkerStorage so that shared workers have exclusive, synchronous
access to their own persistent storage via an API compatible with
LocalStorage.

This sounds like it has a low implementation cost and solves many use cases
in a very simple way, right?

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jeremy Orlow
On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org wrote:

 1) Create a LocalStorage like API that can only be accessed in an async
 way via pages (kind of like WebDatabase).

 2) Remove any
 atomicity/consistency guarantees from synchronous LocalStorage access within
 pages (like IE8 currently does) and add an async interface for when pages do
 need atomicity/consistency.

 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.


 4) Create WorkerStorage so that shared workers have exclusive, synchronous
 access to their own persistent storage via an API compatible with
 LocalStorage.


Ah yes.  That is also an option.

And, now that I think about it (combined with Jonas' last point) I think it
might be the best option since it has a very low implementation cost, it
keeps the very simple API, and solves the primary problem of not blocking
pages' event loops.


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Drew Wilson
Thanks, Robert - I didn't want to second my own proposal :)
I think that #4 is probably a reasonable bridge API until we come up with a
consensus API for #3. For myself, I see this API as being very useful for
persistent workers (yes, I'm still banging that drum :).

-atw

On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org wrote:

 1) Create a LocalStorage like API that can only be accessed in an async
 way via pages (kind of like WebDatabase).

 2) Remove any
 atomicity/consistency guarantees from synchronous LocalStorage access within
 pages (like IE8 currently does) and add an async interface for when pages do
 need atomicity/consistency.

 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.


 4) Create WorkerStorage so that shared workers have exclusive, synchronous
 access to their own persistent storage via an API compatible with
 LocalStorage.

 This sounds like it has a low implementation cost and solves many use cases
 in a very simple way, right?

 Rob
 --
 He was pierced for our transgressions, he was crushed for our iniquities;
 the punishment that brought us peace was upon him, and by his wounds we are
 healed. We all, like sheep, have gone astray, each of us has turned to his
 own way; and the LORD has laid on him the iniquity of us all. [Isaiah
 53:5-6]



Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jonas Sicking
On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan rob...@ocallahan.org wrote:
 On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org wrote:

 1) Create a LocalStorage like API that can only be accessed in an async
 way via pages (kind of like WebDatabase).
 2) Remove any
 atomicity/consistency guarantees from synchronous LocalStorage access within
 pages (like IE8 currently does) and add an async interface for when pages do
 need atomicity/consistency.
 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.

 4) Create WorkerStorage so that shared workers have exclusive, synchronous
 access to their own persistent storage via an API compatible with
 LocalStorage.

I think some of the use cases require that code running in Window
objects can access the same storage area though. Consider for example
an email web app that uses a WorkerStorage area for to store email
data locally (for performance and for offline support), and then uses
a worker to synchronize that with the server.

Here the code running in the Window wants to access the storage in
order to render the emails in the page, and the worker wants to access
it to synchronize with the server.

See my email earlier in this thread. If we change the name from
'clientStorage' to 'workerStorage', while still allowing the main
window to asynchronously get a reference to the storage, then I think
that about matches what you're proposing (and what item 1 is
proposing).

/ Jonas


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jeremy Orlow
On Wed, Sep 16, 2009 at 3:32 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:
  On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org
 wrote:
 
  1) Create a LocalStorage like API that can only be accessed in an async
  way via pages (kind of like WebDatabase).
  2) Remove any
  atomicity/consistency guarantees from synchronous LocalStorage access
 within
  pages (like IE8 currently does) and add an async interface for when
 pages do
  need atomicity/consistency.
  3) Come up with a completely different storage API that all the browser
  vendors are willing to implement that only allows Async access from
 within
  pages.  WebSimpleDatabase might be a good starting point for this.
 
  4) Create WorkerStorage so that shared workers have exclusive,
 synchronous
  access to their own persistent storage via an API compatible with
  LocalStorage.

 I think some of the use cases require that code running in Window
 objects can access the same storage area though. Consider for example
 an email web app that uses a WorkerStorage area for to store email
 data locally (for performance and for offline support), and then uses
 a worker to synchronize that with the server.

 Here the code running in the Window wants to access the storage in
 order to render the emails in the page, and the worker wants to access
 it to synchronize with the server.

 See my email earlier in this thread. If we change the name from
 'clientStorage' to 'workerStorage', while still allowing the main
 window to asynchronously get a reference to the storage, then I think
 that about matches what you're proposing (and what item 1 is
 proposing).


Code wise, what Robert suggested is MUCH simpler.  Almost for free in
WebKit.  Creating an asynchronous access method and exposing this in the
page is much more complex.  It also defeats the main purpose of LocalStorage
(which is to be a simple, light weight way to store data).

I certainly agree that having some shared memory format between workers and
pages would be good, and there's some use cases which would
certainly benefit, but most of the developers I've talked to so far were
mostly concerned about having _some_ form of storage and the shared memory
aspects were more nice to have.

J


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Aaron Boodman
On Wed, Sep 16, 2009 at 3:36 PM, Jeremy Orlow jor...@chromium.org wrote:

 Code wise, what Robert suggested is MUCH simpler.  Almost for free in
 WebKit.  Creating an asynchronous access method and exposing this in the
 page is much more complex.  It also defeats the main purpose of LocalStorage
 (which is to be a simple, light weight way to store data).


I do not buy that creating an asynchronous access method and exposing this
in the page ... defeats the main purpose of LocalStorage (which is to be a
simple, light weight way to store data)

Having one async callback doesn't make the API hard to use. Callbacks are
easy to work with in JS. Adding one is not the end of the world by a long
shot.

That said, I suppose it is probably wise to chase down option 3), if people
are motivated, so that we don't end up with *three* name/value storage APIs.

- a


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jonas Sicking
On Wed, Sep 16, 2009 at 3:36 PM, Jeremy Orlow jor...@chromium.org wrote:
 On Wed, Sep 16, 2009 at 3:32 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:
  On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org
  wrote:
 
  1) Create a LocalStorage like API that can only be accessed in an async
  way via pages (kind of like WebDatabase).
  2) Remove any
  atomicity/consistency guarantees from synchronous LocalStorage access
  within
  pages (like IE8 currently does) and add an async interface for when
  pages do
  need atomicity/consistency.
  3) Come up with a completely different storage API that all the browser
  vendors are willing to implement that only allows Async access from
  within
  pages.  WebSimpleDatabase might be a good starting point for this.
 
  4) Create WorkerStorage so that shared workers have exclusive,
  synchronous
  access to their own persistent storage via an API compatible with
  LocalStorage.

 I think some of the use cases require that code running in Window
 objects can access the same storage area though. Consider for example
 an email web app that uses a WorkerStorage area for to store email
 data locally (for performance and for offline support), and then uses
 a worker to synchronize that with the server.

 Here the code running in the Window wants to access the storage in
 order to render the emails in the page, and the worker wants to access
 it to synchronize with the server.

 See my email earlier in this thread. If we change the name from
 'clientStorage' to 'workerStorage', while still allowing the main
 window to asynchronously get a reference to the storage, then I think
 that about matches what you're proposing (and what item 1 is
 proposing).

 Code wise, what Robert suggested is MUCH simpler.  Almost for free in
 WebKit.  Creating an asynchronous access method and exposing this in the
 page is much more complex.  It also defeats the main purpose of LocalStorage
 (which is to be a simple, light weight way to store data).

The only difference between Roberts and my suggestion is that I'm also
adding a asynch accessor in the window. That doesn't seem to make it
MUCH simpler, or am I missing something?

I do agree that some of the additional optional
multiple-differently-named storage area does add additional
complexity, and maybe we should defer that to something like the
WebSimpleStorage spec.

 I certainly agree that having some shared memory format between workers and
 pages would be good, and there's some use cases which would
 certainly benefit, but most of the developers I've talked to so far were
 mostly concerned about having _some_ form of storage and the shared memory
 aspects were more nice to have.

What would the specifics of a worker-only storage be? Can multiple
different workers access it? (In which case they need to be protected
by a mutex). Is there one storage per worker URL? Or do all workers
from a particular domain share the same workerStorage?

I'm also wondering what the use-cases for a worker-only storage is?

/ Jonas


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Michael Nordman
On Wed, Sep 16, 2009 at 3:30 PM, Jeremy Orlow jor...@chromium.org wrote:

 On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.orgwrote:

 1) Create a LocalStorage like API that can only be accessed in an async
 way via pages (kind of like WebDatabase).

 2) Remove any
 atomicity/consistency guarantees from synchronous LocalStorage access within
 pages (like IE8 currently does) and add an async interface for when pages do
 need atomicity/consistency.

 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.


 4) Create WorkerStorage so that shared workers have exclusive, synchronous
 access to their own persistent storage via an API compatible with
 LocalStorage.


 Ah yes.  That is also an option.

 And, now that I think about it (combined with Jonas' last point) I think it
 might be the best option since it has a very low implementation cost, it
 keeps the very simple API, and solves the primary problem of not blocking
 pages' event loops.


But it fails to solve the problem of a providing a shared storage repository
for the applications use, which at least to me is the real primary goal.


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jeremy Orlow
On Wed, Sep 16, 2009 at 4:05 PM, Michael Nordman micha...@google.comwrote:



 On Wed, Sep 16, 2009 at 3:30 PM, Jeremy Orlow jor...@chromium.org wrote:

 On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan 
 rob...@ocallahan.orgwrote:

 On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.orgwrote:

 1) Create a LocalStorage like API that can only be accessed in an async
 way via pages (kind of like WebDatabase).

 2) Remove any
 atomicity/consistency guarantees from synchronous LocalStorage access 
 within
 pages (like IE8 currently does) and add an async interface for when pages 
 do
 need atomicity/consistency.

 3) Come up with a completely different storage API that all the browser
 vendors are willing to implement that only allows Async access from within
 pages.  WebSimpleDatabase might be a good starting point for this.


 4) Create WorkerStorage so that shared workers have exclusive,
 synchronous access to their own persistent storage via an API compatible
 with LocalStorage.


 Ah yes.  That is also an option.

 And, now that I think about it (combined with Jonas' last point) I think
 it might be the best option since it has a very low implementation cost, it
 keeps the very simple API, and solves the primary problem of not blocking
 pages' event loops.


 But it fails to solve the problem of a providing a shared storage
 repository for the applications use, which at least to me is the real
 primary goal.


Is it?  Can you provide some use cases?  :-)

As I stated, my conversations with developers led me to believe having
access to storage within workers is most important and that having shared
memory between pages and workers would make things easier on some of them.
 In other words, from my talks, it's a secondary goal.


On Wed, Sep 16, 2009 at 3:57 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Sep 16, 2009 at 3:36 PM, Jeremy Orlow jor...@chromium.org wrote:
  Code wise, what Robert suggested is MUCH simpler.  Almost for free in
  WebKit.  Creating an asynchronous access method and exposing this in the
  page is much more complex.  It also defeats the main purpose of
 LocalStorage
  (which is to be a simple, light weight way to store data).

 The only difference between Roberts and my suggestion is that I'm also
 adding a asynch accessor in the window. That doesn't seem to make it
 MUCH simpler, or am I missing something?


Well, doing just a sync interface is MUCH simpler, but I suppose there's
no reason not to add both to the spec.  To be clear, though, adding the sync
interface to workers would be a much higher priority for me than the async
interface.  Enough so that there's a chance we'd ship a version of Chrome
that did not yet implement the async interface.  That seems OK to me,
though.


 I do agree that some of the additional optional
 multiple-differently-named storage area does add additional
 complexity, and maybe we should defer that to something like the
 WebSimpleStorage spec.

  I certainly agree that having some shared memory format between workers
 and
  pages would be good, and there's some use cases which would
  certainly benefit, but most of the developers I've talked to so far were
  mostly concerned about having _some_ form of storage and the shared
 memory
  aspects were more nice to have.

 What would the specifics of a worker-only storage be? Can multiple
 different workers access it? (In which case they need to be protected
 by a mutex). Is there one storage per worker URL? Or do all workers
 from a particular domain share the same workerStorage?

 I'm also wondering what the use-cases for a worker-only storage is?


The use cases all revolve around having a backend in a worker that handles
offline and/or caching.  It could either feed its data to the page via
messages or shared memory.  The former requires at least worker-only and the
latter requires storage shared between the worker and the page.  The latter
is technically an optimization, but I agree that it's a fairly major one.

J


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Robert O'Callahan
On Thu, Sep 17, 2009 at 11:17 AM, Jeremy Orlow jor...@chromium.org wrote:

 The use cases all revolve around having a backend in a worker that handles
 offline and/or caching.  It could either feed its data to the page via
 messages or shared memory.  The former requires at least worker-only and the
 latter requires storage shared between the worker and the page.  The latter
 is technically an optimization, but I agree that it's a fairly major one.


I don't think copying data from a worker to a page through any kind of
database is going to outperform copying Javascript objects or even
serializing to strings and then deserializing. You don't even necessarily
need to copy all the JS objects passed from one thread to another if you're
willing to do some COW or other tricks.

Maybe I'm wrong, but at least it seems a premature optimization to declare
that shared database storage between page and worker is necessary for
performance.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Michael Nordman
 Is it?  Can you provide some use cases?  :-)
Um...sure... an app sets up a shared worker whose function it is to sync
up/down changes to the data the application manages...

* pageA makes changes, other pageB sees it by virtue of an event and
reflects change it it view of the world... worker sees the change to by
virtue of the same event and pushes it up.

* worker receive delta from server... and makes the change locally... pageA
and B see that by virtue of the event.


What is the use case for silo'd worker storage?


Re: [whatwg] LocalStorage in workers

2009-09-16 Thread Jeremy Orlow
On Wed, Sep 16, 2009 at 4:47 PM, Michael Nordman micha...@google.comwrote:

  Is it?  Can you provide some use cases?  :-)
 Um...sure... an app sets up a shared worker whose function it is to sync
 up/down changes to the data the application manages...

 * pageA makes changes, other pageB sees it by virtue of an event and
 reflects change it it view of the world... worker sees the change to by
 virtue of the same event and pushes it up.

 * worker receive delta from server... and makes the change locally... pageA
 and B see that by virtue of the event.


 What is the use case for silo'd worker storage?


I mentioned this earlier and also explained that a work-around is to do this
via message passing rather than shared memory.  As I explained in a couple
emails, shared memory is just an optimization.  And, as Robert explained,
it's not ever clear whether it's a performance optimization or not...it
might just be a simpler way to program.

When I asked if you had any use cases, I was asking whether there were any
use cases that could not be solved efficiently/reasonably elegantly by
worker-only storage.


Re: [whatwg] Surrogate pairs and character references

2009-09-16 Thread Øistein E . Andersen

It is much clearer now.  Thanks.  Just a few minor issues:

Bytes or sequences of bytes in the original byte stream that could  
not be converted to Unicode characters must be converted to U+FFFD  
REPLACEMENT CHARACTER code points.


With the new definition of Unicode characters as Unicode scalar  
values, this excludes surrogate code points, which are also handled  
separately (and cause a parse error) in the step quoted below.  You  
may want to say Unicode code points rather than Unicode characters.


U+FFFD REPLACEMENT CHARACTERs is sufficient, used elsewhere and  
probably reads better than U+FFFD REPLACEMENT CHARACTER code points.
All U+ NULL characters and code points in the range U+D800 to U 
+DFFF in the input must be replaced by U+FFFD REPLACEMENT  
CHARACTERs. Any occurrences of such characters and code points are  
parse errors.


The phrase characters and code points (in the second sentence) is  
awkward given that all characters are in fact code points.


--
Øistein E. Andersen