Re: [whatwg] StringEncoding: Allowed encodings for TextEncoder

2012-08-10 Thread Jonas Sicking
On Thu, Aug 9, 2012 at 10:42 AM, Joshua Bell  wrote:
> On Wed, Aug 8, 2012 at 9:03 AM, Joshua Bell  wrote:
>
>>
>>
>> On Wed, Aug 8, 2012 at 2:48 AM, James Graham  wrote:
>>
>>> On 08/07/2012 07:51 PM, Jonas Sicking wrote:
>>>
>>>  I don't mind supporting *decoding* from basically any encoding that
 Anne's spec enumerates. I don't see a downside with that since I
 suspect most implementations will just call into a generic decoding
 backend anyway, and so supporting the same set of encodings as for
 other parts of the platform should be relatively easy.

>>>
>>> [...]
>>>
>>>
>>>  However I think we should consider restricting support to a smaller
 set of encodings for while *encoding*. There should be little reason
 for people today to produce text in non-utf formats. We might even be
 able to get away with only supporting UTF8, though I wouldn't be
 surprised if there are reasonably modern file formats which use utf16.

>>>
>>> FWIW, I agree with the decode-from-all-platform-**encodings
>>> encode-to-utf[8|16] position.
>>>
>>
>> Any disagreement on limiting the supported encodings to utf-8, utf-16, and
>> utf-16be, while permitting decoding of all encodings in the Encoding spec?
>>
>> (This eliminates the "what to do on encoding error" issue nicely, still
>> need to resolve the BOM issue though.)
>>
>
> http://wiki.whatwg.org/wiki/StringEncoding has been updated to restrict the
> supported encodings for encoding to UTF-8, UTF-16 and UTF-16BE.
>
> I'm tempted to take it further to just UTF-8 and see if anyone complains.
>
> Jury is still out on the decode-with-BOM issue - I need to reason through
> Glenn's suggestions on the "open issues" thread.
>
> I added a related open issue raised by Glenn, summarized as "... suggest
> that the .encoding attribute simply return the name that was passed to
> the constructor." - taking this further, perhaps the attribute should be
> eliminated as callers could apply it themselves.

The spec now contains the following text:

"NOTE: Because only UTF encodings are supported, and because of the
algorithm used to convert a DOMString to a sequence of Unicode
characters, no input can cause the encoding process to emit an encoder
error."

This is not correct. A DOMString is not a sequence of Unicode
characters, it's a UTF16 encoded string (this is per EcmaScript). Thus
it can contain unpaired surrogates and so the encoding process can
result in encoder errors.

As I've suggested earlier, I think we should deal with this by simply
emitting Unicode replacement characters for these encoder errors (i.e.
for unpaired surrogates).

/ Jonas


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Kang-Hao (Kenny) Lu
(12/08/11 8:41), Erik Reppen wrote:
> Thanks Hugh. I had mistakenly been thinking of XHTML5 as something that
> never happened rather than merely HTML5 served as XML which hadn't really
> occurred to me as being a viable option. I look forward to messing with
> this. This is precisely what I wanted to be able to do.

Yep. I would encourage you to play with XHTML5 (application/xhtml+xml)
more and report bugs to browsers. When I still had interest in
application/xhtml+xml (back in 2007?), I got troubled by all the
differences in the DOM APIs. I think currently most JS framework
probably doesn't support XHTML5.

After playing XHTML5, if you still think browsers should implement yet
another mode, you should probably say why XHTML5 is bad and why you
don't just use it.

If you have proposals for how some of the DOM APIs in XHTML5 should
work, you might want to follow the instruction on the top of relevant
specs (DOM Parsing and Serialization[1] basically) and send feedback.


[1] http://html5.org/specs/dom-parsing.html



Cheers,
Kenny
-- 
Web Specialist, Oupeng Browser, Beijing
Try Oupeng: http://www.oupeng.com/


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
Thanks Hugh. I had mistakenly been thinking of XHTML5 as something that
never happened rather than merely HTML5 served as XML which hadn't really
occurred to me as being a viable option. I look forward to messing with
this. This is precisely what I wanted to be able to do.

On Fri, Aug 10, 2012 at 7:28 PM, Hugh Guiney  wrote:

> On Fri, Aug 10, 2012 at 8:06 PM, Erik Reppen 
> wrote:
> > Sorry if this double-posted but I think I forgot to CC the list.
> >
> > Browser vendor politics I can understand but if we're going to talk about
> > what "history shows" about people like myself suggesting features we
> can't
> > actually support I'd like to see some studies that contradict the
> > experiences I've had as a web ui developer for the last five years.
> >
> > Everybody seems on board with providing a JavaScript strict mode. How is
> > this any different? Do people blame the vendors when vars they try to
> > define without a var keyword break their strict-mode code? Do we fret
> about
> > all the js out there that's not written in strict mode?
> >
> > And HTML5 has found the key to eliminating the political issue, I should
> > think. Don't just worry about the rules for when the authors get it
> right.
> > Explicitly spell out the rules for how to handle it when they get it
> wrong.
> > How can you blame the browser for strict mode face plants when every
> modern
> > browser including IE goes about face-planting in exactly the same way?
> >
> > Sure, I could integrate in-editor validation into my process, but why add
> > to bloat to any number of tools I might be  using for any number of
> > different stacks when we had something I know worked for a lot of
> > developers who were all as confused as I was when people inexplicably
> > started shouting about XHTML strict's "failure" from the rooftops.
> >
> > Is there some unspoken concern here? If there is, I'll shut up and try to
> > find out what it is through other means but I really don't see the logic
> in
> > not having some strict provision for authors who want it. How hard is it
> to
> > plug in an XML validator and rip out the namespace bits if that's not
> > something we want to deal with just yet and propose a set of behaviors
> for
> > when your HTML5 isn't compliant with a stricter syntax?
> >
> > Because yes, these bugs can be kinda nasty when you don't think to check
> to
> > make sure your HTML is well-formed and it's the kind of stuff that can
> > easily slide into production as difficult-to-diagnose edge-cases. Believe
> > me. Front-liner here. It's an issue. Markup is where presentation,
> > behavior, content, client-side, and server-side meet. I'm comfortable
> with
> > letting people embrace their own philosophies but I like my markup to be
> > done right in the first place and visible breakage or at least browser
> > console error messages is the easiest and most obvious way to discover
> that
> > it isn't. And I developed that philosophy from my experience moving from
> > less strict to strict markup, not just toeing some weird technorati
> > political line or zeitgeist.
>
> There's nothing stopping you from writing XHTML5. This is what I do
> for exactly the reasons you describe. If you write polyglot documents,
> you can use application/xhtml+xml in development, serve text/html in
> production and still sleep at night. Hell, these days you can even
> serve application/xhtml+xml in production if IE < 9 isn't a
> significant market segment for you.
>


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Hugh Guiney
On Fri, Aug 10, 2012 at 8:06 PM, Erik Reppen  wrote:
> Sorry if this double-posted but I think I forgot to CC the list.
>
> Browser vendor politics I can understand but if we're going to talk about
> what "history shows" about people like myself suggesting features we can't
> actually support I'd like to see some studies that contradict the
> experiences I've had as a web ui developer for the last five years.
>
> Everybody seems on board with providing a JavaScript strict mode. How is
> this any different? Do people blame the vendors when vars they try to
> define without a var keyword break their strict-mode code? Do we fret about
> all the js out there that's not written in strict mode?
>
> And HTML5 has found the key to eliminating the political issue, I should
> think. Don't just worry about the rules for when the authors get it right.
> Explicitly spell out the rules for how to handle it when they get it wrong.
> How can you blame the browser for strict mode face plants when every modern
> browser including IE goes about face-planting in exactly the same way?
>
> Sure, I could integrate in-editor validation into my process, but why add
> to bloat to any number of tools I might be  using for any number of
> different stacks when we had something I know worked for a lot of
> developers who were all as confused as I was when people inexplicably
> started shouting about XHTML strict's "failure" from the rooftops.
>
> Is there some unspoken concern here? If there is, I'll shut up and try to
> find out what it is through other means but I really don't see the logic in
> not having some strict provision for authors who want it. How hard is it to
> plug in an XML validator and rip out the namespace bits if that's not
> something we want to deal with just yet and propose a set of behaviors for
> when your HTML5 isn't compliant with a stricter syntax?
>
> Because yes, these bugs can be kinda nasty when you don't think to check to
> make sure your HTML is well-formed and it's the kind of stuff that can
> easily slide into production as difficult-to-diagnose edge-cases. Believe
> me. Front-liner here. It's an issue. Markup is where presentation,
> behavior, content, client-side, and server-side meet. I'm comfortable with
> letting people embrace their own philosophies but I like my markup to be
> done right in the first place and visible breakage or at least browser
> console error messages is the easiest and most obvious way to discover that
> it isn't. And I developed that philosophy from my experience moving from
> less strict to strict markup, not just toeing some weird technorati
> political line or zeitgeist.

There's nothing stopping you from writing XHTML5. This is what I do
for exactly the reasons you describe. If you write polyglot documents,
you can use application/xhtml+xml in development, serve text/html in
production and still sleep at night. Hell, these days you can even
serve application/xhtml+xml in production if IE < 9 isn't a
significant market segment for you.


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread David Bruant

Le 10/08/2012 20:06, Erik Reppen a écrit :

Sorry if this double-posted but I think I forgot to CC the list.

Browser vendor politics I can understand but if we're going to talk about
what "history shows" about people like myself suggesting features we can't
actually support I'd like to see some studies that contradict the
experiences I've had as a web ui developer for the last five years.

Everybody seems on board with providing a JavaScript strict mode. How is
this any different?
JavaScript strict mode enables to make a JavaScript program secure 
(there is some additional work to do, but you can do it yourself as a 
programmer) while it's almost impossible to write a secure program in 
non-strict JavaScript because of scope-violating (indirect) eval. 
JavaScript strict mode is almost a different language with that regard.
The ability to write securable JavaScript required an intervention at 
the language level.


HTML has no such thing to win with a strict mode as far as I know.
Also, JS strict mode deals with runtime and not syntax (with statement 
aside). That's far different from what could be expected from an HTML 
strict mode.


To some extent, CSP ("Content Security Policy", about to reach 
Recommandation stage soon) is your "HTML strict mode" if you care about 
security.


David


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Tab Atkins Jr.
On Fri, Aug 10, 2012 at 5:02 PM, Erik Reppen  wrote:
> Browser vendor politics I can understand but if we're going to talk about
> what "history shows" about people like myself suggesting features we can't
> actually support I'd like to see some studies that contradict the
> experiences I've had as a web ui developer for the last five years.

I don't wish to spend the time to dig up the studies showing that 95%
or so of XML served as text/html is invalid XML, but others probably
have the links easily at hand.

> Everybody seems on board with providing a JavaScript strict mode. How is
> this any different? Do people blame the vendors when vars they try to define
> without a var keyword break their strict-mode code? Do we fret about all the
> js out there that's not written in strict mode?

JS is a substantially different story, because you virtually never
display a page which runs Javascript code from many arbitrary users of
the website.  As well, JS is commonly written only by the more
advanced maintainers of a given website.  Finally, JS errors are
localized to the scripts in which they are located.

This is substantially different from HTML, which is commonly composed
of lots of user comments which are barely, if at all, escaped for
security.  As well, HTML is commonly written by a number of users of
varying skill levels in a given site; in my previous experience as a
web dev, for example, people from the advertising department often
wrote snippets of HTML into pages, though their skills were far below
mine.  Finally, HTML doesn't have the same localizing barriers as JS
does with 

Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
Sorry if this double-posted but I think I forgot to CC the list.

Browser vendor politics I can understand but if we're going to talk about
what "history shows" about people like myself suggesting features we can't
actually support I'd like to see some studies that contradict the
experiences I've had as a web ui developer for the last five years.

Everybody seems on board with providing a JavaScript strict mode. How is
this any different? Do people blame the vendors when vars they try to
define without a var keyword break their strict-mode code? Do we fret about
all the js out there that's not written in strict mode?

And HTML5 has found the key to eliminating the political issue, I should
think. Don't just worry about the rules for when the authors get it right.
Explicitly spell out the rules for how to handle it when they get it wrong.
How can you blame the browser for strict mode face plants when every modern
browser including IE goes about face-planting in exactly the same way?

Sure, I could integrate in-editor validation into my process, but why add
to bloat to any number of tools I might be  using for any number of
different stacks when we had something I know worked for a lot of
developers who were all as confused as I was when people inexplicably
started shouting about XHTML strict's "failure" from the rooftops.

Is there some unspoken concern here? If there is, I'll shut up and try to
find out what it is through other means but I really don't see the logic in
not having some strict provision for authors who want it. How hard is it to
plug in an XML validator and rip out the namespace bits if that's not
something we want to deal with just yet and propose a set of behaviors for
when your HTML5 isn't compliant with a stricter syntax?

Because yes, these bugs can be kinda nasty when you don't think to check to
make sure your HTML is well-formed and it's the kind of stuff that can
easily slide into production as difficult-to-diagnose edge-cases. Believe
me. Front-liner here. It's an issue. Markup is where presentation,
behavior, content, client-side, and server-side meet. I'm comfortable with
letting people embrace their own philosophies but I like my markup to be
done right in the first place and visible breakage or at least browser
console error messages is the easiest and most obvious way to discover that
it isn't. And I developed that philosophy from my experience moving from
less strict to strict markup, not just toeing some weird technorati
political line or zeitgeist.

On Fri, Aug 10, 2012 at 5:44 PM, Tab Atkins Jr. wrote:

> On Fri, Aug 10, 2012 at 3:29 PM, Erik Reppen 
> wrote:
> > This confuses me. Why does it matter that other documents wouldn't work
> if
> > you changed the parsing rules they were defined with to stricter
> versions?
> > As far as backwards compatibility, if a strict-defined set of HTML would
> > also work in a less strict context, what could it possibly matter? It's
> only
> > the author's problem to maintain (or switch to a more forgiving mode) and
> > backwards compatibility isn't broken if the same client 500 years from
> now
> > uses the same general HTML mode for both.
> >
> > I think there's a legit need for a version or some kind of mode for HTML5
> > that assumes you're a pro and breaks visibly or throws an error when
> you've
> > done something wrong. Back in the day nobody ever forced authors who
> didn't
> > know what they're doing to use doctypes they were too sloppy to handle. I
> > wasn't aware of any plan to discontinue non-XHTML doctypes. How everybody
> > started thinking of it as a battle for one doctype to rule them all
> makes no
> > sense to me but I'm fine with one doctype. I just want something that
> works
> > in regular HTML5 but that will break in some kind of a strict mode when
> > XML-formatting rules aren't adhered to. You pick degrees of strictness
> based
> > on what works for you. I don't really see a dealbreaking issue here. Why
> > can't we all have it the way we want it?
> >
> > As somebody who deals with some pretty complex UI where the HTML and CSS
> are
> > concerned it's a  problem when things in the rendering context give no
> > indication of breakage, while in the DOM they are in fact getting tripped
> > up. Sure, I can validate and swap out doctypes or just keep running
> stuff in
> > IE8 to see if it breaks until I actually start using HTML5-only tags but
> > this is kind of awkward and suggests something forward-thinking design
> could
> > address don't you think?
>
> As I said, years of evidence have provided strong evidence that a
> large majority of authors cannot guarantee that their pages are valid
> all of the time.  This covers both authoring-time validity and
> validity after including user comments or the like.
>
> If you want a mode that guarantees validity, that already exists -
> it's called "put a validator into your workflow".  Many popular text
> editors offer plugins that validate your markup as you go, as well.

Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Tab Atkins Jr.
On Fri, Aug 10, 2012 at 3:29 PM, Erik Reppen  wrote:
> This confuses me. Why does it matter that other documents wouldn't work if
> you changed the parsing rules they were defined with to stricter versions?
> As far as backwards compatibility, if a strict-defined set of HTML would
> also work in a less strict context, what could it possibly matter? It's only
> the author's problem to maintain (or switch to a more forgiving mode) and
> backwards compatibility isn't broken if the same client 500 years from now
> uses the same general HTML mode for both.
>
> I think there's a legit need for a version or some kind of mode for HTML5
> that assumes you're a pro and breaks visibly or throws an error when you've
> done something wrong. Back in the day nobody ever forced authors who didn't
> know what they're doing to use doctypes they were too sloppy to handle. I
> wasn't aware of any plan to discontinue non-XHTML doctypes. How everybody
> started thinking of it as a battle for one doctype to rule them all makes no
> sense to me but I'm fine with one doctype. I just want something that works
> in regular HTML5 but that will break in some kind of a strict mode when
> XML-formatting rules aren't adhered to. You pick degrees of strictness based
> on what works for you. I don't really see a dealbreaking issue here. Why
> can't we all have it the way we want it?
>
> As somebody who deals with some pretty complex UI where the HTML and CSS are
> concerned it's a  problem when things in the rendering context give no
> indication of breakage, while in the DOM they are in fact getting tripped
> up. Sure, I can validate and swap out doctypes or just keep running stuff in
> IE8 to see if it breaks until I actually start using HTML5-only tags but
> this is kind of awkward and suggests something forward-thinking design could
> address don't you think?

As I said, years of evidence have provided strong evidence that a
large majority of authors cannot guarantee that their pages are valid
all of the time.  This covers both authoring-time validity and
validity after including user comments or the like.

If you want a mode that guarantees validity, that already exists -
it's called "put a validator into your workflow".  Many popular text
editors offer plugins that validate your markup as you go, as well.

The problem with breaking visibly is that it doesn't punish authors,
it punishes *users*, who overwhelmingly blame the browser rather than
the site author when the site won't display for whatever reason.
There's no *benefit* to a browser for doing this; it's much more in
their interest to continue doing error-recovery, because, again,
history suggests very strongly that most authors *who theoretically
want strict parsing* can't actually satisfy the constrains they ask
for.  It's simply better for users to always do soft error-recovery,
no matter what the author claims they want.

~TJ


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
This confuses me. Why does it matter that other documents wouldn't work if
you changed the parsing rules they were defined with to stricter versions?
As far as backwards compatibility, if a strict-defined set of HTML would
also work in a less strict context, what could it possibly matter? It's
only the author's problem to maintain (or switch to a more forgiving mode)
and backwards compatibility isn't broken if the same client 500 years from
now uses the same general HTML mode for both.

I think there's a legit need for a version or some kind of mode for HTML5
that assumes you're a pro and breaks visibly or throws an error when you've
done something wrong. Back in the day nobody ever forced authors who didn't
know what they're doing to use doctypes they were too sloppy to handle. I
wasn't aware of any plan to discontinue non-XHTML doctypes. How everybody
started thinking of it as a battle for one doctype to rule them all makes
no sense to me but I'm fine with one doctype. I just want something that
works in regular HTML5 but that will break in some kind of a strict mode
when XML-formatting rules aren't adhered to. You pick degrees of strictness
based on what works for you. I don't really see a dealbreaking issue here.
Why can't we all have it the way we want it?

As somebody who deals with some pretty complex UI where the HTML and CSS
are concerned it's a  problem when things in the rendering context give no
indication of breakage, while in the DOM they are in fact getting tripped
up. Sure, I can validate and swap out doctypes or just keep running stuff
in IE8 to see if it breaks until I actually start using HTML5-only tags but
this is kind of awkward and suggests something forward-thinking design
could address don't you think?

On Fri, Aug 10, 2012 at 3:05 PM, Tab Atkins Jr. wrote:

> On Fri, Aug 10, 2012 at 12:45 PM, Erik Reppen 
> wrote:
> > My understanding of the general philosophy of HTML5 on the matter of
> > malformed HTML is that it's better to define specific rules concerning
> > breakage rather than overly strict rules about how to do it right in the
> > first place but this is really starting to create pain-points in
> > development.
> >
> > Modern browsers are so good at hiding breakage in rendering now that I
> > sometimes run into things that are just nuking the DOM-node structure on
> > the JS-side of things while everything looks hunky-dorey in rendering and
> > no errors are being thrown.
> >
> > It's like the HTML equivalent of wrapping every function in an empty
> > try/catch statement. For the last year or so I've started using IE8 as my
> > HTML canary when I run into weird problems and I'm not the only dev I've
> > heard of doing this. But what happens when we're no longer supporting IE8
> > and using tags that it doesn't recognize?
> >
> > Why can't we set stricter rules that cause rendering to cease or at
> least a
> > non-interpreter-halting error to be thrown by browsers when the HTML is
> > broken from a nesting/XML-strict-tag-closing perspective if we want?
> Until
> > most of the vendors started lumping XHTML Strict 1.0 into a general
> > "standards" mode that basically worked the same for any declared
> doctype, I
> > thought it was an excellent feature from a development perspective to
> just
> > let bad XML syntax break the page.
> >
> > And if we were able to set such rules, wouldn't it be less work to parse?
> > How difficult would it be to add some sort of opt-in strict mode for
> HTML5
> > that didn't require juggling of doctypes (since that seems to be what the
> > vendors want)?
>
> The parsing rules of HTML aren't set to accommodate old browsers,
> they're set to accommodate old content (which was written for those
> old browsers).  There is an *enormous* corpus of content on the web
> which is officially "invalid" according to various strict definitions,
> and would thus not be displayable in your browser.
>
> As well, experience shows that this isn't an accident, or just due to
> "bad authors".  If you analyze XML sent as text/html on the web,
> something like 95% of it is invalid XML, for lots of different
> reasons.  Even when authors *know* they're using something that's
> supposed to be strict, they screw it up.  Luckily, we ignore the fact
> that it's XML and use good parsing rules to usually extract what the
> author meant.
>
> There are several efforts ongoing to extend this kind of non-strict
> parsing to XML itself, such as the XML-ER (error recovery) Community
> Group in the W3C.  XML failed on the web in part because of its
> strictness - it's very non-trivial to ensure that your page is always
> valid when you're lumping in arbitrary user content as well.
>
> Simplifying the parser to be stricter would not have any significant
> impact on performance.  The vast majority of pages usually pass down
> the fast common path anyway, and most of the "fixes" are very simple
> and fast to apply as well.  Additionally, doing something naive like
> saying "just use

Re: [whatwg] Real-time thread support for workers

2012-08-10 Thread Glenn Maynard
On Thu, Aug 9, 2012 at 1:20 AM, Jussi Kalliokoski <
jussi.kallioko...@gmail.com> wrote:

> On W3C AudioWG we're currently discussing the possibility of having web
> workers that run in a priority/RT thread. This would be highly useful for
> example to keep audio from glitching even under high CPU stress.
>

Realtime work is hard in a nondeterministically GC'd environment.

Be careful about a flag that says "run this thread at higher priority".
People will simply always set it; it makes their code run faster (at the
expense of other pages' workers, who they don't care about).  Once people
start doing that, everyone has to do it.  Limiting this to audio threads
probably won't help--people will spin up fake audio threads in order to get
higher priority for other work.  Limiting the amount of work that can
actually be done would probably help this.  For example, although you may
be giving the thread a timeslice every 10ms, the thread may only have 3ms
to do its work and return before being preempted.  Audio output threads
need to run regularly, but most don't actually do a whole lot of work.

Also, note that actual realtime threads in many OSs (including, last I
knew, both Windows and Linux) have the capacity to take all CPU and hang
the system.  I suspect implementations would play it safe and go no higher
than "high priority", though you could probably do this safely with careful
CPU quotas.

(These are all implementation details, of course.)

Realtime processing is tricky natively; trying to do this in JS on the web
is probably a hard problem.

-- 
Glenn Maynard


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Tab Atkins Jr.
On Fri, Aug 10, 2012 at 12:45 PM, Erik Reppen  wrote:
> My understanding of the general philosophy of HTML5 on the matter of
> malformed HTML is that it's better to define specific rules concerning
> breakage rather than overly strict rules about how to do it right in the
> first place but this is really starting to create pain-points in
> development.
>
> Modern browsers are so good at hiding breakage in rendering now that I
> sometimes run into things that are just nuking the DOM-node structure on
> the JS-side of things while everything looks hunky-dorey in rendering and
> no errors are being thrown.
>
> It's like the HTML equivalent of wrapping every function in an empty
> try/catch statement. For the last year or so I've started using IE8 as my
> HTML canary when I run into weird problems and I'm not the only dev I've
> heard of doing this. But what happens when we're no longer supporting IE8
> and using tags that it doesn't recognize?
>
> Why can't we set stricter rules that cause rendering to cease or at least a
> non-interpreter-halting error to be thrown by browsers when the HTML is
> broken from a nesting/XML-strict-tag-closing perspective if we want? Until
> most of the vendors started lumping XHTML Strict 1.0 into a general
> "standards" mode that basically worked the same for any declared doctype, I
> thought it was an excellent feature from a development perspective to just
> let bad XML syntax break the page.
>
> And if we were able to set such rules, wouldn't it be less work to parse?
> How difficult would it be to add some sort of opt-in strict mode for HTML5
> that didn't require juggling of doctypes (since that seems to be what the
> vendors want)?

The parsing rules of HTML aren't set to accommodate old browsers,
they're set to accommodate old content (which was written for those
old browsers).  There is an *enormous* corpus of content on the web
which is officially "invalid" according to various strict definitions,
and would thus not be displayable in your browser.

As well, experience shows that this isn't an accident, or just due to
"bad authors".  If you analyze XML sent as text/html on the web,
something like 95% of it is invalid XML, for lots of different
reasons.  Even when authors *know* they're using something that's
supposed to be strict, they screw it up.  Luckily, we ignore the fact
that it's XML and use good parsing rules to usually extract what the
author meant.

There are several efforts ongoing to extend this kind of non-strict
parsing to XML itself, such as the XML-ER (error recovery) Community
Group in the W3C.  XML failed on the web in part because of its
strictness - it's very non-trivial to ensure that your page is always
valid when you're lumping in arbitrary user content as well.

Simplifying the parser to be stricter would not have any significant
impact on performance.  The vast majority of pages usually pass down
the fast common path anyway, and most of the "fixes" are very simple
and fast to apply as well.  Additionally, doing something naive like
saying "just use strict XML parsing" is actually *worse* - XML all by
itself is relatively simple, but the addition of namespaces actually
makes it *slower* to parse than HTML.

~TJ


[whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
My understanding of the general philosophy of HTML5 on the matter of
malformed HTML is that it's better to define specific rules concerning
breakage rather than overly strict rules about how to do it right in the
first place but this is really starting to create pain-points in
development.

Modern browsers are so good at hiding breakage in rendering now that I
sometimes run into things that are just nuking the DOM-node structure on
the JS-side of things while everything looks hunky-dorey in rendering and
no errors are being thrown.

It's like the HTML equivalent of wrapping every function in an empty
try/catch statement. For the last year or so I've started using IE8 as my
HTML canary when I run into weird problems and I'm not the only dev I've
heard of doing this. But what happens when we're no longer supporting IE8
and using tags that it doesn't recognize?

Why can't we set stricter rules that cause rendering to cease or at least a
non-interpreter-halting error to be thrown by browsers when the HTML is
broken from a nesting/XML-strict-tag-closing perspective if we want? Until
most of the vendors started lumping XHTML Strict 1.0 into a general
"standards" mode that basically worked the same for any declared doctype, I
thought it was an excellent feature from a development perspective to just
let bad XML syntax break the page.

And if we were able to set such rules, wouldn't it be less work to parse?
How difficult would it be to add some sort of opt-in strict mode for HTML5
that didn't require juggling of doctypes (since that seems to be what the
vendors want)?


Re: [whatwg] Was is considered to use JSON-LD instead of creating application/microdata+json?

2012-08-10 Thread Ian Hickson
On Fri, 10 Aug 2012, Markus Lanthaler wrote:
> On Thursday, August 09, 2012 4:53 PM, Ian Hickson wrote:
> > > > 
> > > > The only reason there's a MIME type at all (rather than just using 
> > > > JSON's directly) was to enable filtering of copy-and-paste and 
> > > > drag-and-drop payloads; would JSON-LD enable that also?
> > >
> > > Sure, I see no reason why not.
> > 
> > Could you give an example of how? I don't understand how it would work 
> > if we re-use an an existing MIME type. If you have any concrete 
> > examples I could look at that would be ideal.
> 
> Maybe I'm missing something but what would be the difference of re-using 
> an existing MIME type?

There'd be no way to distinguish a microdata drag-and-drop payload from 
any other JSON-based (or in the case of what you're proposing, 
JSON-LD-based) payload in the dropzone="" filtering.


> Looking at the drag and drop API the only thing that would need to be 
> changed is the "drag data item type string" from " 
> application/microdata+json" to "application/ld+json" in [1].

Then there'd be no way to determine if the payload was generated by the 
microdata extractor or not.


> The advantage in doing so would be that a drop handler could use the 
> JSON-LD API to reframe the data so that it can be used more easily.

What JSON-LD API? I'm not aware of any browsers that have such a thing. 
And why would we want to require that authors use yet another API instead 
of just using straight JavaScript, as you can with JSON?


> > > > That seems like it is strictly more complicated (trivially so, but 
> > > > still). What is the advantage?
> > >
> > > Well, I would say there are several advantages. First of all, 
> > > JSON-LD is more flexible and expressive.
> > 
> > More flexible and expressive than what?
> 
> Than application/microdata+json.

I don't understand what you mean by "flexible and expressive". Could you 
give an example of how JSON-LD is more flexible than JSON? I'm really 
confused as to what you're saying here.


> JSON-LD could also be used to extract RDFa (lossless).

That doesn't seem like a benefit.

(Note that microdata and RDF have different data models and cannot be 
directly mapped from one to the other. It is highly unlikely that any 
other format can actually represent both of them without either some sort 
of data loss or a dramatically more complicated data model than microdata, 
both of which would be bad.)


> > > It has support for string internationalization, data typing, lists 
> > > etc.
> > 
> > How would this manifest itself in this context? Are you suggesting 
> > that we should change the microdata to JSON serialisation rules 
> > somehow?
> 
> Since microdata doesn't support that, it isn't really needed in that 
> context. But it could harmonize the result with a lossless extraction of 
> RDFa for example or come very handy when interacting with Web services 
> exposing JSON-LD.

Could you give a concrete example of a problem this solves? I'm finding it 
different to understand what you are proposing.


> > > It also allows to distinguish between IRIs and literals (which isn't 
> > > the case for application/microdata+json) which is important for 
> > > Linked Data application.
> > 
> > Could you give an example of how this would help an application?
> 
> You could imagine an application that manages books and their authors. 
> If the author is specified in the form of an IRI, the application could 
> render the information in the form of a hyperlink or go even a step 
> further and try to automatically fetch more information about that 
> author.

That sounds like the pie-in-the-sky reasoning that underlies most RDF 
arguments. :-) Could you point to a concrete example of an actual 
application that would benefit from having a single field have multiple 
types?

In the case of the example you give, I think applications would in general 
benefit far more (in terms of ease of implementation and maintenance) from 
just having one field that describes the author in terms of the author's 
name, and one field that gives an identifier that can be used to look up 
the author in the database, rather than having a single field that can do 
one or the other but not both, or that can do both but is sometimes a 
multivalued array and sometimes just one value and you have to introspect 
each value to work out what each entry is.


> > It would help if you described what precise changes you would like to 
> > see to the algorithms, so that I better understood the implications 
> > here.
> 
> The changes are trivial. In the drag and drop API algorithms all that 
> have to be changed is the MIME type. In the microdata API [2] the 
> changes would be something like this: [...]

I see no value in doing these changes. They just make the format more ugly 
with more punctuation without adding any new features, as far as I can tell.


> > > Secondly, there is an API for JSON-LD to reframe [1] a document into 
> > > a shape that migh

Re: [whatwg] Was is considered to use JSON-LD instead of creating application/microdata+json?

2012-08-10 Thread Markus Lanthaler
On Thursday, August 09, 2012 4:53 PM, Ian Hickson wrote:

> > > The only reason there's a MIME type at all (rather than just using
> > > JSON's directly) was to enable filtering of copy-and-paste and
> > > drag-and-drop payloads; would JSON-LD enable that also?
> >
> > Sure, I see no reason why not.
> 
> Could you give an example of how? I don't understand how it would work
> if we re-use an an existing MIME type. If you have any concrete examples
> I could look at that would be ideal.

Maybe I'm missing something but what would be the difference of re-using an
existing MIME type?

Looking at the drag and drop API the only thing that would need to be
changed is the "drag data item type string" from "
application/microdata+json" to "application/ld+json" in [1]. The advantage
in doing so would be that a drop handler could use the JSON-LD API to
reframe the data so that it can be used more easily.


> > > That seems like it is strictly more complicated (trivially so, but
> > > still). What is the advantage?
> >
> > Well, I would say there are several advantages. First of all, JSON-LD
> is
> > more flexible and expressive.
> 
> More flexible and expressive than what?

Than application/microdata+json. JSON-LD could also be used to extract RDFa
(lossless).


> > It has support for string internationalization, data typing, lists
> etc.
> 
> How would this manifest itself in this context? Are you suggesting that
> we
> should change the microdata to JSON serialisation rules somehow?

Since microdata doesn't support that, it isn't really needed in that
context. But it could harmonize the result with a lossless extraction of
RDFa for example or come very handy when interacting with Web services
exposing JSON-LD.


> > It also allows to distinguish between IRIs and literals (which isn't
> the
> > case for application/microdata+json) which is important for Linked
> Data
> > application.
> 
> Could you give an example of how this would help an application?

You could imagine an application that manages books and their authors. If
the author is specified in the form of an IRI, the application could render
the information in the form of a hyperlink or go even a step further and try
to automatically fetch more information about that author.


> It would help if you described what precise changes you would like to
> see
> to the algorithms, so that I better understood the implications here.

The changes are trivial. In the drag and drop API algorithms all that have
to be changed is the MIME type. In the microdata API [2] the changes would
be something like this:

.. 4. Add an entry to result called "items" ...

++ 5. Add an entry to result called "@context" whose value is the following
object
  {  "@vocab": "" }

.. 6. Return the result of serializing result to JSON ...


If you don't like to use "@id", "@type", and "@graph" instead of "id",
"type", and "items" add a step after step 4 of the current algorithm:

.. 4. Add an entry to result called "items" ...

++ 5. Add an entry to result called "@context" whose value is the following
object
  {
"id": "@id",
"type": "@type",
"items": "@graph"
  }

.. 6. Return the result of serializing result to JSON ...



If the @-keywords are fine, you don't have to add a context, instead the
following steps have to be changed in the algorithm:

-- 3. If the item has any item types, add an entry to result called "type"
...
++ 3. If the item has any item types, add an entry to result called "@type"
...

-- 4. If the item has a global identifier, add an entry to result called
"id" ...
++ 4. If the item has a global identifier, add an entry to result called
"@id" ...


In both cases you would have to drop step 7

-- 7. Add an entry to result called "properties" whose value is the object
properties.

and change steps 6.3.1 and 6.3.2 to use "result" directly

-- 1. If there is no entry named name in properties, then add an
--entry named name to properties whose value is an empty array.
-- 2. Append value to the entry named name in properties.

++ 1. If there is no entry named name in result, then add an
++entry named name to result whose value is an empty array.
++ 2. Append value to the entry named name in result.


> > Secondly, there is an API for JSON-LD to reframe [1] a document into
> a
> > shape that might be easier to work with in a web app (I think that's
> the
> > whole point of microdata+json or am I wrong?).
> 
> I don't understand what this means.

Well, for example you could transform a list of books and chapters of those
books to a nested structure with the books at the top level and the chapters
as children. Have a look at the example in the JSON-LD playground [3] (click
on Framing Examples: Library at the top right).


> > Other API calls allow e.g. to convert to and from RDF [2]. If you are
> > interested, there is an online JSON-LD playground [3] where you can
> play
> > with the various API calls. Last but not least it would also make web
> > developers

Re: [whatwg] Features for responsive Web design

2012-08-10 Thread Andy Davies
On 9 August 2012 17:01, Tab Atkins Jr.  wrote:
> On Thu, Aug 9, 2012 at 1:16 AM, Andy Davies  wrote:
>> Would also like to see if there's a way of using srcset to hint to the UA
>> that it can skip the image under low throughput conditions e.g. GPRS.
>> Same would apply to image-set in CSS
>
> The image-set() function already includes this functionality.  You can
> include a fallback color, and if the UA decides it doesn't want to
> download *any* of the images, it can just create a solid-color image
> and use that instead.
>

Thanks I see it now - must have missed it when first scanned the
CSS4-images spec

Andy


Re: [whatwg] Features for responsive Web design

2012-08-10 Thread Odin Hørthe Omdal
On Thu, 09 Aug 2012 18:54:10 +0200, Kornel Lesiński   
wrote:


One stylesheet can be easily reused for   pixel-perfect 1x/2x layout,  
but pixel-perfect 1.5x requires its own sizes incompatible with 1x/2x.



Apart from it possibly being a self-fulfilling prophecy – isn't this
too much premature “optimization” ?


I think we can safely assume that authors will always want to prepare as  
few assets and stylesheets as they can, and will prefer integer units to  
fractional ones (1px line vs 1.px line).


I don't see the big problem, I think the spec is fine here. Yes it allows  
for putting a float there, but authors won't use it, so what's the  
problem? The spec already say you should use the number to calculate the  
correct intrinsic size, and the implementation will know what to do with a  
float number there if someone finds an actual use for it.


This isn't limiting it for the sake of making anything easier, it's not  
like "the x is an integer" is any easier than "the x is a float". And if  
you *do* somehow find a good use for it down the line (and I believe there  
might be, maybe 0.5x) it'll be there and work. No harm. :)


--
Odin Hørthe Omdal (Velmont/odinho) · Core, Opera Software, http://opera.com


Re: [whatwg] Features for responsive Web design

2012-08-10 Thread Stephanie Rieger
On 10 Aug 2012, at 09:54, Florian Rivoal wrote:
> On Thu, 09 Aug 2012 11:29:17 +0200, Kornel Lesiński  
> wrote:
>> On 8 sie 2012, at 12:57, "Florian Rivoal"  wrote:
> Is there a good reason to believe that * will be something other than a
> power of two?
> 
> I wasn't debating whether or not shipping a device with a 1.5 pixel
> ratio is the best decision, but answering: "Is there a good reason
> to believe that will be something other than a power of two?"
> 
> The fact that it has happened seems a pretty good reason to believe
> that it may happen.

For reference, we are seeing *all sorts* of viewport and pixel density 
variations on Android devices. Low cost devices such as the Galaxy Mini have 
240 hardware pixels but the default viewport (CSS pixels) has been set to a 
higher value of 320 pixels. 

This value was likely chosen not just to match the iPhone, but because it was 
needed to achieve comfortable legibility given the quality of the display. It 
does however result in an unusual ratio of 0.75.

As the Android platform also includes some incredibly robust user settings (and 
the granularity of these settings only seems to be increasing as the platform 
evolves) a user can easily reset these 320 CSS pixels back down to 240 or as 
high up as 940 (they won't know the exact number...the settings are simply 
labelled small, medium etc). 

Of course at 940 pixels, content on this device would be near illegible but 
this is just one example, of one setting, on one device. A given viewport 
adjustment may make little sense on a phone, but lots more sense on a 
completely different device (such as a seat-back display in a car or airplane).

While I hope viewport sizes will eventually standardize, the display is just 
one variable on a bill of materials. Given that manufacturers (and not just the 
ones that make phones) have the option to tweak the viewport in an effort to 
achieve a more comfortable pairing of hardware with software, unusual viewport 
sizes seem likely to remain a "coping strategy" for some time.

Steph

Re: [whatwg] StringEncoding: Allowed encodings for TextEncoder

2012-08-10 Thread Jonas Sicking
On Thu, Aug 9, 2012 at 10:42 AM, Joshua Bell  wrote:
> On Wed, Aug 8, 2012 at 9:03 AM, Joshua Bell  wrote:
>
>>
>>
>> On Wed, Aug 8, 2012 at 2:48 AM, James Graham  wrote:
>>
>>> On 08/07/2012 07:51 PM, Jonas Sicking wrote:
>>>
>>>  I don't mind supporting *decoding* from basically any encoding that
 Anne's spec enumerates. I don't see a downside with that since I
 suspect most implementations will just call into a generic decoding
 backend anyway, and so supporting the same set of encodings as for
 other parts of the platform should be relatively easy.

>>>
>>> [...]
>>>
>>>
>>>  However I think we should consider restricting support to a smaller
 set of encodings for while *encoding*. There should be little reason
 for people today to produce text in non-utf formats. We might even be
 able to get away with only supporting UTF8, though I wouldn't be
 surprised if there are reasonably modern file formats which use utf16.

>>>
>>> FWIW, I agree with the decode-from-all-platform-**encodings
>>> encode-to-utf[8|16] position.
>>>
>>
>> Any disagreement on limiting the supported encodings to utf-8, utf-16, and
>> utf-16be, while permitting decoding of all encodings in the Encoding spec?
>>
>> (This eliminates the "what to do on encoding error" issue nicely, still
>> need to resolve the BOM issue though.)
>>
>
> http://wiki.whatwg.org/wiki/StringEncoding has been updated to restrict the
> supported encodings for encoding to UTF-8, UTF-16 and UTF-16BE.
>
> I'm tempted to take it further to just UTF-8 and see if anyone complains.
>
> Jury is still out on the decode-with-BOM issue - I need to reason through
> Glenn's suggestions on the "open issues" thread.
>
> I added a related open issue raised by Glenn, summarized as "... suggest
> that the .encoding attribute simply return the name that was passed to
> the constructor." - taking this further, perhaps the attribute should be
> eliminated as callers could apply it themselves.

I could definitely live with removing the attribute.

/ Jonas


Re: [whatwg] Features for responsive Web design

2012-08-10 Thread Florian Rivoal
On Thu, 09 Aug 2012 11:29:17 +0200, Kornel Lesiński   
wrote:



On 8 sie 2012, at 12:57, "Florian Rivoal"  wrote:

Is there a good reason to believe that * will be something other than  
a

power of two?

That is, could we just optimize the *x syntax away and specify that  
the

first option is 1x, the second is 2x, the third is 4x, etc.?


If you look at mobile phones, there are a bunch of existing devices with
1.5 device pixel per css pixel, and also some with 2.25, so I don't
think we can assume only powers of 2 will be used.


Pixel-perfect design for non-integer scaling ratios is very hard. To  
have evenly thin lines (1 device pixel wide) on such screens you have to  
use fractional CSS pixel sizes, and fractions need to be different for  
different scaling ratios.


I don't think anybody will take advantage of that. IMHO non-integer  
ratios are a mistake that can/will be corrected.


I wasn't debating whether or not shipping a device with a 1.5 pixel
ratio is the best decision, but answering: "Is there a good reason
to believe that will be something other than a power of two?"

The fact that it has happened seems a pretty good reason to believe
that it may happen.

Fractional ratios have proven to be unnecessary: on desktops 1x CSS  
pixel changed from 72dpi (CRT) to 130dpi on notebook screens, but we  
haven't got fractional scaling ratios along the way. Variability in  
screen sizes and actual DPI has been accepted. The same can happen with  
1.5x-2.5x screens: pretend they all are 2x, vary CSS pixel width/height,  
accept physical size of CSS pixel will be slightly different.


For example the 2.25 ratio doesn't make sense to me. 12.5% increase in  
screen density is going to be imperceptible. A better solution would be  
to use the crisp 2x ratio and have bigger screen area (in CSS pixels).


A ratio of 2.25 on 720 physical pixel device gives a viewport width of 320
css pixels. 320 pixels is the same as the iPhone, and being identical to
that helps with site compatibility.

I am not convinced that using 2.25 was the best decision, but it has
some justifications, and has happened, so I don't think it is reasonable
to bake in some assumptions in the spec (only powers of 2) when we
know that they don't match reality.

 - Florian