Re: [whatwg] Wasn't there going to be a strict spec?

2012-09-19 Thread Henri Sivonen
On Fri, Sep 7, 2012 at 7:25 AM, Ian Hickson  wrote:
> Browsers are certainly allowed to report syntax errors in their consoles.
> Indeed I would encourage it.

Unlikely to happen. I already faced reviewer skepticism for making
Firefox whine to console about encoding errors. I silenced the most
common encoding error to get the patch landed. Fortunately the most
common error is also the least actionable one (encoding not declared
in a different-origin framed document—ad iframe in practice) so
silencing it isn’t crazy from the perspective of the utility of the
messages.

As Mike said, Firefox reports most (not all!) parse errors in View
Source, though.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] Wasn't there going to be a strict spec?

2012-09-06 Thread Michael[tm] Smith
Ian Hickson , 2012-09-07 04:25 +:

> On Fri, 10 Aug 2012, Erik Reppen wrote:
> > Why can't we set stricter rules that cause rendering to cease or at least a
> > non-interpreter-halting error to be thrown by browsers when the HTML is
> > broken from a nesting/XML-strict-tag-closing perspective if we want?
>
> Browsers are certainly allowed to report syntax errors in their consoles. 
> Indeed I would encourage it.

Firefox's "View source" does it. It highlights syntax errors in red (that
is, things the HTML spec defines as syntax errors for text/html documents),
and if you hover over them, shows the text of the error message.

  --Mike

P.S. to Eric, if by "XML-strict-tag-closing" you mean having a browser
report XML well-formedness errors when it's parsing a document served as
text/html, I don't think you're going to get that.  Lack of
"XML-strict-tag-closing" in an HTML document doesn't make it broken. If you
want to catch XML well-formedness errors, I guess you'd need check by
running stuff through an XML parser separately. Also, I don't what you mean
by "broken from a nesting perspective"...

-- 
Michael[tm] Smith http://people.w3.org/mike


Re: [whatwg] Wasn't there going to be a strict spec?

2012-09-06 Thread Ian Hickson
On Fri, 10 Aug 2012, Erik Reppen wrote:
>
> My understanding of the general philosophy of HTML5 on the matter of 
> malformed HTML is that it's better to define specific rules concerning 
> breakage rather than overly strict rules about how to do it right in the 
> first place

This is incorrect. The philosophy is to have strict rules about how to 
write content -- e.g. in the form of the strict content model 
descriptions, the "Writing HTML documents" syntax section, the obsoletion 
of many legacy parts of the language (like ), and other authoring 
conformance criteria -- and then to have equally strict rules for browsers 
and other user agents that defines what exactly should happen when the 
first set of rules are ignored and broken (usually to ignore the broken 
content and not try to fix the problem, but sometimes, usually for legacy 
reasons, to make an attempt at "do what I mean").


> Modern browsers are so good at hiding breakage in rendering now that I 
> sometimes run into things that are just nuking the DOM-node structure on 
> the JS-side of things while everything looks hunky-dorey in rendering 
> and no errors are being thrown.

Use a validator. :-) That should help catch syntax errors and content 
model errors, at least.


> It's like the HTML equivalent of wrapping every function in an empty 
> try/catch statement. For the last year or so I've started using IE8 as 
> my HTML canary when I run into weird problems and I'm not the only dev 
> I've heard of doing this. But what happens when we're no longer 
> supporting IE8 and using tags that it doesn't recognize?

I don't really understand how IE8 is relevant here. Can you elaborate? How 
does it help?


> Why can't we set stricter rules that cause rendering to cease or at least a
> non-interpreter-halting error to be thrown by browsers when the HTML is
> broken from a nesting/XML-strict-tag-closing perspective if we want?

Browsers are certainly allowed to report syntax errors in their consoles. 
Indeed I would encourage it.


> And if we were able to set such rules, wouldn't it be less work to parse?

You have to catch the error either way, whether it's to then abort, or to 
then ignore it. It's the same amount of work. For some errors, e.g. 
out-of-range errors or content model errors, it can actually be 
significantly _more_ work to detect the error than to ignore it.


> How difficult would it be to add some sort of opt-in strict mode for 
> HTML5 that didn't require juggling of doctypes (since that seems to be 
> what the vendors want)?

It's not at all difficult. The spec allows it today. The question is 
really how hard would it be to convince browsers to implement it. :-)


On Fri, 10 Aug 2012, Erik Reppen wrote:
> 
> I think there's a legit need for a version or some kind of mode for 
> HTML5 that assumes you're a pro and breaks visibly or throws an error 
> when you've done something wrong.

If you're a pro, use a validator.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-16 Thread yuhong


Kang-Hao (Kenny) Lu-4 wrote:
> 
> Yep. I would encourage you to play with XHTML5 (application/xhtml+xml)
> more and report bugs to browsers. When I still had interest in
> application/xhtml+xml (back in 2007?), I got troubled by all the
> differences in the DOM APIs. I think currently most JS framework
> probably doesn't support XHTML5.
AFAIK my favorite features to mention when I talk about how IE8 is a boat
anchor is XHTML and DOM Level 2, as they are more than 10 years old!
It is unfortunate that MS do not provide any further major upgrades to IE
after the version of Windows enters extended support. IE8 was released just
before XP entered extended support in April 2009.
-- 
View this message in context: 
http://old.nabble.com/Wasn%27t-there-going-to-be-a-strict-spec--tp34283528p34308791.html
Sent from the whatwg.org - whatwg mailing list archive at Nabble.com.



Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-13 Thread Erik Reppen
That spells out a major browser vendor issue much more clearly. I think
just having the option to develop in application/xhtml+xml and switching to
text/html is a good start though.

On Sat, Aug 11, 2012 at 10:17 AM, Karl Dubost  wrote:

>
> Le 10 août 2012 à 20:19, Tab Atkins Jr. a écrit :
> > I don't wish to spend the time to dig up the studies showing that 95% or
> so of XML served as text/html is invalid XML
>
> That doesn't really makes sense, but I guess what Tab meant is
>
> People attempting to write documents
> * with XML syntax rules (such as for example XHTML 1.0),
> * and serving it as text/html.
>
> Often, these documents are NOT well-formed, even before being valid, and
> even-less conformant.
>
> On top of that you can add a layer of madness with user-agent sniffing. I
> have documented one we had in Opera and forced us to recover automatically.
> *unfortunately*. It also makes the task of creating a survey very hard
> because… well you get different markup, redirections, etc. aka results
> because of the user agent sniffing.
>
> See [Wrong To Be Right - application/xhtml+xml][1]
>
> [1]:
> http://my.opera.com/karlcow/blog/2011/03/03/wrong-to-be-right-with-xhtml
>
> For stats, there are two big surveys which have been made in the past
> (maybe it is what Tab refers to)
>
> https://developers.google.com/webmasters/state-of-the-web/
> http://dev.opera.com/articles/view/mama/
>
> PS: Erik, you can also rely on XHTML5. Aka serving your document as
> application/xhtml+xml, expect issues with browser market shares in some
> countries.
>
>
> --
> Karl Dubost - http://dev.opera.com/
> Developer Relations, Opera Software
>
>


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-11 Thread Karl Dubost

Le 10 août 2012 à 20:19, Tab Atkins Jr. a écrit :
> I don't wish to spend the time to dig up the studies showing that 95% or so 
> of XML served as text/html is invalid XML

That doesn't really makes sense, but I guess what Tab meant is

People attempting to write documents 
* with XML syntax rules (such as for example XHTML 1.0), 
* and serving it as text/html.

Often, these documents are NOT well-formed, even before being valid, and 
even-less conformant. 

On top of that you can add a layer of madness with user-agent sniffing. I have 
documented one we had in Opera and forced us to recover automatically. 
*unfortunately*. It also makes the task of creating a survey very hard because… 
well you get different markup, redirections, etc. aka results because of the 
user agent sniffing.

See [Wrong To Be Right - application/xhtml+xml][1]

[1]: http://my.opera.com/karlcow/blog/2011/03/03/wrong-to-be-right-with-xhtml

For stats, there are two big surveys which have been made in the past (maybe it 
is what Tab refers to)

https://developers.google.com/webmasters/state-of-the-web/
http://dev.opera.com/articles/view/mama/

PS: Erik, you can also rely on XHTML5. Aka serving your document as 
application/xhtml+xml, expect issues with browser market shares in some 
countries.


-- 
Karl Dubost - http://dev.opera.com/
Developer Relations, Opera Software



Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Kang-Hao (Kenny) Lu
(12/08/11 8:41), Erik Reppen wrote:
> Thanks Hugh. I had mistakenly been thinking of XHTML5 as something that
> never happened rather than merely HTML5 served as XML which hadn't really
> occurred to me as being a viable option. I look forward to messing with
> this. This is precisely what I wanted to be able to do.

Yep. I would encourage you to play with XHTML5 (application/xhtml+xml)
more and report bugs to browsers. When I still had interest in
application/xhtml+xml (back in 2007?), I got troubled by all the
differences in the DOM APIs. I think currently most JS framework
probably doesn't support XHTML5.

After playing XHTML5, if you still think browsers should implement yet
another mode, you should probably say why XHTML5 is bad and why you
don't just use it.

If you have proposals for how some of the DOM APIs in XHTML5 should
work, you might want to follow the instruction on the top of relevant
specs (DOM Parsing and Serialization[1] basically) and send feedback.


[1] http://html5.org/specs/dom-parsing.html



Cheers,
Kenny
-- 
Web Specialist, Oupeng Browser, Beijing
Try Oupeng: http://www.oupeng.com/


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
Thanks Hugh. I had mistakenly been thinking of XHTML5 as something that
never happened rather than merely HTML5 served as XML which hadn't really
occurred to me as being a viable option. I look forward to messing with
this. This is precisely what I wanted to be able to do.

On Fri, Aug 10, 2012 at 7:28 PM, Hugh Guiney  wrote:

> On Fri, Aug 10, 2012 at 8:06 PM, Erik Reppen 
> wrote:
> > Sorry if this double-posted but I think I forgot to CC the list.
> >
> > Browser vendor politics I can understand but if we're going to talk about
> > what "history shows" about people like myself suggesting features we
> can't
> > actually support I'd like to see some studies that contradict the
> > experiences I've had as a web ui developer for the last five years.
> >
> > Everybody seems on board with providing a JavaScript strict mode. How is
> > this any different? Do people blame the vendors when vars they try to
> > define without a var keyword break their strict-mode code? Do we fret
> about
> > all the js out there that's not written in strict mode?
> >
> > And HTML5 has found the key to eliminating the political issue, I should
> > think. Don't just worry about the rules for when the authors get it
> right.
> > Explicitly spell out the rules for how to handle it when they get it
> wrong.
> > How can you blame the browser for strict mode face plants when every
> modern
> > browser including IE goes about face-planting in exactly the same way?
> >
> > Sure, I could integrate in-editor validation into my process, but why add
> > to bloat to any number of tools I might be  using for any number of
> > different stacks when we had something I know worked for a lot of
> > developers who were all as confused as I was when people inexplicably
> > started shouting about XHTML strict's "failure" from the rooftops.
> >
> > Is there some unspoken concern here? If there is, I'll shut up and try to
> > find out what it is through other means but I really don't see the logic
> in
> > not having some strict provision for authors who want it. How hard is it
> to
> > plug in an XML validator and rip out the namespace bits if that's not
> > something we want to deal with just yet and propose a set of behaviors
> for
> > when your HTML5 isn't compliant with a stricter syntax?
> >
> > Because yes, these bugs can be kinda nasty when you don't think to check
> to
> > make sure your HTML is well-formed and it's the kind of stuff that can
> > easily slide into production as difficult-to-diagnose edge-cases. Believe
> > me. Front-liner here. It's an issue. Markup is where presentation,
> > behavior, content, client-side, and server-side meet. I'm comfortable
> with
> > letting people embrace their own philosophies but I like my markup to be
> > done right in the first place and visible breakage or at least browser
> > console error messages is the easiest and most obvious way to discover
> that
> > it isn't. And I developed that philosophy from my experience moving from
> > less strict to strict markup, not just toeing some weird technorati
> > political line or zeitgeist.
>
> There's nothing stopping you from writing XHTML5. This is what I do
> for exactly the reasons you describe. If you write polyglot documents,
> you can use application/xhtml+xml in development, serve text/html in
> production and still sleep at night. Hell, these days you can even
> serve application/xhtml+xml in production if IE < 9 isn't a
> significant market segment for you.
>


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Hugh Guiney
On Fri, Aug 10, 2012 at 8:06 PM, Erik Reppen  wrote:
> Sorry if this double-posted but I think I forgot to CC the list.
>
> Browser vendor politics I can understand but if we're going to talk about
> what "history shows" about people like myself suggesting features we can't
> actually support I'd like to see some studies that contradict the
> experiences I've had as a web ui developer for the last five years.
>
> Everybody seems on board with providing a JavaScript strict mode. How is
> this any different? Do people blame the vendors when vars they try to
> define without a var keyword break their strict-mode code? Do we fret about
> all the js out there that's not written in strict mode?
>
> And HTML5 has found the key to eliminating the political issue, I should
> think. Don't just worry about the rules for when the authors get it right.
> Explicitly spell out the rules for how to handle it when they get it wrong.
> How can you blame the browser for strict mode face plants when every modern
> browser including IE goes about face-planting in exactly the same way?
>
> Sure, I could integrate in-editor validation into my process, but why add
> to bloat to any number of tools I might be  using for any number of
> different stacks when we had something I know worked for a lot of
> developers who were all as confused as I was when people inexplicably
> started shouting about XHTML strict's "failure" from the rooftops.
>
> Is there some unspoken concern here? If there is, I'll shut up and try to
> find out what it is through other means but I really don't see the logic in
> not having some strict provision for authors who want it. How hard is it to
> plug in an XML validator and rip out the namespace bits if that's not
> something we want to deal with just yet and propose a set of behaviors for
> when your HTML5 isn't compliant with a stricter syntax?
>
> Because yes, these bugs can be kinda nasty when you don't think to check to
> make sure your HTML is well-formed and it's the kind of stuff that can
> easily slide into production as difficult-to-diagnose edge-cases. Believe
> me. Front-liner here. It's an issue. Markup is where presentation,
> behavior, content, client-side, and server-side meet. I'm comfortable with
> letting people embrace their own philosophies but I like my markup to be
> done right in the first place and visible breakage or at least browser
> console error messages is the easiest and most obvious way to discover that
> it isn't. And I developed that philosophy from my experience moving from
> less strict to strict markup, not just toeing some weird technorati
> political line or zeitgeist.

There's nothing stopping you from writing XHTML5. This is what I do
for exactly the reasons you describe. If you write polyglot documents,
you can use application/xhtml+xml in development, serve text/html in
production and still sleep at night. Hell, these days you can even
serve application/xhtml+xml in production if IE < 9 isn't a
significant market segment for you.


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread David Bruant

Le 10/08/2012 20:06, Erik Reppen a écrit :

Sorry if this double-posted but I think I forgot to CC the list.

Browser vendor politics I can understand but if we're going to talk about
what "history shows" about people like myself suggesting features we can't
actually support I'd like to see some studies that contradict the
experiences I've had as a web ui developer for the last five years.

Everybody seems on board with providing a JavaScript strict mode. How is
this any different?
JavaScript strict mode enables to make a JavaScript program secure 
(there is some additional work to do, but you can do it yourself as a 
programmer) while it's almost impossible to write a secure program in 
non-strict JavaScript because of scope-violating (indirect) eval. 
JavaScript strict mode is almost a different language with that regard.
The ability to write securable JavaScript required an intervention at 
the language level.


HTML has no such thing to win with a strict mode as far as I know.
Also, JS strict mode deals with runtime and not syntax (with statement 
aside). That's far different from what could be expected from an HTML 
strict mode.


To some extent, CSP ("Content Security Policy", about to reach 
Recommandation stage soon) is your "HTML strict mode" if you care about 
security.


David


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Tab Atkins Jr.
On Fri, Aug 10, 2012 at 5:02 PM, Erik Reppen  wrote:
> Browser vendor politics I can understand but if we're going to talk about
> what "history shows" about people like myself suggesting features we can't
> actually support I'd like to see some studies that contradict the
> experiences I've had as a web ui developer for the last five years.

I don't wish to spend the time to dig up the studies showing that 95%
or so of XML served as text/html is invalid XML, but others probably
have the links easily at hand.

> Everybody seems on board with providing a JavaScript strict mode. How is
> this any different? Do people blame the vendors when vars they try to define
> without a var keyword break their strict-mode code? Do we fret about all the
> js out there that's not written in strict mode?

JS is a substantially different story, because you virtually never
display a page which runs Javascript code from many arbitrary users of
the website.  As well, JS is commonly written only by the more
advanced maintainers of a given website.  Finally, JS errors are
localized to the scripts in which they are located.

This is substantially different from HTML, which is commonly composed
of lots of user comments which are barely, if at all, escaped for
security.  As well, HTML is commonly written by a number of users of
varying skill levels in a given site; in my previous experience as a
web dev, for example, people from the advertising department often
wrote snippets of HTML into pages, though their skills were far below
mine.  Finally, HTML doesn't have the same localizing barriers as JS
does with 

Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
Sorry if this double-posted but I think I forgot to CC the list.

Browser vendor politics I can understand but if we're going to talk about
what "history shows" about people like myself suggesting features we can't
actually support I'd like to see some studies that contradict the
experiences I've had as a web ui developer for the last five years.

Everybody seems on board with providing a JavaScript strict mode. How is
this any different? Do people blame the vendors when vars they try to
define without a var keyword break their strict-mode code? Do we fret about
all the js out there that's not written in strict mode?

And HTML5 has found the key to eliminating the political issue, I should
think. Don't just worry about the rules for when the authors get it right.
Explicitly spell out the rules for how to handle it when they get it wrong.
How can you blame the browser for strict mode face plants when every modern
browser including IE goes about face-planting in exactly the same way?

Sure, I could integrate in-editor validation into my process, but why add
to bloat to any number of tools I might be  using for any number of
different stacks when we had something I know worked for a lot of
developers who were all as confused as I was when people inexplicably
started shouting about XHTML strict's "failure" from the rooftops.

Is there some unspoken concern here? If there is, I'll shut up and try to
find out what it is through other means but I really don't see the logic in
not having some strict provision for authors who want it. How hard is it to
plug in an XML validator and rip out the namespace bits if that's not
something we want to deal with just yet and propose a set of behaviors for
when your HTML5 isn't compliant with a stricter syntax?

Because yes, these bugs can be kinda nasty when you don't think to check to
make sure your HTML is well-formed and it's the kind of stuff that can
easily slide into production as difficult-to-diagnose edge-cases. Believe
me. Front-liner here. It's an issue. Markup is where presentation,
behavior, content, client-side, and server-side meet. I'm comfortable with
letting people embrace their own philosophies but I like my markup to be
done right in the first place and visible breakage or at least browser
console error messages is the easiest and most obvious way to discover that
it isn't. And I developed that philosophy from my experience moving from
less strict to strict markup, not just toeing some weird technorati
political line or zeitgeist.

On Fri, Aug 10, 2012 at 5:44 PM, Tab Atkins Jr. wrote:

> On Fri, Aug 10, 2012 at 3:29 PM, Erik Reppen 
> wrote:
> > This confuses me. Why does it matter that other documents wouldn't work
> if
> > you changed the parsing rules they were defined with to stricter
> versions?
> > As far as backwards compatibility, if a strict-defined set of HTML would
> > also work in a less strict context, what could it possibly matter? It's
> only
> > the author's problem to maintain (or switch to a more forgiving mode) and
> > backwards compatibility isn't broken if the same client 500 years from
> now
> > uses the same general HTML mode for both.
> >
> > I think there's a legit need for a version or some kind of mode for HTML5
> > that assumes you're a pro and breaks visibly or throws an error when
> you've
> > done something wrong. Back in the day nobody ever forced authors who
> didn't
> > know what they're doing to use doctypes they were too sloppy to handle. I
> > wasn't aware of any plan to discontinue non-XHTML doctypes. How everybody
> > started thinking of it as a battle for one doctype to rule them all
> makes no
> > sense to me but I'm fine with one doctype. I just want something that
> works
> > in regular HTML5 but that will break in some kind of a strict mode when
> > XML-formatting rules aren't adhered to. You pick degrees of strictness
> based
> > on what works for you. I don't really see a dealbreaking issue here. Why
> > can't we all have it the way we want it?
> >
> > As somebody who deals with some pretty complex UI where the HTML and CSS
> are
> > concerned it's a  problem when things in the rendering context give no
> > indication of breakage, while in the DOM they are in fact getting tripped
> > up. Sure, I can validate and swap out doctypes or just keep running
> stuff in
> > IE8 to see if it breaks until I actually start using HTML5-only tags but
> > this is kind of awkward and suggests something forward-thinking design
> could
> > address don't you think?
>
> As I said, years of evidence have provided strong evidence that a
> large majority of authors cannot guarantee that their pages are valid
> all of the time.  This covers both authoring-time validity and
> validity after including user comments or the like.
>
> If you want a mode that guarantees validity, that already exists -
> it's called "put a validator into your workflow".  Many popular text
> editors offer plugins that validate your markup as you go, as well.

Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Tab Atkins Jr.
On Fri, Aug 10, 2012 at 3:29 PM, Erik Reppen  wrote:
> This confuses me. Why does it matter that other documents wouldn't work if
> you changed the parsing rules they were defined with to stricter versions?
> As far as backwards compatibility, if a strict-defined set of HTML would
> also work in a less strict context, what could it possibly matter? It's only
> the author's problem to maintain (or switch to a more forgiving mode) and
> backwards compatibility isn't broken if the same client 500 years from now
> uses the same general HTML mode for both.
>
> I think there's a legit need for a version or some kind of mode for HTML5
> that assumes you're a pro and breaks visibly or throws an error when you've
> done something wrong. Back in the day nobody ever forced authors who didn't
> know what they're doing to use doctypes they were too sloppy to handle. I
> wasn't aware of any plan to discontinue non-XHTML doctypes. How everybody
> started thinking of it as a battle for one doctype to rule them all makes no
> sense to me but I'm fine with one doctype. I just want something that works
> in regular HTML5 but that will break in some kind of a strict mode when
> XML-formatting rules aren't adhered to. You pick degrees of strictness based
> on what works for you. I don't really see a dealbreaking issue here. Why
> can't we all have it the way we want it?
>
> As somebody who deals with some pretty complex UI where the HTML and CSS are
> concerned it's a  problem when things in the rendering context give no
> indication of breakage, while in the DOM they are in fact getting tripped
> up. Sure, I can validate and swap out doctypes or just keep running stuff in
> IE8 to see if it breaks until I actually start using HTML5-only tags but
> this is kind of awkward and suggests something forward-thinking design could
> address don't you think?

As I said, years of evidence have provided strong evidence that a
large majority of authors cannot guarantee that their pages are valid
all of the time.  This covers both authoring-time validity and
validity after including user comments or the like.

If you want a mode that guarantees validity, that already exists -
it's called "put a validator into your workflow".  Many popular text
editors offer plugins that validate your markup as you go, as well.

The problem with breaking visibly is that it doesn't punish authors,
it punishes *users*, who overwhelmingly blame the browser rather than
the site author when the site won't display for whatever reason.
There's no *benefit* to a browser for doing this; it's much more in
their interest to continue doing error-recovery, because, again,
history suggests very strongly that most authors *who theoretically
want strict parsing* can't actually satisfy the constrains they ask
for.  It's simply better for users to always do soft error-recovery,
no matter what the author claims they want.

~TJ


Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
This confuses me. Why does it matter that other documents wouldn't work if
you changed the parsing rules they were defined with to stricter versions?
As far as backwards compatibility, if a strict-defined set of HTML would
also work in a less strict context, what could it possibly matter? It's
only the author's problem to maintain (or switch to a more forgiving mode)
and backwards compatibility isn't broken if the same client 500 years from
now uses the same general HTML mode for both.

I think there's a legit need for a version or some kind of mode for HTML5
that assumes you're a pro and breaks visibly or throws an error when you've
done something wrong. Back in the day nobody ever forced authors who didn't
know what they're doing to use doctypes they were too sloppy to handle. I
wasn't aware of any plan to discontinue non-XHTML doctypes. How everybody
started thinking of it as a battle for one doctype to rule them all makes
no sense to me but I'm fine with one doctype. I just want something that
works in regular HTML5 but that will break in some kind of a strict mode
when XML-formatting rules aren't adhered to. You pick degrees of strictness
based on what works for you. I don't really see a dealbreaking issue here.
Why can't we all have it the way we want it?

As somebody who deals with some pretty complex UI where the HTML and CSS
are concerned it's a  problem when things in the rendering context give no
indication of breakage, while in the DOM they are in fact getting tripped
up. Sure, I can validate and swap out doctypes or just keep running stuff
in IE8 to see if it breaks until I actually start using HTML5-only tags but
this is kind of awkward and suggests something forward-thinking design
could address don't you think?

On Fri, Aug 10, 2012 at 3:05 PM, Tab Atkins Jr. wrote:

> On Fri, Aug 10, 2012 at 12:45 PM, Erik Reppen 
> wrote:
> > My understanding of the general philosophy of HTML5 on the matter of
> > malformed HTML is that it's better to define specific rules concerning
> > breakage rather than overly strict rules about how to do it right in the
> > first place but this is really starting to create pain-points in
> > development.
> >
> > Modern browsers are so good at hiding breakage in rendering now that I
> > sometimes run into things that are just nuking the DOM-node structure on
> > the JS-side of things while everything looks hunky-dorey in rendering and
> > no errors are being thrown.
> >
> > It's like the HTML equivalent of wrapping every function in an empty
> > try/catch statement. For the last year or so I've started using IE8 as my
> > HTML canary when I run into weird problems and I'm not the only dev I've
> > heard of doing this. But what happens when we're no longer supporting IE8
> > and using tags that it doesn't recognize?
> >
> > Why can't we set stricter rules that cause rendering to cease or at
> least a
> > non-interpreter-halting error to be thrown by browsers when the HTML is
> > broken from a nesting/XML-strict-tag-closing perspective if we want?
> Until
> > most of the vendors started lumping XHTML Strict 1.0 into a general
> > "standards" mode that basically worked the same for any declared
> doctype, I
> > thought it was an excellent feature from a development perspective to
> just
> > let bad XML syntax break the page.
> >
> > And if we were able to set such rules, wouldn't it be less work to parse?
> > How difficult would it be to add some sort of opt-in strict mode for
> HTML5
> > that didn't require juggling of doctypes (since that seems to be what the
> > vendors want)?
>
> The parsing rules of HTML aren't set to accommodate old browsers,
> they're set to accommodate old content (which was written for those
> old browsers).  There is an *enormous* corpus of content on the web
> which is officially "invalid" according to various strict definitions,
> and would thus not be displayable in your browser.
>
> As well, experience shows that this isn't an accident, or just due to
> "bad authors".  If you analyze XML sent as text/html on the web,
> something like 95% of it is invalid XML, for lots of different
> reasons.  Even when authors *know* they're using something that's
> supposed to be strict, they screw it up.  Luckily, we ignore the fact
> that it's XML and use good parsing rules to usually extract what the
> author meant.
>
> There are several efforts ongoing to extend this kind of non-strict
> parsing to XML itself, such as the XML-ER (error recovery) Community
> Group in the W3C.  XML failed on the web in part because of its
> strictness - it's very non-trivial to ensure that your page is always
> valid when you're lumping in arbitrary user content as well.
>
> Simplifying the parser to be stricter would not have any significant
> impact on performance.  The vast majority of pages usually pass down
> the fast common path anyway, and most of the "fixes" are very simple
> and fast to apply as well.  Additionally, doing something naive like
> saying "just use

Re: [whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Tab Atkins Jr.
On Fri, Aug 10, 2012 at 12:45 PM, Erik Reppen  wrote:
> My understanding of the general philosophy of HTML5 on the matter of
> malformed HTML is that it's better to define specific rules concerning
> breakage rather than overly strict rules about how to do it right in the
> first place but this is really starting to create pain-points in
> development.
>
> Modern browsers are so good at hiding breakage in rendering now that I
> sometimes run into things that are just nuking the DOM-node structure on
> the JS-side of things while everything looks hunky-dorey in rendering and
> no errors are being thrown.
>
> It's like the HTML equivalent of wrapping every function in an empty
> try/catch statement. For the last year or so I've started using IE8 as my
> HTML canary when I run into weird problems and I'm not the only dev I've
> heard of doing this. But what happens when we're no longer supporting IE8
> and using tags that it doesn't recognize?
>
> Why can't we set stricter rules that cause rendering to cease or at least a
> non-interpreter-halting error to be thrown by browsers when the HTML is
> broken from a nesting/XML-strict-tag-closing perspective if we want? Until
> most of the vendors started lumping XHTML Strict 1.0 into a general
> "standards" mode that basically worked the same for any declared doctype, I
> thought it was an excellent feature from a development perspective to just
> let bad XML syntax break the page.
>
> And if we were able to set such rules, wouldn't it be less work to parse?
> How difficult would it be to add some sort of opt-in strict mode for HTML5
> that didn't require juggling of doctypes (since that seems to be what the
> vendors want)?

The parsing rules of HTML aren't set to accommodate old browsers,
they're set to accommodate old content (which was written for those
old browsers).  There is an *enormous* corpus of content on the web
which is officially "invalid" according to various strict definitions,
and would thus not be displayable in your browser.

As well, experience shows that this isn't an accident, or just due to
"bad authors".  If you analyze XML sent as text/html on the web,
something like 95% of it is invalid XML, for lots of different
reasons.  Even when authors *know* they're using something that's
supposed to be strict, they screw it up.  Luckily, we ignore the fact
that it's XML and use good parsing rules to usually extract what the
author meant.

There are several efforts ongoing to extend this kind of non-strict
parsing to XML itself, such as the XML-ER (error recovery) Community
Group in the W3C.  XML failed on the web in part because of its
strictness - it's very non-trivial to ensure that your page is always
valid when you're lumping in arbitrary user content as well.

Simplifying the parser to be stricter would not have any significant
impact on performance.  The vast majority of pages usually pass down
the fast common path anyway, and most of the "fixes" are very simple
and fast to apply as well.  Additionally, doing something naive like
saying "just use strict XML parsing" is actually *worse* - XML all by
itself is relatively simple, but the addition of namespaces actually
makes it *slower* to parse than HTML.

~TJ


[whatwg] Wasn't there going to be a strict spec?

2012-08-10 Thread Erik Reppen
My understanding of the general philosophy of HTML5 on the matter of
malformed HTML is that it's better to define specific rules concerning
breakage rather than overly strict rules about how to do it right in the
first place but this is really starting to create pain-points in
development.

Modern browsers are so good at hiding breakage in rendering now that I
sometimes run into things that are just nuking the DOM-node structure on
the JS-side of things while everything looks hunky-dorey in rendering and
no errors are being thrown.

It's like the HTML equivalent of wrapping every function in an empty
try/catch statement. For the last year or so I've started using IE8 as my
HTML canary when I run into weird problems and I'm not the only dev I've
heard of doing this. But what happens when we're no longer supporting IE8
and using tags that it doesn't recognize?

Why can't we set stricter rules that cause rendering to cease or at least a
non-interpreter-halting error to be thrown by browsers when the HTML is
broken from a nesting/XML-strict-tag-closing perspective if we want? Until
most of the vendors started lumping XHTML Strict 1.0 into a general
"standards" mode that basically worked the same for any declared doctype, I
thought it was an excellent feature from a development perspective to just
let bad XML syntax break the page.

And if we were able to set such rules, wouldn't it be less work to parse?
How difficult would it be to add some sort of opt-in strict mode for HTML5
that didn't require juggling of doctypes (since that seems to be what the
vendors want)?