Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-27 Thread Eduard Pascual
On Mon, Jul 27, 2009 at 2:53 AM, Jonas Sicking wrote:
> The more I think about it, the more I'm intrigued by Rob Sayres idea
> of completely removing the definition of what is "conforming". Let the
> spec define UA (or HTML consumer) behavior, and let lint tools fight
> out best practices for authoring.

Besides the point Maciej already made, there is another aspect in
favor of good conformance definitions: web evolution.

Some of the issues, like attribute quoting, may be stylistic, but
there are many where there is a clear boundary between what's right
and what's wrong. For example,  is clearly wrong; but there are
too many legacy webpages that use it; so browsers need to support it
to render all that content. If we leave "conformance" out of the spec,
and only define what browsers are supposed to do, we'd be bringing
 back to the web, even for new websites, and this would be
clearly wrong (we are not speaking of assistive technologies only, but
many pages that rely on  end up unreadable even in common
browsers).

Someone could argue that this is just a matter of best practice or
style, and hence could be handled by lint tools; but conformance
criteria on the specification has a lot more strength than any lint
tool. While it may be ok to leave more arguable aspects to these
tools, things that are obviously wrong should be clearly defined as
non-conformant by the spec.

Just my two cents.

Regards,
Eduard Pascual


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Maciej Stachowiak


On Jul 26, 2009, at 6:53 PM, Jonas Sicking wrote:

On Sun, Jul 26, 2009 at 9:09 AM, Mike Shaver  
wrote:

On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote:
My analogy was simply this: Just like it makes sense for a  
JavaScript lint
tool to enforce semi-colons, it makes sense for an HTML  
conformance checker

to enforce quotation marks.


A lint tool is not a conformance checker.  Your proposal here is
analogous to removing ASI from ECMAScript, such that a program which
relied on it would not be conformant.

I recommend that you find an HTML guru of the same stature as
Crockford in the JS community, and convince her to write a lint tool
which forbids unquoted attribute values.  Once you have that, you can
(attempt to) popularize that style via evangelism for the lint tool,
rather than trying to foist your stylistic preferences -- which, as  
it

happens, I share -- onto the world via spec requirements.


The more I think about it, the more I'm intrigued by Rob Sayres idea
of completely removing the definition of what is "conforming". Let the
spec define UA (or HTML consumer) behavior, and let lint tools fight
out best practices for authoring.


I was intrigued by this idea as well, but Henri Sivonen raised an  
important point that, to a significant extent, changed my mind. A Web  
content development toolchain will often include markup generaters, as  
well as validation as part of QA. With a centrally defined notion of  
markup conformance, markup generators can seek to produce content that  
meets the conformance rules, while validators can make sure to check  
the conformance rules as a baseline. This makes it more practical to  
swap out parts of the toolchain. Otherwise, switching either  
validators or markup generators would be likely to produce a flood of  
errors, which would make the switching costs fairly high. Thus, there  
is an interoperability benefit to defining at least a baseline core of  
conformance rules. It's not for interoperability between content and  
user agents, but for interoperability between content generators and  
markup checkers.


That being said, validators can and should compete on the basis of  
providing additional useful warnings. To build that kind of ecosystem  
doesn't require the removal of markup conformance. JavaScript, C and C+ 
+ are examples of languages where conforming syntax is strictly  
defined, yet tools are available that do additional static analysis  
for both style and correctness. For example, GCC and MSVC have very  
different sets of C++ warnings, but the fact that syntax errors and  
certain mandatory warnings are defined by the C++ spec makes it easier  
to move code from one to the other, while leaving them room to compete  
on quality and usefulness of optional warnings, among other things.


So, in conclusion, having a baseline for correct syntax may actually  
make it easier to develop an ecosystem of style-checking tools.  
However, this makes it important to keep the core set of syntax errors  
relatively minimal. I'm not sure HTML5 as currently drafted entirely  
hits that balance, but mandating optional tags or requiring double  
quotes on attributes would be a move in the wrong direction.


Regards,
Maciej



Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Jonas Sicking
On Sun, Jul 26, 2009 at 9:09 AM, Mike Shaver wrote:
> On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote:
>> My analogy was simply this: Just like it makes sense for a JavaScript lint
>> tool to enforce semi-colons, it makes sense for an HTML conformance checker
>> to enforce quotation marks.
>
> A lint tool is not a conformance checker.  Your proposal here is
> analogous to removing ASI from ECMAScript, such that a program which
> relied on it would not be conformant.
>
> I recommend that you find an HTML guru of the same stature as
> Crockford in the JS community, and convince her to write a lint tool
> which forbids unquoted attribute values.  Once you have that, you can
> (attempt to) popularize that style via evangelism for the lint tool,
> rather than trying to foist your stylistic preferences -- which, as it
> happens, I share -- onto the world via spec requirements.

The more I think about it, the more I'm intrigued by Rob Sayres idea
of completely removing the definition of what is "conforming". Let the
spec define UA (or HTML consumer) behavior, and let lint tools fight
out best practices for authoring.

/ Jonas

/ Jonas


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread And Clover

Keryx Web wrote:

I think I've stated my case by now. So until I hear from Ian (who writes 
the spec) or Henri, who is authoring the validator, I think we've 
reached the end of this discussion.


I think we reached that point some time ago. :-)

I wouldn't hold your breath for acceptance. HTML5 is, for better or for 
worse*, a tag soup language. It codifies many existing practices: good, 
bad, and arguable. If you want a language with simple, sane markup rules 
that you can plausibly teach, the answer is simple: XHTML[5].


(*: I would argue for worse, but that argument was held and lost a long 
time ago.)


--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/



Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Aryeh Gregor
On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote:
> Who is talking about substitution? I am not talking about server side
> scripting practices as a whole. I said that escaping is no substitution for
> using quotes, since one can not expect developers to escape space
> characters. That's all.

Since you're escaping anyway, you can just have the escaping function
add the quotes (if needed).  So the issue won't arise.

> And I think adding quotes is better handled in the presentation logic, than
> in the business logic. It is more the responsibility of the front end
> engineer, than of the back end developer.

Why?  If the escaping function doesn't add the quotes, you run into
the possibility of a situation where the front-end developer omits the
quotes, and nothing bad happens until a value with spaces is used --
since regardless of best practices or the advice of conformance
checkers, browsers *will* accept unquoted values without complaint.

If the escaping function does add the quotes, on the other hand, then
the worst the front-end developer can do would be to add extra quotes.
 That would either cause the value to be empty (e.g. id=""foo""), or
be treated as invalid (e.g. style="'color:red'"), or work but have
extra quotes in it (e.g. title="'Hello'"), in any case much more
easily noticeable.  Having the escaping function add the quotes is
thus a better policy.

> So, you are using python, a language that enforces specific indentation to
> define block statements, to say that JSLint has got it all wrong? Douglas
> Crockford, and every other JavaScript guru I know, have identified using
> semi-colons as best practice - for JavaScript.

Roughly every Python guru out there identifies using spaces instead of
tabs as best practice in Python.  That doesn't mean it has any
intrinsic merit.  It's just a stylistic convention.

> I think I've stated my case by now. So until I hear from Ian (who writes the
> spec) or Henri, who is authoring the validator, I think we've reached the
> end of this discussion.

Agreed.


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Mike Shaver
On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote:
> My analogy was simply this: Just like it makes sense for a JavaScript lint
> tool to enforce semi-colons, it makes sense for an HTML conformance checker
> to enforce quotation marks.

A lint tool is not a conformance checker.  Your proposal here is
analogous to removing ASI from ECMAScript, such that a program which
relied on it would not be conformant.

I recommend that you find an HTML guru of the same stature as
Crockford in the JS community, and convince her to write a lint tool
which forbids unquoted attribute values.  Once you have that, you can
(attempt to) popularize that style via evangelism for the lint tool,
rather than trying to foist your stylistic preferences -- which, as it
happens, I share -- onto the world via spec requirements.

Mike


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Mike Shaver
On Sun, Jul 26, 2009 at 5:10 AM, Keryx Web wrote:
> Mike, I know what you are doing at Mozilla, and have a ton of respect for
> you. But I fail to see how you could misunderstand my analogy to JSLint. Or
> do you suggest that Doug Crockford should drop manual semi-colon insertion
> from that tool?

I'm suggesting that a tool which produces an error report for all use
of HTML event handler attributes is enforcing Mr Crockford's style,
and not just accepted "best practices", making its requiring of a
trailing semi in such event handler attributes rather
non-authoritative.

Mike


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Keryx Web

On 2009-07-26 03:56, Aryeh Gregor wrote:

There's no substitute for real escaping here.  What if the developer
decided that a better value is something like:

Please enter your "login" name here


Who is talking about substitution? I am not talking about server side 
scripting practices as a whole. I said that escaping is no substitution 
for using quotes, since one can not expect developers to escape space 
characters. That's all.



Or whatever.  If you're not sure what the input is, you have to
programmatically escape it.  Once you're programmatically escaping it,
your escaping function can add the quotes, and can add them only when
necessary (or always, or whatever you prefer).


And I think adding quotes is better handled in the presentation logic, 
than in the business logic. It is more the responsibility of the front 
end engineer, than of the back end developer.


But it really does not matter. There should be an easy way to enforce 
it, no matter what code generates the quotation marks. I don't think 
such an enforcement is a panacea to all problems, but it's a small help 
for some problems, quite common for rookies, though.


Please do not argue against it on the failed merits of not being able to 
substitute indata filtering and output escaping. Those factors are not 
part of this equation.



I think my suggestion is totally analogous to e.g. semi-colon insertion in
ECMAScript. JSLint demands that those should be present, and I've yet to
hear anyone say "it's a matter of style".


Well, I'm going to say it's a matter of style there, too.  The
dominant convention in Python, for instance, is to omit semicolons.


So, you are using python, a language that enforces specific indentation 
to define block statements, to say that JSLint has got it all wrong? 
Douglas Crockford, and every other JavaScript guru I know, have 
identified using semi-colons as best practice - for JavaScript.


My analogy was simply this: Just like it makes sense for a JavaScript 
lint tool to enforce semi-colons, it makes sense for an HTML conformance 
checker to enforce quotation marks.


Always? No, not for boolean attributes and *perhaps* not for attributes 
that by design never can take anything but a simple keyword or integer 
as a value.


I think I've stated my case by now. So until I hear from Ian (who writes 
the spec) or Henri, who is authoring the validator, I think we've 
reached the end of this discussion.



--
Keryx Web (Lars Gunther)
http://keryx.se/
http://twitter.com/itpastorn/
http://itpastorn.blogspot.com/


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-26 Thread Keryx Web

On 2009-07-26 06:56, Mike Shaver wrote:

And yet, tons of inline event handler attribute values on the web omit
their trailing semicolons...as a matter of style.


Yes, one of 1000 perhaps violates JSLint rules on purpose. But I'd wager 
my right arm that the overwhelming majority using inline event handlers 
simply do not know or care about best practices. They are following bad 
and outdated advice. They probably browser sniff too, or still check for 
support for document.layers. Or have Visual Studio generate all the 
ghastly code using default settings, including 40 kB viewstates. And use 
font tags.


Mike, I know what you are doing at Mozilla, and have a ton of respect 
for you. But I fail to see how you could misunderstand my analogy to 
JSLint. Or do you suggest that Doug Crockford should drop manual 
semi-colon insertion from that tool?


Commenting on this thread as a whole now:

Three kinds of attribute values have been identified:
- Those that can have multiple words, e.g. class, alt, title, value...
- Those that can have just one word or an integer, e.g. width, length...
- Boolean attributes, that can be shortened in HTML.

Today teachers like me use (false) XHTML to enforce quotation marks for 
all three cases, because we've seen the pedagogic benefit (and frankly 
grown tired of looking over the shoulders of our students and say for 
the millionth time "you've forgotten to quote that alt attribute value").


I actually thought that having a tool that could enforce XHTML-ish rules 
for the first (and perhaps second) category above, while still leaving 
boolean attributes alone, would be seen as a benefit, not as a burden.



--
Keryx Web (Lars Gunther)
http://keryx.se/
http://twitter.com/itpastorn/
http://itpastorn.blogspot.com/


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-25 Thread Michael Kozakewich

From: "Mike Shaver" 
Sent: Saturday, July 25, 2009 11:56 PM
To: "Keryx Web" 
Cc: 
Subject: Re: [whatwg] Make quoted attributes a conformance criterion


On Sat, Jul 25, 2009 at 5:47 AM, Keryx Web wrote:
I think my suggestion is totally analogous to e.g. semi-colon insertion 
in

ECMAScript. JSLint demands that those should be present, and I've yet to
hear anyone say "it's a matter of style". Omitting semi-colons is a known
cause of trouble in ECMAScript.


And yet, tons of inline event handler attribute values on the web omit
their trailing semicolons...as a matter of style.

Mike


As someone with an eye for language, I can say that's not really a matter of 
style. We'll drop the final semicolon in inline JavaScript because we all 
know it's never necessary, no matter the situation.
It's true that these ideas do make themselves at home. What most people 
don't seem to grasp, however, is that it has everything to do with how they 
learned the language. The human mind is a very adaptable and intuitive 
thing, and it tries hard to optimize. If a language doesn't require that 
something exists, most people will skip it. This can lead to a very degraded 
language, such as the type of English you see in chat channels, as well as 
something simpler, like the absence of quotes in HTML.


As well, you'll never find habitual omission of quotes from programmers of 
most other languages, because they're required -- HTML is an odd man out. 
It's made this way to be easier for most people to learn and use, but it 
takes up a greater amount of browser overhead and still leaves some errors.


The root of the problem is this: Requiring quotes, especially after all 
these people have learned about HTML and have learned to code without 
quotes, isn't backwards-compatible. Browsers already use their resources to 
parse bad code, and so it's also too late to try forcing well-formedness on 
those.
At the same time, quotes -- if the writers learn to always quote without 
thought -- decrease errors and also normalize the language.
The only answer, then, is to deprecate not-quoting: Add quotes to the spec 
examples, state that quotes aren't needed but are best-practice, add 
'unquoted' warnings to the validator, and teach new web developers to always 
quote attributes.


In the future, we might be able to resurrect this debate with more 
usefulness. Until then, our options are to either do the above or leave it 
as it is. Adding quotes is more sustainable in the long run, unless it's 
shown that coders really do have a hard time learning it. HTML must stay 
easy, above all (most) else. (I argue that quoting all the time is easier 
than never quoting, but you're really have to ask the students.) 



Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-25 Thread Mike Shaver
On Sat, Jul 25, 2009 at 5:47 AM, Keryx Web wrote:
> I think my suggestion is totally analogous to e.g. semi-colon insertion in
> ECMAScript. JSLint demands that those should be present, and I've yet to
> hear anyone say "it's a matter of style". Omitting semi-colons is a known
> cause of trouble in ECMAScript.

And yet, tons of inline event handler attribute values on the web omit
their trailing semicolons...as a matter of style.

Mike


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-25 Thread Aryeh Gregor
On Sat, Jul 25, 2009 at 5:47 AM, Keryx Web wrote:
> Consider this PHP template:
>
> 
>
> Value is the suggested text, if no user data is available it says "login".
> Otherwise its the users login name (no spaces allowed). All is well.
>
> One day a developer decides that "login name" is a better value, and hard
> codes it into the PHP business logic, producing this HTML:
>
> 

There's no substitute for real escaping here.  What if the developer
decided that a better value is something like:

Please enter your "login" name here

Or whatever.  If you're not sure what the input is, you have to
programmatically escape it.  Once you're programmatically escaping it,
your escaping function can add the quotes, and can add them only when
necessary (or always, or whatever you prefer).

> I think my suggestion is totally analogous to e.g. semi-colon insertion in
> ECMAScript. JSLint demands that those should be present, and I've yet to
> hear anyone say "it's a matter of style".

Well, I'm going to say it's a matter of style there, too.  The
dominant convention in Python, for instance, is to omit semicolons.


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-25 Thread Keryx Web

On 2009-07-25 05:55, Bil Corry wrote:

 it's still a best practice to encode/sanitize the value


Speaking (once again) as someone who has had students in this position a 
lot of times (and myself a few times) this does not cover all use cases.


Consider this PHP template:



Value is the suggested text, if no user data is available it says 
"login". Otherwise its the users login name (no spaces allowed). All is 
well.


One day a developer decides that "login name" is a better value, and 
hard codes it into the PHP business logic, producing this HTML:




All of a sudden you *effectively* have produced this:



And it stops working.

Now, what would have been easier to avoid this? Url-encoding hard coded 
variable data, or adding two quotation marks to the template?


Bottom line:

I think my suggestion is totally analogous to e.g. semi-colon insertion 
in ECMAScript. JSLint demands that those should be present, and I've yet 
to hear anyone say "it's a matter of style". Omitting semi-colons is a 
known cause of trouble in ECMAScript. Omitting quotation marks is a 
known cause of trouble in HTML.


Choosing between robustness and saving a few bytes, one should always 
opt for the former.


--
Keryx Web (Lars Gunther)
http://keryx.se/
http://twitter.com/itpastorn/
http://itpastorn.blogspot.com/


Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-23 Thread Keryx Web

On 2009-07-23 15:32, Kornel wrote:

On 23 Jul 2009, at 13:35, Keryx Web wrote:


I'd say it is safe to say that using quotation marks for attribute
values, always, except perhaps for collapsed, boolean attributes, has
been regarded as best practice for a long time now. Speaking as an
instructor for newbies, enforcing quotation marks has proven its value
countless of times for me and my students.


It's not a clear benefit. Unpaired quotation marks can also be a
*source* of errors, which wouldn't happen without them:


Having dealt with other peoples code a lot (being a teacher) I'd say 
that problem is exceptionally rare and very easy to spot, put in 
contrast with the ones arising from not using quotation marks. The 
proportion is like 100 to 1.


Ergo: In real life the benefit is very clear.

As for conformance criteria only being about unambiguous parsing: If 
that is the case we do not need them at all any more, since HTML 5 
defines how to handle badly written markup.


Just like validation in HTML 4 in reality is more of a benefit to the 
developer than the browser, since 99% of all errors actually are benign, 
so conformance criteria in HTML 5 is supposed to primarily help web 
developers - and authoring tool developers.


And speaking directly to Ian H, a few years ago you said on this list 
that you'd love for the spec to help teachers as much as possible 
(within the limits of being a spec). My suggested example markup changes 
is definitely such a help.


--
Keryx Web (Lars Gunther)
http://keryx.se/
http://twitter.com/itpastorn/
http://itpastorn.blogspot.com/