Re: [whatwg] Make quoted attributes a conformance criterion
On Mon, Jul 27, 2009 at 2:53 AM, Jonas Sicking wrote: > The more I think about it, the more I'm intrigued by Rob Sayres idea > of completely removing the definition of what is "conforming". Let the > spec define UA (or HTML consumer) behavior, and let lint tools fight > out best practices for authoring. Besides the point Maciej already made, there is another aspect in favor of good conformance definitions: web evolution. Some of the issues, like attribute quoting, may be stylistic, but there are many where there is a clear boundary between what's right and what's wrong. For example, is clearly wrong; but there are too many legacy webpages that use it; so browsers need to support it to render all that content. If we leave "conformance" out of the spec, and only define what browsers are supposed to do, we'd be bringing back to the web, even for new websites, and this would be clearly wrong (we are not speaking of assistive technologies only, but many pages that rely on end up unreadable even in common browsers). Someone could argue that this is just a matter of best practice or style, and hence could be handled by lint tools; but conformance criteria on the specification has a lot more strength than any lint tool. While it may be ok to leave more arguable aspects to these tools, things that are obviously wrong should be clearly defined as non-conformant by the spec. Just my two cents. Regards, Eduard Pascual
Re: [whatwg] Make quoted attributes a conformance criterion
On Jul 26, 2009, at 6:53 PM, Jonas Sicking wrote: On Sun, Jul 26, 2009 at 9:09 AM, Mike Shaver wrote: On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote: My analogy was simply this: Just like it makes sense for a JavaScript lint tool to enforce semi-colons, it makes sense for an HTML conformance checker to enforce quotation marks. A lint tool is not a conformance checker. Your proposal here is analogous to removing ASI from ECMAScript, such that a program which relied on it would not be conformant. I recommend that you find an HTML guru of the same stature as Crockford in the JS community, and convince her to write a lint tool which forbids unquoted attribute values. Once you have that, you can (attempt to) popularize that style via evangelism for the lint tool, rather than trying to foist your stylistic preferences -- which, as it happens, I share -- onto the world via spec requirements. The more I think about it, the more I'm intrigued by Rob Sayres idea of completely removing the definition of what is "conforming". Let the spec define UA (or HTML consumer) behavior, and let lint tools fight out best practices for authoring. I was intrigued by this idea as well, but Henri Sivonen raised an important point that, to a significant extent, changed my mind. A Web content development toolchain will often include markup generaters, as well as validation as part of QA. With a centrally defined notion of markup conformance, markup generators can seek to produce content that meets the conformance rules, while validators can make sure to check the conformance rules as a baseline. This makes it more practical to swap out parts of the toolchain. Otherwise, switching either validators or markup generators would be likely to produce a flood of errors, which would make the switching costs fairly high. Thus, there is an interoperability benefit to defining at least a baseline core of conformance rules. It's not for interoperability between content and user agents, but for interoperability between content generators and markup checkers. That being said, validators can and should compete on the basis of providing additional useful warnings. To build that kind of ecosystem doesn't require the removal of markup conformance. JavaScript, C and C+ + are examples of languages where conforming syntax is strictly defined, yet tools are available that do additional static analysis for both style and correctness. For example, GCC and MSVC have very different sets of C++ warnings, but the fact that syntax errors and certain mandatory warnings are defined by the C++ spec makes it easier to move code from one to the other, while leaving them room to compete on quality and usefulness of optional warnings, among other things. So, in conclusion, having a baseline for correct syntax may actually make it easier to develop an ecosystem of style-checking tools. However, this makes it important to keep the core set of syntax errors relatively minimal. I'm not sure HTML5 as currently drafted entirely hits that balance, but mandating optional tags or requiring double quotes on attributes would be a move in the wrong direction. Regards, Maciej
Re: [whatwg] Make quoted attributes a conformance criterion
On Sun, Jul 26, 2009 at 9:09 AM, Mike Shaver wrote: > On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote: >> My analogy was simply this: Just like it makes sense for a JavaScript lint >> tool to enforce semi-colons, it makes sense for an HTML conformance checker >> to enforce quotation marks. > > A lint tool is not a conformance checker. Your proposal here is > analogous to removing ASI from ECMAScript, such that a program which > relied on it would not be conformant. > > I recommend that you find an HTML guru of the same stature as > Crockford in the JS community, and convince her to write a lint tool > which forbids unquoted attribute values. Once you have that, you can > (attempt to) popularize that style via evangelism for the lint tool, > rather than trying to foist your stylistic preferences -- which, as it > happens, I share -- onto the world via spec requirements. The more I think about it, the more I'm intrigued by Rob Sayres idea of completely removing the definition of what is "conforming". Let the spec define UA (or HTML consumer) behavior, and let lint tools fight out best practices for authoring. / Jonas / Jonas
Re: [whatwg] Make quoted attributes a conformance criterion
Keryx Web wrote: I think I've stated my case by now. So until I hear from Ian (who writes the spec) or Henri, who is authoring the validator, I think we've reached the end of this discussion. I think we reached that point some time ago. :-) I wouldn't hold your breath for acceptance. HTML5 is, for better or for worse*, a tag soup language. It codifies many existing practices: good, bad, and arguable. If you want a language with simple, sane markup rules that you can plausibly teach, the answer is simple: XHTML[5]. (*: I would argue for worse, but that argument was held and lost a long time ago.) -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/
Re: [whatwg] Make quoted attributes a conformance criterion
On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote: > Who is talking about substitution? I am not talking about server side > scripting practices as a whole. I said that escaping is no substitution for > using quotes, since one can not expect developers to escape space > characters. That's all. Since you're escaping anyway, you can just have the escaping function add the quotes (if needed). So the issue won't arise. > And I think adding quotes is better handled in the presentation logic, than > in the business logic. It is more the responsibility of the front end > engineer, than of the back end developer. Why? If the escaping function doesn't add the quotes, you run into the possibility of a situation where the front-end developer omits the quotes, and nothing bad happens until a value with spaces is used -- since regardless of best practices or the advice of conformance checkers, browsers *will* accept unquoted values without complaint. If the escaping function does add the quotes, on the other hand, then the worst the front-end developer can do would be to add extra quotes. That would either cause the value to be empty (e.g. id=""foo""), or be treated as invalid (e.g. style="'color:red'"), or work but have extra quotes in it (e.g. title="'Hello'"), in any case much more easily noticeable. Having the escaping function add the quotes is thus a better policy. > So, you are using python, a language that enforces specific indentation to > define block statements, to say that JSLint has got it all wrong? Douglas > Crockford, and every other JavaScript guru I know, have identified using > semi-colons as best practice - for JavaScript. Roughly every Python guru out there identifies using spaces instead of tabs as best practice in Python. That doesn't mean it has any intrinsic merit. It's just a stylistic convention. > I think I've stated my case by now. So until I hear from Ian (who writes the > spec) or Henri, who is authoring the validator, I think we've reached the > end of this discussion. Agreed.
Re: [whatwg] Make quoted attributes a conformance criterion
On Sun, Jul 26, 2009 at 5:15 AM, Keryx Web wrote: > My analogy was simply this: Just like it makes sense for a JavaScript lint > tool to enforce semi-colons, it makes sense for an HTML conformance checker > to enforce quotation marks. A lint tool is not a conformance checker. Your proposal here is analogous to removing ASI from ECMAScript, such that a program which relied on it would not be conformant. I recommend that you find an HTML guru of the same stature as Crockford in the JS community, and convince her to write a lint tool which forbids unquoted attribute values. Once you have that, you can (attempt to) popularize that style via evangelism for the lint tool, rather than trying to foist your stylistic preferences -- which, as it happens, I share -- onto the world via spec requirements. Mike
Re: [whatwg] Make quoted attributes a conformance criterion
On Sun, Jul 26, 2009 at 5:10 AM, Keryx Web wrote: > Mike, I know what you are doing at Mozilla, and have a ton of respect for > you. But I fail to see how you could misunderstand my analogy to JSLint. Or > do you suggest that Doug Crockford should drop manual semi-colon insertion > from that tool? I'm suggesting that a tool which produces an error report for all use of HTML event handler attributes is enforcing Mr Crockford's style, and not just accepted "best practices", making its requiring of a trailing semi in such event handler attributes rather non-authoritative. Mike
Re: [whatwg] Make quoted attributes a conformance criterion
On 2009-07-26 03:56, Aryeh Gregor wrote: There's no substitute for real escaping here. What if the developer decided that a better value is something like: Please enter your "login" name here Who is talking about substitution? I am not talking about server side scripting practices as a whole. I said that escaping is no substitution for using quotes, since one can not expect developers to escape space characters. That's all. Or whatever. If you're not sure what the input is, you have to programmatically escape it. Once you're programmatically escaping it, your escaping function can add the quotes, and can add them only when necessary (or always, or whatever you prefer). And I think adding quotes is better handled in the presentation logic, than in the business logic. It is more the responsibility of the front end engineer, than of the back end developer. But it really does not matter. There should be an easy way to enforce it, no matter what code generates the quotation marks. I don't think such an enforcement is a panacea to all problems, but it's a small help for some problems, quite common for rookies, though. Please do not argue against it on the failed merits of not being able to substitute indata filtering and output escaping. Those factors are not part of this equation. I think my suggestion is totally analogous to e.g. semi-colon insertion in ECMAScript. JSLint demands that those should be present, and I've yet to hear anyone say "it's a matter of style". Well, I'm going to say it's a matter of style there, too. The dominant convention in Python, for instance, is to omit semicolons. So, you are using python, a language that enforces specific indentation to define block statements, to say that JSLint has got it all wrong? Douglas Crockford, and every other JavaScript guru I know, have identified using semi-colons as best practice - for JavaScript. My analogy was simply this: Just like it makes sense for a JavaScript lint tool to enforce semi-colons, it makes sense for an HTML conformance checker to enforce quotation marks. Always? No, not for boolean attributes and *perhaps* not for attributes that by design never can take anything but a simple keyword or integer as a value. I think I've stated my case by now. So until I hear from Ian (who writes the spec) or Henri, who is authoring the validator, I think we've reached the end of this discussion. -- Keryx Web (Lars Gunther) http://keryx.se/ http://twitter.com/itpastorn/ http://itpastorn.blogspot.com/
Re: [whatwg] Make quoted attributes a conformance criterion
On 2009-07-26 06:56, Mike Shaver wrote: And yet, tons of inline event handler attribute values on the web omit their trailing semicolons...as a matter of style. Yes, one of 1000 perhaps violates JSLint rules on purpose. But I'd wager my right arm that the overwhelming majority using inline event handlers simply do not know or care about best practices. They are following bad and outdated advice. They probably browser sniff too, or still check for support for document.layers. Or have Visual Studio generate all the ghastly code using default settings, including 40 kB viewstates. And use font tags. Mike, I know what you are doing at Mozilla, and have a ton of respect for you. But I fail to see how you could misunderstand my analogy to JSLint. Or do you suggest that Doug Crockford should drop manual semi-colon insertion from that tool? Commenting on this thread as a whole now: Three kinds of attribute values have been identified: - Those that can have multiple words, e.g. class, alt, title, value... - Those that can have just one word or an integer, e.g. width, length... - Boolean attributes, that can be shortened in HTML. Today teachers like me use (false) XHTML to enforce quotation marks for all three cases, because we've seen the pedagogic benefit (and frankly grown tired of looking over the shoulders of our students and say for the millionth time "you've forgotten to quote that alt attribute value"). I actually thought that having a tool that could enforce XHTML-ish rules for the first (and perhaps second) category above, while still leaving boolean attributes alone, would be seen as a benefit, not as a burden. -- Keryx Web (Lars Gunther) http://keryx.se/ http://twitter.com/itpastorn/ http://itpastorn.blogspot.com/
Re: [whatwg] Make quoted attributes a conformance criterion
From: "Mike Shaver" Sent: Saturday, July 25, 2009 11:56 PM To: "Keryx Web" Cc: Subject: Re: [whatwg] Make quoted attributes a conformance criterion On Sat, Jul 25, 2009 at 5:47 AM, Keryx Web wrote: I think my suggestion is totally analogous to e.g. semi-colon insertion in ECMAScript. JSLint demands that those should be present, and I've yet to hear anyone say "it's a matter of style". Omitting semi-colons is a known cause of trouble in ECMAScript. And yet, tons of inline event handler attribute values on the web omit their trailing semicolons...as a matter of style. Mike As someone with an eye for language, I can say that's not really a matter of style. We'll drop the final semicolon in inline JavaScript because we all know it's never necessary, no matter the situation. It's true that these ideas do make themselves at home. What most people don't seem to grasp, however, is that it has everything to do with how they learned the language. The human mind is a very adaptable and intuitive thing, and it tries hard to optimize. If a language doesn't require that something exists, most people will skip it. This can lead to a very degraded language, such as the type of English you see in chat channels, as well as something simpler, like the absence of quotes in HTML. As well, you'll never find habitual omission of quotes from programmers of most other languages, because they're required -- HTML is an odd man out. It's made this way to be easier for most people to learn and use, but it takes up a greater amount of browser overhead and still leaves some errors. The root of the problem is this: Requiring quotes, especially after all these people have learned about HTML and have learned to code without quotes, isn't backwards-compatible. Browsers already use their resources to parse bad code, and so it's also too late to try forcing well-formedness on those. At the same time, quotes -- if the writers learn to always quote without thought -- decrease errors and also normalize the language. The only answer, then, is to deprecate not-quoting: Add quotes to the spec examples, state that quotes aren't needed but are best-practice, add 'unquoted' warnings to the validator, and teach new web developers to always quote attributes. In the future, we might be able to resurrect this debate with more usefulness. Until then, our options are to either do the above or leave it as it is. Adding quotes is more sustainable in the long run, unless it's shown that coders really do have a hard time learning it. HTML must stay easy, above all (most) else. (I argue that quoting all the time is easier than never quoting, but you're really have to ask the students.)
Re: [whatwg] Make quoted attributes a conformance criterion
On Sat, Jul 25, 2009 at 5:47 AM, Keryx Web wrote: > I think my suggestion is totally analogous to e.g. semi-colon insertion in > ECMAScript. JSLint demands that those should be present, and I've yet to > hear anyone say "it's a matter of style". Omitting semi-colons is a known > cause of trouble in ECMAScript. And yet, tons of inline event handler attribute values on the web omit their trailing semicolons...as a matter of style. Mike
Re: [whatwg] Make quoted attributes a conformance criterion
On Sat, Jul 25, 2009 at 5:47 AM, Keryx Web wrote: > Consider this PHP template: > > > > Value is the suggested text, if no user data is available it says "login". > Otherwise its the users login name (no spaces allowed). All is well. > > One day a developer decides that "login name" is a better value, and hard > codes it into the PHP business logic, producing this HTML: > > There's no substitute for real escaping here. What if the developer decided that a better value is something like: Please enter your "login" name here Or whatever. If you're not sure what the input is, you have to programmatically escape it. Once you're programmatically escaping it, your escaping function can add the quotes, and can add them only when necessary (or always, or whatever you prefer). > I think my suggestion is totally analogous to e.g. semi-colon insertion in > ECMAScript. JSLint demands that those should be present, and I've yet to > hear anyone say "it's a matter of style". Well, I'm going to say it's a matter of style there, too. The dominant convention in Python, for instance, is to omit semicolons.
Re: [whatwg] Make quoted attributes a conformance criterion
On 2009-07-25 05:55, Bil Corry wrote: it's still a best practice to encode/sanitize the value Speaking (once again) as someone who has had students in this position a lot of times (and myself a few times) this does not cover all use cases. Consider this PHP template: Value is the suggested text, if no user data is available it says "login". Otherwise its the users login name (no spaces allowed). All is well. One day a developer decides that "login name" is a better value, and hard codes it into the PHP business logic, producing this HTML: All of a sudden you *effectively* have produced this: And it stops working. Now, what would have been easier to avoid this? Url-encoding hard coded variable data, or adding two quotation marks to the template? Bottom line: I think my suggestion is totally analogous to e.g. semi-colon insertion in ECMAScript. JSLint demands that those should be present, and I've yet to hear anyone say "it's a matter of style". Omitting semi-colons is a known cause of trouble in ECMAScript. Omitting quotation marks is a known cause of trouble in HTML. Choosing between robustness and saving a few bytes, one should always opt for the former. -- Keryx Web (Lars Gunther) http://keryx.se/ http://twitter.com/itpastorn/ http://itpastorn.blogspot.com/
Re: [whatwg] Make quoted attributes a conformance criterion
On 2009-07-23 15:32, Kornel wrote: On 23 Jul 2009, at 13:35, Keryx Web wrote: I'd say it is safe to say that using quotation marks for attribute values, always, except perhaps for collapsed, boolean attributes, has been regarded as best practice for a long time now. Speaking as an instructor for newbies, enforcing quotation marks has proven its value countless of times for me and my students. It's not a clear benefit. Unpaired quotation marks can also be a *source* of errors, which wouldn't happen without them: Having dealt with other peoples code a lot (being a teacher) I'd say that problem is exceptionally rare and very easy to spot, put in contrast with the ones arising from not using quotation marks. The proportion is like 100 to 1. Ergo: In real life the benefit is very clear. As for conformance criteria only being about unambiguous parsing: If that is the case we do not need them at all any more, since HTML 5 defines how to handle badly written markup. Just like validation in HTML 4 in reality is more of a benefit to the developer than the browser, since 99% of all errors actually are benign, so conformance criteria in HTML 5 is supposed to primarily help web developers - and authoring tool developers. And speaking directly to Ian H, a few years ago you said on this list that you'd love for the spec to help teachers as much as possible (within the limits of being a spec). My suggested example markup changes is definitely such a help. -- Keryx Web (Lars Gunther) http://keryx.se/ http://twitter.com/itpastorn/ http://itpastorn.blogspot.com/