Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-21 Thread Ian Hickson
On Sat, 16 Apr 2005, fantasai wrote:
 Jim Ley wrote:
Or at the very least use something that would not confuse people into
thinking that it is an
application of SGML or XML.
   
   Do you want to replace NONSGML with THIS-IS-NOT-SGML?
  
  No, I want to replace !DOCTYPE - with something completely different,
  the whole point that anything that looks like an SGML (or XHTML)
  doctype will confuse users into thinking that it is an application of
  SGML.
 
 The vast majority of people out there have never heard of SGML,
 and the ones who have are probably clever enough to figure out
 that NONSGML means it's not SGML.

Of course (ironically) they'd be wrong... NONSGML is an SGML keyword 
meaning that the DTD in question is not an SGML DTD, it doesn't mean that 
the document isn't an SGML document.

I'm currently leaning towards something simpler, maybe just:

   !DOCTYPE HTML5

This would still trigger standards mode (I believe; we'd have to check, of 
course) and would be a lot easier to remember.

But I won't be looking at this in detail for some time (probably not 
until I start working on the Parsing section).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-16 Thread fantasai
Henri Sivonen wrote:
I am very hostile towards the idea of requiring UAs to implement any XML 
parsing features that are in the realm of the XML 1.0 spec but that the 
XML 1.0 spec does not require. This means processing the DTD beyond 
checking the internal subset for well-formedness.
That hostility may be justified as far as browser-type UAs go, but I
would rather you didn't apply it to server-side and authoring tools.
Those who want to use entities for input, should parse and reserialize 
as UTF-8 in their own lair and not expose their entity references (or 
parochial legacy encodings) to the public network.
For those of us writing HTML by hand, this is not a practical solution,
particularly when invisible characters are involved. Invisible characters
aside, I don't want to go digging through a Unicode character map every
time I want rarr; or tau;.
~fantasai


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-08 Thread Lachlan Hunt
Henri Sivonen wrote:
On Apr 7, 2005, at 09:58, Lachlan Hunt wrote:
There's no reason why a full conformance checker couldn't be based on 
OpenSP.
It would be prudent not to use OpenSP in order to avoid accidentally 
allowing SGMLisms that are alien to real-world tag soup.
If I ever get around to writing any form of conformance checker, true 
SGML validation (most likely using OpenSP) or XML validation (probably 
using Xerces or other XML parser) is at the top of my list.

Personally, I probably wouldn't make use of a full conformance checker 
too often during my normal publishing process, as I understand semantic 
documents and most likely wouldn't end up writing non-conformant 
documents in that regard anyway.  However, I do make mistakes and forget 
to close elements, misspell attributes and tag-names or whatever, in 
which case an SGML validator catches most of those mistakes for me. 
Yes, I know there are some things like conditionally required attributes 
that cannot be expressed by a DTD, but that doesn't make _true SGML or 
XML_ validation any less of a *very useful conformance tool*.

Infact, it would probably be a good idea for them to do so, since then 
they'll also be real validators too, which is part of the conformance 
requirements.
I don't think SGML validation is part of What WG conformance 
requirements.
Considering it seems to be part of the conformance criteria,
| Conformance checkers *must* verify that a document conforms to the
| applicable conformance criteria described in this specification...
|
| The term validation specifically refers to a subset of conformance
| checking...
|
| 1. Criteria that can be expressed in a DTD.
validation is a critical part of conformance checking.
I thought Hixie has specifically said he doesn't bother with DTDs.
Just because his authoring practices may not involve their use, doesn't 
mean many other authors don't make use of them.

As real usecase for DTD validation, consider this.  There are increasing 
calls for CMSs to produce strictly conformant markup.  There have been 
many complaints that such conformance is not enforced, which results in 
many invalid and non-conformant websites.  Users should not be required 
to check all of these conformance criteria manually before submitting 
content through a CMS, as experience shows that simply doesn't happen.

If CMSs are ever going to enforce strinctly conformant code, then DTD 
validation will be a core component of that process.  Why re-invent the 
wheel when it comes to that, when a perfectly suitable and proven method 
already exists?  Experience has shown, with all the lints available, 
that validation/conformance checking without a DTD is often incorrect, 
which makes them very useless conformance tools.

This is why HTML must remain an application of SGML, the XHTML version 
*must* be a *valid* application of XML, and why DTDs are so important. 
The only thing we are waiting for in this field is CMSs that actually do 
enforce conformance, which we won't have a chance with if DTDs (or 
Schemas for XML) are not retained.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-08 Thread Henri Sivonen
On Apr 8, 2005, at 03:21, Petrazickis wrote:
Wouldn't authors need to use an HTML4 or an XHTML doctype specifically  
to trigger the standards mode in IE6?
No. The proposed doctype !DOCTYPE html PUBLIC -//WHATWG//NONSGML  
HTML5//EN activates the standards mode in IE6.

http://www.macsanomat.com/%7Ehsivonen/test-quirks.php? 
doctype=%3C%21DOCTYPE+html+PUBLIC+%22- 
%2F%2FWHATWG%2F%2FNONSGML+HTML5%2F%2FEN%22%3E

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-08 Thread Henri Sivonen
On Apr 8, 2005, at 09:23, Lachlan Hunt wrote:
If I ever get around to writing any form of conformance checker, true 
SGML validation (most likely using OpenSP) or XML validation (probably 
using Xerces or other XML parser) is at the top of my list.
If I ever got around to it, DTD validation wouldn't be my approach. I'd 
use Jing with Relax NG and a hand-written SAX filter for checking what 
Jing cannot check. (text/html could be handled by substituting a parser 
that inferred optional tags and appeared to the app as a parser parsing 
XHTML--like TagSoup without error recovery.)

| 1. Criteria that can be expressed in a DTD.
validation is a critical part of conformance checking.
You could check the same criteria either manually or using Relax NG. 
Using DTDs is not required.

If CMSs are ever going to enforce strinctly conformant code, then DTD 
validation will be a core component of that process.
Why bother with DTDs now that Relax NG exists?
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-08 Thread Jim Ley
On Apr 8, 2005 8:18 AM, Henri Sivonen [EMAIL PROTECTED] wrote:
 No. The proposed doctype !DOCTYPE html PUBLIC -//WHATWG//NONSGML
 HTML5//EN activates the standards mode in IE6.

The proposed string that MUST appear as the first line of a WHAT-WG
document is... please do not call it a doctype unless it is a doctype,
see even people on the list are confused by using this!

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Ian Hickson
On Thu, 7 Apr 2005, Lachlan Hunt wrote:
  
  A conformance checker that doesn't check for all the machine-checkable 
  things is not compliant, just like a browser that doesn't support 
  everything in the spec is not compliant.
 
 Fair enough, but is the spec going to specify exactly which conformance 
 criteria fits into which of the 3 categories you've now added, or is 
 expected that implementors will be able to make an educated guess to 
 decide for themselves?

This is something I've been pondering myself, actually. I've been trying 
to think of a way to label the conformance requirements that conformence 
checkers are exempt from checking. In fact I'd quite like to label every 
conformance requirements with flags to indicate who it applies to. That's 
a lot of work though and may get quick messy, so I haven't done it yet.


 It doesn't need to be altered, it only needs to be pointed to an HTML 5 
 DTD, with the system identifier (the URI) in the DOCTYPE.

At the moment I have no intention of personally writing a DTD, schema or 
similar for WHATWG specs. Fantasai once volunteered to do so, but I don't 
know the status of this.

I am very reluctant to put a particular DTD in the DOCTYPE, though. Given 
that DTDs are highly inadequate for catching errors, it feels very wrong 
to me to be giving a particulr DTD any kind of legitimacy at that level.

This doesn't stop conformance checker implements from writing DTDs of 
their own and then placing them in their SGML catalog so that the HTML5 
DOCTYPE triggers that DTD, though. The point is that different conformance 
checker vendors should be able to write their own DTD for HTML5 to 
complement the rest of the conformance checking process. As the mix 
between DTD-based and other checking will probably be vendor-dependent, I 
don't see why we'd want to elevate any particular DTD to official status.


  This is not a bad thing. One hopes that HTML5's more detailed 
  conformance requirements will encourage the development of truly 
  useful conformance checkers that don't mislead people into thinking 
  they have written correct documents when in fact they have just fixed 
  the small subset of errors that the limited validator catches.
 
 I hope so, cause existing conformance checkers (often called lints 
 [1]) for HTML aren't really useful cause they're often only subjective 
 and issue bogus errors or don't catch all errors.

Exactly. By being more precise about what conformance checkers must check, 
we should sidestep that problem.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Jim Ley
On Apr 7, 2005 11:51 AM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Thu, 7 Apr 2005, Anne van Kesteren wrote:
 
  Entities. Or is that problem going to be solved by: use UTF-8? (Which
  would be something I wouldn't disagree with, although for mathematical
  symbols it might be a pain to enter them.)
 
 In my world that is solved by no longer claiming that HTML is an SGML
 application.

So please state that clearly in the specification.

Can you also explain the point of the !DOCTYPE ...  gibberish that
the specs require at the top of documents?  What are they doing,
please remove them, they serve no purpose whatsoever.  Or if they do
serve a purpose, document what the purpose is.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Ian Hickson
On Thu, 7 Apr 2005, Anne van Kesteren wrote:
 
 And how does the XML part of your world feel about [not having a DTD 
 meaning they can't use entities]? (I like the idea for HTML.)

The current draft says that there is no particular DTD for XHTML5. It 
doesn't stop anyone from using one if they want to use entities. I suppose 
we could include one that just had the entities and had a known FPI so 
that UAs could permanently cache it. *shrug*

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Anne van Kesteren
Jim Ley wrote:
Entities. Or is that problem going to be solved by: use UTF-8?
(Which would be something I wouldn't disagree with, although for
mathematical symbols it might be a pain to enter them.)
In my world that is solved by no longer claiming that HTML is an
SGML application.
So please state that clearly in the specification.
Can you also explain the point of the !DOCTYPE ...  gibberish that 
the specs require at the top of documents?  What are they doing, 
please remove them, they serve no purpose whatsoever.  Or if they do 
serve a purpose, document what the purpose is.
You should know the purpose I guess. (Standards mode.) I agree that it
should be documentated.
--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Ian Hickson
On Thu, 7 Apr 2005, Jim Ley wrote:
  
  In my world that is solved by no longer claiming that HTML is an SGML 
  application.
 
 So please state that clearly in the specification.

Yes, patience boy. All in due course. Like I said earlier in this thread, 
I haven't gotten that far in the editing yet, which is why I don't have 
detailed well-thought-through answers to all these questions.

When I get around to actually speccing out how this part of things work 
(probably around the same time I work on a section about how to parse the 
non-XML serialisation), I'll take a close look at all the e-mails in this 
thread and reply to them all.


 Can you also explain the point of the !DOCTYPE ...  gibberish that the 
 specs require at the top of documents?  What are they doing, please 
 remove them, they serve no purpose whatsoever.  Or if they do serve a 
 purpose, document what the purpose is.

They trigger standards mode in modern browsers. The current one for WHATWG 
specs is:

   !DOCTYPE html PUBLIC -//WHATWG//NONSGML HTML5//EN

...as described in 1.8 of the WA1 spec and also somewhere in the WF2 spec.

This will hopefully be explained in more detail in the future. At the 
moment I'm concentrating on defining all the elements and attributes of 
the language.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Ian Hickson
On Thu, 7 Apr 2005, Anne van Kesteren wrote:
  
  Can you also explain the point of the !DOCTYPE ...  gibberish that 
  the specs require at the top of documents?  What are they doing, 
  please remove them, they serve no purpose whatsoever.  Or if they do 
  serve a purpose, document what the purpose is.
 
 You should know the purpose I guess. (Standards mode.) I agree that it 
 should be documentated.

Actually come to think of it there is also a second purpose, namely, 
telling conformance checkers what version of the specification to check 
against. (Which I guess is basically the original purpose of the DOCTYPE.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Jim Ley
On Apr 7, 2005 12:03 PM, Ian Hickson [EMAIL PROTECTED] wrote:
 They trigger standards mode in modern browsers. The 
 current one for WHATWG specs is:

Will the spec explain this some more, in particular could you document
what standards mode is, and exactly how user agents should use this
doctype to trigger it?

Would it not be better to just require WF2/WA user agents to render it
in this standards mode you talk of?  Or at the very least use
something that would not confuse people into thinking that it is an
application of SGML or XML.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Olav Junker Kjær
Jim Ley wrote:
However, a
syntax error in the initial value of a date control *will* cause the
page to stop working as intended.
Could you describe how?  My reading of the error handling defined in
the spec for that situation does not lead to the failure you describe.
 However the unclosed B element does exactly that. (in the XHTML
dialect)
The intention is that the control should show the default value. If the 
value contains a syntax error (e.g. a missing colon) the value will be 
ignored and the control will be empty (according to 
http://whatwg.org/specs/web-forms/current-work/#handling). More subtle 
errors will result if min or max attributes contain a syntax error. 
Depending on the type of application, the wrong or missing date might 
have serious consequences. (It wont prevent the page from showing up, 
though, it just wont work as intended - which in some circumstances 
might be worse).

The problem might be even more subtle if the date is syntactically 
correct but invalid, e.g. the 29. of February in a year that is not a 
leap year. Schema validation using regular expressions wont catch this.

A conformance checker should be able to flag these kinds of errors.
OTOH a missing /b might be annoying but wont usually have serious 
consequences in HTML (XHTML is different, of course). Still, this is the 
only type of error DTD validation will catch.

regards
Olav Junker Kjær


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Henri Sivonen
On Apr 7, 2005, at 14:09, Jim Ley wrote:
Will the spec explain this some more, in particular could you document
what standards mode is, and exactly how user agents should use this
doctype to trigger it?
Ideally, UAs would know nothing of that particular doctype and would 
trigger the standards mode because there is a doctype that is not on 
the list of doctypes that triggers the quirks mode or the almost 
standards mode.

Would it not be better to just require WF2/WA user agents to render it
in this standards mode you talk of?
Yes, in principle, but you can't trigger the layout mode when you hit 
the first What WG feature. UAs already have code for triggering on 
doctype. It's not pretty, but it's the reality.

Or at the very least use something that would not confuse people into 
thinking that it is an
application of SGML or XML.
Do you want to replace NONSGML with THIS-IS-NOT-SGML?
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Jim Ley
  Or at the very least use something that would not confuse people into
  thinking that it is an
  application of SGML or XML.
 
 Do you want to replace NONSGML with THIS-IS-NOT-SGML?

No, I want to replace !DOCTYPE - with something completely different,
the whole point that anything that looks like an SGML (or XHTML)
doctype will confuse users into thinking that it is an application of
SGML.

I see no reason to continue only the odd model of rendering mode
switching  - especially without what this is exactly being defined in
the spec. when as only new implementations will be written supporting
WF2  a simple html WHATversion=2 like mechanism can be used, this
will leave it in a much stronger position for going forward.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Jim Ley
On Apr 7, 2005 6:59 PM, Henri Sivonen [EMAIL PROTECTED] wrote:
 On Apr 7, 2005, at 09:58, Lachlan Hunt wrote:
 I don't think SGML validation is part of What WG conformance
 requirements. I thought Hixie has specifically said he doesn't bother
 with DTDs.

Hixie is simply the editor of the spec, this thread has shown clearly
that many people contributing to the WHAT-WG work do use DTD's, indeed
we already have a volunteer for creating a doctype, in fact it's only
at this (supposedly) late stage that we've suddenly been told there's
not one.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Henri Sivonen
On Apr 7, 2005, at 21:49, Jim Ley wrote:
this thread has shown clearly that many people contributing to the 
WHAT-WG work do use DTD's
To me it seemed that you argued that DTD validation is more useful than 
other conformance checks as long as the other checks are vaporware and 
Lachlan Hunt was theorizing that maybe someone might want to use OpenSP 
and that should be catered for.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Jim Ley
On Apr 7, 2005 8:30 PM, Henri Sivonen [EMAIL PROTECTED] wrote:
 On Apr 7, 2005, at 21:49, Jim Ley wrote:
 
  this thread has shown clearly that many people contributing to the
  WHAT-WG work do use DTD's
 
 To me it seemed that you argued that DTD validation is more useful than
 other conformance checks as long as the other checks are vaporware

From which you can clearly conclude I do use DTD validation as part of
my QA process.  All the people who have said that DTD validation is
absolutely useless haven't bothered to describe their QA processes at
all.

Maybe we could hear about these QA techniques rather than just saying
how crap the existing tools are, rather than the sudden proposal to
seriously reduce the amount of automated QA available to WHAT-WG
adopters.  If there was a different proposal on how WHAT-WG documents
be QA'd then I'd certainly be happy to see DTD validation disappear.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Ian Hickson
On Thu, 7 Apr 2005, Jim Ley wrote:

 From which you can clearly conclude I do use DTD validation as part of 
 my QA process.  All the people who have said that DTD validation is 
 absolutely useless haven't bothered to describe their QA processes at 
 all.

Nobody is stopping anyone from using DTDs.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Jim Ley
On Apr 7, 2005 9:22 PM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Thu, 7 Apr 2005, Jim Ley wrote:
 
  From which you can clearly conclude I do use DTD validation as part of
  my QA process.  All the people who have said that DTD validation is
  absolutely useless haven't bothered to describe their QA processes at
  all.
 
 Nobody is stopping anyone from using DTDs.

If it's not an SGML applicaiton, you most certainly are.

However, the main issue, is How are people going to ensure they're
producing valid WHAT-WG documents?  Your proposal is to throw away all
the existing QA resources and leave a user with none, unless they
happen to have the time and the resources to understand a lot of dense
prose and author a DTD from it.  Something which very few people are
going to be able to do.

So I'll ask once again, how do the WHAT-WG believe authors of WHAT-WG
documents will produce conformant ones?

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-07 Thread Petrazickis
Olav Junker Kjær wrote:
Jim Ley wrote:
Would a version parameter not be more appropriate, simpler, less
confusing to users, easier to parse, easier to understand, doesn't
confuse users into thinking that it's really an application of SGML. 
Doesn't cause problems for legacy user agents like the HTML Validator
etc. etc.

Actually, the HTML element has a (deprecated!) version attribute, 
which could be used for this purpose. I agree it feels cleaner than 
using the doctype syntax.

OTOH authors are going to use doctypes for the forseeable future 
anyway, since they want to trigger standards compliant mode in 
browsers, so we might as well put the doctype to some use.
Wouldn't authors need to use an HTML4 or an XHTML doctype specifically 
to trigger the standards mode in IE6? In that case, specifying a doctype 
of our own would be counter-productive to the goal of compatibility with 
IE6.

Perhaps we need to specify that any DOCTYPE will be ignored when there 
is an html version=5.0 present.
--
Leons Petrazickis


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Jim Ley
On Apr 6, 2005 11:22 AM, Lachlan Hunt [EMAIL PROTECTED] wrote:
 However, I
 disagree with that statement anyway.  Validators should not be
 non-conformant simply because they only do their job to validate a
 document and nothing else.

Absolutely, if there is a continued use of a doctype, then a validator
is absolutely correct to validate to it, so either the validator
should remain conformant, or the doctype should be dropped.  (or
explicitly marked as this is not an SGML or XML doctype it is simply
some cargo cult you should include as your first line)

  I don't see any reason why such a statement
 needs to be included at all.

Neither do I, it's completely unreasonable to say that an incredibly
useful QA tool is non-conformant, simply because the editor doesn't
consider those benefits in the same way.

  In any case, assuming I'm still the editor when the parsing section gets
  written,
 
 Why wouldn't you be?

Because they might present the work to a standards body who gets a new
editor? or some disgruntled reader may ...  hmm, no, let's not go
there...

  HTML5 will most likely stop the pretense of HTML being an SGML  application.
 
 What the?  I disagree with that.  HTML should remain an application of
 SGML, and browser's should be built to conform properly. 

Fully agree.

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Jim Ley
On Apr 6, 2005 11:41 AM, Anne van Kesteren [EMAIL PROTECTED] wrote:
 Lachlan Hunt wrote:
  and the mostly undefined error handling, what about HTML 5 will
  be so incompatible with SGML to warrant such a decision?
 
 One example:
 
 http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2005-January/002993.html

the specication has not currently taken this into the specification,
and there has been no other support in the mailing list for doing
this?  This is clearly an example of how existing browsers are
non-conformant, and simply making it conformant just blesses browsers
in the future to continue violating specs safe in the knowledge that
the spec will get changed to suit them, rather than the reverse.

Exactly what's happened with CSS, do we really want to do it with HTML too?

Cheers,

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Lachlan Hunt
Anne van Kesteren wrote:
Lachlan Hunt wrote:
HTML5 will most likely stop the pretense of HTML being an SGML  
application.
+1.
-1
and the mostly undefined error handling, what about HTML 5 will be so 
incompatible with SGML to warrant such a decision?
One example:
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2005-January/002993.html 
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2005-January/002999.html 
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2005-January/003001.html 
Documents that contain / within script and style elements, that are not 
/script and /style respectively (or the SHORTTAG version /) are 
broken.  I see no problem with defining error handling for broken 
documents, but no need to break conformance with SGML in the process. 
HTML is an application of SGML, regardless of all the broken 
implementations and documents we currently have, and I don't want to see 
that changed.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Lachlan Hunt
Olav Junker Kjr wrote:
Lachlan Hunt wrote:
see no problem with defining error handling for broken documents, but 
no need to break conformance with SGML in the process. HTML is an 
application of SGML, regardless of all the broken implementations and 
documents we currently have, and I don't want to see that changed.
An innocent question (no flamewar intended):
Of course not, I try not to flame. :-)
What is the benefit of having HTML defined as an application of SGML ?
So that it may be processed with SGML tools, and validated with an SGML 
based validator, and possibly even generated using XSLT.  (I know XSLT 
can generate HTML4, but I don't know if it would be able to do HTML5 or 
not, even if it did remain an SGML application).

Even if it is decided that HTML 5 is not formally an application of 
SGML, it must at least remain fully compatible with SGML, and thus a 
conformant HTML 5 document must be a conformant SGML document.  XHTML 
variants of HTML 5 must be a conformant XML document instead, though I 
noticed that is not the case with square brackets in ID attributes in 
section 3.7.2 of WF2  (are there no other character(s) than can be used 
instead?).  So, I guess, there's already no hope of HTML 5 conforming to 
anything.

However, I would like to request that any defined error handling 
behaviour designed to cope with malformed documents that directly 
violates SGML, be made optional (but recommended) so that a user agent 
with a conforming SGML parser may still be conform to HTML 5.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Anne van Kesteren
Lachlan Hunt wrote:
Olav Junker Kjr wrote:
Lachlan Hunt wrote:
Validators should not be non-conformant simply because they only do 
their job to validate a document and nothing else.  I don't see any 
reason why such a statement needs to be included at all.
I don't see anything about validators. I only read about Conformance 
checkers.

--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Anne van Kesteren
Lachlan Hunt wrote:
Even if it is decided that HTML 5 is not formally an application of 
SGML, it must at least remain fully compatible with SGML, and thus a 
conformant HTML 5 document must be a conformant SGML document.  XHTML 
variants of HTML 5 must be a conformant XML document instead, though I 
noticed that is not the case with square brackets in ID attributes in 
section 3.7.2 of WF2  (are there no other character(s) than can be used 
instead?).  So, I guess, there's already no hope of HTML 5 conforming to 
anything.
That is conforming to the XML syntax. It is just not valid.
--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Anne van Kesteren
Lachlan Hunt wrote:
This is clearly an example of how existing browsers are
non-conformant,
Doing otherwise would result in a lot of broken pagges
Those pages are already broken.  Authors just don't know it because
the browsers are even more broken by being forced to deal with them.
You could also argue that they interoparate pretty well. And that it
would be nonsense to break that. (Especially since no browser does it
the other way around.)

and probably less market share for the browser.
I thought this was about standardisation, not some marketing gimmick
for brower vendors!
O common. I just meant that nobody would win anything if a browser 
became conformant here. It would be a lot better to fix the 
specification for these instances and make HTML a more logical language.

--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Lachlan Hunt
Anne van Kesteren wrote:
Lachlan Hunt wrote:
Olav Junker Kjr wrote:
Lachlan Hunt wrote:
Validators should not be non-conformant simply because they only do 
their job to validate a document and nothing else.  I don't see any 
reason why such a statement needs to be included at all.
I don't see anything about validators. I only read about Conformance 
checkers.
In the note in that section [1]:
| Conformance checkers that only perform validation are non-conformant,
In fact, now that I've read it again, it seems rather contradictory. 
Just before the note, it states:

| Conformance checkers are exempt from detecting errors that require
| interpretation of the author's intent (for example, while a document
| is non-conformant if the content of a blockquote element is not a
| quote, conformance checkers do not have to check that blockquote
| elements only contain quoted material).
I would argue that conformance requirements that cannot be expressed by 
a DTD *are* constraints that require interpretation by the author. 
Therefore, that section seems to be saying that validators are exempt 
from checking some things, but are non-conformant for not checking them 
anyway.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Anne van Kesteren
Lachlan Hunt wrote:
Validators should not be non-conformant simply because they 
only do their job to validate a document and nothing else.  I
don't see any reason why such a statement needs to be 
included at all.
I don't see anything about validators. I only read about 
Conformance checkers.
In the note in that section [1]:
| Conformance checkers that only perform validation are 
non-conformant,
So? That doesn't make it a validator. A conformance checker might do
things validators do too, but that doesn't make it one.

In fact, now that I've read it again, it seems rather contradictory.
How?

I would argue that conformance requirements that cannot be expressed 
by a DTD *are* constraints that require interpretation by the author.
Not really. Think about:
 http://annevankesteren.nl/archives/2003/09/invalid-after-validated

Therefore, that section seems to be saying that validators are exempt
from checking some things, but are non-conformant for not checking 
them anyway.
Note that this is about more than just validating and isn't about
validators.
--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Lachlan Hunt
Anne van Kesteren wrote:
Lachlan Hunt wrote:
| Conformance checkers that only perform validation are non-conformant,
So? That doesn't make it a validator.
What is a validator, if it is not a form of conformance checker that 
only peforms validation then?  Or, the other way around, what is a 
conformance checker that only performs validation if it is not a 
validator?

A conformance checker might do things validators do too, but that
doesn't make it one.
I belive such conformance checkers are often called lints and they are 
usually not true validators, despite what many claim, so you are correct 
in that a conformance checker may not be a validator.  But, from what I 
understand of the wording in the spec, a validator is a form of 
conformance checker.  Basically, metaphorically speaking, it's like a 
square is a rectangle, but a rectangle is not always a square.

In fact, now that I've read it again, it seems rather contradictory.
How?
Did I not explain it well enough before?  See below.
I would argue that conformance requirements that cannot be expressed 
by a DTD *are* constraints that require interpretation by the author.
Not really.
Yes, really.
 Think about:
 http://annevankesteren.nl/archives/2003/09/invalid-after-validated
Exactly, the conformance constraints violated in those examples cannot 
be expressed in an XML DTD (some can, and are, by the HTML4 DTD though), 
and require interpretation by the author.  This merely illustrates the 
difference between valid and conformant.

Therefore, that section seems to be saying that validators are exempt
from checking some things, but are non-conformant for not checking 
them anyway.
That is how the spec is contradictory, except s/validators/conformance 
checkers/ and with some things meaning errors that require 
interpretation of the author's intent

Because, if I am understanding correctly and a validator is a form of 
conformance checker, a validator cannot check constraints that are not 
expressed in the DTD and require them to be interpreted by the author. 
Therefore, validators are exempt from checking such constraints, but are 
non-conformant for not checking them anyway, as stated in the note. 
(well done if you are not totally confused by that, I tried to make it 
as clear as possible :-))

Note that this is about more than just validating and isn't about
validators.
Yes, but Conformance checkers that only perform validation are, unless 
I am mistaken, validators.  Hixie, can you please clarify what that 
means, if I am mistaken?

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Olav Junker Kjr
Lachlan Hunt wrote:
Because, if I am understanding correctly and a validator is a form of 
conformance checker, a validator cannot check constraints that are not 
expressed in the DTD and require them to be interpreted by the author. 
Therefore, validators are exempt from checking such constraints, but are 
non-conformant for not checking them anyway, as stated in the note. 
(well done if you are not totally confused by that, I tried to make it 
as clear as possible :-))
I dont think that is correct.
There are three types of conformance criteria:
(1) Criteria that can be expressed in a DTD
(2) Criteria that cannot be expressed by a DTD, but can still be checked 
by a machine.
(3) Criteria that can only be checked by a human.

A conformance checker must check (1) and (2). A simple validator which 
only checks (1) is therefore not conformant.

regards


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Jim Ley
On Apr 6, 2005 3:41 PM, Olav Junker Kjær [EMAIL PROTECTED] wrote:
 Lachlan Hunt wrote:
 There are three types of conformance criteria:
 (1) Criteria that can be expressed in a DTD
 (2) Criteria that cannot be expressed by a DTD, but can still be checked
 by a machine.
 (3) Criteria that can only be checked by a human.
 
 A conformance checker must check (1) and (2). A simple validator which
 only checks (1) is therefore not conformant.

One of the motivations of the WHAT-WG stuff, is that existing users
don't have to change their existing tools, processes and
understanding, now all of sudden we're removing one of the most
valuable QA tools available today, based on some spurious notion that
all these existing users don't understand the QA tools limitations.

Firstly I think the conclusions that the audience for WHAT-WG stuff
doesn't understand the limitations of the validator is sustainable -
where's the evidence?

And secondly, there won't be any QA tools at all if the validator
isn't one of them, so we'll be getting even more crap published, and
far from cleaning up the correctness, we'll just have a whole new load
of crud to rubber stamp as valid in WF2, now I realise it's to the
advantage of existing browser manufacturers to rubber stamp
complicated heuristic behaviour they've already solved into a spec (it
prevents new entrants from coming along)  but how is it to the
advantage to the rest of us - understanding specifications becomes
harder and harder and relies on the fact that we knew what happened
before...

I simply cannot see the point in removing one of the few QA tools that
actually exists for HTML, and would like to hear the actual argument
for doing so. (as this is a seperate issue to if application of SGML
is something that it would be)

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Anne van Kesteren
Lachlan Hunt wrote:
(2) Criteria that cannot be expressed by a DTD, but can still be 
checked by a machine.
Such as...?
aema//em/a
(Can also be expressed using RelaxNG or XML Schema.) You did read my 
entry, didn't you?

--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-06 Thread Jim Ley
On Apr 6, 2005 10:05 PM, Henri Sivonen [EMAIL PROTECTED] wrote:
 On Apr 6, 2005, at 15:10, Lachlan Hunt wrote:
  XHTML variants of HTML 5 must be a conformant XML document instead,
  though I noticed that is not the case with square brackets in ID
  attributes in section 3.7.2 of WF2
 
 That's not a problem if you don't claim they are ID attributes but
 attributes that happen to be named id.

Which would mean we also have to start redfining DOM, so
document.getElementById(...) is defined to work against things that
happen to be named id and not just things that are ID's.

Is it really worth going down this road?

Jim.


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-05 Thread Anne van Kesteren
Lachlan Hunt wrote:
No, there is no implied body element in either of those fragments.
I appreciate your comments but I was wondering if you have taken into 
account what existing user agents do. Since that, not some 
out-of-date-not-followed SGML standard, should be standardized in my 
humble opinion.

That also means that:
 data:text/html,stylebody{background:lime}/style
... generates:
 HTML
  HEAD
   STYLE
#text
  BODY
... whether you like it or not. Same for the other cases. I do not think 
it makes sense to say all current UAs are non-compliant as many pages 
may rely on the kind of DOM they generate.

--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-05 Thread Lachlan Hunt
Ian Hickson wrote:
On Tue, 5 Apr 2005, Anne van Kesteren wrote:
script type=text/javascript src=bar/script
titleFoo/title
..?
If I am not mistaken:
   htmlheadscript.../
   title...//headbody/body/html
I believe you are mistaken.  A conforming SGML parser will not imply the 
body element without any content to make it do so.

Is there a BODY element in this document (or, is there always a body 
element?):

style type=text/css
 body{ background:lime }
/style
... or this:
titleBar/title
The body will always be implied, though.
Not in a conforming SGML parser, though it seems to be in Mozilla, Opera 
and IE, as I checked using your DOM viewer [1].  Although Opera seems to 
have a bug in standards comliant mode (at least, according to the DOM 
viewer script) because neither the head or body elements appeared in the 
DOM using this markup:

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01//EN
http://www.w3.org/TR/html4/strict.dtd;
titleFoo/title
script type=text/javascript src=bar/script
However, if the body element were to be automatically implied 
regardless, then the same would be true of the tbody element since 
both are required elements of html and table, respectively, and both 
have optional start- and end-tags,the rules for both must be the same. 
Neither Mozilla or Opera implies the missing tbody element within 
table/table, although IE does.  However, OpenSP does not imply the 
missing elements in either case.

The only documentation I could find that supports this, given the short 
amount of time I have to look, is this paragraph from section 9.2.3 of 
Martin Bryan's SGML and HTML Explained [2] that was explaining how the 
associated example should be parsed.

| The start-tag can be omitted because the absence of this compulsory
| first embedded subelement could be implied by the parser from the
| content model... As soon as it sees a character other than a
| start-tag delimiter () it will recognize that the character should be
| preceded by [the start tag].
(For backwards compatibility with legacy parsers, the head probably won't be.)
The head element seems to be implied by Mozilla and IE.  Opera and 
OpenSP correctly don't imply the missing head element.

[1] http://www.hixie.ch/tests/adhoc/html/parsing/compat/viewer.html
[2] http://www.is-thought.co.uk/book/sgml-9.htm#Omitting
--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-05 Thread Ian Hickson
On Wed, 6 Apr 2005, Lachlan Hunt wrote:
  
   script type=text/javascript src=bar/script
   titleFoo/title
   
   ..?
  
  If I am not mistaken:
  
 htmlheadscript.../
 title...//headbody/body/html
 
 I believe you are mistaken.  A conforming SGML parser will not imply the 
 body element without any content to make it do so.

I meant in existing UAs, not in the spec. According to the HTML spec, the 
handling of the above is completely undefined since it is invalid. (Note 
that something being invalid or non-conformant does _not_ make the 
rendering undefined in most cases in Web Apps 1 / HTML5. That's one of the 
main things I'm making sure of.)


   Is there a BODY element in this document (or, is there always a body
   element?):
   
   style type=text/css
body{ background:lime }
   /style
   
   ... or this:
   
   titleBar/title
  
  The body will always be implied, though.
 
 Not in a conforming SGML parser, though it seems to be in Mozilla, Opera and
 IE, as I checked using your DOM viewer [1].

Yeah, I meant in browsers, not per SGML.


 However, if the body element were to be automatically implied 
 regardless, then the same would be true of the tbody element since 
 both are required elements of html and table, respectively, and both 
 have optional start- and end-tags,the rules for both must be the same. 
 Neither Mozilla or Opera implies the missing tbody element within 
 table/table, although IE does. However, OpenSP does not imply the 
 missing elements in either case.

tbody is implied if there is a tr there.

The history behind all these quirks is long and confused. body in 
particular has had an especially colourful past.


  (For backwards compatibility with legacy parsers, the head probably 
  won't be.)
 
 The head element seems to be implied by Mozilla and IE.

Even when there are no elements that imply a head? I meant, e.g., when 
parsing the empty string as HTML. My understanding was that no head 
element was generated in that case.


 Opera and OpenSP correctly don't imply the missing head element.

I'm not sure what you mean by correctly here since an HTML4 document 
without a title is invalid and thus parsing is undefined in HTML4. If 
there is a title then the head must be implied per SGML.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [html5] tags, elements and generated DOM

2005-04-05 Thread Ian Hickson
On Tue, 5 Apr 2005, Anne van Kesteren wrote:
 Ian Hickson wrote:
   The head element seems to be implied by Mozilla and IE.
  
  Even when there are no elements that imply a head? I meant, e.g.,
  when parsing the empty string as HTML. My understanding was that no
  head element was generated in that case.
 
  data:text/html,
 
 ... generates both in Firefox 1.0 and in recent nightlies:
 
  HTML
   HEAD
   BODY

I stand corrected.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'