subject:"Re\: \[whatwg\] Allow trailing slash in always\-empty HTML5 elements\?"

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-15 Thread Benjamin Hawkes-Lewis

Alexey Feldgendler wrote:
> They do mention validation:  
> http://www.anysurfer.be/fr/obtenir-label/procedures-de-labellisation/la-validation/
>   
> -- though I'm not sure they mean "ensuring valid HTML".

I'm afraid they mean validating to /their/ accessibility standards, not
the (X)HTML specifications. (The page in question has 37 XHTML
validation errors.) Some discussion of the reasoning behind not
requiring validation can be found at:

http://veerle.duoh.com/index.php/blog/comments/a_response_from_an_accessibility_consultant_from_blindsurfer/

(NB AnySurfer used to be called BlindSurfer.) 

And for another example of accessibility-orientated developers
decisively rejecting validation, seemingly with RNIB's tacit acceptance,
consider LightMaker (creators of the flagship "accessible" Flash at J.
K. Rowling's website):

http://www.rnib.org.uk/wacblog/news/just-how-accessible-is-the-web-bbc-1s-click-investigates/

I explain why inserting deliberate errors into their own markup was
counter-productive in comments there, so I won't repeat myself here. :)

--
Benjamin Hawkes-Lewis

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-15 Thread Alexey Feldgendler

On Fri, 15 Dec 2006 15:49:24 +0600, Benjamin Hawkes-Lewis  
<[EMAIL PROTECTED]> wrote:



I think basic conformance is part and parcel of creating an accessible,
interoperable site; but it's worth noting that there are plenty of
captains of accessibility who reject that viewpoint, e.g.:

http://www.anysurfer.be/


They do mention validation:  
http://www.anysurfer.be/fr/obtenir-label/procedures-de-labellisation/la-validation/  
-- though I'm not sure they mean "ensuring valid HTML".



--
Alexey Feldgendler <[EMAIL PROTECTED]>
[ICQ: 115226275] http://feldgendler.livejournal.com

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-15 Thread Rimantas Liubertas


> "Make sure that your TITLE and ALT tags ...

Tags... Right.


ha ha, good catch, how did I miss this one...

Regards,
Rimantas
--
http://rimantas.com/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-15 Thread Anne van Kesteren

On Fri, 15 Dec 2006 11:03:23 +0100, Rimantas Liubertas  
<[EMAIL PROTECTED]> wrote:

"Make sure that your TITLE and ALT tags ...


Tags... Right.


--
Anne van Kesteren

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-15 Thread Rimantas Liubertas


Indeed, and, from the broad indications they do give, there's /nothing/
to suggest that they favour conformant markup over non-conformant
markup: "Currently we take into account several factors, including a
given page's simplicity, how much visual imagery it carries and whether
or not its primary purpose is immediately viable with keyboard
navigation."


There are some relevant advices at
http://www.google.com/support/webmasters/bin/answer.py?answer=35769
too.
They don't mention accessibility and are oriented towards betters
position at SERPs,
nonetheless:
"Make sure that your TITLE and ALT tags are descriptive and accurate.;
Check for broken links and correct HTML. ;Use a text browser such as
Lynx to examine your site, because most search engine spiders see your
site much as Lynx would. If fancy features such as JavaScript,
cookies, session IDs, frames, DHTML, or Flash keep you from seeing all
of your site in a text browser, then search engine spiders may have
trouble crawling your site."

These won't hurt accessibility either.


Regards,
Rimantas
--
http://rimantas.com/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-15 Thread Benjamin Hawkes-Lewis

Alexey Feldgendler wrote:
> Interesting. But this lacks one important thing: a clear indication of
>  why some page doesn't qualify as accessible. Google seems reluctant to
>  disclose their criteria, and it's a pity.

Indeed, and, from the broad indications they do give, there's /nothing/
to suggest that they favour conformant markup over non-conformant
markup: "Currently we take into account several factors, including a
given page's simplicity, how much visual imagery it carries and whether
or not its primary purpose is immediately viable with keyboard
navigation."

http://labs.google.com/accessible/faq.html

At least they point people towards WCAG, which makes using conformant,
valid (X)HTML a priority two criterion.

I think basic conformance is part and parcel of creating an accessible,
interoperable site; but it's worth noting that there are plenty of
captains of accessibility who reject that viewpoint, e.g.:

http://www.anysurfer.be/

--
Benjamin Hawkes-Lewis

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-14 Thread Alexey Feldgendler

On Fri, 15 Dec 2006 13:01:01 +0600, Matthew Paul Thomas <[EMAIL PROTECTED]> 
wrote:

>> Personally, I would *love* Google to do this sort of thing. I just
>> have no hope for it.

> 

Interesting. But this lacks one important thing: a clear indication of why some 
page doesn't qualify as accessible. Google seems reluctant to disclose their 
criteria, and it's a pity.

-- 
Alexey Feldgendler <[EMAIL PROTECTED]>
[ICQ: 115226275] http://feldgendler.livejournal.com

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-14 Thread Matthew Paul Thomas


On Dec 7, 2006, at 7:47 PM, Alexey Feldgendler wrote:


On Thu, 07 Dec 2006 05:09:44 +0600, Mike Schinkel 
<[EMAIL PROTECTED]> wrote:


And if these corporations were using content management systems that 
didn't produce standards-based code, you can bet those CMS vendors 
would soon have a new #1 priority, but fast.  And THAT would clean up 
the web quicker than any academic or grass roots effort ever could.

...
As I don't work for Google, I'm not in the right position to say what 
is appropriate for Google to do and what is not. And I'm almost sure 
Hixie is not in that position eiter.


Personally, I would *love* Google to do this sort of thing. I just 
have no hope for it.

...




--
Matthew Paul Thomas
http://mpt.net.nz/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-08 Thread Matthew Paul Thomas


On Dec 4, 2006, at 6:56 AM, Shadow2531 wrote:

...
Firefox could do the same with the yellow bar that pops up at the top
of the page that says, "The document appears to be XHTML, but is not
well formed. Firefox has reparsed it as HTML for you in an attempt to
handle the errors.", or something like that.


To get an idea of how this would appear to the average human: "The 
document appears to be XZPQR, but is not fizzlebopped. Firefox has 
rewotsited it as ZPQR in an attempt to handle mysterious errors".



...
Sites could have a "Our pages support 'text/html as XML'  handling.
Add us to your browsers's text/html -> XML list.".
...


That would be even worse.

--
Matthew Paul Thomas
http://mpt.net.nz/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-07 Thread Mike Schinkel

Alexey Feldgendler wrote:
> Personally, I would *love* Google to do this sort of thing. I 
> just have no hope for it.

When then, wouldn't it at least make some sense to find the right person in
hopes they might say yes?  Anyway, I'll add to my backlog list of planned
blogs (it's a long list. :)

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-06 Thread Alexey Feldgendler

On Thu, 07 Dec 2006 05:09:44 +0600, Mike Schinkel <[EMAIL PROTECTED]> wrote:

> And if these corporations were using content management systems that didn't
> produce standards-based code, you can bet those CMS vendors would soon have
> a new #1 priority, but fast.  And THAT would clean up the web quicker than
> any academic or grass roots effort ever could.
>
> Anyway, it's always easy to say something won't work, especially if when no
> alternate proposals are presented.

As I don't work for Google, I'm not in the right position to say what is 
appropriate for Google to do and what is not. And I'm almost sure Hixie is not 
in that position eiter.

Personally, I would *love* Google to do this sort of thing. I just have no hope 
for it.

-- 
Alexey Feldgendler <[EMAIL PROTECTED]>
[ICQ: 115226275] http://feldgendler.livejournal.com

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-06 Thread Mike Schinkel

Alexey Feldgendler wrote:
>> An interesting idea, but I don't see how Google would benefit from this. 

1.) If the web get cleaner, it's easier for search engines to inspect
documents
2.) If Google doesn't benefit from a better web, why would they pay Ian to
edit the HTML5 spec?

>> On the other hand, it requires effort, 

Everything worth doing requires effort.

>> and it would set Google somewhat at war with owners of numerous corporate
websites whose image would be spoiled.

Google is already at war with everyone who is not #1 in a search engine
result and wants to be.  What I proposed would be totally objective -- it
use a standards validator -- and the corporations with websites would be
totally in control to fix any problems found with their not following
standards.  Google would see no fallout from this that would affect them
negatively; instead they'd become a hero to some people that are very
difficult to please.

And if these corporations were using content management systems that didn't
produce standards-based code, you can bet those CMS vendors would soon have
a new #1 priority, but fast.  And THAT would clean up the web quicker than
any academic or grass roots effort ever could.

Anyway, it's always easy to say something won't work, especially if when no
alternate proposals are presented. 

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-05 Thread Alexey Feldgendler

On Sun, 03 Dec 2006 10:00:06 +0600, Mike Schinkel <[EMAIL PROTECTED]> wrote:

> And as I write this email, it's finally come to me one method that would
> work for even the most clueless and apathetic of web publishers: What if
> Google, Yahoo, and Microsoft Live were to display a human-readable string,
> denoting the content type, hyperlinked to a web page that gives the details
> of that content type.  For example, assume some future version of that the
> Web Apps current-work page was written in XHTML 1.0 yet it failed the
> validator; it could look like this (example from Google):
>
>   Web Applications 1.0
>   The list of active formatting elements; 9.2.4.3.3. Creating and
>   inserting HTML elements; 9.2.4.3.4. Closing elements that have
>   implied end tags; 9.2.4.3.5. ...
>   whatwg.org/specs/web-apps/current-work/ - Similar pages - XHTML 1.0
> (WARNING)

An interesting idea, but I don't see how Google would benefit from this. On the 
other hand, it requires effort, and it would set Google somewhat at war with 
owners of numerous corporate websites whose image would be spoiled.


-- 
Alexey Feldgendler <[EMAIL PROTECTED]>
[ICQ: 115226275] http://feldgendler.livejournal.com

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-05 Thread Mike Schinkel

>> http://lachy.id.au/log/2005/12/xhtml-beginners

Thanks. BTW, have you viewed your page in IE7?  Your right side menu bar is
displayed at the bottom below your content.  In Firefox, it's okay.

>> >> This means that you lose any benefits that hinge on you only having 
>> >> to ensure targeting XHTML_all.
>> > 
>> > That benefit is so huge it can't even be easily calculated.
>> What benefits are there and what makes them so huge?

As I stated, "less for people to learn."  It's huge because you multiple
times the number of people needing to learn it, each having to learn
additional.  Anyway, we have strayed off course. I think Ian has us back on
course.
 
-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-05 Thread Mike Schinkel

>> Turn that example around. Suppose the web server 
>> says the document is a script that should be executed. 
>> Should the client execute it? 

Ah, your interpretation is incorrect. The server says "this is a script" and
the client is required to treat it "as a script." THAT DOESN'T MEAN that the
client must execute it; nay, the client should decide what to do with it,
and a smart client with initiate safety precautions and NOT exectute it
unless the user explicity overrides the safety.  

But the clients shouldn't, for example try to open a script that the server
said was a script in Word or Excel, that is unless the server served as
application/msword or application/vnd.ms-excel, respectively.

See: http://www.w3.org/2001/tag/doc/mime-respect-20030709

1 Summary of Key Points
* MIME headers sent by a server are authoritative. [Design
choice]
* User agent behavior that misrepresents the user or the
server is harmful. [Principle]

The document is short and worth reading if you haven't previously read it.

Ultimately we are saying the same thing, but we got there via different
paths. In many case, the path is very important as, in this debate,
described in the referenced document.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-05 Thread Alexey Feldgendler

On Thu, 30 Nov 2006 03:42:38 +0600, Sam Ruby <[EMAIL PROTECTED]> wrote:

> What should be the most damning of all is that I found an example on the most
> prominent page on the mozilla.org site.  No one can say that the authors of
> that page didn't make a conscious choice in the DOCTYPE for that page.  No
> one can say that the authors of that page are ignorant. No one can say that
> mozilla has a(n entirely) cavalier attitude towards standards.

I was surprised to find the W3C validator consider this page valid. A bug in 
the validator?


-- 
Alexey Feldgendler <[EMAIL PROTECTED]>
[ICQ: 115226275] http://feldgendler.livejournal.com

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-04 Thread Henri Sivonen


On Dec 4, 2006, at 13:14, Mike Schinkel wrote:


Henri Sivonen wrote:

At this point, it is important to realize that
pro-XHTML advocacy


Who are the pro-XHTML advocates; those one who want divergence, or  
those who

want HTML5 to interoperate with XHTML as much as possible?


The usual kind of pro-XHTML advocacy is the kind that talks about the  
benefits of XHTML without specifying what they are but implicitly the  
reasoning is based on a different set of possible documents than what  
the reasoning is being applied to.


Note that some pro-XHTML-as-text/html opinions on this mailing list  
and on Sam Ruby's blog advocating only XHTML_compatible with the  
reasoning based on the properties of that set in particular are  
different from run-of-the-mill XHTML advocacy. However, those  
opinions should be considered strictly with the merits of  
XHTML_compatible--not broader XHTML--in mind.



This reasoning is then applied to XHTML
served as text/html. This is logical and
intellectually honest if and only if
XHTML_all equals XHTML_compatible.


That is too abstract for me to follow.


Following it is essential.

It was pointed out to me that XHTML_all was a bad label. XHTML_xml  
would be a better label. It doesn't include XHTML_bogus which  
purports to be XHTML, appears to work when processed as HTML but  
wouldn't work when processed as XHTML. Most XHTML on the Web is in  
the XHTML_bogus set.


http://hsivonen.iki.fi/img/xhtml-venn.png


I'll name the difference of XHTML_all and
XHTML_compatible as XHTML_incompatible.
Lachlan gave examples that indicate that
XHTML_incompatible is not empty.


I'm sorry but may I please ask for a reference? I unfortunately  
don't know

where to find that needle in the haystack.


I was referring to this very thread:
http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006- 
December/008272.html



Instead, you have to *make an effort* to
make sure that your documents fall into
XHTML_compatible.


That's fair and reasonable to require.


That may be, but the point is that it undermines the usual XHTML  
argument relating to the "Benefits of XHTML" on the markup  
*production* side. The usual "benefit" is that you can use "XML  
tools". However, having *generic* XML tools does not guarantee that  
they only produce documents that are in XHTML_compatible. Once you  
make sure that the tools *aren't* generic but instead it guaranteed  
that the output belongs in XHTML_compatible, the work you will have  
done is almost the same as tuning an XML toolset to produce HTML5.



If your documents fell into
XHTML_incompatible, things would
*break*, which would be *bad*.


I'm not sure that I agree with the assertion that it would be bad  
(or that

it would be worse than the alternative currently proposed.)


You don't agree that it is bad if stuff breaks? As in "does not even  
appear to work".



This means that you lose any benefits
that hinge on you only having to
ensure targeting XHTML_all.


That is a sweeping statement that minimally discounts the significant
benefit of having less for people to learn. That benefit is so huge  
it can't

even be easily calculated.


How is learning to target XHTML_compatible less for people to learn?  
It isn't enough to understand how HTML processing works and it isn't  
enough to understand how XML processing works. Rather, one would need  
to understand both these and how to avoid the differences in the  
associated CSS and scripting behavior.



unless you specifically want to participate in
upholding a political appearance that doesn't
match the technical reality and in doing so
confuse newbies into believing that the political
obfuscation is the truth (which leads them to
waste time on finding out the truth the hard way).


From where I sit the only reason it would be untrue is because of a
contigent trying to make it untrue and not willing to steer HTML5 in a
direction more compatible with XHTML.

My values involve acknowledging legacy realities, wanting ability  
to use
XML tools with conforming HTML5 documents after a lossless  
conversion and

eschewing political obfuscation of technical realities.


I'm in 100% agreement with those, which means that there must be  
further

hidden values where he differ, possible unconscious values even.


Judging from your messages to this mailing list, it appears that we  
do not agree on the legacy point 100 percent. It seems to me that you  
believe that Hixie has more freedom in defining HTML5 then he  
actually has. The constraints placed by legacy content and browsers  
already out there are not open for the editor of the spec to overrule.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-04 Thread Shadow2531


On 12/4/06, Mike Schinkel <[EMAIL PROTECTED]> wrote:

Shadow2531:

Sounds like you are in agreement. But can I ask you to summarized what you'd
propose?


Not sure if I can summarize, but I can be more specific by example.

Example browser preferences:
(Default value is first value)

[Markup handling preferences]
html_with_xhtml_xmlns = parse_as_html or parse_as_xml or
parse_as_xml_only_if_in_list
xml_parser_error_fallback = show_link_to_fallback_to_html or
direct_fallback_to_html[1] or no_fallback or
show_link_even_if_no_error

[parse html as xml list]
somesite.com
someothersite.com
server.domain.tld
http://somesite.com/well-formed_xhtml_markup.html

1. A direct fallback to html would not cause a loop back to the xml
parser for an html page that was set to be parsed as XML and wasn't
well formed.

Specifically, I was mentioning that settings like the following would
be a use case for XMLisms in text/html.

[Markup handling preferences]
html_with_xhtml_xmlns = parse_as_xml_only_if_in_list
xml_parser_error_fallback = show_link_to_fallback_to_html

[parse html as xml list]
Some site or page that only serves text/html, but can be properly parsed as XML.

As you can see, it wouldn't bother anyone that didn't care (as it'd be
off by default), but for those who care and want XHTML markup treated
as XML even for text/html ( and local .html and .htm), that would be a
use case.

So, my point was that we wouldn't need a text/html5 mime type (for
example) as we could reuse the text/html type (performance issues
aside). Adding a new type I don't think would help as it's probably
not compatibile, but text/html is.

It's not possible to fully get rid of mime type dependability yet, but
judging from "Who cares what mime type it uses, let's treat it for
what it is, if possible" comments on the list, the above would have
its use. And, because the above would have its use, I can see the
usefulness of some *partial* merging of XHTML5 and HTML5.

So, I agree in the usefullness of treating xhtml markup as XML at
will. I'm just not sure that it'll work good enough and many including
Ian have already strongly suggested that it would not work good
enough.

The question probably is: if the xhtml markup is being sent as
text/html and works fine as text/html, why treat it as XML?
For me personally, I like the strictness of XML and its other rules. I
want xhtml markup to blow up if there's an error so it can be fixed.
For others, that want to use XML tools on XHTML markup (regardless of
mime type), want the errors fixed also.

Don't get me wrong and it may seem contrary to what I've said above
(just being open minded), I'm fine with serving as
application/xhtml+xml and calling it a day. I don't mind serving using
HTML markup as text/html and calling it a day. But, doing both,
especially with the same markup, I am not interested in usually.
However, I'd still like to handle pages made by others that do it.

--
burnout426

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-04 Thread Elliotte Harold


Mike Schinkel wrote:


Hmm. I believe the http standard states that clients are not suppose to
override a content-type given by a server. For example, a web page showing a
script virus shouldn't be identified by the client as a script and executed;
the client should instead just display it as a web page like the server told
it to.  Or am I missing your context?



Turn that example around. Suppose the web server says the document is a 
script that should be executed. Should the client execute it?


Of course not. Security demands that the client not execute the script 
in both cases: when the server says it is a script and when the server 
says it isn't.


Security requires that the client be in control of decisions about what 
the client does.


There are also many good nonsecurity reasons for putting the client in 
control.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-04 Thread Lachlan Hunt


Mike Schinkel wrote:

Henri Sivonen wrote:
I'll name the difference of XHTML_all and XHTML_compatible as 
XHTML_incompatible. Lachlan gave examples that indicate that 
XHTML_incompatible is not empty.


I'm sorry but may I please ask for a reference? I unfortunately don't 
know where to find that needle in the haystack. Or did you mean Ian 
Hickson?: http://hixie.ch/advocacy/xhtml


No, he meant the list of examples that demonstrate the kinds of errors
millions of authors make when attempting to use XHTML as text/html.

http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-December/008272.html

FWIW, that list was based on this old article of mine which has a lot 
more information and discussion in it.


http://lachy.id.au/log/2005/12/xhtml-beginners

This means that you lose any benefits that hinge on you only having 
to ensure targeting XHTML_all.


That benefit is so huge it can't even be easily calculated.


What benefits are there and what makes them so huge?


Lachlan Hunt wrote:

http://www.w3.org/mid/[EMAIL PROTECTED]


In that email you wrote: 

	"My point is that the whole idea of embedding 
	XML in HTML is nonsense and  should have no 
	part in any transition from HTML to XML.  I'll be 
	explaining this last point more in a future post."


Have you written that post yet, and if so may I have the reference?


http://www.w3.org/mid/[EMAIL PROTECTED]

--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-04 Thread Mike Schinkel

Henri Sivonen wrote:
>> At this point, it is important to realize that 
>> pro-XHTML advocacy 

Who are the pro-XHTML advocates; those one who want divergence, or those who
want HTML5 to interoperate with XHTML as much as possible?

>> This reasoning is then applied to XHTML 
>> served as text/html. This is logical and 
>> intellectually honest if and only if 
>> XHTML_all equals XHTML_compatible.

That is too abstract for me to follow.

>> I'll name the difference of XHTML_all and 
>> XHTML_compatible as XHTML_incompatible. 
>> Lachlan gave examples that indicate that 
>> XHTML_incompatible is not empty. 

I'm sorry but may I please ask for a reference? I unfortunately don't know
where to find that needle in the haystack. Or did you mean Ian Hickson?:
http://hixie.ch/advocacy/xhtml

>> Now if you wish to serve your documents 
>> as text/html, it follows that you can't just 
>> happily do things that guarantee that your 
>> documents are members of XHTML_all. 

Which, point of note, wasn't the proposal (at least not mine.)

>> Instead, you have to *make an effort* to 
>> make sure that your documents fall into 
>> XHTML_compatible. 

That's fair and reasonable to require.

>> The equality of XHTML_all and 
>> XHTML_compatible is not true--it is 
>> political obfuscation to hide an 
>> inconvenient truth. 

I'm certainly not trying to obfuscate. 

>> If your documents fell into 
>> XHTML_incompatible, things would 
>> *break*, which would be *bad*. 

I'm not sure that I agree with the assertion that it would be bad (or that
it would be worse than the alternative currently proposed.)

>> This means that you lose any benefits 
>> that hinge on you only having to 
>> ensure targeting XHTML_all.

That is a sweeping statement that minimally discounts the significant
benefit of having less for people to learn. That benefit is so huge it can't
even be easily calculated.

>> unless you specifically want to participate in 
>> upholding a political appearance that doesn't 
>> match the technical reality and in doing so 
>> confuse newbies into believing that the political 
>> obfuscation is the truth (which leads them to 
>> waste time on finding out the truth the hard way).

>From where I sit the only reason it would be untrue is because of a
contigent trying to make it untrue and not willing to steer HTML5 in a
direction more compatible with XHTML.

>> My values involve acknowledging legacy realities, wanting ability to use
XML tools with conforming HTML5 documents after a lossless conversion and
eschewing political obfuscation of technical realities.

I'm in 100% agreement with those, which means that there must be further
hidden values where he differ, possible unconscious values even.  Or maybe
it is because you don't value things I value including minimizing the need
to choose one or the other that doesn't allow later change, minimizing the
need to learn differences, and empowering as many people as possible to
author content.

>> What was sold to you was XHTML_all. Not that 
>> you you have to know how to avoid 
>> XHTML_incompatible.

Not exactly. I was sold on having one direction, not two.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Mike Schinkel

Elliotte Harold wrote:
> Mike Schinkel wrote:
> > Sounds like we need content-types determined by inspection 
> > on web servers?(which would really slow-down serving pages, 
> > unless they could be cached, but with so much dynamic 
> > generated content that doesn't seem realistic...)
> No, I don;t think so. There's nothing wrong with a server specifying 
> the content-type it likes for a document, though that decision 
> should be in the hands of the document author, not the server 
> administrator. That's a design flaw in a lot of web servers and 
> server installations today.

I think you missed my point, which was that authors should be in control. I
strongly believe in the need for web authors to control content types, and
I'm even doing some research in the area at the moment.

>> My point is that the client gets to decide how it will process the 
>> incoming document. The server can suggest but it can't demand. 
>> If the client wants to process the incoming document as XML, 
>> HTML, plain text, or JPEG, that's its choice. Different clients will 
>> have different needs and thus make different choices.

Hmm. I believe the http standard states that clients are not suppose to
override a content-type given by a server. For example, a web page showing a
script virus shouldn't be identified by the client as a script and executed;
the client should instead just display it as a web page like the server told
it to.  Or am I missing your context?

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Mike Schinkel

Shadow2531:

Sounds like you are in agreement. But can I ask you to summarized what you'd
propose?

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

-Original Message-
From: Shadow2531 [mailto:[EMAIL PROTECTED] 
Sent: Sunday, December 03, 2006 12:57 PM
To: Mike Schinkel
Cc: Elliotte Harold; Lachlan Hunt; WHAT WG List
Subject: Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

On 12/2/06, Mike Schinkel <[EMAIL PROTECTED]> wrote:
> On another note, why not another new content type, one that would mean?:
>
> "Striving to be XHTML, but if not consider me HTML5. And if 
> that doesn't work, try HTML 4.01."

If you're implying, "Treat as XML and if that fails":

One of the cool things about Opera is that if it encounters a broken xml
document, you can reparse it as HTML with a click of a link.

That means, there could be an option for browsers that support
application/xhtml+xml to treat text/html as xml ( by sniffing for a xmlns or
some other way) and then either directly or non-directly fall back to html
if there's an error.

Firefox could do the same with the yellow bar that pops up at the top of the
page that says, "The document appears to be XHTML, but is not well formed.
Firefox has reparsed it as HTML for you in an attempt to handle the
errors.", or something like that.

The problem with that would be that a lot of well-formed XHTML markup would
break if treated as XML ( because of casing rules etc.), so you'd still want
an option to reparse as HTML even if there were no markup errors.

Sites could have a "Our pages support 'text/html as XML'  handling.
Add us to your browsers's text/html -> XML list.".

One point is that stuff like that could be done in a slick way with the
text/html type instead of a new type as we already have problems with 2
types. (not that I believe this idea would be well accepted or
practical)

Just mentioning this though, I realize everyone's thinking, "Users would
just turn that feature off and what's the point anyway etc.???", but I do
see some benefit in the idea as a developers tool, to spread XHTML awareness
and to provide XHTML benefits with just using text/html.

To be on topic, the other point is, that describes a (far fetched) use case
for those suggesting  *partial* integration of XML stuff in HTML5.

--
burnout426

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Mike Schinkel

Thanks for the link.

Serdar Kilic wrote:
>> Ian outlines why sending XHTML as HTML is harmful in his article at: 
>> http://www.hixie.ch/advocacy/xhtml 

Thanks for the link.  Give me a chance to digest it. :)

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Shadow2531


On 12/2/06, Mike Schinkel <[EMAIL PROTECTED]> wrote:

On another note, why not another new content type, one that would mean?:

"Striving to be XHTML, but if not consider me HTML5. And if that
doesn't work, try HTML 4.01."


If you're implying, "Treat as XML and if that fails":

One of the cool things about Opera is that if it encounters a broken
xml document, you can reparse it as HTML with a click of a link.

That means, there could be an option for browsers that support
application/xhtml+xml to treat text/html as xml ( by sniffing for a
xmlns or some other way) and then either directly or non-directly fall
back to html if there's an error.

Firefox could do the same with the yellow bar that pops up at the top
of the page that says, "The document appears to be XHTML, but is not
well formed. Firefox has reparsed it as HTML for you in an attempt to
handle the errors.", or something like that.

The problem with that would be that a lot of well-formed XHTML markup
would break if treated as XML ( because of casing rules etc.), so
you'd still want an option to reparse as HTML even if there were no
markup errors.

Sites could have a "Our pages support 'text/html as XML'  handling.
Add us to your browsers's text/html -> XML list.".

One point is that stuff like that could be done in a slick way with
the text/html type instead of a new type as we already have problems
with 2 types. (not that I believe this idea would be well accepted or
practical)

Just mentioning this though, I realize everyone's thinking, "Users
would just turn that feature off and what's the point anyway etc.???",
but I do see some benefit in the idea as a developers tool, to spread
XHTML awareness and to provide XHTML benefits with just using
text/html.

To be on topic, the other point is, that describes a (far fetched) use
case for those suggesting  *partial* integration of XML stuff in
HTML5.

--
burnout426

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread David Håsäther


On 2006-12-02 17:42, Elliotte Harold wrote:


First of all, I believe there was only ever one parser that
implemented all of the SGML specification, SP from James Clark.


Nope, not even SP implemented the whole standard, e.g. not DATATAG and 
CONCUR, see .


Just a little parenthesis.

--
David Håsäther

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Henri Sivonen


On Dec 3, 2006, at 14:54, Henri Sivonen wrote:

(which leads them to waste time on finding out the truth the hard  
way).


Which is *bad*.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Henri Sivonen


On Dec 3, 2006, at 11:00, Mike Schinkel wrote:


All I've heard is that people are saying and doing things that are
incorrect. That means you are assuming that, above all else,  
whatever people

say and do must be correct. In this specific case, I challenge that
assumption. I think the results of taking the medicine you are  
proposing

will be far worse than living with the disease.


First, there's XHTML--all of it the way it works as application/xhtml 
+xml. I'll call it XHTML_all. Then there's a subset of XHTML that  
when served as text/html to a browser that handles text/html  
according to requirements imposed by the real-world legacy still  
appears to "work" for the casual observer. I'll call this  
XHTML_compatible.


At this point, it is important to realize that pro-XHTML advocacy is  
based on reasoning derived from the properties of XHTML_all when it  
is processed as application/xhtml+xml. This reasoning is then applied  
to XHTML served as text/html. This is logical and intellectually  
honest if and only if XHTML_all equals XHTML_compatible.


I'll name the difference of XHTML_all and XHTML_compatible as  
XHTML_incompatible. Lachlan gave examples that indicate that  
XHTML_incompatible is not empty. Hence, XHTML_compatible is a proper  
subset of XHTML_all.


Now if you wish to serve your documents as text/html, it follows that  
you can't just happily do things that guarantee that your documents  
are members of XHTML_all. Instead, you have to *make an effort* to  
make sure that your documents fall into XHTML_compatible. The  
equality of XHTML_all and XHTML_compatible is not true--it is  
political obfuscation to hide an inconvenient truth. If your  
documents fell into XHTML_incompatible, things would *break*, which  
would be *bad*. This means that you lose any benefits that hinge on  
you only having to ensure targeting XHTML_all.


If you are making the text/html compatibility effort, you might as  
well adjust your effort to producing HTML5 instead of  
XHTML_compatible, unless you specifically want to participate in  
upholding a political appearance that doesn't match the technical  
reality and in doing so confuse newbies into believing that the  
political obfuscation is the truth (which leads them to waste time on  
finding out the truth the hard way).


If I am correct in my assessment then the best thing for all  
parties would

to be make their *values* clear to each other.


My values involve acknowledging legacy realities, wanting ability to  
use XML tools with conforming HTML5 documents after a lossless  
conversion and eschewing political obfuscation of technical realities.


That's an excellent point. My answer is that I was sold on the  
benefits of
XHTML, and I still believe in them so I don't want to give up on  
the hope

that I can eventually get there.


What was sold to you was XHTML_all. Not that you you have to know how  
to avoid XHTML_incompatible.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Elliotte Harold


Lachlan Hunt wrote:

As Henri has outlined in his article on producing XML, using string 
concatenation and print statements to produce XML is a mistake. 
WordPress and MediaWiki fall into this category and it has proven to be 
a fatal architectural flaw in their design.  CMSs built like that would 
find it easier to migrate to HTML5.


For the record, I'm not convinced that's true. I think the flaw in the 
mentioned products is an inadequate test suite, not the use of string 
concatenation and print statements. Unlike XML input, XML output really 
isn't all that hard. It doesn't always have to be delegated to a library.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Elliotte Harold


Mike Schinkel wrote:


Sounds like we need content-types determined by inspection on web servers?
(which would really slow-down serving pages, unless they could be cached,
but with so much dynamic generated content that doesn't seem realistic...)


No, I don;t think so. There's nothing wrong with a server specifying the 
content-type it likes for a document, though that decision should be in 
the hands of the document author, not the server administrator. That's a 
design flaw in a lot of web servers and server installations today.


My point is that the client gets to decide how it will process the 
incoming document. The server can suggest but it can't demand. If the 
client wants to process the incoming document as XML, HTML, plain text, 
or JPEG, that's its choice. Different clients will have different needs 
and thus make different choices.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Mike Schinkel

Thanks for the detailed reply. 

>> Lachlan Hunt wrote:
>> > 4.) Currently offering a CMS/web app generates HTML(4) 
>> > using string concatonation, with plans to move it to XHTML
>> > 5.) Currently offering a CMS/web app generates HTML(4) and 
>> > XHTML both using string concatonation
>> 
>> As Henri has outlined in his article on producing XML, using string 
>> concatenation and print statements to produce XML is a mistake. 
>> WordPress and MediaWiki fall into this category and it has proven 
>> to be a fatal architectural flaw in their design.  CMSs built like that 
>> would find it easier to migrate to HTML5.

Just because you tell people they should do it a certain way doesn't mean
they will. I had hestitated to use this analogy but here goes: Telling
Africans (or teenagers for that matter) they shouldn't have sex outside of a
marital relationship (before marriage or otherwise) isn't going to stop a
lot of them from doing it and hence isn't going to stop the spread of AIDs
in Africa (and elsewhere.)  Prohibition just doesn't work; better to be
pragmatic rather than adhere to an ideology in the face of evidence to the
contrary.

I guess what I'm seeing is that the position you and some others on this
list are taking will, I believe, create significant interoperability
problems on the Internet. My understanding is that it is in the charter of
the W3C and by extension the WHATWG to guard against such interoperability
problems at all costs.  Being idealistic is great, but not when it could
create huge problems on the Internet that could otherwise have been avoided.
It's much better to be pragmatic and focus on optimizing for what *will*
happen rather then target what you *want* to happen (look at IRAQ, it didn't
work for Bush either. ;-)

>> e.g. In XHTML:
>> 
>> A paragraph containing
>>
>>  a list
>>  of items
>>
>> 
>> 
>> Cannot be accurately represented in the HTML serialisation, as it 
>> would result in the following:
>> 
>> A paragraph containing
>>
>>  a list
>>  of items
>>

Why not solve that problem by converting 

A paragraph containing

  a list
  of items

To something like this:

A paragraph containing

  a list
  of items

Which can easily be round-tripped.  My solution is ugly, but then it is
solving an even uglier problem. "__xhtml_paragraph" it is unique enough that
it shouldn't clash with anything else.

>> If wanting to author in HTML and reserialise as XHTML, 
>> there is also an issue with using the  element, 
>> as it is forbidden in XHTML5.

XHTML5? Do you mean XHTML?   Anyway, can't you encode it using comments if
serialized to XHTML for round-tripping?  Certainly it can't have any
hehavior in XHTML, but then when there are technical constraints (as opposed
to constraints on principle) limitations are acceptable because they can't
be helped.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

P.S. Any chance of splitting
http://www.whatwg.org/specs/web-apps/current-work/ into multiple files?  It
overwhelms IE7 so much as to be practically unusable (Yes I have FireFox,
but it's not my default browser.)

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Mike Schinkel

Lachlan Hunt wrote:
>> Why do I need to keep repeating myself? 

Because either I'm new to the list and didn't hear it, or because your
justifications aren't making sense to me.

>> Because your just pretending its XML when its not.  
>> That's like asking what's wrong with sending HTML 
>> as text/plain?!
>>
>> The fact that it's no different from what we have 
>> today is exactly what the problem is.  Authors that 
>> think they're using XHTML and making bogus claims 
>> about its benefits even when served as text/html 
>> doesn't help anyone, and is infact contrary to the 
>> philosophy of XML, particularly in relation to error 
>> handling.

All I've heard is that people are saying and doing things that are
incorrect. That means you are assuming that, above all else, whatever people
say and do must be correct. In this specific case, I challenge that
assumption. I think the results of taking the medicine you are proposing
will be far worse than living with the disease.

>> You're missing the point.  Those that write XHTML 
>> badly will no doubt write HTML badly as well.  But 
>> the point is that by using HTML, at least they won't 
>> be lying to themselves about the benefits they've 
>> gained from using XHTML!

I think it comes down to different values, which is the main cause of
empassioned debates (arguing over scarce resources is usually not as
empassioned.)  It appears you value above all else fidelity with regards to
content/media type.  On the other hand, although I respect your goal and
this it is a goal to approach, I value the outcome more than the absolute
adherence to conformity.  (Have I understand the situation correctly?)

If I am correct in my assessment then the best thing for all parties would
to be make their *values* clear to each other. Then, each respecting that
the other's values are important to them, work toward a solution that
optimizes the values for all parties. And compromises might be required in
order not to completely ignore the values of any given reasonable
contingent.  BTW, if this weren't the global Internet, it wouldn't be so
important to address the different values for everyone. :-)

BTW, I'm curious if TimBL has recently weighed in on this topic, that being
is HTML5 divergent or convergent with XHTML?

>> However, that's not the question you should be 
>> asking.  Instead, your question should be if XHTML 
>> isn't supported adequately for your needs on today's 
>> internet, but HTML is, why should you bother trying 
>> to output XHTML at all?

That's an excellent point. My answer is that I was sold on the benefits of
XHTML, and I still believe in them so I don't want to give up on the hope
that I can eventually get there. And I'll bet a lot of other web developers
feel the same way.  I'd like to see an HTML5 that runs parallel to the same
track that HTML runs on so that, someday, I might be able to jump tracks.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-03 Thread Serdar Kilic


Hi Mike,
Ian outlines why sending XHTML as HTML is harmful in his article at:

http://www.hixie.ch/advocacy/xhtml

On 03/12/2006, at 4:51 PM, Mike Schinkel wrote:


The following are honest questions, not rhetorical baiting.

Lachlan Hunt wrote:

Use XHTML, send it with an HTML MIME type, and be happy.

No!


Why not?  What's wrong with doing that?


Regards,
Serdar Kilic

smime.p7s
Description: S/MIME cryptographic signature

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Lachlan Hunt


Mike Schinkel wrote:

Lachlan Hunt wrote:

Use XHTML, send it with an HTML MIME type, and be happy.

No!


	Why not?  What's wrong with doing that?  


Why do I need to keep repeating myself?


Lachlan Hunt wrote:
In many more cases, an HTML document or even an 
XHTML 1.0 as text/html document is just tag soup.


What's wrong with that?


Because your just pretending its XML when its not.  That's like asking 
what's wrong with sending HTML as text/plain?!



Lachlan Hunt wrote:
There were a few proprietary, incompatible, buggy engines 
locked up in various browsers; and that was about it.

OpenSP, which is free software,


Will a recommendation to use OpenSP be included in the spec?


That applied to authoring HTML 2.0 to 4.01, which are based on SGML. 
HTML5 is not based on SGML, so that doesn't apply.



Lachlan Hunt wrote:
Because the fact is that when authors try to use XHTML as 
text/html, they inevitibly fail to do so properly.

...
There is significant evidence to show that millions of authors 
make those mistakes very frequently, despite thinking they're 
using XHTML.


Again, why is this a problem?  It is no different than we have today.


The fact that it's no different from what we have today is exactly what 
the problem is.  Authors that think they're using XHTML and making bogus 
claims about its benefits even when served as text/html doesn't help 
anyone, and is infact contrary to the philosophy of XML, particularly in 
relation to error handling.



Maybe I should ask a different question. If people write XHTML badly, what
makes you think they will write HTML5 any better?


You're missing the point.  Those that write XHTML badly will no doubt 
write HTML badly as well.  But the point is that by using HTML, at least 
they won't be lying to themselves about the benefits they've gained from 
using XHTML!



As I understand it, serving with the correct mime type for XHTML isn't an
option, assuming you want people to be able to read it with current
browsers, or am I wrong on that?
...
And what MIME type should he be using that will work on today's Internet?


Either application/xml or application/xhtml+xml are recommended.

However, that's not the question you should be asking.  Instead, your 
question should be if XHTML isn't supported adequately for your needs on 
today's internet, but HTML is, why should you bother trying to output 
XHTML at all?


--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Lachlan Hunt


Mike Schinkel wrote:

So what guidance would you publish after HTML5 is released with regards to
people in each of the following situations:


Note: Where I refer to outputting HTML below, I assume the use of 
text/html.  Where I refer to outputting XHTML below, I assume the use of 
an XML MIME type.  XHTML5 is not an option for content served as text/html.



1.) Currently coding HTML(4) but trying to move to XHTML


Those who are more comfortable with HTML4 will find it easier to move 
directly to HTML5.  Such authors can choose to migrate to XHTML if they 
wish, but to do so requires that they begin authoring using XML tools, 
or at least testing in a browser that actually supports XHTML using an 
XML parser.



2.) Currently coding XHTML and cleaning up only HTML(4)
3.) Currently coding only in XHMTL


That depends how they're coding XHTML.  If they're making the common 
beginner mistake of coding XHTML under text/html conditions (for which 
there is significant evidence proving it to be a fatal mistake), then 
such authors should migrate to HTML5.


If they're authoring with XML tools under XML conditions, then they may 
choose to migrate to XHTML5, but the final decision of what to use would 
still depend on how they intend to serve the content.



4.) Currently offering a CMS/web app generates HTML(4) using string
concatonation, with plans to move it to XHTML
5.) Currently offering a CMS/web app generates HTML(4) and XHTML both using
string concatonation


As Henri has outlined in his article on producing XML, using string 
concatenation and print statements to produce XML is a mistake. 
WordPress and MediaWiki fall into this category and it has proven to be 
a fatal architectural flaw in their design.  CMSs built like that would 
find it easier to migrate to HTML5.


http://hsivonen.iki.fi/producing-xml/#dontprint


6.) Currently offering a CMS/web app generates HTML(4) with string
concatonation and XHTML with an XML pipeline


CMSs that do use a proper XML pipeline may choose to migrate to XHTML5. 
 However, such CMSs would, in reality, still be required to be able to 
output HTML5 given current browser limitations with XHTML.  But rather 
than using string concatenation to generate the HTML, it would be more 
effective to put an HTML5 serialiser on the end of the XML pipeline to 
output HTML5 from the XHTML source.


Authors wishing to output both formats depending on the browsers 
support, based on the Accept header, would also need to be aware of the 
issues involved with writing scripts and stylesheets that work correctly 
in either format, and also aware of the conditions under which 
reserialisation as HTML will result in a slightly different document.


e.g. In XHTML:

A paragraph containing
  
a list
of items
  


Cannot be accurately represented in the HTML serialisation, as it would 
result in the following:


A paragraph containing
  
a list
of items
  

http://www.whatwg.org/specs/web-apps/current-work/#restrictions

If wanting to author in HTML and reserialise as XHTML, there is also an 
issue with using the  element, as it is forbidden in XHTML5.



7.) Currently offering a CMS/web app generates XHTML with an XML pipeline


CMSs like that that only output XHTML and do not wish to output HTML may 
choose to migrate to XHTML5.


--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Mike Schinkel

The following are honest questions, not rhetorical baiting.

Lachlan Hunt wrote:
>> Use XHTML, send it with an HTML MIME type, and be happy.
> No!

Why not?  What's wrong with doing that?  

Lachlan Hunt wrote:
>> In many more cases, an HTML document or even an 
>> XHTML 1.0 as text/html document is just tag soup.

What's wrong with that?

Lachlan Hunt wrote:
> > There were a few proprietary, incompatible, buggy engines 
> > locked up in various browsers; and that was about it.
> OpenSP, which is free software,

Will a recommendation to use OpenSP be included in the spec?

Lachlan Hunt wrote:
>> Because the fact is that when authors try to use XHTML as 
>> text/html, they inevitibly fail to do so properly.  It takes 
>> considerable knowledge and skill to be aware of and handle 
>> all issues ranging from parsing, character encodings to scripts 
>> and stylesheets.
>>
>> ...
>>
>> There is significant evidence to show that millions of authors 
>> make those mistakes very frequently, despite thinking they're 
>> using XHTML.
>> That is why I strongly believe that XHTML 1.0 Appendix C was 
>> a huge mistake and that continuing to allow authors to think 
>> they can use XHTML as text/html is extremely harmful for the 
>> future of XML, not beneficial to it.

Again, why is this a problem?  It is no different than we have today.

Maybe I should ask a different question. If people write XHTML badly, what
makes you think they will write HTML5 any better?

Lachlan Hunt wrote:
>> I really don't understand how you can go on about the benefits 
>> of XML because it requires well-formedness, but then turn around 
>> and say XML can be served as text/html which just makes all your 
>> arguments null and void.

As I understand it, serving with the correct mime type for XHTML isn't an
option, assuming you want people to be able to read it with current
browsers, or am I wrong on that?

Lachlan Hunt wrote:
>> >> The problem is when we don't realize we have a problem in the 
>> >> first place. 
>> The problem is that you're using the wrong MIME type.

And what MIME type should he be using that will work on today's Internet?

I must be missing something...

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Mike Schinkel

Elliotte Harold wrote:
>> Perhaps because you believe the MIME type is a 
>> magic incantation that somehow changes the 
>> document's nature, and I don't.
>>
>> The document is what it is. A sequence of bytes 
>> is either a well-formed XML document or it isn't. 
>> I can call it XML, but that doesn't mean it is; and 
>> I can say it's not XML, but that doesn't mean it isn't. 
>>

Sounds like we need content-types determined by inspection on web servers?
(which would really slow-down serving pages, unless they could be cached,
but with so much dynamic generated content that doesn't seem realistic...)

On another note, why not another new content type, one that would mean?:

"Striving to be XHTML, but if not consider me HTML5. And if that
doesn't work, try HTML 4.01."  

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Mike Schinkel

Elliotte Harold wrote:
>> The other half could be addressed by one little box 
>> in the corner of Firefox's status bar that's a smiley 
>> face if the page is valid, and a frown if it isn't.

>> Most hand authors including myself don't always 
>> achieve well-formedness because nothing pricks us 
>> if we don't. Even the tiniest annoyance from a bad 
>> page, would cause us to check the error logs and 
>> fix the problems.

>> Fixing a page to be well-formed and even valid XHTML 
>> is not hard, and well within the abilities of most people 
>> hand authoring HTML. The problem is when we don't 
>> realize we have a problem in the first place. 

>> Once we've noticed the problem, we're 90% of the way 
>> to solving it. 

You absolutely hit the nail on the head!!!  I've been thinking along similar
lines ever since all the fallout from TimBL's memo recent about XHTML &
HTML.

Your suggestion would go a long way towards ensuring people create
well-formed XHTML, although I'd like it to (default to) be(ing) a little
more obvious that a "little" box...

Actually, that would work for concientious people with a clue, like you, but
not for most people publishing to the web. I've always viewed that the best
way to motivate change is to motivate the person who created the problem in
the first place and who can also get it fixed, and avoidance of pain is a
great motivator.  I've been racking my brain for a way that web publishers
could be *motivated* to fix their XHTML.  

And as I write this email, it's finally come to me one method that would
work for even the most clueless and apathetic of web publishers: What if
Google, Yahoo, and Microsoft Live were to display a human-readable string,
denoting the content type, hyperlinked to a web page that gives the details
of that content type.  For example, assume some future version of that the
Web Apps current-work page was written in XHTML 1.0 yet it failed the
validator; it could look like this (example from Google):

Web Applications 1.0
The list of active formatting elements; 9.2.4.3.3. Creating and 
inserting HTML elements; 9.2.4.3.4. Closing elements that have 
implied end tags; 9.2.4.3.5. ...
whatwg.org/specs/web-apps/current-work/ - Similar pages - XHTML 1.0
(WARNING)

The "XHTML 1.0" would link to a description of XHTML 1.0 and it's content
type and how it can be viewed, etc. etc. But the "WARNING" could be in BOLD
RED type linking to a warning page that explained why the "Web Applications
1.0" page failed XHTML 1.0 validation, and it could include a link to a
validator for retesting (The search engines could even use  if they
*really* wanted it to be effective; doh!)  

The search engines could also let people register validators so that
validation didn't become a bottleneck. Validators would be required to
correctly validate a variety of documents to be approved, and registered
validators would get to serve advertising in exchange for their service.

I'll *bet* if the search engines did this, we'd see the public get educated
and documents cleaned up, but fast!  And I can imagine that having more
documented well formed on the web could only help the search engines be more
accurate, so they should be motivated to do this.

Thoughts?  Or does someone see a whole in my theory?  If not, Ian's from
Google; what about Yahoo and Microsoft... ;-)

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

P.S. I might just have to blog this...

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Mike Schinkel

Ian:

(I didn't know what part to quote, so I left your email intact below.)

So what guidance would you publish after HTML5 is released with regards to
people in each of the following situations:

1.) Currently coding HTML(4) but trying to move to XHTML
2.) Currently coding XHTML and cleaning up only HTML(4)
3.) Currently coding only in XHMTL
4.) Currently offering a CMS/web app generates HTML(4) using string
concatonation, with plans to move it to XHTML
5.) Currently offering a CMS/web app generates HTML(4) and XHTML both using
string concatonation
6.) Currently offering a CMS/web app generates HTML(4) with string
concatonation and XHTML with an XML pipeline
7.) Currently offering a CMS/web app generates XHTML with an XML pipeline

And, respectfully, "it doesn't matter" should not be considered a valid
answer.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

-Original Message-
From: Ian Hickson [mailto:[EMAIL PROTECTED] 
Sent: Saturday, December 02, 2006 3:48 AM
To: Mike Schinkel
Cc: [EMAIL PROTECTED]
Subject: RE: [whatwg] Allow trailing slash in always-empty HTML5 elements?

On Sat, 2 Dec 2006, Mike Schinkel wrote:
> 
> >> You don't need to do one or the other. It's just up to you which 
> >> you do. Neither is better or worse than the other. They are 
> >> equivalent, neither is deprecated,  There's no reason to try and do
both.
> 
> If, as you say one is as good as the other, why in the world have both?

The Web Apps 1.0 spec doesn't have both. It has a single format, HTML5, that
is compatible with the overwhelming majority of Web content, and is
compatible with the "tag soup" parsers as supported by all major Web
browsers (including most mobile browsers).

However, since HTML5 is defined in terms of the DOM, and since XML is one
way of serialising a DOM, it is guarenteed that _someone_ will try to create
an XML serialisation of HTML5 DOMs. Therefore, the Web Apps spec addresses
this, gives it a name (XHTML5), and makes sure to clearly state how that
should work, so that when someone does it, they don't have to guess.

We have no choice about having the HTML format -- that's what the Web uses
today and we have to be compatible with that to be successful. We have no
cohice about the XML form, XML is used by certain people and there is a
guarentee that people will try to use HTML5 with XML. Therefore we are stuck
with having both.

For most people, however, XHTML5 need never enter their world.

> Maybe I'm wrong about this, but I bet most of the word-a-day web 
> developers and vendors of products that would need to support both 
> would agree. Has there been any attempt to learn their options on the 
> direction of HTML5 vs. XHTML?

There really is very little reason for anyone to use XHTML5 today, since it
doesn't work in IE7, the majority browser, whereas HTML5 does.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Mike Schinkel

> Henri Sivonen wrote:
> > Elliotte Harold wrote:
> > What I don't understand is why some members of this working group is 
> > so dead set on actively preventing HTML from being XML. The non- 
> > draconian error handling I understand. But why are you disappointed 
> > that  is well-formed XML? Why the active hostility to 
> > well-formedness?
> 
> To make a conformance checker not accidentally let MIME type mistakes
silently pass in some cases. 
> 

Can you clarify that please, maybe with some examples.  As is, I don't
understand the concern.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Elliotte Harold


Lachlan Hunt wrote:

The Yellow Screen of Death is about as annoying as you can get.  I 
really don't understand how you can go on about the benefits of XML 
because it requires well-formedness, but then turn around and say XML 
can be served as text/html which just makes all your arguments null and 
void.


Perhaps because you believe the MIME type is a magic incantation that 
somehow changes the document's nature, and I don't.


The document is what it is. A sequence of bytes is either a well-formed 
XML document or it isn't. I can call it XML, but that doesn't mean it 
is; and I can say it's not XML, but that doesn't mean it isn't.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Lachlan Hunt


Elliotte Harold wrote:

Lachlan Hunt wrote:

Because the fact is that when authors try to use XHTML as text/html, 
they inevitibly fail to do so properly.  It takes considerable 
knowledge and skill to be aware of and handle all issues ranging from 
parsing, character encodings to scripts and stylesheets.


All it really takes is minimal tool support. If systems like WordPress 
and DreamWeaver that hide the HTML start generating well-formed HTML, 
that's half the battle right there.


I think the recent discussion about WordPress proved that isn't going to 
happen any time soon.


The other half could be addressed by one little box in the corner of 
Firefox's status bar that's a smiley face if the page is valid, and a 
frown if it isn't.


Developers already have the option to install extensions for that.

http://users.skynet.be/mgueury/mozilla/
http://relaxed.sourceforge.net/firefox-plugin.html

Most hand authors including myself don't always achieve well-formedness 
because nothing pricks us if we don't.


It does when you use the correct MIME type!

Even the tiniest annoyance from a bad page, would cause us to check the 
error logs and fix the problems.


The Yellow Screen of Death is about as annoying as you can get.  I 
really don't understand how you can go on about the benefits of XML 
because it requires well-formedness, but then turn around and say XML 
can be served as text/html which just makes all your arguments null and 
void.


Fixing a page to be well-formed and even valid XHTML is not hard, and 
well within the abilities of most people hand authoring HTML.


Hmmm.  You obviously haven't seen a lot of the rubbish that many people, 
including those that hand code, actually produce.  Perhaps you keep 
forgetting that people like us who can easily produce well-formed and 
valid markup are in the minority.


The problem is when we don't realize we have a problem in the first place. 


The problem is that you're using the wrong MIME type.

--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Lachlan Hunt


Elliotte Harold wrote:
Secondly, anyone who actually tried to use an SGML parser to handle HTML 
 rapidly hit a wall since most HTML documents were not even close to 
actually conformant to the SGML spec or the HTML DTD.


s/SGML/XML and you might see my point:

Anyone who actually tried to use an XML parser to handle HTML rapidly 
hit a wall since most HTML documents were not even close to actually 
conformant to the XML spec.


--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Elliotte Harold


Lachlan Hunt wrote:

Because the fact is that when authors try to use XHTML as text/html, 
they inevitibly fail to do so properly.  It takes considerable knowledge 
and skill to be aware of and handle all issues ranging from parsing, 
character encodings to scripts and stylesheets.


All it really takes is minimal tool support. If systems like WordPress 
and DreamWeaver that hide the HTML start generating well-formed HTML, 
that's half the battle right there.


The other half could be addressed by one little box in the corner of 
Firefox's status bar that's a smiley face if the page is valid, and a 
frown if it isn't.


Most hand authors including myself don't always achieve well-formedness 
because nothing pricks us if we don't. Even the tiniest annoyance from a 
bad page, would cause us to check the error logs and fix the problems.


It used to be that the Cafe au Lait and Cafe con Leche home pages became 
malformed on a regular basis through my carelessness or typos. That 
stopped once I implemented an XML toolchain that e-mailed me when it 
noticed a mistake on those pages. (That was actually a side effect of 
another project, not the specific intent.)


Fixing a page to be well-formed and even valid XHTML is not hard, and 
well within the abilities of most people hand authoring HTML. The 
problem is when we don't realize we have a problem in the first place. 
Once we've noticed the problem, we're 90% of the way to solving it. .


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Elliotte Harold


Lachlan Hunt wrote:

HTML 2.0 to 4.01 documents could, in the same way you're insisting on 
using XML tools on the back end, be reliably parsed using SGML tools. 


Surely you jest. First of all, I believe there was only ever one parser 
that implemented all of the SGML specification, SP from James Clark. In 
practice, SGML documents and parsers were not interoperable. Documents 
and document types were designed to fit the limitations of one specific 
parser.


Secondly, anyone who actually tried to use an SGML parser to handle HTML 
 rapidly hit a wall since most HTML documents were not even close to 
actually conformant to the SGML spec or the HTML DTD.


XML, by contrast, learned enough from the experience of SGML to make 
interoperable parsers and documents a reality. It also learned to 
separate well-formedness from validity, which is certainly the single 
most underrated contribution of XML to the field. Well-formedness gives 
80% of the benefit of validity for 20% of the cost. In fact, a lot of 
the time I'd say it gives 120% of the benefit of validity at 20% of the 
cost. Well-formedness sans validity is a large part of extensibility. It 
is what puts the X is XML.


A well-formed XML document is a much lower bar to aim for than a valid 
SGML one.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Lachlan Hunt


Elliotte Harold wrote:

Lachlan Hunt wrote:
HTML and XML have significantly different parsing requirements and 
they absolutely must be treated as significantly different file 
formats.  Any attempt to treat them as the same format is an extremely 
bad idea.


That's only true to the extent that some people seem to insist on making 
them needlessly different. HTML is tantalizingly close to well-formed 
XML. They both derive from SGML. They both use angle bracketed tags. 
They both define a tree structure. Indeed in many cases an HTML document 
is an XML document.


In many more cases, an HTML document or even an XHTML 1.0 as text/html 
document is just tag soup.


This enables the use of the very powerful XML toolchain for processing 
HTML.


The XHTML serialisation allows for the very powerful XML toolchain for 
processing (X)HTML.  You just need to stick an HTML serialiser on to the 
end of it.


In fact, prior to the widespread adoption of XML there were, near 
as I could tell, no reliable open means of parsing HTML documents.


HTML 2.0 to 4.01 documents could, in the same way you're insisting on 
using XML tools on the back end, be reliably parsed using SGML tools. 
Now that HTML 5 is no longer based on SGML tools, it will require the 
use of an HTML5 parser instead, but the principle is the same.  It seems 
the only thing preventing that from happening right now is the current 
lack of implementations.  But given that HTML5 is a new language still 
under development, and the fact that such tools are being developed 
right now, it won't be a problem for much longer.


There were a few proprietary, incompatible, buggy engines locked up in various 
browsers; and that was about it.


OpenSP, which is free software,

What I don't understand is why some members of this working group is so 
dead set on actively preventing HTML from being XML. The non-draconian 
error handling I understand.


Because the fact is that when authors try to use XHTML as text/html, 
they inevitibly fail to do so properly.  It takes considerable knowledge 
and skill to be aware of and handle all issues ranging from parsing, 
character encodings to scripts and stylesheets.


This is list of very common mistakes inevitably made by the vast 
majority real-world authors when they try and fail to use XHTML as 
text/html, which would cause significant problems with any attempt to 
serve as XML.


* Fatal well-formedness errors
  - Unencoded & and <
  - Unclosed elements
  - Unqutoed attrbutes
  - etc...

* Incorrect or omitted namespace declaration (xmlns attribute), or use
  of ill-formed MS Office xmlns garbage.

* Named entity references require validating parsers (or a Mozilla-like
  hack to parse a subset of the DTD for recognised DOCTYPEs)
  - (excluding & < > " and ')
  - Lack of DOCTYPE in XHTML5 means that any others would be fatal

* Encoding should be declared within the XML declaration
  - When omitted, UTF-8 or UTF-16 must be used, unless specified at
the protocol level (usually not done).
  - Many just use ISO-8859-1, Windows-1252, etc. specifed using 
  - XML declaration triggers quirks mode in IE6 (text/html only).

* Badly encoded characters
  - e.g. use of Windows-1252 when ISO-8859-1 is declared

* Script and style elements are parsed differently
  - Not a problem for external scripts, but internal scripts
are very common.

  - This *very common* technique doesn't work in XML:


  - On pages that don't use that comment, this would be fatal:

if (a < b & c) {
// do someting
}


   - This can be worked around using a CDATA section, but
 //

* document.write() and document.writeln() do not work.

* DOM methods are case sensitive.
  - Although HTML5 is attempting to address many DOM API differences,
several still remain for backwards compatibility.

* XML rules for CSS differ slightly from HTML.
  - e.g. No special treatment for the body element.
  - Case sensitivity of Selectors

Keep in mind that, although someone like yourself may be able to handle 
every single one of those issues with ease, you are in the minority. 
There is significant evidence to show that millions of authors make 
those mistakes very frequently, despite thinking they're using XHTML.


That is why I strongly believe that XHTML 1.0 Appendix C was a huge 
mistake and that continuing to allow authors to think they can use XHTML 
as text/html is extremely harmful for the future of XML, not beneficial 
to it.



But why are you disappointed that  is well-formed XML?
Why the active hostility to well-formedness?


Because it allows people like youself to continue thinking that it's ok 
to parse HTML with an XML parser, just because they happen to share a 
few similarities in their syntax, and despite that fact that an XML 
serialisation is being provided for exa

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Lachlan Hunt


Elliotte Harold wrote:

Ian Hickson wrote:
If you want to use XHTML, then use XHTML, send it with an XML MIME 
type, and be happy.


What's wrong with option 1 is that it doesn't work in the browser with 
the majority of the installed base,


What you fail to realise is that by the time (X)HTML 5 is ready to be 
used significantly, current information suggests that IE8 will have 
added support for XHTML.



something I used to think mattered to this group.


It does matter, but that's why the HTML5 serialisation is provided.  The 
XHTML serialisation is not intended for IE7 and earlier.



Consequently I and many others choose option 3:

Use XHTML, send it with an HTML MIME type, and be happy.


No!

--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Michel Fortin


Le 1 déc. 2006 à 17:45, Ian Hickson a écrit :
You don't need to do one or the other. It's just up to you which  
you do.
Neither is better or worse than the other. They are equivalent,  
neither is

deprecated, they are both unambiguous, they are both strict, they will
both have validators and they will both have tools that can be used to
process them. There's no reason to try and do both.


I disagree with this choice-of-tool argument.

If you develop software to be used by other people, or other  
programs, you don't want to lock them in either camp, so you have to  
provide a way to generate both outputs. That's especially important  
when programs and libraries are exchanging documents or snippets of  
documents between each other. The DOM is a poor choice for these  
exchanges, because different DOM implementation are not interoperable  
between each other. The markup on the other hand can move more freely.


Having two markups pose the same problem as having two incompatible  
HD DVD formats. Browsers do (or will) accept both formats, so as long  
as the media type is known it'll work fine for them. But what about  
every other piece of software in the middle that does not talk  
directly to the browser?


That's the real difficulty when dealing with HTML and XHTML: the  
choice isn't really about tools, it's a choice between two  
incompatible exchange format. That's the reason why I think it's  
compelling to have a common subset between HTML and XHTML. If you can  
output something valid for both HTML and XHTML at the same time, then  
you don't have to worry about what format is supported on the other end.


That's also why it's probably worth knowing what the common subset  
looks like, how people might be tempted to use it, and what are its  
exact limitations and pitfalls. The common subset is an integral part  
of the the HTML/XHTML couple; it may be just a side effect, but it's  
there and should not be ignored. It's pretty clear to me that it'll  
be used whether we want it or not.


Oh, and here is one last remark. There are really *two* important  
common subsets: one between conformant HTML and conformant XHTML, and  
another between unambiguous HTML and well-formed XHTML. The first was  
pretty irrelevant before HTML allowed "/>", but it did not prevent  
people from using the second.



Michel Fortin
[EMAIL PROTECTED]
http://www.michelf.com/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Elliotte Harold


Ian Hickson wrote:

There is an underlying assumption here, namely that there would be 
something wrong with picking one or the other and just using that.


If you want to use XHTML, then use XHTML, send it with an XML MIME type, 
and be happy.


If you want to use HTML, then use HTML, send it with an HTML MIME type, 
and be happy.


What's wrong with option 1 is that it doesn't work in the browser with 
the majority of the installed base, something I used to think mattered 
to this group. Consequently I and many others choose option 3:


Use XHTML, send it with an HTML MIME type, and be happy.

--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Henri Sivonen


On Dec 2, 2006, at 14:02, Elliotte Harold wrote:


Lachlan Hunt wrote:

HTML and XML have significantly different parsing requirements and  
they absolutely must be treated as significantly different file  
formats.  Any attempt to treat them as the same format is an  
extremely bad idea.


That's only true to the extent that some people seem to insist on  
making them needlessly different. HTML is tantalizingly close to  
well-formed XML. They both derive from SGML. They both use angle  
bracketed tags. They both define a tree structure. Indeed in many  
cases an HTML document is an XML document.


But the point is that the text/html processing model has to work with  
the real Web where not all documents are well-formed.


This enables the use of the very powerful XML toolchain for  
processing HTML.


You can use the toolchain, except for the XML processor itself, as I  
have explained before.


What I don't understand is why some members of this working group  
is so dead set on actively preventing HTML from being XML. The non- 
draconian error handling I understand. But why are you disappointed  
that  is well-formed XML? Why the active hostility  
to well-formedness?


To make a conformance checker not accidentally let MIME type mistakes  
silently pass in some cases.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Elliotte Harold


Lachlan Hunt wrote:

HTML and XML have significantly different parsing requirements and they 
absolutely must be treated as significantly different file formats.  Any 
attempt to treat them as the same format is an extremely bad idea.


That's only true to the extent that some people seem to insist on making 
them needlessly different. HTML is tantalizingly close to well-formed 
XML. They both derive from SGML. They both use angle bracketed tags. 
They both define a tree structure. Indeed in many cases an HTML document 
is an XML document.


This enables the use of the very powerful XML toolchain for processing 
HTML. In fact, prior to the widespread adoption of XML there were, near 
as I could tell, no reliable open means of parsing HTML documents. There 
were a few proprietary, incompatible, buggy engines locked up in various 
browsers; and that was about it.


What I don't understand is why some members of this working group is so 
dead set on actively preventing HTML from being XML. The non-draconian 
error handling I understand. But why are you disappointed that html> is well-formed XML? Why the active hostility to well-formedness?


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-02 Thread Ian Hickson

On Sat, 2 Dec 2006, Mike Schinkel wrote:
> 
> >> You don't need to do one or the other. It's just up to you which you 
> >> do. Neither is better or worse than the other. They are equivalent, 
> >> neither is deprecated,  There's no reason to try and do both.
> 
> If, as you say one is as good as the other, why in the world have both?

The Web Apps 1.0 spec doesn't have both. It has a single format, HTML5, 
that is compatible with the overwhelming majority of Web content, and is 
compatible with the "tag soup" parsers as supported by all major Web 
browsers (including most mobile browsers).

However, since HTML5 is defined in terms of the DOM, and since XML is one 
way of serialising a DOM, it is guarenteed that _someone_ will try to 
create an XML serialisation of HTML5 DOMs. Therefore, the Web Apps spec 
addresses this, gives it a name (XHTML5), and makes sure to clearly state 
how that should work, so that when someone does it, they don't have to 
guess.

We have no choice about having the HTML format -- that's what the Web uses 
today and we have to be compatible with that to be successful. We have no 
cohice about the XML form, XML is used by certain people and there is a 
guarentee that people will try to use HTML5 with XML. Therefore we are 
stuck with having both.

For most people, however, XHTML5 need never enter their world.

> Maybe I'm wrong about this, but I bet most of the word-a-day web 
> developers and vendors of products that would need to support both would 
> agree. Has there been any attempt to learn their options on the 
> direction of HTML5 vs. XHTML?

There really is very little reason for anyone to use XHTML5 today, since 
it doesn't work in IE7, the majority browser, whereas HTML5 does.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Mike Schinkel

Lachlan Hunt wrote:
>> HTML and XML have significantly different parsing requirements 
>> and they absolutely must be treated as significantly different 
>> file formats.  Any attempt to treat them as the same format is 
>> an extremely bad idea.
>> ...
>> This is why the spec is defined in terms of the DOM, so that 
>> there can be both HTML and XHTML serialisations of the same 
>> document, rather than defining that both serialisations are the 
>> same syntax.

But please take into consideration that almost nobody writes web pages using
a DOM; they write web pages using text editors and dynamically using string
concatonation. As such there is great value for users in having them be as
similar as possible. If they converge, it will accelerate chaos on the web.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Mike Schinkel

>> There is an underlying assumption here, namely that there 
>> would be something wrong with picking one or the other 
>> and just using that.

You are correct in identifying the assumption.

>> If you want to use XHTML, then use XHTML, send it with 
>> an XML MIME type, and be happy. If you want to use HTML, 
>> then use HTML, send it with an HTML MIME type, and be 
>> happy.

I see this as a very confusing position for the average person, and one that
will consume lots and lots of unnecessary analyst and consultant dollars
helping organizations decide which is the right solution, and lots of
duplicated effort for supplies of both commericial and open source
technology that must continue to support new specifications for both, ad
infinitum. 

>> You don't need to do one or the other. It's just up to you 
>> which you do. Neither is better or worse than the other. 
>> They are equivalent, neither is deprecated, 
>> There's no reason to try and do both. 

If, as you say one is as good as the other, why in the world have both?  It
will just cause massive confusion and consternation.  Better to have one
that is a more ridgid subset of another one that is a more lax superset.

Maybe I'm wrong about this, but I bet most of the word-a-day web developers
and vendors of products that would need to support both would agree. Has
there been any attempt to learn their options on the direction of HTML5 vs.
XHTML?

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Mike Schinkel

Lachlan Hunt wrote:
>> XML parsing for HTML on the web is totally impractical and I 
>> really do not understand the desire to do so.

It is frequently difficult for someone of advanced skills to understand why
the average person struggles with a given technology.  It seems to me that
web technologies should target the average person (not the advanced person)
as that's why it has previously been successful.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Ian Hickson

On Fri, 1 Dec 2006, Mike Schinkel wrote:
> 
> Even though they are both serializations, the vast majority of people 
> producing HTML/XHTML are not doing it by serializing, they are doing it 
> by string concatonation and merging templates. Unfortunately, no matter 
> how much it's lamented that this is the wrong way to do it, it's not 
> going to change by a significant amount and hence it would seem to me to 
> be the enlightened thing to acknowledge and strive to converge HTML with 
> XHTML over time, as much as reasonably possible.

There is an underlying assumption here, namely that there would be 
something wrong with picking one or the other and just using that.

If you want to use XHTML, then use XHTML, send it with an XML MIME type, 
and be happy.

If you want to use HTML, then use HTML, send it with an HTML MIME type, 
and be happy.

You don't need to do one or the other. It's just up to you which you do. 
Neither is better or worse than the other. They are equivalent, neither is 
deprecated, they are both unambiguous, they are both strict, they will 
both have validators and they will both have tools that can be used to 
process them. There's no reason to try and do both.

> Another very beneficial thing would be to ensure there are reference 
> implementations of open source or public domain serializers for XHTML 
> and HTML as part of the spec in all major languages and platforms.

There will be tools available in due course. Right now it's still early 
days; the spec is in flux, so implementations would have to do a lot of 
work to keep track.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Mike Schinkel

Ian Hickson wrote:
>> 
>> On Thu, 30 Nov 2006, Mike Schinkel wrote:
>> > 1.) I read the FAQ http://blog.whatwg.org/faq/ and it seemed to imply 
>> > that HTML 5 and XHTML where not at odds with each other?  Did I misread

>> > that, because from comments on this thread I get the impression that 
>> > might not be the case.
>> 
>> They're just differently serialisations. One is for text/html, the other 
>> for XML. You can use one or the other, it basically only depends on 
>> whether you want to send it as text/html or not.
>> 

That is a good explanation, thank you.

Even though they are both serializations, the vast majority of people
producing HTML/XHTML are not doing it by serializing, they are doing it by
string concatonation and merging templates. Unfortunately, no matter how
much it's lamented that this is the wrong way to do it, it's not going to
change by a significant amount and hence it would seem to me to be the
enlightened thing to acknowledge and strive to converge HTML with XHTML over
time, as much as reasonably possible.

Another very beneficial thing would be to ensure there are reference
implementations of open source or public domain serializers for XHTML and
HTML as part of the spec in all major languages and platforms. That way
there would be a fighting chance that the next generation of web apps would
implement proper serialization and pipelines instead of reverting to string
concatonation because the other is just too hard. That way there is a
greater likelyhood of the next WordPress will be developed with a proper
architecture.

JMTCW anyway.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Mike Schinkel

Lachlan Hunt wrote:
>> Mike Schinkel wrote:
>> > 1.) I read the FAQ http://blog.whatwg.org/faq/ and it seemed to imply 
>> > that HTML 5 and XHTML where not at odds with each other?  Did I 
>> > misread that, because from comments on this thread I get the 
>> > impression that might not be the case.
>> > 
>> > 2.) A similar question, but is the goal for HTML5 and XHTML to slowly 
>> > converge, or is the goal for them to diverage?
>> 
>> This issue was explained in detail in this recent blog entry.
>> 
>> http://blog.whatwg.org/html-vs-xhtml
>> 

Thanks for the reference. I'll check it out.

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Ian Hickson

On Fri, 1 Dec 2006, Rimantas Liubertas wrote:
> > 
> > As far as I can tell,  is handled by all browsers the same 
> > way as . How is it not interoperable?
> 
> That's true, however, what happens depends on the browser and presence 
> of  in the code.

Right, the interoperability problems with

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Lachlan Hunt


Lachlan Hunt wrote:
XML parsing for HTML on the web is totally impractical and I really do 
understand the desire to do so.


I meant "I really [not] understand...".

--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Lachlan Hunt


Michel Fortin wrote:
It seems I was mistaken about that. I was pretty sure that it'd be a 
parse error in XML, but I now look at the [DTD construct in the XML 
spec][1] and I cannot see why. Apparently this is a valid DTD for an XML 
document where the root element is :





It's a well-formed DOCTYPE (unfortunately), not a valid DTD.  If only it 
weren't and perhaps this nonsense about treating HTML as XHTML and vice 
versa would stop.



These wouldn't since XML is case-sensitive:




That would only be a validity error because the root element is not 
, not a well-formedness error.






That would be a well-formedness error in XML.


 [1]: http://www.w3.org/TR/REC-xml/#dtd

So it appears after all that if HTML allows "/>", it would be possible 
and practical to have a single document which is valid for both HTML and 
XHTML at the same time.


It would be theoretically possible, but totally impractical in the real 
world. You can do whatever you like in your own authoring environment 
where you have control over exactly what goes into your documents, but 
XML parsing for HTML on the web is totally impractical and I really do 
understand the desire to do so.


HTML and XML have significantly different parsing requirements and they 
absolutely must be treated as significantly different file formats.  Any 
attempt to treat them as the same format is an extremely bad idea.



That doesn't mean the document will behave in the same way in the two
cases however.


Exactly, that's one of the problems!  This is why the spec is defined in 
terms of the DOM, so that there can be both HTML and XHTML 
serialisations of the same document, rather than defining that both 
serialisations are the same syntax.


I wonder if allowing "/>" in HTML couldn't, on the opposite of some 
other arguments, help authors and developers to grasp the real 
difference between the two markups.


No, I think the evidence of people wishing to blur the distinction 
between HTML and XHTML by having a fully compatible syntax only proves 
that it will serve to confuse the issue more.  They are separate 
syntaxes with separate parsing requirements and it makes no sense 
whatsoever to treat one like the other.



they'll just take note that "/>" doesn't necessarily mean XHTML anymore


That has never been a reliable indication of XHTML.  There are many 
authors that just use that XML syntax regardless of the DOCTYPE, 
namespace declaration or MIME type.  Many authors just don't have a clue 
that it's XML syntax and that it is absolutely meaningless in HTML.


--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Rimantas Liubertas


2006/12/1, Ian Hickson <[EMAIL PROTECTED]>:
<...>

> An example of something that is NOT implemented interoperably is
> .

As far as I can tell,  is handled by all browsers the same way as
. How is it not interoperable?


That's true, however, what happens depends on the browser and presence
of  in the code.

When IE encounters  it swallows everything after as the content of script. If there is
no  in the source - that's it.

Firefox likes consistency:  works OK,

This is OK too:


some text


However

some text


Produces only single SCRIPT in DOM tree swallowing paragraph and the
second

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-12-01 Thread Henri Sivonen


On Dec 1, 2006, at 04:15, Michel Fortin wrote:

that their valid XHTML1 documents served as text/html, when updated  
to XHTML5, are now called valid HTML5 documents by the validator.


Except:
 * xmlns is illegal in HTML5.
 * xml:lang vs. lang.
 *  vs. xml:base.
 * http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Lachlan Hunt


Mike Schinkel wrote:
1.) I read the FAQ http://blog.whatwg.org/faq/ and it seemed to imply that 
HTML 5 and XHTML where not at odds with each other?  Did I misread that, 
because from comments on this thread I get the impression that might not be 
the case.


2.) A similar question, but is the goal for HTML5 and XHTML to slowly
converge, or is the goal for them to diverage?


This issue was explained in detail in this recent blog entry.

http://blog.whatwg.org/html-vs-xhtml

--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Ian Hickson

On Tue, 28 Nov 2006, Sam Ruby wrote:
> 
> In HTML5, there are a number of elements with a content model of empty: area,
> base, br, col, command, embed, hr, img, link, meta, and param.
> 
> If HTML5 were changed so that these elements -- and these elements alone -- 
> permitted an optional trailing slash character, what percentage of the web
> would be parsed differently?

0%. Allowing or disallowing something is completely orthogonal to how it 
is parsed.

> The basis for my question is the observation that the web browsers that 
> I am familiar with apparently already operate in this fashion, this 
> usage seems to have crept into quite a number of diverse places

Browsers don't do any sort of conformance reporting for HTML parsing, so 
they can't actually be said to be allowing it or disallowing it. As far as 
parsing goes, all browsers, as well as the HTML5 parsing specification, 
handle bogus trailing / characters in tags by ignoring them.

> As a side benefit of this change, I believe that I could modify my weblog to
> be simultaneously both HTML5 and XHTML5 compliant

Since the namespace declaration is required in XML and disallowed in HTML, 
this is not possible. In addition, while you may be right that a tiny 
subset of XML might be equivalent to a tiny subset of HTML, it is not, and 
will never be, generally true that you can take an arbitrary HTML5 
document and treat it as XML. HTML5 has very detailed parsing rules (at 
least as detailed as XML, and arguably more detailed, since the HTML 
parsing rules define the tree you obtain from parsing, whereas XML parsing 
rules only state what a conformant document looks like and how to detect 
conformance errors, not how to turn a conformant document into a tree).

I'm not sure I really understand the value of having a single common 
syntax subset, either. Now that there is an unambiguous way of parsing 
HTML, converting HTML to XML and back again in a lossless manner is easy. 
(Though not trivial -- there are things that can be represented in one 
syntax and not the other, like namespaces in XML, and the  
element in HTML.)

Regarding your original suggestion: based on the arguments presented by 
the various people taking part in this discussion, I've now updated the 
specification to allow "/" characters at the end of void elements.

There were many e-mails on this thread. I have replied to the salient 
points below. Since much the discussion focused not on specific HTML5 
proposals but on the pros and cons of XML, WordPress, and other 
technologies, I've not replied to all the e-mails. If you feel I have 
failed to reply to an e-mail that I should have replied to, please bring 
it to my attention.

On Wed, 29 Nov 2006, Benjamin Hawkes-Lewis wrote:
>
> I think having /two/ different serializations of Web Forms 2.0/Web 
> Applications 1.0 is bad enough. To try and cater to what's effectively a 
> third serialization compatible with both parsing methods is to reinvent 
> the "XHTML 1.0 as text/html" mess. Serializing to multiple formats from 
> a single source is, I think, a better model. Especially as embedded 
> content may need different treatment too.

I strongly agree with this.

On Wed, 29 Nov 2006, Sam Ruby wrote:
> 
> That was not the intent of my suggestion.  I am suggesting that HTML5 
> standardize on *one* format.  One that comes as close as humanly 
> possible to capturing the web as it is practiced in all of its glorious 
> and often quite messy detail.  Those that wish to serialize the DOM in 
> other formats are certainly free to do so, but those formats aren't 
> HTML5.

This is already what we have -- the Web Apps 1.0 specification defines a 
single format, HTML5, with its syntax rules and parsing rules (including 
error handling). Serialisation to other formats is allowed, but not 
formally described by the Web Apps 1.0 specification. Due to its high 
profile, the serialisation that uses the XML syntax is explicitly 
addressed in the specification, and termed "XHTML5".

> But before they do, this work group certainly can anticipate that 
> question. What is the cost of accepting trailing slashes on elements 
> which are always defined with a content model of empty, except when 
> found in "Attribute value (unquoted) state"?  What sites would be parsed 
> differently based on this change?  Are those differences in line with 
> how existing browsers actually behave, or at odds with this behavior?

Again, these questions seem to betray a misunderstanding as to how the 
specification works. Trailing slashes were always ignored, and this has 
not changed. The only change is in whether such slashes are reported as 
errors in the validator or not. Whether something is an error has no 
effect on how it is parsed.

On Wed, 29 Nov 2006, Robert Sayre wrote:
> On 11/29/06, Lachlan Hunt <[EMAIL PROTECTED]> wrote:
> > 
> > I do not think it's a good idea to make the trailing slash conforming. 
> > Although it is harmless, it provides no add

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Michel Fortin


Le 30 nov. 2006 à 16:46, Sam Ruby a écrit :


On 11/30/06, Michel Fortin <[EMAIL PROTECTED]> wrote:


We can't really have a document that is both HTML5 and XHTML5 at the
same time if we keep the  declaration however.


Why not?


It seems I was mistaken about that. I was pretty sure that it'd be a  
parse error in XML, but I now look at the [DTD construct in the XML  
spec][1] and I cannot see why. Apparently this is a valid DTD for an  
XML document where the root element is :




These wouldn't since XML is case-sensitive:




 [1]: http://www.w3.org/TR/REC-xml/#dtd

So it appears after all that if HTML allows "/>", it would be  
possible and practical to have a single document which is valid for  
both HTML and XHTML at the same time. That doesn't mean the document  
will behave in the same way in the two cases however.


I wonder if allowing "/>" in HTML couldn't, on the opposite of some  
other arguments, help authors and developers to grasp the real  
difference between the two markups. Currently, "/>" is the signature  
of XHTML; people have learned that you add "/>" to HTML documents to  
make them XHTML. If HTML embrace the "/>" syntax, then that  
misleading hint no longer holds and people will have to learn to  
differentiate HTML from XHTML using other means (hint: media type!).  
They wouldn't really need to relearn anything if they don't want to,  
they'll just take note that "/>" doesn't necessarily mean XHTML  
anymore and that their valid XHTML1 documents served as text/html,  
when updated to XHTML5, are now called valid HTML5 documents by the  
validator.


Does this scenario makes any sense?


Michel Fortin
[EMAIL PROTECTED]
http://www.michelf.com/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Øistein E . Andersen

Trailing slashes in void elements are clearly unnecessary from a syntactic point
of view, but I think it can be argued that allowing them actually makes HTML
more internally consistent.

Current versions of HTML allow many unnecessary closing tags to be omitted
(e.g., ), and for authors exploiting this feature, adding trailing slashes 
to void
elements probably does not make much sense. Let's call this syntax I.

But current HTML also allows closing tags to be used for all non-void elements.
Authors doing this consistently will quite reasonably want to indicate closure 
of void elements
explicitly, which can be done by allowing something like either  or 
,
of which the former is probably preferable because it makes it overt that the 
element
is void, i.e., that it must be empty. Let's call this syntax II.

[Personally, I would like a conformance checker to issue either a warning or an
informational statement if (1) some, but not all optional closing tags are 
omitted,
if (2) some, but not all void elements contain a trailing slash, and perhaps 
even
if (3) the author does not adhere to either syntax I or syntax II. I do 
realise, though,
that others may want to keep conformance and consistence separate, and that such
additions to a conformance checker are unlikely to be made if important
modifications would be necessary in order to do so. This is only one example of
inconsistency that authors might want to avoid, of course.] 

Finally, not allowing trailing slashes in HTML does indeed make the format more
different from XHTML, but this does probably not imply that the distinction
between the two will be clearer or easier to grasp, which is what is really 
wanted.

-- 
Ãistein E. Andersen

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Sam Ruby


On 11/30/06, Michel Fortin <[EMAIL PROTECTED]> wrote:



We can't really have a document that is both HTML5 and XHTML5 at the
same time if we keep the  declaration however.



Why not?

- Sam Ruby

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Henri Sivonen


On Nov 30, 2006, at 17:57, Benjamin Hawkes-Lewis wrote:


On Thu, 2006-11-30 at 17:16 +0200, Henri Sivonen wrote:


Without labels, I do think that regardless of how the HTML5 spec
turns out, WordPress has an architectural flaw in its methodology of
producing markup. Since the flaw is in the architecture, I am not
optimistic of it getting fixed in WordPress because it would require
a rewrite. I'm hoping that at some point, a better system enters the
market. Meanwhile, asking the WP developers to rewrite theirs seems
unproductive.


Why? WordPress is much more than just the code: it is also a thriving
brand and community. Any /new/ system would have a big battle on its
hands to displace established players.


Of course, it would be better to have a better system get the  
community and brand benefits of WordPress. However, I believe that  
motivating the WordPress developers to write a new better but  
incompatible (with old plugins) system is not something that can be  
induced from the outside by just talking. It is the kind of change  
that is more difficult to carry out in an existing team and product  
community than to create a new product identity for. Even if it were  
to happen under the banner of WordPress, it would need to be driven  
by strong will from inside the team. The best way to induce such will  
from the outside is to create competitive pressure.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Henri Sivonen


On Nov 30, 2006, at 21:48, Michel Fortin wrote:

The best way someone could fix the resulting tag soup would  
probably be to pass the result through HTML Tidy. And it should be  
pretty straightforward since the tidy library has been part of PHP  
since version 5.


I noticed, but it is not compiled in by default, which is frustrating.

What is really important is that authors understand better that  
HTML must be served as text/html and that XHTML must be served with  
an xml media type. If the validator enforce that, then I think  
it'll be sufficient.


http://hsivonen.iki.fi/validator/html5/ does. http://hsivonen.iki.fi/ 
validator/ does too, but allows the enforcement to be turned off.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Michel Fortin


Le 30 nov. 2006 à 15:21, Elliotte Harold a écrit :


Michel Fortin wrote:

What is really important is that authors understand better that  
HTML must be served as text/html and that XHTML must be served  
with an xml media type. If the validator enforce that, then I  
think it'll be sufficient.


That's only plausible if

1. All browsers that accept XHTML served as text/html accept XHTML  
served as application/xhtml+xml.


2. Document authors can control the MIME types their documents are  
served with.


Neither is true today. Neither is likely to be true within the next  
couple of years, probably longer. They should be true by all means,  
but they aren't.


Given that fact of the installed base, I cannot accept that it is  
wrong to serve XHTML as text/html, and I'm afraid any effort that  
depends critically on that happening is doomed.


These are valid point, but I think you just misunderstood me. I could  
have said it this way: HTML must be served in a way that it is parsed  
by an HTML parser, and XHTML must be served in a way that it is  
parsed by an XML parser. That's the only sane thing you can do, and I  
don't really care if you do that by other means than the HTTP Content- 
Type header.


Now, when you're talking about "XHTML", you could be talking about  
two things. I'm talking an XML document, a format where  is  
equivalent to . You just can't send that to an HTML parser.


If you're talking about HTML where you added "/>" at the end of  
singleton tags to make it compatible with XHTML, then you've just  
misunderstood me. I wasn't arguing against "/>" in HTML. If HTML  
allows "/>" on singleton tags,  becomes both valid  
HTML and XHTML. If it wasn't for the doctype, you could serve it with  
any of the two media types (assuming the scripts and other things  
work with XHTML too, of course).


So if a document is meant to be parsed by the HTML parser, it's an  
HTML document; if it's meant to be parsed by the XML parser, it's an  
XHTML document; and if it can be parsed by both, then it's both.


We can't really have a document that is both HTML5 and XHTML5 at the  
same time if we keep the  declaration however. I'm not  
sure that's really a problem though, and it may even be a good thing  
since it's pretty easy to change the doctype if you want -- a lot  
easier than changing every singleton tags -- but you have to make the  
change explicitly in the source, so when an error arises it'll be  
easier to link it to a HTML/XHTML issue.



Michel Fortin
[EMAIL PROTECTED]
http://www.michelf.com/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Benjamin Hawkes-Lewis

On Thu, 2006-11-30 at 15:21 -0500, Elliotte Harold wrote:
> That's only plausible if [...] All browsers that accept XHTML served as 
> text/html accept XHTML 
> served as application/xhtml+xml.

This isn't required at all. All we really need is content
transformation. If systems like WordPress start using real DOMs instead
of tag soup, there would be much less of a problem.

--
Benjamin Hawkes-Lewis

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Elliotte Harold


Michel Fortin wrote:

What is really important is that authors understand better that HTML 
must be served as text/html and that XHTML must be served with an xml 
media type. If the validator enforce that, then I think it'll be 
sufficient.



That's only plausible if

1. All browsers that accept XHTML served as text/html accept XHTML 
served as application/xhtml+xml.


2. Document authors can control the MIME types their documents are 
served with.


Neither is true today. Neither is likely to be true within the next 
couple of years, probably longer. They should be true by all means, but 
they aren't.


Given that fact of the installed base, I cannot accept that it is wrong 
to serve XHTML as text/html, and I'm afraid any effort that depends 
critically on that happening is doomed.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Michel Fortin


Le 30 nov. 2006 à 10:16, Henri Sivonen a écrit :

Without labels, I do think that regardless of how the HTML5 spec  
turns out, WordPress has an architectural flaw in its methodology  
of producing markup. Since the flaw is in the architecture, I am  
not optimistic of it getting fixed in WordPress because it would  
require a rewrite. I'm hoping that at some point, a better system  
enters the market. Meanwhile, asking the WP developers to rewrite  
theirs seems unproductive.


I concur with that. While it may be true that WordPress often gives  
valid XHTML1 markup, it can't be denied that the internal processing  
manipulates pseudo-HTML tag soup at almost every levels, that's not a  
good architecture if you ask me. Integrating Markdown correctly into  
the text system of WordPress is a big hack because of that; filters  
have to be inserted all around the place as workarounds for various  
issues. I've written on the subject if someone wants to dig deeper:


 

The best way someone could fix the resulting tag soup would probably  
be to pass the result through HTML Tidy. And it should be pretty  
straightforward since the tidy library has been part of PHP since  
version 5.


 - - -

For me, accepting /> in HTML could be an acceptable solution. It sure  
is a departure from what was accepted as HTML previously, but I see  
no point in trying to convince everyone to change (again) their  
markup for cosmetic reasons.


What is really important is that authors understand better that HTML  
must be served as text/html and that XHTML must be served with an xml  
media type. If the validator enforce that, then I think it'll be  
sufficient.


What really confused people with XHTML1 is that the validator accepts  
XHTML1 as text/html without complaining, without even checking that  
the document will do fine when parsed as HTML. If the validator tells  
someone that his  is perfectly valid for XHTML1 as text/html,  
it's normal for that person to wonder why it doesn't work in the  
browser. That's confusing for authors, and that's exactly what we  
should avoid.



Michel Fortin
[EMAIL PROTECTED]
http://www.michelf.com/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Hallvord R M Steen


> http://example.org/bar/>



 Just require quotes around attribute values like
 HTML should have done 15 years ago.


You can "require" all that you want but we have to specify how to
parse content that is out there with this exact error. Anyway, this
discussion is really about validation.


Sam says:


 for the checked/> example it actually could go
 either way.  I'd personally would allow it.


"Allow" as in "not make the validator warn against it" I presume.
Anything goes as long as it is not appended to the attribute name..

--
Hallvord R. M. Steen

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Elliotte Harold


Hallvord R M Steen wrote:


It's the core of the debate, namely if  isn't technically a
problem why are validators required to flag it as invalid? The counter
examples are comparisons with  which isn't parsed into the DOM
most would expect when sent as HTML, and corner cases like

http://example.org/bar/>



That one's easy to fix. Just require quotes around attribute values like 
 HTML should have done 15 years ago.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Sam Ruby

Hallvord R M Steen wrote:

On 30/11/06, Anne van Kesteren <[EMAIL PROTECTED]> wrote:

> Closing slash on void elements
> sounds like a good example of "this is invalid because we're sticking
> to our fixed ideas"[1] rather than "this is invalid for technical
> reasons like causing ambiguities in DOM parsing". So I support Sam's
> approach.

Well, nothing per the parsing section causes "ambiguities in DOM parsing"
(assuming I understand what that means). So I'm not sure what you're
suggesting.

It's the core of the debate, namely if  isn't technically a
problem why are validators required to flag it as invalid? The counter
examples are comparisons with  which isn't parsed into the DOM
most would expect when sent as HTML, and corner cases like

http://example.org/bar/>

- now, how do you resolve relative URLs in this document? This is the
sort of ambiguity the DOM parsing has to take into account - caused by
the usage of forward closing slashes within tags. If the spec can
specify simple non-ambiguous ways of parsing that like the author
expects I think we can relax validation requirements like Sam wants.

From the head of this thread:

  As an additional constraint, I am explicitly suggesting that the
  "Attribute value (unquoted) state" not be changed - slashes in this
  state would continue to be appended to the current attribute's value

What this means is that the above example MUST be interpreted 
identically as:

  http://example.org/bar/";>

> That said, HTML5 must see
>
> 
>
> as a checkbox input with a "checked" attribute.

It does.

Included in the discussion to make sure HTML5 continues to do so even
if the change I want (more liberal validation) is taken in.

Yes, verb tenses are problematic in the face of a fluid draft document 
and an active proposal.

The current draft indicates that the above would be a parse error, and 
would result in the exact same parse error and DOM (modulo attribute 
order) as the following:

Clearly a fleshed out version of this proposal would preserve the 
existing specification behavior (including the parse error) for the 
checked/type example, but for the checked/> example it actually could go 
either way.  I'd personally would allow it.

- Sam Ruby

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Thomas Broyer


2006/11/30, Hallvord R M Steen:

> Well, nothing per the parsing section causes "ambiguities in DOM parsing"
> (assuming I understand what that means). So I'm not sure what you're
> suggesting.

It's the core of the debate, namely if  isn't technically a
problem why are validators required to flag it as invalid? The counter
examples are comparisons with  which isn't parsed into the DOM
most would expect when sent as HTML, and corner cases like

http://example.org/bar/>

- now, how do you resolve relative URLs in this document? This is the
sort of ambiguity the DOM parsing has to take into account - caused by
the usage of forward closing slashes within tags. If the spec can
specify simple non-ambiguous ways of parsing that like the author
expects I think we can relax validation requirements like Sam wants.


How about: a slash is ignored in the start tag of a void element if it
appears just before the closing > and it unambiguously is not part of
an attribute value.
-  => no attribute, ignored
- http://example.org/bar"/> => after the closing quote, ignored
- http://example.org/bar /> => preceded by a space, so its
not part of the attribute value => ignored
- http://example.org/bar/> => could be part of the
attribute value, so treated as *being* part of it

--
Thomas Broyer

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Benjamin Hawkes-Lewis

On Thu, 2006-11-30 at 17:16 +0200, Henri Sivonen wrote:

> Without labels, I do think that regardless of how the HTML5 spec  
> turns out, WordPress has an architectural flaw in its methodology of  
> producing markup. Since the flaw is in the architecture, I am not  
> optimistic of it getting fixed in WordPress because it would require  
> a rewrite. I'm hoping that at some point, a better system enters the  
> market. Meanwhile, asking the WP developers to rewrite theirs seems  
> unproductive.

Why? WordPress is much more than just the code: it is also a thriving
brand and community. Any /new/ system would have a big battle on its
hands to displace established players.

--
Benjamin Hawkes-Lewis

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Leons Petrazickis

On 11/30/06, Sam Ruby <[EMAIL PROTECTED]> wrote:

On 11/30/06, Anne van Kesteren <[EMAIL PROTECTED]> wrote:
>
> It has to allow two authoring syntaxes. One HTML and one XML. I thought we
> were past that discussion?

The sense I am gathering is that the proposal is not obviously insane, and
in fact is a bit novel in that such a narrowly scoped adoption of XML syntax
-- i.e., only to the extent that it both reflects the web as widely
practiced and only to the extent that doing such does not introduce
ambiguity into the grammar -- had not been considered before.

In any case, I plan to proceed on the assumption that it is worth my time to
flesh out the proposal a bit more.  The next iteration is likely to also
contain thoughts on extensibility and namespaces.  Like this proposal was,
my intent is that that proposal too will also take great care to only be
minimally invasive.

So far, the proposal is to have two syntaxes:

1) HTML5 - Backwards-compatible text/html syntax that allows trailing
slashes on always-empty elements.
2) XHTML5 - Full XML syntax with the proper mime type.

I am not sure where extensibility and namespaces would fit into that.
Perhaps they should be proposed independently of this.

Again, early adopters of CSS, validation, and semantic mark-up were
told a story. That story said that maintainability=>no formatting in
HTML=>CSS=>XHTML=>trailing backslashes on empty elements. That's not
true, but quibbling nets few converts. If we make trailing backslashes
invalid, then every bug report we file will say:
- Remove trailing backslashes. You can't serve XHTML like this.

That will invariably be rejected.

If we keep them valid on always-empty elements, then it'll be much nicer:
- HTML5 is the new hot thing. You shouldn't be serving XHTML as
text/html anyways. Switch doctypes, revalidate, and iteratively
improve markup

It's much easier to gain acceptance, agreement, buy-in, consensus on
the latter. The validation errors they'll see will actually help them
with browser compatibility.

--
Leons Petrazickis

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Hallvord R M Steen

On 30/11/06, Anne van Kesteren <[EMAIL PROTECTED]> wrote:

> Closing slash on void elements
> sounds like a good example of "this is invalid because we're sticking
> to our fixed ideas"[1] rather than "this is invalid for technical
> reasons like causing ambiguities in DOM parsing". So I support Sam's
> approach.

Well, nothing per the parsing section causes "ambiguities in DOM parsing"
(assuming I understand what that means). So I'm not sure what you're
suggesting.

It's the core of the debate, namely if  isn't technically a
problem why are validators required to flag it as invalid? The counter
examples are comparisons with  which isn't parsed into the DOM
most would expect when sent as HTML, and corner cases like

http://example.org/bar/>

- now, how do you resolve relative URLs in this document? This is the
sort of ambiguity the DOM parsing has to take into account - caused by
the usage of forward closing slashes within tags. If the spec can
specify simple non-ambiguous ways of parsing that like the author
expects I think we can relax validation requirements like Sam wants.

> That said, HTML5 must see
>
> 
>
> as a checkbox input with a "checked" attribute.

It does.

Included in the discussion to make sure HTML5 continues to do so even
if the change I want (more liberal validation) is taken in.

--
Hallvord R. M. Steen

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Simon Pieters


Hi,

From: Ian Hickson <[EMAIL PROTECTED]>

I think basically the argument is "it would help people" and the counter
argument is "it would confuse people". We need evidence to back up these
arguments so we can make a solid decision. The only relevant data I have
is that 50% of the web uses trailing slashes, and only 17% uses XHTML.
This could be used to back up either argument: "clearly people think that
trailing slashes are allowed, so we should allow them", and "clearly
people are confused about trailing slashes, so we should get rid of them
altogether". I don't know which is best.


Previously I thought trailing slashes should be disallowed in HTML5, mostly 
because it means something different in SGML using the SGML declaration for 
HTML4. But now we don't care about compatibility with SGML, so this point is 
irrelevant.


If we disallow it, authors who use tag soup systems that emit "XHTML" today 
and want to convert to HTML5 will probably just do a search and replace 
either in the files directly or with a script before serving. Such string 
substitution is not safe because it might lead to /> being replaced with > 
in other places than intended, e.g. in comments, attribute values, style 
sheets, scripts, content, XML files, etc. Additionally, replacing /> with > 
even if done carefully or safely does not add any value, because it is 
already handled interoperably. That this has happened on blog.whatwg.org is 
not an isolated example; I've seen XHTML to HTML4 scripts using string 
substitution all over the place -- I've even written one myself in the past 
-- and they are completely useless because they require that the input 
already works as text/html without any conversion.


So now I'm starting to think that trailing slashes for void elements should 
be allowed in HTML5.


Regards,
Simon Pieters

_
Facklor och eldar i trädgården http://alltombostad.msn.se/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Henri Sivonen


On Nov 30, 2006, at 14:15, Sam Ruby wrote:


Henri Sivonen wrote:

I don't think it has any actual technical merit
OTOH, the blog.whatwg.org WordPress lipsticking drill was a total  
waste of time from a technical point of view. It was purely about  
public relations and politics.


As an alternative to being perceived as a "lipsticking drill", I  
would prefer that others felt that an important part of the spec  
authoring process includes what amounts to a feasibility study and  
hands on experimentation with extant authoring tools.


That's my personal perception considering what I knew before, what I  
know after and what the opportunity cost to me personally was. The  
problem that was fixed was not causing any technical interop harm,  
which I knew before, and I also had a pretty good hunch in advance  
that "fixing" it in WP would amount to lipstick or duct tape.


I do think that the discussion that ensued has been good for the spec.

(Still, I am against efforts to make it appear that the text/html and  
application/xhtml+xml syntaxes are one thing.)



I apologize if I've caused any ill will.


No ill will on my part. I apologize if it appeared to be the case.

I am just disappointed with myself that I didn't stick to my policy  
of steering clear of situations where I risk ending up patching  
someone else's PHP code. (Because I've patched PHP code before and I  
know it makes me unhappy.)


I do believe that efforts to keep blog.whatwg.org and other sites  
to be valid relative to the current draft of HTML5 are important in  
order to keep perspective


I agree that dogfooding is important. Dogfooding Web Forms 2.0 at  
http://hsivonen.iki.fi/validator/ has lead to one spec improvement.  
But I don't volunteer to dogfood HTML5 on WordPress.



and to provide an example for others to learn from.


I am a bit wary of setting examples at this point before the spec has  
stabilized more. For example, not discussing the HTML5 doctype *yet*  
at http://hsivonen.iki.fi/doctype/ is intentional. (Not discussing  
Opera 9 and IE7 is just a scheduling problem. I intend to include  
them RSN.)


Finally, I will express a bit of disappointment at seeing the  
WordPress folks prematurely being labeled bozos,


Even though I have a document in which I quote Tim Bray's coinage of  
the term in the XML context, I have tried to avoid labeling anyone in  
particular a bozo.


Without labels, I do think that regardless of how the HTML5 spec  
turns out, WordPress has an architectural flaw in its methodology of  
producing markup. Since the flaw is in the architecture, I am not  
optimistic of it getting fixed in WordPress because it would require  
a rewrite. I'm hoping that at some point, a better system enters the  
market. Meanwhile, asking the WP developers to rewrite theirs seems  
unproductive.


and am disappointed to see portions of this discussion framed in  
terms that border on the discussions of epic battles with Zeldman.


I was acknowledging that I agreed that the /> habit has been largely  
popularized by Zeldman et al. and that it is more of a fashion  
statement than a technical necessity. If there were some repressed  
ill feelings there, it is probably because at times it annoys me that  
although Zeldman et al. have been very successful in instilling  
mantras in the minds of Web authors, they haven't been able to  
instill profound understanding of the related issues, and sometimes  
the axiomatic mantras get in the way. (http://hsivonen.iki.fi/ 
wannabe/ grew out of an encounter with a person repeating one of the  
mantras at me.)


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Anne van Kesteren

On Thu, 30 Nov 2006 15:14:03 +0100, Hallvord R M Steen  
<[EMAIL PROTECTED]> wrote:

FWIW, it sounds sane to me to align validation as much as possible
with the UA parsing in a way that issues that aren't really problems
for the UA aren't flagged as invalid. Closing slash on void elements
sounds like a good example of "this is invalid because we're sticking
to our fixed ideas"[1] rather than "this is invalid for technical
reasons like causing ambiguities in DOM parsing". So I support Sam's
approach.


Well, nothing per the parsing section causes "ambiguities in DOM parsing"  
(assuming I understand what that means). So I'm not sure what you're  
suggesting.




That said, HTML5 must see



as a checkbox input with a "checked" attribute.


It does.



[...]

[1] disclaimer: not intended as a flame-bait but probably is one.. :-ø



--
Anne van Kesteren

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Hallvord R M Steen


The sense I am gathering is that the proposal is not obviously insane, and
in fact is a bit novel in that such a narrowly scoped adoption of XML syntax
-- i.e., only to the extent that it both reflects the web as widely
practiced and only to the extent that doing such does not introduce
ambiguity into the grammar -- had not been considered before.


FWIW, it sounds sane to me to align validation as much as possible
with the UA parsing in a way that issues that aren't really problems
for the UA aren't flagged as invalid. Closing slash on void elements
sounds like a good example of "this is invalid because we're sticking
to our fixed ideas"[1] rather than "this is invalid for technical
reasons like causing ambiguities in DOM parsing". So I support Sam's
approach.

That said, HTML5 must see



as a checkbox input with a "checked" attribute. Finding a "checked/"
attribute and not checking the checkbox is not compatible with the web
(learnt the hard way!).  Perhaps finding a slash in "attribute name"
mode on void elements should be a parse error if the next character is
not > ? (Pretty certain the specs already disallow attribute names
starting with forward slash.)

[1] disclaimer: not intended as a flame-bait but probably is one.. :-ø

--
Hallvord R. M. Steen

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Mike Schinkel

Hi All:

Being new to this list, I've been following this thread with interest and
have some questions and comments:

As for my questions:

1.) I read the FAQ http://blog.whatwg.org/faq/ and it seemed to imply that
HTML 5 and XHTML where not at odds with each other?  Did I misread that,
because from comments on this thread I get the impression that might not be
the case.

2.) A similar question, but is the goal for HTML5 and XHTML to slowly
converge, or is the goal for them to diverage?  If the former, it would seem
that Sam's proposal makes a heck of a lot of sense.  If the latter, I would
ask why?  Why would you want to create two different standards to choose
between? Why create a fork in the road where whichever branch you take means
you'll never be taking the other one (w/o a lot of backtracking anyway.)  If
they are converging, there's no need to fear which road to take because both
will eventually get you to the same place.

As for confusion vs creating a problem, from my seven years experience as a
programming instructor and course material author I would look at it like
this:

1.) Don't support trailing slashes, and now people have to be taught how to
fix it.  That requires active effort and active understanding which is
likely to cause more than a bit of confusion because it is different than
what was accepted before and because it is different than XHTML. They will
wonder about the difference with 99.9% of them never reading these archives
and probably >99% of them never getting a full explanation. Hence
significant confusion among a large number of people who are attempting to
validate with ongoing confusion between what works for HTML and what works
for XHTML when ironically the goal was to reduce confusion.

2.) Do support trailing slashes. Everything just works fot everybody:
Validates fine for people who don't use trailing slashes. Validates fine for
people who do using trailing slashes. Most people never even notice the
inconsistency (lack of consistency is not a problem for most people unless
it stops them from otherwise doing something. That's partly why so many
salespeople so easily paint so many technical people into corners when they
promise a world that actually can't be delivered. But I digress... :)  A few
people would notice the inconsistency, but most of them are of the mind to
understand the explanation. For those who aren't, they can just be told one
of the following:

1.) "Well, the designers chose that to be consistent with both HTML and
XHTML since it doesn't otherwise cause a problem," or 
2.) "The trailing slash is optional for the singleton elements only, as they
are the only ones where it's applicable," 

OR MUCH MORE LIKELY:

3.) "Uh, don't worry about it.  It doesn't really matter. Works either way."

Given answer #3, I can almost guarantee you that 99% of people told #3 will
be happy as a clam with the answer and just go about their business, totally
unconfused.  Most people won't care enough to evaluate the inconsistancy any
further, as long as it's not causing them any problems.  (this again from my
seven years experience training programmers.)  

JMTCW IMHO, anyway. :)

A final reason to support trailing slashes is for people who, like me, plan
to one day fully support XHTML even though that plan may ultimately never
materialize. :)

Oh, one last thing. I recently chose to use WordPress because, after working
with several others and being less than happy with them (dasBlog, Community
Server, and TypePad)  it appeared that WordPress was by far the best when it
came to standards and supporting new and useful features. And I was very
happy with the decision. Now, after reading this thread, I'm thoroughly
depressed with respect to WordPress. FWIW.  :-(  

-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org/

P.S. Hi Sam, you might recognize my name as the founder and former president
of VBxtras/Xtras.Net, here reincarnated in a new form. :)

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Anne van Kesteren

On Thu, 30 Nov 2006 12:51:36 +0100, Sam Ruby <[EMAIL PROTECTED]>  
wrote:
It has to allow two authoring syntaxes. One HTML and one XML. I thought  
we were past that discussion?


I fully expected my proposal to either be bounced immediately as sheer
lunacy, or for someone to quickly point to the specific reason why it had
been rejected before.


I think it's still not clear to me what your proposal is and what it would  
entail. I thought you solely argued for allowing "/" at the end of the  
start tag of void elements, but it seems you don't want an XML  
serialization anymore as well or something in that direction.



The sense I am gathering is that the proposal is not obviously insane,  
and in fact is a bit novel in that such a narrowly scoped adoption of  
XML syntax -- i.e., only to the extent that it both reflects the web as  
widely

practiced and only to the extent that doing such does not introduce
ambiguity into the grammar -- had not been considered before.


I'm not sure I get this.



[...]



--
Anne van Kesteren

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Sam Ruby


Henri Sivonen wrote:



I don't think it has any actual technical merit


OTOH, the blog.whatwg.org WordPress lipsticking drill was a total waste 
of time from a technical point of view. It was purely about public 
relations and politics.


As an alternative to being perceived as a "lipsticking drill", I would 
prefer that others felt that an important part of the spec authoring 
process includes what amounts to a feasibility study and hands on 
experimentation with extant authoring tools.


I apologize if I've caused any ill will.

I do believe that efforts to keep blog.whatwg.org and other sites to be 
valid relative to the current draft of HTML5 are important in order to 
keep perspective and to provide an example for others to learn from.


Finally, I will express a bit of disappointment at seeing the WordPress 
folks prematurely being labeled bozos, and am disappointed to see 
portions of this discussion framed in terms that border on the 
discussions of epic battles with Zeldman.


- Sam Ruby

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Sam Ruby


On 11/30/06, Anne van Kesteren <[EMAIL PROTECTED]> wrote:



It has to allow two authoring syntaxes. One HTML and one XML. I thought we
were past that discussion?



I fully expected my proposal to either be bounced immediately as sheer
lunacy, or for someone to quickly point to the specific reason why it had
been rejected before.

The sense I am gathering is that the proposal is not obviously insane, and
in fact is a bit novel in that such a narrowly scoped adoption of XML syntax
-- i.e., only to the extent that it both reflects the web as widely
practiced and only to the extent that doing such does not introduce
ambiguity into the grammar -- had not been considered before.

In any case, I plan to proceed on the assumption that it is worth my time to
flesh out the proposal a bit more.  The next iteration is likely to also
contain thoughts on extensibility and namespaces.  Like this proposal was,
my intent is that that proposal too will also take great care to only be
minimally invasive.

- Sam Ruby

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Henri Sivonen


On Nov 30, 2006, at 00:18, James Graham wrote:

I tentatively support the idea that trailing slashes on  
"singleton"[1] elements should not be a parse error.


Me, too, and I'm past the tentative phase.


I don't think it has any actual technical merit


OTOH, the blog.whatwg.org WordPress lipsticking drill was a total  
waste of time from a technical point of view. It was purely about  
public relations and politics.


but I think it will be helpful in getting developer mindshare; a  
lot of people have drunk the "Zeldman Koolaid" and have the ideas  
of XHTML, clean markup, CSS, and conformance to standards in  
general all mushed together in their brain[2]. For these people  
(who I think represent the upper quartile of web developers in  
terms of commitment to good markup) the trailing slash in empty  
elements is the syntax of a new generation - it is a symbol that  
represents everything that has changed in web design since 1996 -  
as intrinsically useless as a fashionable designer label but just  
as seductive.


+1

Propaganda efforts are better directed at other issues than undoing  
the Zeldmanian />.


[1] I find that name quite confusing as it suggests there should  
only be one in the entire document.


They are void elements now.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-30 Thread Anne van Kesteren

On Thu, 30 Nov 2006 00:46:37 +0100, Sam Ruby <[EMAIL PROTECTED]>  
wrote:

I think that it is fair to assume that the majority of the people who
(rightfully or incorrectly) assume that they are using XHTML use trailing
slashes.


Has it actually been checked how many people use an XHTML doctype and  
forget to use the trailing slashes on one or more elements?




It is fair to conclude that that 33% (i.e., 50-17) of those that (in this
case incorrectly) assume that they are producing HTML use trailing  
slashes. In Ian's terminology, these people are confused.


Same here: do those documents use them occasionally or throughout?


The first question I think we can answer fairly conclusively: of those  
33%, how many will become "un-confused" if HTML5 does not permit trailing
slashes?  Hint: the version of HTML they are currently using already  
doesn't permit trailing slashes.


The current version of HTML actually does, because it is in part based on  
SGML (and at the same time isn't according to some people). The meaning of  
a trailing slash there is, however, very different. Of course,  
validator.w3.org (most widely used validator I reckon), doesn't really  
tell you that.




Path 1: HTML5 permits two authoring syntaxes, and the question as to  
whether or not trailing slashes are allowed is forever "it depends".

I continue  to
maintain that most people don't understand DOCTYPEs, and will point to  
the 50% number above as being consistent with that contention.


It has to allow two authoring syntaxes. One HTML and one XML. I thought we  
were past that discussion?




Path 2: HTML5 permits only one authoring syntax, and permits "XML-style"
notation only to the extent that such syntax wouldn't be interpreted in a
different manner by consumers that only understand HTML.  The  
documentation
for HTML5 would contain examples of such cases, and any conformance  
checker would only point out such examples.


That only understand HTML??



[...] For example, technically ' would fall on the wrong side the
argument, but as I can see from the current draft of HTML5, the right
decision was already made in that case.


The sole reason for that is that a couple of user agents support in HTML.


--
Anne van Kesteren

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Sam Ruby


On 11/29/06, Ian Hickson <[EMAIL PROTECTED]> wrote:


I think basically the argument is "it would help people" and the counter
argument is "it would confuse people". We need evidence to back up these
arguments so we can make a solid decision. The only relevant data I have
is that 50% of the web uses trailing slashes, and only 17% uses XHTML.
This could be used to back up either argument: "clearly people think that
trailing slashes are allowed, so we should allow them", and "clearly
people are confused about trailing slashes, so we should get rid of them
altogether". I don't know which is best.



More data would certainly be better, particularly as this data apparently
doesn't distinguish between "singleton slashes" and "non-singleton slashes",
and I have my intuitions on that matter, but limiting myself to only the
data we have:

I think that it is fair to assume that the majority of the people who
(rightfully or incorrectly) assume that they are using XHTML use trailing
slashes.  Besides, this is the most conservative assumption for the next
point:

It is fair to conclude that that 33% (i.e., 50-17) of those that (in this
case incorrectly) assume that they are producing HTML use trailing slashes.
In Ian's terminology, these people are confused.

The first question I think we can answer fairly conclusively: of those 33%,
how many will become "un-confused" if HTML5 does not permit trailing
slashes?  Hint: the version of HTML they are currently using already doesn't
permit trailing slashes.

The remaining question is how many of the 67% of the 83% (i.e., 55%) of
people who use HTML and don't using trailing slashes would suddenly become
confused (assuming that they aren't already, but just happen to be lucky or
use good tool) if HTML5 were to now given an option that provides them
little, if any value.  This question I don't believe is answerable only by
scanning the documents that they produce.  But I will say that few people
produce documents in a vacuum, and given that 50% of the documents on the
web contain trailing slashes, I dare say that most if not all of them have
already been exposed to such documents.

And like most questions, the mere existence of HTML5 is likely to influence
the answer to the question.

More specifically, imagine two paths:

Path 1: HTML5 permits two authoring syntaxes, and the question as to whether
or not trailing slashes are allowed is forever "it depends".  I continue to
maintain that most people don't understand DOCTYPEs, and will point to the
50% number above as being consistent with that contention.

Path 2: HTML5 permits only one authoring syntax, and permits "XML-style"
notation only to the extent that such syntax wouldn't be interpreted in a
different manner by consumers that only understand HTML.  The documentation
for HTML5 would contain examples of such cases, and any conformance checker
would only point out such examples.

Note: the two paths above are mere thumbnail sketches.  The devil's in the
detail.  For example, technically ' would fall on the wrong side the
argument, but as I can see from the current draft of HTML5, the right
decision was already made in that case.

- Sam Ruby

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Robert Sayre

On 11/29/06, Robert Sayre <[EMAIL PROTECTED]> wrote:

On 11/29/06, Robert Sayre <[EMAIL PROTECTED]> wrote:
>
> Ok, I have submitted a bug report.
>
> http://trac.wordpress.org/ticket/3406
>
> Let's see what happens.

Well, that didn't seem too effective. :/

Ah, if you visit now, you'll find a WHAT-WG member has written
"fundamental flaw with the way WordPress has been built" in bright red
letters. Not exactly Dale Carnegie material.

It still seems impossible to file a bug on teXtHTML.

Sam Ruby wrote:

Drawing lines in the sand and maintaining that "" is invalid is only 
going to make more
busy work for a lot of people.  If you try to explain why this decision was 
made, most won't
understand, and eventually most will decide that compliance isn't worth the 
bother.

Agree.

--

Robert Sayre

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread James Graham


Ian Hickson wrote:

On Wed, 29 Nov 2006, Leons Petrazickis wrote:
This rigmarole is going to repeat on every site that has converted to 
XHTML sent as text/html. People are emotionally invested in the idea of 
trailing slashes. Websites have complex codebases, and going through 
them removing trailing slashes on singleton elements would be very hard.


If people want to make HTML5 syntactically compatible with XHTML1, such 
that XHTML1 documents don't cause syntax errors in HTML5, we'll have to do 
a whole lot more than just allowing trailing /s. I don't really see why 
that would be a goal, though. Going further, if we want to make documents 
in general compliant with HTML5, then we've got our work cut out for us -- 
at least 78% of documents are syntactically incorrect today (not counting 
things like trailing /s in attributes, or missing DOCTYPEs -- if you 
include those, the number is more like 93%).


I tentatively support the idea that trailing slashes on "singleton"[1] 
elements should not be a parse error. I don't think it has any actual 
technical merit but I think it will be helpful in getting developer 
mindshare; a lot of people have drunk the "Zeldman Koolaid" and have the 
ideas of XHTML, clean markup, CSS, and conformance to standards in 
general all mushed together in their brain[2]. For these people (who I 
think represent the upper quartile of web developers in terms of 
commitment to good markup) the trailing slash in empty elements is the 
syntax of a new generation - it is a symbol that represents everything 
that has changed in web design since 1996 - as intrinsically useless as 
a fashionable designer label but just as seductive.


[1] I find that name quite confusing as it suggests there should only be 
one in the entire document.


[2] c.f. the "code is poetry" comment in the Wordpress bug report 
despite the fact that most here would argue HTML 4 as text/html is 
considerably more poetic than XHTML as text/html.


--
"The universe doesn't care what you believe. The wonderful thing about 
science is that it doesn't ask for your faith, it just asks for your 
eyes" --- http://xkcd.com/c154.html

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Ian Hickson

On Wed, 29 Nov 2006, Steve Runyon wrote:
>
> Thanks Ian - so is it fair to say that self-closing singletons should be 
> _allowed_ but not _required_ -- that either syntax would be accepted as 
> valid HTML5?  That only makes sense to me -- it's backward-compatible 
> while allowing XHTML compatibility as well.

It's a compelling argument.

I think basically the argument is "it would help people" and the counter 
argument is "it would confuse people". We need evidence to back up these 
arguments so we can make a solid decision. The only relevant data I have 
is that 50% of the web uses trailing slashes, and only 17% uses XHTML. 
This could be used to back up either argument: "clearly people think that 
trailing slashes are allowed, so we should allow them", and "clearly 
people are confused about trailing slashes, so we should get rid of them 
altogether". I don't know which is best.

> Your point about 'test' being the same as 'test' is very 
> interesting.

To clarify, 'test' is the same as 'test' because it's the same 
as 'test' -- the "/" character is completely ignored by browsers.

> That's not something I've ever done (that I'm aware of, anyway), and it 
> surprises me that it works that way.  As a divergent example -- at least 
> in IE6 -- '' is treated as an inline element rather than a 
> block...that's probably non-standard behavior, and in any case it was a 
> surprise when I encountered it.

Could you show an example of this? I couldn't reproduce the behaviour you 
describe. In my tests, in text/html content,  and  acted 
exactly the same, in all browsers that I tested it with.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Julian Reschke


Julian Reschke schrieb:

Lachlan Hunt schrieb:

...
The fact is that authors already try things like ,  and even 
.  I've seen all of those examples in the wild.  See, for 
instance, the source of the XML 1.0 spec (and many others) which claim 
to be XHTML as text/html, littered with plenty of  tags all 
throughout.

...


Huh? The thing at ? Don't see that 
problem there.


If this was the case at an earlier point of time, it was probably caused 
by a bug in their XSLT code, not the authors writing the spec (which 
IMHO uses the W3C's xmlspec XML language).


Best regards, Julian


OK, I take that back.

The main problem here seems to be that  
is served as text/html, but contains XHTML content.


Best regards, Julian

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Sam Ruby

On 11/29/06, Ian Hickson <[EMAIL PROTECTED]> wrote:

On Wed, 29 Nov 2006, Leons Petrazickis wrote:
>
> This rigmarole is going to repeat on every site that has converted to
> XHTML sent as text/html. People are emotionally invested in the idea of
> trailing slashes. Websites have complex codebases, and going through
> them removing trailing slashes on singleton elements would be very hard.

Various things are worse noting here:

XHTML is a minority on the Web. Looking at just which elements specify the
XHTML namespace on their  element, XHTML has at most 15%
penetration, for example.

I am of the belief that that particular statistic is meaningless.  Even if
it were 15%, most aren't well formed.  Of those that are well formed, most
don't have the cojones to serve such documents with the appropriate MIME
type as they know that to do so would cause compliant UA to be rather
unforgiving.  And of the few insane enough to do so, it is rare that the
page in question is actually valid.

Nothing is going to stop people from continuing to use XHTML1, HTML4,

HTML3.2, HTML2, or whatever their existing content uses. HTML5 is a new
language, that happens to be backwards-compatible with all of those. There
are probably near zero documents on the Web today that are
HTML5-compliant, simply because the DOCTYPE is new. That's fine. Just
getting new documents to be compliant would be fine. WordPress, for
example, will eventually create new templates, and those could be based on
HTML5 (though of course WordPress would have a harder job there due to its
hardcoding of markup, but that's another story).

... on the other hand, I am not of the belief that version numbers mean what
they are supposed to.  You will see HTTP 1.1 headers in HTTP 1.0 requests,
RSS 2.0 elements in RSS 0.91 feeds, and HTML4 elements in XHTML documents.

We live in a cut and paste world.  The fact that I could find an XHTMLism in
the front page of Microsoft.com will likely surprise few.  Lachlan is free
to call the authors of WordPress bozos if he likes, but frankly the bozos
out number you.  What should be the most damning of all is that I found an
example on the most prominent page on the mozilla.org site.  No one can say
that the authors of that page didn't make a conscious choice in the DOCTYPE
for that page.  No one can say that the authors of that page are ignorant.
No one can say that mozilla has a(n entirely) cavalier attitude towards
standards.

My theory is that we live in a cut and paste world, one based on partial
understanding.  Few understand DOCTYPEs and xmlns attributes, mostly people
crib from something that works.

If people want to make HTML5 syntactically compatible with XHTML1, such

that XHTML1 documents don't cause syntax errors in HTML5, we'll have to do
a whole lot more than just allowing trailing /s. I don't really see why
that would be a goal, though. Going further, if we want to make documents
in general compliant with HTML5, then we've got our work cut out for us --
at least 78% of documents are syntactically incorrect today (not counting
things like trailing /s in attributes, or missing DOCTYPEs -- if you
include those, the number is more like 93%).

At the present time being valid is an ideal that is virtually unattainable.
For most people, if your web page is broken, a validator is probably the
last place you want to go as it will require you to fix a number of things
that frankly nobody cares about before you can see the real errors.

The situation is not perfect, but perhaps a bit better for feeds.  For the
overwhelming majority of errors that the feed validator reports, there is
somebody that cares.  Example: try viewing a feed that isn't well formed
using IE7.

In general, people don't migrate to new versions of HTML. They only use

new versions for new documents. Which is fine, since HTML5 UAs are going
to be backwards-compatible (by design).

Now we are getting to the real question:  backwards compatible with what?
Only with compliant  documents (i.e., at most 22% of the web) or with pages
like the one at mozilla.org?

They've already reaped all the benefits of XHTML -- cleaner, more
> readable, more maintainable code.

It's a myth than XHTML gives you those benefits, by the way, especially if
you don't actually use an XML pipeline (which WordPress doesn't).

I have no interest in that discussion.

The very idea of HTML5 is to not demand that the Web be scrapped and
> rewritten. We need the people who have rewritten all their pages so that
> they validate on the W3C validator -- they have the fire and the zeal
> and the will to spread our format. We need to make the migration from
> invalid XHTML to valid HTML5 very, very easy for them. We can't require
> them to dig through PHP spaghetti. And that means that, no matter how
> it's achieved,  needs to be valid HTML5.

I don't really understand this argument. Those who use XHTML1 because it's
"the latest thing", are as likely to use HTML5 because it's "th

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Julian Reschke


Anne van Kesteren schrieb:
On Wed, 29 Nov 2006 18:03:33 +0100, Julian Reschke 
<[EMAIL PROTECTED]> wrote:
The fact is that authors already try things like ,  and 
even .  I've seen all of those examples in the wild.  See, for 
instance, the source of the XML 1.0 spec (and many others) which 
claim to be XHTML as text/html, littered with plenty of  tags all 
throughout.

...


Huh? The thing at ? Don't see that 
problem there.


  Names and Tokens

is one example...


If this was the case at an earlier point of time, it was probably 
caused by a bug in their XSLT code, not the authors writing the spec 
(which IMHO uses the W3C's xmlspec XML language).


In your humble opinion or is it just a fact? :-)


Aha. I thought it was about an "" with no attributes.

So yes, that's a bug in the XSLT code (xmlspec.xsl). I'll forward this 
info to Norman Walsh.


Best regards, Julian

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Lachlan Hunt


Julian Reschke wrote:

Lachlan Hunt schrieb:

...
The fact is that authors already try things like ,  and even 
.  I've seen all of those examples in the wild.  See, for 
instance, the source of the XML 1.0 spec (and many others) which claim 
to be XHTML as text/html, littered with plenty of  tags all 
throughout.

...


Huh? The thing at ? Don't see that 
problem there.


Yes, did you look at the source code?

Abstract

See that  element?  That's fine in XML, but it's served as 
text/html, so its treated as an unclosed  element.


--
Lachlan Hunt
http://lachy.id.au/

Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Anne van Kesteren

On Wed, 29 Nov 2006 18:03:33 +0100, Julian Reschke <[EMAIL PROTECTED]>  
wrote:
The fact is that authors already try things like ,  and even  
.  I've seen all of those examples in the wild.  See, for instance,  
the source of the XML 1.0 spec (and many others) which claim to be  
XHTML as text/html, littered with plenty of  tags all throughout.

...


Huh? The thing at ? Don't see that  
problem there.


  Names and Tokens

is one example...


If this was the case at an earlier point of time, it was probably caused  
by a bug in their XSLT code, not the authors writing the spec (which  
IMHO uses the W3C's xmlspec XML language).


In your humble opinion or is it just a fact? :-)


--
Anne van Kesteren

1 2 >

1 - 100 of 135 matches

Mail list logo