In response to http://www.autisticcuckoo.net/archive.php?id=2007/05/09/forward-towards-the-past I contacted Tommy Olsson to discuss the issue further, and we agreed to forward the discussion to the list. I've translated it from swedish so any grammatical errors are my fault. :-)

-------------8<---------------

It seems like the premises of the working group are something completely different from what I personally would have wished for further development of one of the most important standards on the web. That is yet another reason I don't want to be involved in the game. :)

In the article you write:

   Instead, it looks as if they are going to redefine the semantics of
   existing element types so that old-school documents from the Bad Old
   Days will be conforming to the new specification!

Hmm. It's probably more on a case-to-case basis.

One single such case is bad enough, in my opinion. W3C have already done this, and we don't quite know what the result will be. They changed the semantics of DL from definition list to some kind of generic value pair list. For instance, they say that DL can be used for markup up dialogues. How will that affect an application that has relied on that a DL is a definition list? One example is the DEFINE: feature in Google.

In my opinion it would have been considerably much better if they let DL continue be a definition list and instead added a new element type for value pair lists. Although I suspect that Microsoft was holding it back in that respect, by not wanting to put any work into development of IE. (This was before Firefox's successes forced them to do so anyway.)

When it comes to which semantic <p> should have you first need to ask the
question who will benefit from the definition of <p>? The one who authors
HTML by hand? The one who implements a WYSIWYG editor? Above mentioned
analysis applications? Browser manufacturers? Several/all of them?

This is where I see the line between my point of view and the working group's. You look at the benefit of each specific definition, which I think means that you miss the forest for all the trees.

My standpoint is based on that I learned HTML during the time the specs were at cern.ch. It was before the W3C was grounded and before HTML got any version number. HTML was then a semantic markup language that was very biased towards scientific documents, for natural reasons.

It became natural for me to think of HTML elements from a semantic point of view. The element type has nothing to do with presentation, but shall only mark up what things are. Unfortunately the range of semantic element types is very limited, but at least we can mark up headings, paragraphs, lists and tables.

The web's development during the second half of the 90's went in a totally different direction, when designers and happy amateurs took over. I thought it went off the track, since HTML for me wasn't a presentational language. The W3C agreed, and eventually released CSS, but the damage was already made.

To me the question of who "benefits" from the definition of <p> is irrelevant. The definition already exists and is unambigous. A <p> tag shall mark up a textual paragraph, and nothing else. Then of course there are gray areas: is a byline a special case of a paragraph, or something else?

We look at the overall picture from completely different perspectives, and I don't think we will reach a common vision. Your outlook is probably in a vast majority; 99.999% of those who create web pages have never even read the HTML4 specification, after all, but sees HTML from a presentational aspect. For them <p> is probably just a <div> with predefined margins, just like the HTML5 specification seems to suggest.

Webbläsare kan inte göra så mycket skoj med <p>. Oavsett vad specen säger
att en <p> representerar.

But the web doesn't just about browsers. The web is (or should be, anyway) about publishing information. One way to take part of this information is to present the documetn in a browser, but it's far from the only conceivable manner. Of you think a bit forwards and have some imagination you can probably come up with many interesting areas of use for information on the web, presuming that it is marked up in a sensible way. For the semantic web, that the W3C is talking about, is something I find clearly attractive.

Analysing applications that operate on the entire web without prior
agreement with the producer cannot rely on that <p> == paragraph, because
the web doesn't look like that and we can't change it. Regardless of what
the spec says such apps will thus have to implement heuristics in order
to decide what is a paragraph. (If there is a prior agreement with the
producer it still doesn't matter what the spec says.)

It depends. An analysing application that tries to create some sort of sense of today's tag soup has a strong sysifos work in front of it. But an analysis application that expects semantic correctness would, if it became popular, be able to affect things in the right direction. Today's SEO trend has to some degree lead to better understanding for semantics, e.g. by spiders rewarding correctly marked up headings before tag soup with FONT and B elements.

A WYSIWYG editor probably has a hard time knowing whether what the user
writes is a paragraph or not.

Yes, I have so far not seen one WYSIWYG editor that facilitates semantic correctness. I also can't imagine how such a user interface would look like. But surely there should be wiser minds in this world that can come up with something?

From that point of view it doesn't really matter how <p> is defined in
the spec -- it doesn't change reality

No, I think it matters a lot. For those who don't read the spec (i.e. 99.999%) it obviously has no significanse at all, but there has to be an unambigous semantic definition for each element type for the little minority who actually want to do things right.

Then the question is what is the harm that <p> is used by more things
than just for paragraph. Who is harmed by markup such as
   <form><p><label>Search: <input name="q"></label></p></form>

The one who has read the earlier HTML specifications and thinks that <p> marks up a textual paragraph. Obviously not the one who looks at the result in a graphical browser, but maybe the one who uses a completely different UA.

Sure you can hit in screws with a hammer. There won't come a SWAT team with murderous carpenters and drag you away to the prison for that. But those with a little piece of pride of his profession still uses a chisel or a screw driver.

...? Why is

   <form><div><label>Search: <input name="q"></label></div></form>

better?

It's only marginally better, by using a semantically neutral container instead of abusing <p>. The correct thing is naturally to use a <fieldset>.

This with semantic meaning and correct markup is hard to mediate. I notice that daily both at my job and on forums such as SitePoint. The visual outlook ("the most important thing is that it looks good") is completely dominating before the structual ("it shall be correct too").

I don't imagine that the world will get a collective aha experience and that HTML in the future will get used the way it was intended. But it doesn't stop me from at least preaching now and then for those who are interested. I can't save the web from the tag soup march, but I might be able to save a handful of people from getting stuck.

Let me just clarify that this isn't about me being conservative and opponent to changes. I don't grumble about that "it was better before" and sniff at "the youth of today". Possibly you can draw similarities to authors of letters to the editor column who sign their works with "friend of order". :)

It's simply that I happen to think that the original idea with HTML is a good one. Let HTML mark up structure and sematnics, and leave all presentation to CSS. To further develop HTML, add more semantic elements that experience shows we need; such as <nl> (navigaion list) and an element type for value pair lists.

/Tommy

------------->8---------------

Regards,
--
Simon Pieters

Reply via email to