[whatwg] Don't change the semantics of elements

Simon Pieters Fri, 11 May 2007 05:55:06 -0700

In response tohttp://www.autisticcuckoo.net/archive.php?id=2007/05/09/forward-towards-the-pastI contacted Tommy Olsson to discuss the issue further, and we agreed toforward the discussion to the list. I've translated it from swedish so anygrammatical errors are my fault. :-)


-------------8<---------------

It seems like the premises of the working group are something completelydifferent from what I personally would have wished for further developmentof one of the most important standards on the web. That is yet anotherreason I don't want to be involved in the game. :)

In the article you write:

   Instead, it looks as if they are going to redefine the semantics of
   existing element types so that old-school documents from the Bad Old
   Days will be conforming to the new specification!

Hmm. It's probably more on a case-to-case basis.

One single such case is bad enough, in my opinion. W3C have already donethis, and we don't quite know what the result will be. They changed thesemantics of DL from definition list to some kind of generic value pairlist. For instance, they say that DL can be used for markup up dialogues.How will that affect an application that has relied on that a DL is adefinition list? One example is the DEFINE: feature in Google.

In my opinion it would have been considerably much better if they let DLcontinue be a definition list and instead added a new element type forvalue pair lists. Although I suspect that Microsoft was holding it back inthat respect, by not wanting to put any work into development of IE. (Thiswas before Firefox's successes forced them to do so anyway.)

When it comes to which semantic <p> should have you first need to ask the
question who will benefit from the definition of <p>? The one who authors
HTML by hand? The one who implements a WYSIWYG editor? Above mentioned
analysis applications? Browser manufacturers? Several/all of them?

This is where I see the line between my point of view and the workinggroup's. You look at the benefit of each specific definition, which Ithink means that you miss the forest for all the trees.

My standpoint is based on that I learned HTML during the time the specswere at cern.ch. It was before the W3C was grounded and before HTML gotany version number. HTML was then a semantic markup language that was verybiased towards scientific documents, for natural reasons.

It became natural for me to think of HTML elements from a semantic pointof view. The element type has nothing to do with presentation, but shallonly mark up what things are. Unfortunately the range of semantic elementtypes is very limited, but at least we can mark up headings, paragraphs,lists and tables.

The web's development during the second half of the 90's went in a totallydifferent direction, when designers and happy amateurs took over. Ithought it went off the track, since HTML for me wasn't a presentationallanguage. The W3C agreed, and eventually released CSS, but the damage wasalready made.

To me the question of who "benefits" from the definition of isirrelevant. The definition already exists and is unambigous. A tagshall mark up a textual paragraph, and nothing else. Then of course thereare gray areas: is a byline a special case of a paragraph, or somethingelse?

We look at the overall picture from completely different perspectives, andI don't think we will reach a common vision. Your outlook is probably in avast majority; 99.999% of those who create web pages have never even readthe HTML4 specification, after all, but sees HTML from a presentationalaspect. For them is probably just a <div> with predefined margins,just like the HTML5 specification seems to suggest.

Webbläsare kan inte göra så mycket skoj med <p>. Oavsett vad specen säger
att en <p> representerar.

But the web doesn't just about browsers. The web is (or should be, anyway)about publishing information. One way to take part of this information isto present the documetn in a browser, but it's far from the onlyconceivable manner. Of you think a bit forwards and have some imaginationyou can probably come up with many interesting areas of use forinformation on the web, presuming that it is marked up in a sensible way.For the semantic web, that the W3C is talking about, is something I findclearly attractive.

Analysing applications that operate on the entire web without prior
agreement with the producer cannot rely on that <p> == paragraph, because
the web doesn't look like that and we can't change it. Regardless of what
the spec says such apps will thus have to implement heuristics in order
to decide what is a paragraph. (If there is a prior agreement with the
producer it still doesn't matter what the spec says.)

It depends. An analysing application that tries to create some sort ofsense of today's tag soup has a strong sysifos work in front of it. But ananalysis application that expects semantic correctness would, if it becamepopular, be able to affect things in the right direction. Today's SEOtrend has to some degree lead to better understanding for semantics, e.g.by spiders rewarding correctly marked up headings before tag soup withFONT and B elements.

A WYSIWYG editor probably has a hard time knowing whether what the user
writes is a paragraph or not.

Yes, I have so far not seen one WYSIWYG editor that facilitates semanticcorrectness. I also can't imagine how such a user interface would looklike. But surely there should be wiser minds in this world that can comeup with something?

From that point of view it doesn't really matter how <p> is defined in
the spec -- it doesn't change reality

No, I think it matters a lot. For those who don't read the spec (i.e.99.999%) it obviously has no significanse at all, but there has to be anunambigous semantic definition for each element type for the littleminority who actually want to do things right.

Then the question is what is the harm that <p> is used by more things
than just for paragraph. Who is harmed by markup such as
   <form><p><label>Search: <input name="q"></label></p></form>

The one who has read the earlier HTML specifications and thinks that marks up a textual paragraph. Obviously not the one who looks at theresult in a graphical browser, but maybe the one who uses a completelydifferent UA.

Sure you can hit in screws with a hammer. There won't come a SWAT teamwith murderous carpenters and drag you away to the prison for that. Butthose with a little piece of pride of his profession still uses a chiselor a screw driver.

...? Why is

   <form><div><label>Search: <input name="q"></label></div></form>

better?

It's only marginally better, by using a semantically neutral containerinstead of abusing . The correct thing is naturally to use a <fieldset>.

This with semantic meaning and correct markup is hard to mediate. I noticethat daily both at my job and on forums such as SitePoint. The visualoutlook ("the most important thing is that it looks good") is completelydominating before the structual ("it shall be correct too").

I don't imagine that the world will get a collective aha experience andthat HTML in the future will get used the way it was intended. But itdoesn't stop me from at least preaching now and then for those who areinterested. I can't save the web from the tag soup march, but I might beable to save a handful of people from getting stuck.

Let me just clarify that this isn't about me being conservative andopponent to changes. I don't grumble about that "it was better before" andsniff at "the youth of today". Possibly you can draw similarities toauthors of letters to the editor column who sign their works with "friendof order". :)

It's simply that I happen to think that the original idea with HTML is agood one. Let HTML mark up structure and sematnics, and leave allpresentation to CSS. To further develop HTML, add more semantic elementsthat experience shows we need; such as <nl> (navigaion list) and anelement type for value pair lists.


/Tommy

------------->8---------------

Regards,
--
Simon Pieters

[whatwg] Don't change the semantics of elements

Reply via email to