On Dec 28, 2006, at 1:58 PM, James Graham wrote:

Mike Schinkel wrote:

Matthew Paul Thomas wrote:
...
Non-heuristic machine consumption fails when semantic elements are abused, and becomes practical when elements have multiple popular meanings (examples of the latter include <dl> in HTML 4, and <p> in HTML 5). Heuristic machine consumption fails occasionally by the very nature of heuristics (examples currently include <http://www.google.com/search?q=define:author> and
<http://www.google.com/search?q=define:editor>.)

The origin of this thread was my request for adding attributes to all
elements to support microformat-like semantic markup. Based on the context of your reply, it seems you are agreeing with Matthew Raymond in his assertion that using microformat-like semantic markup is A Bad Thing(tm). Am I understanding your position correctly? (If I'm not, please forgive me.)

Actually, IMHO mpt's point is far broader and consequentially more important than the confines of the original thread.

Broader, yes (and I should have changed the Subject). I don't know about more important, because I have no experience in "microformat-like markup", and I have no idea how important it will be. So I wasn't commenting on it at all (though Matthew Raymond's arguments seem cogent).

The point, as I understand it, is that machine analysis of "semantic" markup fails if the markup construct is (ab)used in so many different ways that the interpretation of any particular fragment is no longer unambiguous. This is a sort of "heat[1] death" of the original semantics; as the use of an element becomes increasingly disordered (i.e. higher entropy), it becomes impossible to extract any useful information from the use of that element. This is critical in the proper design of semantic markup languages because one wishes to stave off the heat death as long as possible so that, as far as possible, UAs can perform useful functions based on the information in the markup (e.g. render it to a media for which the content was not explicitly designed). Obviously I don't know how to achieve this but there are a few things to consider:

* Have enough elements.
...
* Don't have too many elements:
...
* Make the semantics of elements well defined:
...
* Have some "high entropy" elements.
...
* Allow easy extensions.
...

I think this is exactly right. Another point I would add is "implement the semantic benefit early and often". The earlier and more widely software is distributed that takes advantage of the semantics, the more easily people can see whether they are using semantic markup appropriately. I hinted at this earlier when I said that whether <section> becomes a semantic element "will depend on who is faster: UA vendors distributing software that prominently takes advantage of the structure <section> is supposed to provide, or eager tech Weblog authors misguidedly replacing all the occurrences of <div> with <section> in their templates in an attempt to be 'more semantic'."
<http://urlx.org/dreamhost.com/a73e1>

The "Don't have too many elements" guideline bears on Joe Clark's complaint that "'HTML5' replicates HTML's obsession with computer-science and math elements" <http://blog.fawny.org/2006/10/28/tbl-html>. It is true that HTML's few semantic elements are biased toward computer science (but not math), but that's because computer-science people are those most likely to bother with semantic markup at all (Joe being a notable exception). And adding representative elements from other fields of endeavor would likely result in too many elements overall.

This post was brought to you by the society for dodgy physical analogies concocted in the middle of the night.
...

As another analogy, in a recent message to Ian I referred to such presentational elements as "safety valves".

Whenever someone uses <div>, don't say "alas, that's a hole in HTML"; say "hooray, that's someone who isn't misusing <blockquote>".

--
Matthew Paul Thomas
http://mpt.net.nz/

Reply via email to