> When Django makes a move with regards to HTML4/XHTML/HTML5...

Here are my current thoughts on that issue - in a slightly overstated 
tone... :-)

== XHTML as text/html is just fine ==

I happen to agree with Simon Willison's post a while back that said 
that XHTML has 'lost', and HTML4 is a better option for new projects.

However, I contend that:

Django's behaviour (outputting XHTML) combined with the normal usage
(serving as text/html) isn't causing anyone *any* real problems. Loads 
of people are doing it, and browsers handle it just fine.

(I have read and re-read Hickson's article [1], and can't find any 
real problems that affect us.  You only have problems if you try to 
switch to application/xhtml+xml, which is easy to solve: don't do 
that).

Of course, I'm discounting as 'problems' things like personal 
preferences and clients who insist on both HTML4 and valid documents.  
In practice, the choice between HTML4 and 'XHTML as text/html' makes 
no real difference to end users.  It's just a holy war for developers, 
and developers' tools. 

Simon brought up that it makes us look out of date, and I agree, but 
so what?  Switching back to HTML4 is driven by the same kind of
purity-beats-practicality, fashion-conscious silliness that made us 
all switch to XHTML in the first place.  (As a matter of interest, 
www.bbc.co.uk uses XHTML, and after trying to validate a dozen major 
sites that I visit regularly, it happens to be the only one that 
validates.  That probably says something - while the switch to XHTML 
may have been misguided in the first place, those who did so had high 
ideals and actually cared about validation, and so wrote their tools 
and systems accordingly.  You are more likely to be able to generate a 
valid web page if you depend on an XHTML stack).

On the other hand, if we were outputting HTML4, it might cause 
*genuine* problems - e.g.  if you need XHTML in order to mix in MathML 
or SVG on some pages.

In short: XHTML is not entirely pointless for some use cases, but 
there is definitely something much more pointless: switching perfectly 
functioning XHTML code to HTML4 just because the web fashion gurus say 
so.  I've already wasted enough of my life doing things like that.

== Multiple output formats is hard and nasty ==

The fundamental problem here is the need to output in multiple markup
languages.  (The fact that HTML4 and XHTML are so similar doesn't 
actually help us - we have to recognise that they are different 
languages.  The only attempts so far to produce code that outputs both 
have been as ugly as sin, IMHO).  However, Django simply does not work 
at the level of abstraction that would allow multiple output 
languages.  We encourage the use of templates where developers must 
write raw XHTML.  We have gobs of internal code that builds up XHTML 
as strings.  This means that the design of *everything* in our output 
is oriented to a *single* markup language, and that language is XHTML. 
We have never attempted to be markup agnostic.

Changing that assumption in an elegant way would require changing
*everything* to use a markup agnostic output tree, which you would 
then render with different 'writers'.  (I'm thinking something like 
the way Pandoc and docutils work).  We would basically be throwing 
away the entire template system and all templates, and much more 
besides.

Every other attempt to fix this is just patching up lots of different
symptoms, and it gets ugly very quick, both for us and for Django
developers.

In turn, that solution would mean that re-usable apps would end up 
choosing either HTML4 or XHTML, because writing code that supports 
both would be so painful.  (If we have trouble convincing designers to 
write "<br />" instead of "<br>", what chance do we stand with "<br {% 
selfclosing %}>" ?)  Supporting multiple doctypes will produce a Babel 
situation.  We *have* to pick a common language if we want the re-
usable app market to thrive, and we have already done so - it's XHTML.

As well as templates, there are many other bits of code that hard-code 
XHTML problems, such as various plain text to HTML formatters, both 
inside and outside the Django code base.  This even includes Django 
dependencies - docutils outputs XHTML, and there doesn't seem to be a 
way to output HTML4, short of writing our own 'writer'.

I have several bits of code that would require a lot of work to make 
them output HTML as well as XHTML, and would require coupling to the 
template system or something equally nasty.

In fact, one bit of code is impossible to fix - it is related to a 
text editor in django-cms, and outputs XHTML when the data is saved.  
To be able to fix it, I need to know what doctype the HTML will 
eventually end up in, which is impossible.

The only fix for stored, user provided HTML is to run it all through a 
template filter that rewrites content according to the doctype, which 
is obviously a big performance hit.  This is just highlighting another 
reason why you cannot simply switch between these two formats, and we 
shouldn't pretend that you can.

Does anyone know of any framework that works on the same level of 
abstraction as Django (i.e.  with developers writing raw HTML) and has 
a nice solution to this problem?  If not, I'd suggest that we are 
chasing a fantasy. Even if we get it 90% of the way there, we're still 
going to have loads of complaints about the 10% -- which is fair 
enough - if you care about validity, "90% validity" counts for exactly 
nothing.  And we will also inflict the pain of 
fragmentation/complaints on the authors of third-party apps.

== HTML5 will save us anyway ==

Finally, once HTML5 becomes established, the problem goes away.  All 
our output is HTML5 compliant (you can use both XHTML and HTML style 
tags in HTML5).  The HTML5 spec has acknowledged that having two 
subtly different and incompatible formats is a bad idea, and has 
basically said "Let's just redefine both flavours as HTML and be 
done."

== Conclusion ==

We should just ignore this whole issue for now, and simply switch to 
HTML5 doctypes in our provided templates at some point.  That's 
basically what Ian Hickson suggests in his updated article.

>From then on, things get much better, because backwards compatibility 
rules are well-defined.

If people complain, wanting HTML4, tell them:

 - Tough. Django produces XHTML.  What real world problem are you
   trying to solve by using HTML4?  If it's just a case of some
   internal standard, then point out that the de-facto standard for
   Django apps is XHTML, and you gain a lot by fitting in with that
   standard.  All our libraries also output XHTML (e.g. docutils)

 - OR - live with invalid documents.  Almost every single major
   website out there does so, and no-one notices. (Seriously, outside
   of your own websites and w3.org, how many valid HTML pages can you
   find?)

I'm saying this as the guy who wrote a Django middleware/app that 
validates all outgoing HTML - an app I still use it.  Some spot checks 
on various sites of mine indicate that I'm doing better than 99.99% of 
the web, in that I rarely have any invalid documents.

But to worry about being able to instantly switch to the doctype-du-
jour -- or rather *last years'* doctype-du-jour -- *as well* as having 
HTML validity - that's not being a perfectionist, it's called OCD, and 
I'm drawing the line there.

Regards,

Luke


[1] http://hixie.ch/advocacy/xhtml


-- 
"Trouble: Luck can't last a lifetime, unless you die young." 
(despair.com)

Luke Plant || http://lukeplant.me.uk/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to