> When Django makes a move with regards to HTML4/XHTML/HTML5... Here are my current thoughts on that issue - in a slightly overstated tone... :-)
== XHTML as text/html is just fine == I happen to agree with Simon Willison's post a while back that said that XHTML has 'lost', and HTML4 is a better option for new projects. However, I contend that: Django's behaviour (outputting XHTML) combined with the normal usage (serving as text/html) isn't causing anyone *any* real problems. Loads of people are doing it, and browsers handle it just fine. (I have read and re-read Hickson's article [1], and can't find any real problems that affect us. You only have problems if you try to switch to application/xhtml+xml, which is easy to solve: don't do that). Of course, I'm discounting as 'problems' things like personal preferences and clients who insist on both HTML4 and valid documents. In practice, the choice between HTML4 and 'XHTML as text/html' makes no real difference to end users. It's just a holy war for developers, and developers' tools. Simon brought up that it makes us look out of date, and I agree, but so what? Switching back to HTML4 is driven by the same kind of purity-beats-practicality, fashion-conscious silliness that made us all switch to XHTML in the first place. (As a matter of interest, www.bbc.co.uk uses XHTML, and after trying to validate a dozen major sites that I visit regularly, it happens to be the only one that validates. That probably says something - while the switch to XHTML may have been misguided in the first place, those who did so had high ideals and actually cared about validation, and so wrote their tools and systems accordingly. You are more likely to be able to generate a valid web page if you depend on an XHTML stack). On the other hand, if we were outputting HTML4, it might cause *genuine* problems - e.g. if you need XHTML in order to mix in MathML or SVG on some pages. In short: XHTML is not entirely pointless for some use cases, but there is definitely something much more pointless: switching perfectly functioning XHTML code to HTML4 just because the web fashion gurus say so. I've already wasted enough of my life doing things like that. == Multiple output formats is hard and nasty == The fundamental problem here is the need to output in multiple markup languages. (The fact that HTML4 and XHTML are so similar doesn't actually help us - we have to recognise that they are different languages. The only attempts so far to produce code that outputs both have been as ugly as sin, IMHO). However, Django simply does not work at the level of abstraction that would allow multiple output languages. We encourage the use of templates where developers must write raw XHTML. We have gobs of internal code that builds up XHTML as strings. This means that the design of *everything* in our output is oriented to a *single* markup language, and that language is XHTML. We have never attempted to be markup agnostic. Changing that assumption in an elegant way would require changing *everything* to use a markup agnostic output tree, which you would then render with different 'writers'. (I'm thinking something like the way Pandoc and docutils work). We would basically be throwing away the entire template system and all templates, and much more besides. Every other attempt to fix this is just patching up lots of different symptoms, and it gets ugly very quick, both for us and for Django developers. In turn, that solution would mean that re-usable apps would end up choosing either HTML4 or XHTML, because writing code that supports both would be so painful. (If we have trouble convincing designers to write "<br />" instead of "<br>", what chance do we stand with "<br {% selfclosing %}>" ?) Supporting multiple doctypes will produce a Babel situation. We *have* to pick a common language if we want the re- usable app market to thrive, and we have already done so - it's XHTML. As well as templates, there are many other bits of code that hard-code XHTML problems, such as various plain text to HTML formatters, both inside and outside the Django code base. This even includes Django dependencies - docutils outputs XHTML, and there doesn't seem to be a way to output HTML4, short of writing our own 'writer'. I have several bits of code that would require a lot of work to make them output HTML as well as XHTML, and would require coupling to the template system or something equally nasty. In fact, one bit of code is impossible to fix - it is related to a text editor in django-cms, and outputs XHTML when the data is saved. To be able to fix it, I need to know what doctype the HTML will eventually end up in, which is impossible. The only fix for stored, user provided HTML is to run it all through a template filter that rewrites content according to the doctype, which is obviously a big performance hit. This is just highlighting another reason why you cannot simply switch between these two formats, and we shouldn't pretend that you can. Does anyone know of any framework that works on the same level of abstraction as Django (i.e. with developers writing raw HTML) and has a nice solution to this problem? If not, I'd suggest that we are chasing a fantasy. Even if we get it 90% of the way there, we're still going to have loads of complaints about the 10% -- which is fair enough - if you care about validity, "90% validity" counts for exactly nothing. And we will also inflict the pain of fragmentation/complaints on the authors of third-party apps. == HTML5 will save us anyway == Finally, once HTML5 becomes established, the problem goes away. All our output is HTML5 compliant (you can use both XHTML and HTML style tags in HTML5). The HTML5 spec has acknowledged that having two subtly different and incompatible formats is a bad idea, and has basically said "Let's just redefine both flavours as HTML and be done." == Conclusion == We should just ignore this whole issue for now, and simply switch to HTML5 doctypes in our provided templates at some point. That's basically what Ian Hickson suggests in his updated article. >From then on, things get much better, because backwards compatibility rules are well-defined. If people complain, wanting HTML4, tell them: - Tough. Django produces XHTML. What real world problem are you trying to solve by using HTML4? If it's just a case of some internal standard, then point out that the de-facto standard for Django apps is XHTML, and you gain a lot by fitting in with that standard. All our libraries also output XHTML (e.g. docutils) - OR - live with invalid documents. Almost every single major website out there does so, and no-one notices. (Seriously, outside of your own websites and w3.org, how many valid HTML pages can you find?) I'm saying this as the guy who wrote a Django middleware/app that validates all outgoing HTML - an app I still use it. Some spot checks on various sites of mine indicate that I'm doing better than 99.99% of the web, in that I rarely have any invalid documents. But to worry about being able to instantly switch to the doctype-du- jour -- or rather *last years'* doctype-du-jour -- *as well* as having HTML validity - that's not being a perfectionist, it's called OCD, and I'm drawing the line there. Regards, Luke [1] http://hixie.ch/advocacy/xhtml -- "Trouble: Luck can't last a lifetime, unless you die young." (despair.com) Luke Plant || http://lukeplant.me.uk/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.