text/html with mode=xml in Atom 0.3

2006-03-23 Thread James Holderness


I've been seeing a number of feeds recently using Atom 0.3 with a content 
type of text/html and no mode attribute (i.e. the equivalent of 
mode=xml). However, the markup in that content is wrapped in a CDATA 
section, for example something like this:


   content type=text/html
 ![CDATA[div xmlns=http://www.w3.org/1999/xhtml;pContent goes 
here./p/div]]

   /content

If it had been marked as escaped you would obviously unescape the CDATA 
before interpreting the markup. However, since the mode is technically 
xml, I was under the impression that it should be treated as inline XML 
and no unescaping was necessary. But that would result in the literal text 
div xmlns=http://www.w3.org/1999/xhtml;pContent goes here/p/div 
being displayed to the user which is obviously not what is intended.


So is this a bug in the content generator (all the feeds I've seen appear to 
be using TypePad) or are you supposed to ignore the mode attribute when the 
content type is set to text/html and always treat it as escaped? I know 
Atom 0.3 is deprecated and I shouldn't be having to deal with this, but the 
reality of the situation is that there are a whole lot of Atom 0.3 feeds 
still out there (probably more than Atom 1.0) and I need to be able to 
support them.


Some feeds where you can see the problem (not all entries though):

http://feeds.feedburner.com/Flickrblog
http://dilbertblog.typepad.com/the_dilbert_blog/atom.xml
http://blog.cymfony.com/atom.xml

Regards
James



Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread A. Pagaltzis

* James Holderness [EMAIL PROTECTED] [2006-03-23 17:30]:
So is this a bug in the content generator (all the feeds I've
seen appear to be using TypePad)

Yes.

or are you supposed to ignore the mode attribute when the
content type is set to text/html and always treat it as
escaped?

No.

In 0.3, the `mode` attribute was the final arbiter for the form
of the content. In Atom 1.0, its role was subsumed by switching
on the `type` value because consumer developers reported that
this sort of layering was unnecessarily hard to support and
provided no discernible benefit.

Regards,
-- 
Aristotle Pagaltzis // http://plasmasturm.org/



Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread James Holderness


A. Pagaltzis wrote:

So is this a bug in the content generator (all the feeds I've
seen appear to be using TypePad)


Yes.


or are you supposed to ignore the mode attribute when the
content type is set to text/html and always treat it as
escaped?


No.


Thanks for the confirmation. I was beginning to think I was wrong. I tested 
this in 15 different aggregators and all but one ignored the mode and 
unescaped the content anyway. I have a horrible feeling I'm going to have to 
add code to emulate this behaviour.


Regards
James



Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread A. Pagaltzis

* James Holderness [EMAIL PROTECTED] [2006-03-23 18:40]:
I tested this in 15 different aggregators and all but one
ignored the mode and unescaped the content anyway.

Good thing this rule was changed in Atom 1.0, then…

What I really don’t get is what that `xmlns` attribute is doing
there in the CDATA block of your data sample. Sometimes I wonder
if CDATA should not have been left out of the XML spec; it seems
to create far too much confusion to be worthwhile.

Regards,
-- 
Aristotle Pagaltzis // http://plasmasturm.org/



Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread James Holderness


A. Pagaltzis wrote:

What I really don’t get is what that `xmlns` attribute is doing
there in the CDATA block of your data sample. Sometimes I wonder
if CDATA should not have been left out of the XML spec; it seems
to create far too much confusion to be worthwhile.


Well if you look at some of those feeds I listed, many of the entries are 
type=application/xhtml+xml with a namespaced div element as you would 
expect. It looks like they may have taken the exact same code (or template, 
or however it is they do this stuff) and reused it for type=text/html. 
Only with the html they decided they should wrap everything in a CDATA block 
just to be safe.


Regards
James