Shouldn't RSS feeds declare the correct content-type?

Yes, they should, but generally, they don't (a lot of rss feeds return a
text/xml content-type).
I don't know why. Perhaps because application/rss+xml is not registered to
IANA (http://www.iana.org/assignments/media-types/application/)
In practice, many webmasters are don't aware of this, since the main entry
point for their feeds are some HTML pages
that reference them (with the good content-type in HTML tag link) or some
feeds aggregators that simply try to parse the feed content (without any
care of the protocol mime-type) => Their feeds are viewable and usable by
end users.

Further more, I see this "feature" as an extension of the cache mechanism.
The cache provides an access for a document that no longer exists or is
simply temporally unavailable. So why not giving access via the cache to a
document with a wrong protocol content-type but that was correctly
identified /parsed / indexed by Nutch?

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/

Reply via email to