Simon makes some good practical points in the message I forwarded just previous to this one. But I would like to make some more abstract points too for those of you who are more of the Jungian introspective/rational character types (most of us here I guess or else we would be out surfing on the beach ;-)

On 22 Dec 2005, at 16:34, James Holderness wrote:
Henry Story wrote:
Does Atom allow there to be multiple parallel renditions of a blog entry in different languages?

So it is not really possible to put atom entries with the same id and updated time stamp in a feed (without a SHOULD level violation) even if they are translation of each other. That means that the Swiss would not be able to publish a law with the same atom id in french and in german, as they are obliged to publish these at the exact same time by law. (No linguistic preference)

The Swiss may need to publish laws in multiple languages simultaneously, but most users surely don't need to read those laws in multiple languages simultaneously. Why waste their bandwidth including several translations in one feed when it would be far more convenient if you just had a separate feed for each language?

The atom syntax has a very clear restriction which I believe is of semantic importance [1]. And that is as I mentioned in my original post that:

[[
If multiple atom:entry elements with the same atom:id value appear in
   an Atom Feed Document, they represent the same entry.  Their
   atom:updated timestamps SHOULD be different.  If an Atom Feed
   Document contains multiple entries with the same atom:id, Atom
   Processors MAY choose to display all of them or some subset of them.
   One typical behavior would be to display only the entry with the
   latest atom:updated timestamp.
]] last para of section 4.1.1 of http://www.ietf.org/rfc/rfc4287

What would be the point of having this restriction if it was not of semantic significance? It seems a little odd to tell people that they cannot put two entries into a feed with the same updated time stamp, but that they can put them into two different feeds and all is ok. Why have that restricition in that case? My guess is that the restriction was accepted because it worked on the following intuition. The atom id is a string that identifies something we can think of as a document. It therefore does not make sense to have two different incompatible versions of the same document with the same time stamp. The atom id is therefore much more restrictive than a URL [2]. A URL in web architecture can have any number of different representations at the same time. Atom clearly rules this out. So my guess is that we should take this seriously. The id is identifying a document. And a document cannot be wholly in two languages at once.

Once we accept this, then there is no big problem. We just need the translations to have 2 different ids. They are after all two different documents. We can then put the two translations into the same feed, or different feeds without problem.

All we need is to find a way to state that one document is a translation of the other. And James Snell's proposal is a good start at getting us there (it is a possible translation of the N3 proposal in my original e-mail).

If you want to link the various translations together you can add one or more link elements at the top of the feed with rel="alternate" and hreflang set to the language of the alternate feed. If you're feeling really enthusiastic you can include alternate links pointing to the translated html pages for each entry too.

If my argument above is sound then we should be able to put the translations in the same feed. What we want to do then, is find some way to state that one entry in the feed is a translation of the other entry in the same feed. So though the solution saying that one feed is a translation of the other is ok (apart from it breaking what I think are the inherent semantics in atom) it is also too general. We need more precise tools.

No extensions need to be made to the Atom spec. No wasted bandwidth by having multiple translations in a feed that may not be used. But you can still publish muliple language simultaneously by making sure you update all feeds simultaneously. You could even publish all the feeds at the same URL, serving the correct translation based on the HTTP Accept-Language header.

Those are cool ideas. But again, it would be good to have a way to specify that an entry in one feed is a translation of a particular different entry in the other feed.


I may be missing something here, but it seems to me like a reasonable solution to the problem.


Those were my first thoughts too btw. And also Retos. :-)

Regards
James

I hope this type of argument also helps show how the semantics we are working on at the atom-owl group can help reveal hidden meanings lurking in the atom syntax. We do welcome people to help us improove it so that it can become an official semantics for atom (as was initially specified in the atom charter)

Thanks for your attention,

Henry Story

[1] for some quick thoughts on syntax and semantics see my recent post
http://blogs.sun.com/roller/page/bblfish? entry=the_relation_between_xml_and
(not quite right, but nearly there)
[2] something we need to fix in the current atom-owl ontology

Reply via email to