[Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James M Snell


FYI...

 Original Message 
Subject: Re: todo: add language encoding information
Date: Thu, 22 Dec 2005 19:03:25 +0100
From: Henry Story <[EMAIL PROTECTED]>
To: James M Snell <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>

Hi James, you forgot to send this to the Atom syntax list, where it
will be most useful.

Henry
On 22 Dec 2005, at 18:53, James M Snell wrote:

(this may not make it to the atom-owl list as I'm not subscribed to  
that one)


One possibility for this in Atom is to provide multiple Atom  
documents, each covering their own language.  Given that the date  
restriction only covers Atom entries within the *same* feed  
document, the following would be perfectly acceptable:



  ...
  
tag:example.org,2005:some_entry
...
  



  ...
  
tag:example.org,2005:some_entry
...
  


To indicate that the feeds were translations of one another, a new  
"translation" link rel could be established on the feed level



  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  


- James

Henry Story wrote:
Sorry again to take so long to respond. I have been a little too  
busy to respond recently.

On 4 Dec 2005, at 16:42, Simon Phipps wrote:

On Dec 4, 2005, at 14:33, Henry Story wrote:

I have written my first blog entry in French [1] which has made  
me aware that
it would be very useful to add language information to the atom  
file generated by

BlogEd. A menu beneath the Entry icon would probably do the trick.

Henry

[1] http://bblfish.net/blog/page9.html#t2005_12_04.14_17_02_642


Does Atom allow there to be multiple parallel renditions of a  
blog entry in different languages? I've been discussing with Dave  
Johnson how we might provide a facility in Roller where a blog  
entry might be provided in multiple languages and then select the  
appropriate entry based on the contents of the HTTP GET. If this  
were to be implemented, we'd need a way to express the same  
reality in the Atom feed.
This is a complicated and delicate issue. An atom entry has been  
syntactically limited by the atom spec by the following clause

[[
   If multiple atom:entry elements with the same atom:id value  
appear in

   an Atom Feed Document, they represent the same entry.  Their
   atom:updated timestamps SHOULD be different.  If an Atom Feed
   Document contains multiple entries with the same atom:id, Atom
   Processors MAY choose to display all of them or some subset of  
them.

   One typical behavior would be to display only the entry with the
   latest atom:updated timestamp.
]] last para of section 4.1.1 of http://www.ietf.org/rfc/rfc4287
So it is not really possible to put atom entries with the same id  
and updated time stamp in a  feed (without a SHOULD level  
violation) even if they are translation of each other. That means  
that the Swiss would not be able to publish a law with the same  
atom id in french and in german, as they are obliged to publish  
these at the exact same time by law. (No linguistic preference)
If we want to respect this we have to give different translations  
of a document different ids. We should be able to specify that  
they are translations of each other some other way. Using the Atom  
OWL ontology [1] we could do this by adding some translation  
relation between the entries. Once this the best way to do this is  
settled and that it satisfies our needs it should be possible to  
put a proposal to the atom syntax group for an extension to their  
xml format.

So perhaps something like this is needed
[ a :Feed, :Version;
:title [ :value "Example Feed";
 :type "text/plain" ];
:link  [ :href ;
 :rel iana:alternate ];
:updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
:author [ :name "John Doe" ];
:id ;
:entry [ a :Entry;
 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href ;

  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "After a course of integration in  
French philosophy,...";

 :type "text/plain" ];
 ext:translation 
   ];
:entry [ a :Entry;
 :title [ :value "Des Robot Atomiques se rebellent";
  :type "text/plain" ];
 :link [  :href ;

  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "Apres un cours d'integration en  
philosophie francaise...";

 :type "text/plain" ]
 ext:translation 
   ];
] .
The ext:translation relation above relates a :Entry

Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread A. Pagaltzis

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:
> To indicate that the feeds were translations of one another, a
> new  "translation" link rel could be established on the feed
> level
> 
> 
>   ...
>hreflang="fr"
> href="http://.../thefeed?lang=fr";
>   
> tag:example.org,2005:some_entry
> ...
>   
> 
> 
> 
>   ...
>hreflang="de"
> href="http://.../thefeed?lang=de";
>   
> tag:example.org,2005:some_entry
> ...
>   
> 

Is that even necessary? Wouldn’t @rel='self' already work here?

Regards,
-- 
Aristotle Pagaltzis // 



Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James Holderness


A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


Is that even necessary? Wouldn’t @rel='self' already work here?


I would have thought @rel='alternate' (an alternate version of the resource 
described by the containing element) slightly more appropriate than 
@rel='self' (a resource equivalent to the containing element). There's also 
the restriction in section 4.1.1 that suggest that the 'self' link is the 
"preferred URI for retrieving Atom Feed Documents representing this Atom 
feed". I think that could be a bit confusing when mixed with hreflang.


Either way though, I agree that a new rel value isn't necessary.

Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread Henry Story


Yes. That is one solution. But what we are looking for is how one can  
state that two entries in the same feed are translations of one another.


Henry

On 22 Dec 2005, at 20:52, James M Snell wrote:



Hmmm.. interesting thought, hadn't considered that.

rel="self" should always point to *this* document, and never to  
some other document, but if the document referenced is the same  
document just in a different language, then it is possible?  Good  
thinking but I'm not sure if it's legal according to the spec.


 
   ...
   http://.../thefeed?lang=fr";
   http://.../thefeed?lang=de";
   
 tag:example.org,2005:some_entry
 ...
   
 

- James

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  


Is that even necessary? Wouldn’t @rel='self' already work here?
Regards,





Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread A. Pagaltzis

* James Holderness <[EMAIL PROTECTED]> [2005-12-22 21:00]:
> I would have thought @rel='alternate' (an alternate version of
> the resource described by the containing element) slightly more
> appropriate than @rel='self' (a resource equivalent to the
> containing element). There's also the restriction in section
> 4.1.1 that suggest that the 'self' link is the "preferred URI
> for retrieving Atom Feed Documents representing this Atom
> feed". I think that could be a bit confusing when mixed with
> hreflang.

Yeah, agreed. In retrospect I can’t think of why I picked `self`;
`alternate` is clearly the correct choice. D’oh.

Regards,
-- 
Aristotle Pagaltzis // 



Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James M Snell


In that case, the only viable solution is an extension element.  This is 
similar to the Thread Extension use case in which replies may appear 
within the same feed as the original.


Here's how I would do it:

1. Introduce a new  element.

  

2. Use rel="alternate", hreflang="{otherlanguage}" to point to other 
documents that may contain translations.  I would prefer a "translation" 
link rel but agree that reuse of alternate is better.


- James

Henry Story wrote:
Yes. That is one solution. But what we are looking for is how one can 
state that two entries in the same feed are translations of one another.


Henry

On 22 Dec 2005, at 20:52, James M Snell wrote:



Hmmm.. interesting thought, hadn't considered that.

rel="self" should always point to *this* document, and never to some 
other document, but if the document referenced is the same document 
just in a different language, then it is possible?  Good thinking but 
I'm not sure if it's legal according to the spec.


 
   ...
   http://.../thefeed?lang=fr";
   http://.../thefeed?lang=de";
   
 tag:example.org,2005:some_entry
 ...
   
 

- James

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  


Is that even necessary? Wouldn’t @rel='self' already work here?
Regards,







Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James M Snell


Hmmm.. interesting thought, hadn't considered that.

rel="self" should always point to *this* document, and never to some 
other document, but if the document referenced is the same document just 
in a different language, then it is possible?  Good thinking but I'm not 
sure if it's legal according to the spec.


 
   ...
   http://.../thefeed?lang=fr";
   http://.../thefeed?lang=de";
   
 tag:example.org,2005:some_entry
 ...
   
 

- James

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  



Is that even necessary? Wouldn’t @rel='self' already work here?

Regards,




Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread James Holderness


James M Snell wrote:
In that case, the only viable solution is an extension element.  This is 
similar to the Thread Extension use case in which replies may appear 
within the same feed as the original.


The similarity to the Thread Extension also occured to me, but I didn't have 
time to write more about it earlier. My thought was that we could perhaps 
get by with an extension attribute to the link element that would work for 
both cases. The link element already has an href for pointing to the feed 
itself, so all we need is an id to point to a particular entry in the feed. 
I guess for the Thead Extension you'd also need a new rel value though.


An example thead reply-to link would look something like this:

http://www.example.org/feed1.atom
 x:id="entry1_id" />

An example translation link would look something like this:



What do you think?

Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread Elliotte Harold


Henry Story wrote:

The problem as I explained a little quickly in my mail yesterday, is 
that you are relating a entry and an id. Because there can be any number 
of entries with the same id it won't be clear which entry is the 
translation. 



I'm not sure if this is technically relevant or not, but in the case at 
hand neither entry is the translation, or alternately both are. Legally 
both are equally valid. This is customary in bi/multilingual 
jurisdictions and international treaties. For example, article 133 of 
the Geneva convention states:


The present Convention is established in English and in French. Both 
texts are equally authentic. The Swiss Federal Council shall arrange for 
official translations of the Convention to be made in the Russian and 
Spanish languages.


In that case, I suppose, neither English nor French is a translation but 
 Russian and Spanish are.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
XML in a Nutshell 3rd Edition Just Published!
http://www.cafeconleche.org/books/xian3/
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread David Powell


Friday, December 23, 2005, 10:47:23 AM, Henry Story wrote:

> On 23 Dec 2005, at 10:56, James Holderness wrote:
>>
>> The similarity to the Thread Extension also occured to me, but I  
>> didn't have time to write more about it earlier. My thought was  
>> that we could perhaps get by with an extension attribute to the  
>> link element that would work for both cases. The link element  
>> already has an href for pointing to the feed itself, so all we need  
>> is an id to point to a particular entry in the feed. I guess for  
>> the Thead Extension you'd also need a new rel value though.
>>
>> An example thead reply-to link would look something like this:
>>
>> >  type="application/atom+xml"
>>  href=http://www.example.org/feed1.atom
>>  x:id="entry1_id" />
>>
>> An example translation link would look something like this:
>>
>> >  type="application/atom+xml"
>>  hreflang="fr"
>>  x:id="french_entry_id" />
>>
>> What do you think?

I assume that there is an href missing from that?

Isn't it likely that the entry will have been dropped from the end of
the feed by the time anyone dereferences it? Would it be better to
point the link at a static Atom Entry document instead.


I've said it before, but I strongly dislike extending standard
elements by adding namespaced attributes. I think that it is a failing
that the specification defines several extension points, then pretty
much says that you can extend anything pretty much anywhere, and fails
to explain why some options might be better than others. It is
difficult to support attribute extensions in an APP server or Atom API
without explicitly coding support for each one. It is quite likely
that some implementations will fail to preserve the namespaced
attribute and forward on a corrupted link element without it.

The spec's statement that "the role of other foreign markup is
undefined by this specification", would just be an unhelpful
comp.lang.c-ism, except C does at least clearly define the meaning and
implications of the term "undefined".

But anyway...

> The problem as I explained a little quickly in my mail yesterday, is  
> that you are relating a entry and an id. Because there can be any  
> number of entries with the same id it won't be clear which entry is  
> the translation.

The description of atom:id in the spec flounders slightly, but I
assumed that the intention was that representations of entries only
vary over time. The use of the word "revision" implies that to me. So,
the relation is to any revision of the entry with that id, though the
latest is probably most relevant.

Unlike HTTP, Atom Syntax can't support conneg, so I don't think that
it would be useful for entries to vary over anything other than time.
If you want to support multi-languages, I think it is better to build
that on top of Atom's infrastructure via linking, rather than
underneath it via stretching the scope of the entry id.

-- 
Dave



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread Walter Underwood

--On December 23, 2005 11:31:22 PM +0100 Henry Story <[EMAIL PROTECTED]> wrote:
>
> So  you can't have a link pointing from an entry to an id, without losing some
> very important information. We need something more  specific. We need a link
> pointing from A to C as shown by the blue line.

Some people will need that in the guts of their publishing system. Why do
we need it in Atom? Is there something essential that subscribers cannot do
because this isn't represented? This sounds like something needed for the
publishing/translation workflow, not for the general readership.

Extended provenance information is sometimes needed, but there is almost
no limit to that. It certainly does not stop at translation, source, and
translator. I'm reading a new translation of Andersen's tales where 
"Thumbelina" is "Inchelina" because the translator knew the right dialect
of Danish. That is significant, but does it need to be in Atom?

The semantics here should be exactly the same as for dates -- the date
means what the publisher thinks it means. Same for language info. Trying
to get more exact means that the model will be wrong for some publishers
that generate completely legal Atom.

wunder
--
Walter Underwood
Principal Software Architect, Verity



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread James Holderness


Henry Story wrote:

I think you have not quite grasped the point my graph was trying to
make. Perhaps I did not explain myself clearly enough. The graph
represents a feed with three entries A, B and C.
B and C share the same id. C has an updated time stamp that is after
B so C is an update of B.


If you went with my link+hreflang method, this could be handled with an 
additional date attribute in the link, where the date signified the time at 
which the translation was valid. For example:



 tag:eg.com,2005:/en/atom03
 x:updated="2005-11-15T18:30:02Z"/>



 tag:eg.com,2005:/fr/atom03
 2005-11-13T18:30:02Z
 


 tag:eg.com,2005:/fr/atom03
 2005-11-15T18:30:02Z
 


I've left off the obvious elements and attributes for brevity, but you get 
the idea.


Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread James Holderness


David Powell wrote:



What do you think?


I assume that there is an href missing from that?


Actually no. At least not necessarily. That was to indicate the special case 
where the link was pointing to an entry in the same feed.



Isn't it likely that the entry will have been dropped from the end of
the feed by the time anyone dereferences it? Would it be better to
point the link at a static Atom Entry document instead.


I would say that's up to the publisher to decide since it depends on how 
they expect the information to be used. There's nothing wrong with them 
linking to an Atom entry document (and that was one of my suggested 
solutions earlier), but creating a separate document for every entry may not 
be something they're willing to do.


If a client is already subscribed to the feed containing the translation and 
they save all data in the feed then the problem goes away. Alternatively the 
publisher could use the Feed History extension (when it eventually becomes 
available) which would enable a client to search through the archives to 
find the translation if necessary.



to explain why some options might be better than others. It is
difficult to support attribute extensions in an APP server or Atom API
without explicitly coding support for each one. It is quite likely
that some implementations will fail to preserve the namespaced
attribute and forward on a corrupted link element without it.


That had occurred to me and I had another somewhat obscure idea in that 
regard. How wrong would it be to use the anchor part of the URI to refer to 
an entry in a feed the same way you can point to a particular element on an 
html page? That way you wouldn't need an extension attribute for the id 
anymore. Obviously some characters in the id would need to be escaped since 
it's a URI itself, but other than that I don't see any technical problems 
with the concept.


Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2006-01-30 Thread James Holderness


Henry Story wrote:

Just re-reading your mail I think you make a good point that perhaps
"translation" is the wrong word to use. We would like something more
abstract such as "otherLanguageVersion". This made me think that the
word we want is "alternate". And then looking at the spec again I
found the following:

[[
4.2.7.4.  The "hreflang" Attribute


This was the first thing I suggested...

"If you want to link the various translations together you can add one or 
more link elements at the top of the feed with rel="alternate" and hreflang 
set to the language of the alternate feed. If you're feeling really 
enthusiastic you can include alternate links pointing to the translated html 
pages for each entry too." [1]


Regards
James

[1] http://www.imc.org/atom-syntax/mail-archive/msg17609.html