Re: todo: add language encoding information

2006-02-01 Thread Henry Story


Thanks for the reminder. I had not completely swapped in the earlier  
parts of the thread.


1. Content translation
---

With the  hreflang pointing to the alternate content inside the entry


   Atom-Powered Robots Run Amok
   http://example.org/2003/12/13/atom03"/>
   http://example.org/2003/12/13/atom03/fr";  
hreflang="fr"/>

   urn:uuid:1225c695-cfb8-4ebb--80da344efa6a
   2003-12-13T18:30:02Z
   Some text.
 

we get the ability to point to any number of different language  
versions of the same document.
But this does not help the user who would like to find the  
translation of the metadata in his language.


2. Metadata translations


For translation of metadata (title, subtitle, summary and content are  
the ones that would be affected - any others?), there are a number of  
solutions:


2.1 Feed translations.
-

	As James Holderness points out below one can have the feed that the  
entry is contained in  point to a feed that contains the translated  
metadata feed. There was some discussion there about the best way to  
do that (self or alternate, if I recall correctly) but it is clear  
that there should be no difficulty there.


	One problem here is how to tell which entries in one feed are  
translations of which entries in the other.


 	a- One could simply find this out by giving the entries in both  
feeds that are translations of each other the same id. In web  
architecture terms the entries could be seen as different language  
representations of the resource named by the id. [[ but see problem  
with SHOULD restriction below ]].
	b- Otherwise one needs to be able to create translations (alternate  
language) link relations from entries to entries across feeds.



2.2. Entry Translations
---

  As Simon Phipps pointed out, for small players, translations are  
expensive. The creator of an entry may for example only have time to  
translate the metadata for a few of the entries for which he can see  
this to be of value. He may not even wish to translate the content,  
leaving that for someone else with a little more time perhaps to do.  
So we could have french metadata for content that is only available  
in english (but that may later become available in french).


   So here it seems that there is at least one use case requirement  
for having some way to tell that two entries in the same feed are  
translations of one another. Following the logic of 2.1 there should  
be 2 ways of doing this:


	a- Simply publish the two entries with the same id. Here the two  
entries are seen as two different language versions of the resource  
named by the id.
	b- We need to be able to create a translations (alternate language)  
link relation between two entries with different ids in the same feed.


Now solution (a) does not work in this situation because of the  
famous SHOULD level restrictions on when two entries can appear in a  
feed.


[[
  If multiple atom:entry elements with the same atom:id value appear in
   an Atom Feed Document, they represent the same entry.  Their
   atom:updated timestamps SHOULD be different.
]]

And the Swiss case shows that one may be obliged to publish two  
different language versions
simultaneously. But perhaps this is why we don't have a MUST level  
restriction, but only a SHOULD. And perhaps it really is permissible  
to have different language representations of the same entry id.


Even then we may still want some way to specify that two entries are  
metadata translations of each other (eTranslations in the graph).  
Perhaps someone wants to translate an entry of mine, but does not  
want to use the same entry id. They would then still want a way to  
specify the relation between the two ids. (note that in RDF one could  
solve this by saying that the two ids are same-as each other, and  
fall back to case (a)).



HOWTO
=

So how can we specify that two entries are translations of each other?

0. Entries have same id
---

If the two entries have the same id yet have different language  
settings, then when we come across both we should guess that these  
two represent the same thing.
	+ it would mean that an aggregator that had found one entry would  
know he need not ping his user to let them know about the new version  
in a different language.

- this does not help find a different language entry when one finds one

The biggest problem is finding a way to point from one entry to  
another. All entries have ids, but these don't tell us where we can  
get information about them. An atom id is usually not dereferenceable.


Also it is not clear that it would be understood that the new  
representation in a different language is not just meant to be a  
replacement of the old one. (See SHOULD level restriction mentioned  
above). [[ There needs to be a debate as to whether multilingual  
representatio

Re: todo: add language encoding information

2006-02-01 Thread Henry Story



On 1 Feb 2006, at 02:10, Eric Scheid wrote:



On 31/1/06 1:27 PM, "James Holderness" <[EMAIL PROTECTED]> wrote:


Actually I was thinking just a regular href and type. For example:

http://mydomain.com/feed";
hreflang="fr"
x:id="french_entry_id"
x:updated="2005-11-15T18:30:02X" />

I'm not sure how valid that is considering a client that didn't  
understand
this extension would consider the full feed to be an alternative  
for that

one particular entry which doesn't seem right.


Any reason that application/atom+xml document couldn't be an Atom  
Entry
Document, apart from the current lack of reader/browser support? It  
could
also then include the links necessary to subscribe to the specific  
language

feed.


I think that was one of James' original suggestions. It seems like  
one good solution. The only problem is that it forces people to have  
Entry Documents.


I am just writing up a document listing all the different ways one  
can solve this problem...




e.




Re: todo: add language encoding information

2006-01-31 Thread Eric Scheid

On 31/1/06 1:27 PM, "James Holderness" <[EMAIL PROTECTED]> wrote:

> Actually I was thinking just a regular href and type. For example:
> 
>  href="http://mydomain.com/feed";
> hreflang="fr"
> x:id="french_entry_id"
> x:updated="2005-11-15T18:30:02X" />
> 
> I'm not sure how valid that is considering a client that didn't understand
> this extension would consider the full feed to be an alternative for that
> one particular entry which doesn't seem right.

Any reason that application/atom+xml document couldn't be an Atom Entry
Document, apart from the current lack of reader/browser support? It could
also then include the links necessary to subscribe to the specific language
feed.

e.



Re: todo: add language encoding information

2006-01-31 Thread Henry Story


This looks like a good place to look for a solution.

On 31 Jan 2006, at 03:27, James Holderness wrote:

Henry Story wrote:
Presumably one would need to add an x:feed="http://mydomain.com/ 
feed" attribute for translations of entries that appear in other  
feeds.


Actually I was thinking just a regular href and type. For example:

http://mydomain.com/feed";
 hreflang="fr"
 x:id="french_entry_id"
 x:updated="2005-11-15T18:30:02X" />

I'm not sure how valid that is considering a client that didn't  
understand this extension would consider the full feed to be an  
alternative for that one particular entry which doesn't seem right.



Your point is well taken that the extension would be confusing for  
tools that did not understand it.  Perhaps one could create a new  
link rel="translation" that would point to a feed containing an  
alternate entry with the given x:id and updated at the particular time.


http://mydomain.com/feed";
  hreflang="fr"
  x:id="french_entry_id"
  x:updated="2005-11-15T18:30:02X" />

would that still be confusing to tools that did not understand the  
extension?


Henry





Re: todo: add language encoding information

2006-01-31 Thread Henry Story



On 31 Jan 2006, at 03:27, James Holderness wrote:
Personally I would have preferred using a fragment identifier. So  
the above example would look something like this:


http://mydomain.com/feed#french_entry_id";
 hreflang="fr"
 x:updated="2005-11-15T18:30:02X" />

And a link to an entry within the same feed could be done with a  
fragment only uri, like this:




But from what I can make out, this sort of thing would only be  
valid if atom:id was defined to be an ID attribute which would  
assumedly require an Atom DTD.


And that won't be quite possible because:
	- a feed can have a number of entries with the same id, whereas I  
think that when something is defined as being an ID attribute there  
can only be one instance in the same document (not absolutely sure  
about this though)
	- an id is not an attribute in atom xml, so I am not even sure this  
would be possible anyway


Also it is quite possible that the entry fall off the end of the feed  
after a while (usually just before it gets placed into an archive).  
So the link in your first example above would be falsified a little  
too easily.


Henry Story
http://bblfish.net/



Re: todo: add language encoding information

2006-01-30 Thread James Holderness


Henry Story wrote:
Presumably one would need to add an x:feed="http://mydomain.com/feed"; 
attribute for translations of entries that appear in other feeds.


Actually I was thinking just a regular href and type. For example:

http://mydomain.com/feed";
 hreflang="fr"
 x:id="french_entry_id"
 x:updated="2005-11-15T18:30:02X" />

I'm not sure how valid that is considering a client that didn't understand 
this extension would consider the full feed to be an alternative for that 
one particular entry which doesn't seem right.


Personally I would have preferred using a fragment identifier. So the above 
example would look something like this:


http://mydomain.com/feed#french_entry_id";
 hreflang="fr"
 x:updated="2005-11-15T18:30:02X" />

And a link to an entry within the same feed could be done with a fragment 
only uri, like this:




But from what I can make out, this sort of thing would only be valid if 
atom:id was defined to be an ID attribute which would assumedly require an 
Atom DTD.


Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2006-01-30 Thread James Holderness


Henry Story wrote:

Just re-reading your mail I think you make a good point that perhaps
"translation" is the wrong word to use. We would like something more
abstract such as "otherLanguageVersion". This made me think that the
word we want is "alternate". And then looking at the spec again I
found the following:

[[
4.2.7.4.  The "hreflang" Attribute


This was the first thing I suggested...

"If you want to link the various translations together you can add one or 
more link elements at the top of the feed with rel="alternate" and hreflang 
set to the language of the alternate feed. If you're feeling really 
enthusiastic you can include alternate links pointing to the translated html 
pages for each entry too." [1]


Regards
James

[1] http://www.imc.org/atom-syntax/mail-archive/msg17609.html 



Re: todo: add language encoding information

2006-01-30 Thread Henry Story


Sorry for being away for a while. I am back on this issue. We had  
narrowed in on this quite well. It should be RFC time real soon.



On 24 Dec 2005, at 07:25, James Holderness wrote:

Henry Story wrote:

I think you have not quite grasped the point my graph was trying to
make. Perhaps I did not explain myself clearly enough. The graph
represents a feed with three entries A, B and C.
B and C share the same id. C has an updated time stamp that is after
B so C is an update of B.


If you went with my link+hreflang method, this could be handled  
with an additional date attribute in the link, where the date  
signified the time at which the translation was valid. For example:


Yes. Quite right.



 tag:eg.com,2005:/en/atom03
 x:updated="2005-11-15T18:30:02Z"/>



 tag:eg.com,2005:/fr/atom03
 2005-11-13T18:30:02Z
 


 tag:eg.com,2005:/fr/atom03
 2005-11-15T18:30:02Z
 



Presumably one would need to add an x:feed="http://mydomain.com/feed";  
attribute for translations of entries that appear in other feeds.


I've left off the obvious elements and attributes for brevity, but  
you get the idea.


yes I do.

Henry




Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread James Holderness


Henry Story wrote:

I think you have not quite grasped the point my graph was trying to
make. Perhaps I did not explain myself clearly enough. The graph
represents a feed with three entries A, B and C.
B and C share the same id. C has an updated time stamp that is after
B so C is an update of B.


If you went with my link+hreflang method, this could be handled with an 
additional date attribute in the link, where the date signified the time at 
which the translation was valid. For example:



 tag:eg.com,2005:/en/atom03
 x:updated="2005-11-15T18:30:02Z"/>



 tag:eg.com,2005:/fr/atom03
 2005-11-13T18:30:02Z
 


 tag:eg.com,2005:/fr/atom03
 2005-11-15T18:30:02Z
 


I've left off the obvious elements and attributes for brevity, but you get 
the idea.


Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread James Holderness


David Powell wrote:



What do you think?


I assume that there is an href missing from that?


Actually no. At least not necessarily. That was to indicate the special case 
where the link was pointing to an entry in the same feed.



Isn't it likely that the entry will have been dropped from the end of
the feed by the time anyone dereferences it? Would it be better to
point the link at a static Atom Entry document instead.


I would say that's up to the publisher to decide since it depends on how 
they expect the information to be used. There's nothing wrong with them 
linking to an Atom entry document (and that was one of my suggested 
solutions earlier), but creating a separate document for every entry may not 
be something they're willing to do.


If a client is already subscribed to the feed containing the translation and 
they save all data in the feed then the problem goes away. Alternatively the 
publisher could use the Feed History extension (when it eventually becomes 
available) which would enable a client to search through the archives to 
find the translation if necessary.



to explain why some options might be better than others. It is
difficult to support attribute extensions in an APP server or Atom API
without explicitly coding support for each one. It is quite likely
that some implementations will fail to preserve the namespaced
attribute and forward on a corrupted link element without it.


That had occurred to me and I had another somewhat obscure idea in that 
regard. How wrong would it be to use the anchor part of the URI to refer to 
an entry in a feed the same way you can point to a particular element on an 
html page? That way you wouldn't need an extension attribute for the id 
anymore. Obviously some characters in the id would need to be escaped since 
it's a URI itself, but other than that I don't see any technical problems 
with the concept.


Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread Walter Underwood

--On December 23, 2005 11:31:22 PM +0100 Henry Story <[EMAIL PROTECTED]> wrote:
>
> So  you can't have a link pointing from an entry to an id, without losing some
> very important information. We need something more  specific. We need a link
> pointing from A to C as shown by the blue line.

Some people will need that in the guts of their publishing system. Why do
we need it in Atom? Is there something essential that subscribers cannot do
because this isn't represented? This sounds like something needed for the
publishing/translation workflow, not for the general readership.

Extended provenance information is sometimes needed, but there is almost
no limit to that. It certainly does not stop at translation, source, and
translator. I'm reading a new translation of Andersen's tales where 
"Thumbelina" is "Inchelina" because the translator knew the right dialect
of Danish. That is significant, but does it need to be in Atom?

The semantics here should be exactly the same as for dates -- the date
means what the publisher thinks it means. Same for language info. Trying
to get more exact means that the model will be wrong for some publishers
that generate completely legal Atom.

wunder
--
Walter Underwood
Principal Software Architect, Verity



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread David Powell


Friday, December 23, 2005, 10:47:23 AM, Henry Story wrote:

> On 23 Dec 2005, at 10:56, James Holderness wrote:
>>
>> The similarity to the Thread Extension also occured to me, but I  
>> didn't have time to write more about it earlier. My thought was  
>> that we could perhaps get by with an extension attribute to the  
>> link element that would work for both cases. The link element  
>> already has an href for pointing to the feed itself, so all we need  
>> is an id to point to a particular entry in the feed. I guess for  
>> the Thead Extension you'd also need a new rel value though.
>>
>> An example thead reply-to link would look something like this:
>>
>> >  type="application/atom+xml"
>>  href=http://www.example.org/feed1.atom
>>  x:id="entry1_id" />
>>
>> An example translation link would look something like this:
>>
>> >  type="application/atom+xml"
>>  hreflang="fr"
>>  x:id="french_entry_id" />
>>
>> What do you think?

I assume that there is an href missing from that?

Isn't it likely that the entry will have been dropped from the end of
the feed by the time anyone dereferences it? Would it be better to
point the link at a static Atom Entry document instead.


I've said it before, but I strongly dislike extending standard
elements by adding namespaced attributes. I think that it is a failing
that the specification defines several extension points, then pretty
much says that you can extend anything pretty much anywhere, and fails
to explain why some options might be better than others. It is
difficult to support attribute extensions in an APP server or Atom API
without explicitly coding support for each one. It is quite likely
that some implementations will fail to preserve the namespaced
attribute and forward on a corrupted link element without it.

The spec's statement that "the role of other foreign markup is
undefined by this specification", would just be an unhelpful
comp.lang.c-ism, except C does at least clearly define the meaning and
implications of the term "undefined".

But anyway...

> The problem as I explained a little quickly in my mail yesterday, is  
> that you are relating a entry and an id. Because there can be any  
> number of entries with the same id it won't be clear which entry is  
> the translation.

The description of atom:id in the spec flounders slightly, but I
assumed that the intention was that representations of entries only
vary over time. The use of the word "revision" implies that to me. So,
the relation is to any revision of the entry with that id, though the
latest is probably most relevant.

Unlike HTTP, Atom Syntax can't support conneg, so I don't think that
it would be useful for entries to vary over anything other than time.
If you want to support multi-languages, I think it is better to build
that on top of Atom's infrastructure via linking, rather than
underneath it via stretching the scope of the entry id.

-- 
Dave



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread Elliotte Harold


Henry Story wrote:

The problem as I explained a little quickly in my mail yesterday, is 
that you are relating a entry and an id. Because there can be any number 
of entries with the same id it won't be clear which entry is the 
translation. 



I'm not sure if this is technically relevant or not, but in the case at 
hand neither entry is the translation, or alternately both are. Legally 
both are equally valid. This is customary in bi/multilingual 
jurisdictions and international treaties. For example, article 133 of 
the Geneva convention states:


The present Convention is established in English and in French. Both 
texts are equally authentic. The Swiss Federal Council shall arrange for 
official translations of the Convention to be made in the Russian and 
Spanish languages.


In that case, I suppose, neither English nor French is a translation but 
 Russian and Spanish are.


--
Elliotte Rusty Harold  [EMAIL PROTECTED]
XML in a Nutshell 3rd Edition Just Published!
http://www.cafeconleche.org/books/xian3/
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim



Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread James Holderness


James M Snell wrote:
In that case, the only viable solution is an extension element.  This is 
similar to the Thread Extension use case in which replies may appear 
within the same feed as the original.


The similarity to the Thread Extension also occured to me, but I didn't have 
time to write more about it earlier. My thought was that we could perhaps 
get by with an extension attribute to the link element that would work for 
both cases. The link element already has an href for pointing to the feed 
itself, so all we need is an id to point to a particular entry in the feed. 
I guess for the Thead Extension you'd also need a new rel value though.


An example thead reply-to link would look something like this:

http://www.example.org/feed1.atom
 x:id="entry1_id" />

An example translation link would look something like this:



What do you think?

Regards
James



Re: todo: add language encoding information

2005-12-22 Thread Henry Story



On 22 Dec 2005, at 21:34, James M Snell wrote:


In that case, the only viable solution is an extension element.


Yep. I have added James Holderness's solution and clarified the  
points in N3.


This is similar to the Thread Extension use case in which replies  
may appear within the same feed as the original.


Here's how I would do it:

1. Introduce a new  element.

  


That was like my first example rdf. And my question there was: would  
this mean that the entry containing the translation element was a  
translation of a entry with the given id, or of all entries with the  
given id, or that all entries with the id of the containing element  
were translations of entries with the given id? The problem is that  
when one makes a translation of something one really wants to specify  
which version of the entry pointed to is the translation.  Things may  
otherwise get to be quite misleading.


2. Use rel="alternate", hreflang="{otherlanguage}" to point to  
other documents that may contain translations.  I would prefer a  
"translation" link rel but agree that reuse of alternate is better.


This solution is best for pointing to the place where the content of  
the entry is translated I suppose. Here we are saying that the  
content is translated at a certain place. In theory the titles and  
the summaries could in both entries remain in english.


[ a :Feed, :Version;
  :entry [ a :Entry;
 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "After a course of integration in  
French philosophy,...";

 :type "text/plain" ];
 :content ;
 :translation [ = ;

:representation [ :type "text/html";
  :lang "fr" ]
  ]
   ];

:entry [ a :Entry;
 :title [ :value "Des Robot Atomiques se rebellent";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :content ;
 :translation [ = ;

   :representation [ :type "text/html";
 :lang "en" ]
  ];

   ];
] .


3. The suggestion by James Holderness

Alternatively you could create a separate Atom document for each  
entry and then each entry in the feed would include a @rel='self'  
link pointing to their corresponding document with  
@type='application/atom+xml' as well as a @rel='alternate' links  
pointing to the Atom documents of any translations (also with  
@type='application/atom+xtml').


I suppose this would be a way to identify particular entries with  
urls so that they could be identified directly, and so the entries  
would be specified as translations of each other. In N3 this would be  
something like my second example, but we use urls to identify the  
entries and place the content at the url location.


That seems more appropriate when one wants the point to a translation  
that translates both the content and the title and the summary (and  
the metadata?).


[ a :Feed, :Version;
:entry ;
:entry 
] .

http://example.org/2003/12/13/en/atom03.atom would return the  
following representation

-8<---
  <> a :Entry;
 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "After a course of integration in  
French philosophy,...";

 :type "text/plain" ];
 ext:translation .

-8<---

http://example.org/2003/12/13/fr/atom03.atom would return the  
following representation

-8<---
  <> a :Entry;
 :title [ :value "Des Robot Atomiques se rebellent";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "Apres un cours d'integration

Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James M Snell


Hmmm.. interesting thought, hadn't considered that.

rel="self" should always point to *this* document, and never to some 
other document, but if the document referenced is the same document just 
in a different language, then it is possible?  Good thinking but I'm not 
sure if it's legal according to the spec.


 
   ...
   http://.../thefeed?lang=fr";
   http://.../thefeed?lang=de";
   
 tag:example.org,2005:some_entry
 ...
   
 

- James

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  



Is that even necessary? Wouldn’t @rel='self' already work here?

Regards,




Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James M Snell


In that case, the only viable solution is an extension element.  This is 
similar to the Thread Extension use case in which replies may appear 
within the same feed as the original.


Here's how I would do it:

1. Introduce a new  element.

  

2. Use rel="alternate", hreflang="{otherlanguage}" to point to other 
documents that may contain translations.  I would prefer a "translation" 
link rel but agree that reuse of alternate is better.


- James

Henry Story wrote:
Yes. That is one solution. But what we are looking for is how one can 
state that two entries in the same feed are translations of one another.


Henry

On 22 Dec 2005, at 20:52, James M Snell wrote:



Hmmm.. interesting thought, hadn't considered that.

rel="self" should always point to *this* document, and never to some 
other document, but if the document referenced is the same document 
just in a different language, then it is possible?  Good thinking but 
I'm not sure if it's legal according to the spec.


 
   ...
   http://.../thefeed?lang=fr";
   http://.../thefeed?lang=de";
   
 tag:example.org,2005:some_entry
 ...
   
 

- James

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  


Is that even necessary? Wouldn’t @rel='self' already work here?
Regards,







Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread A. Pagaltzis

* James Holderness <[EMAIL PROTECTED]> [2005-12-22 21:00]:
> I would have thought @rel='alternate' (an alternate version of
> the resource described by the containing element) slightly more
> appropriate than @rel='self' (a resource equivalent to the
> containing element). There's also the restriction in section
> 4.1.1 that suggest that the 'self' link is the "preferred URI
> for retrieving Atom Feed Documents representing this Atom
> feed". I think that could be a bit confusing when mixed with
> hreflang.

Yeah, agreed. In retrospect I can’t think of why I picked `self`;
`alternate` is clearly the correct choice. D’oh.

Regards,
-- 
Aristotle Pagaltzis // 



Re: todo: add language encoding information

2005-12-22 Thread James Holderness


Henry Story wrote:
I still think it would be good to be able to have to entries in one  feed 
and be able to state that they are translations of one another.  I don't 
think that putting them in different feeds is going to cover  all the 
cases. See below.


Fair enough. Simon certainly seemed to have a valid use case for that.

I still would suggest it be better to have different ids for the 
translations so that they can be put in the same feed.


I agree. I never said so in my original message but even if the entries are 
in different feeds I think their ids should be different. An identical id to 
me implies an identical entry (which an Atom processor can assume is a 
duplicate or an update).


It should be possible then somehow to state that one entry is a 
translation of the other. Is there a way to do what I was going  trying to 
state here in N3.  This is really the problem that Simon  Phipps was 
looking at.


Why not just have @rel='alternate' links on each atom:entry which point to 
their translations. Now I realise it's not possible to point to a specific 
entry in a feed, but you can point to its html permalink with 
@type='text/html'. Alternatively you could create a separate Atom document 
for each entry and then each entry in the feed would include a @rel='self' 
link pointing to their corresponding document with 
@type='application/atom+xml' as well as a @rel='alternate' links pointing to 
the Atom documents of any translations (also with 
@type='application/atom+xtml').


Regards
James



Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread Henry Story


Yes. That is one solution. But what we are looking for is how one can  
state that two entries in the same feed are translations of one another.


Henry

On 22 Dec 2005, at 20:52, James M Snell wrote:



Hmmm.. interesting thought, hadn't considered that.

rel="self" should always point to *this* document, and never to  
some other document, but if the document referenced is the same  
document just in a different language, then it is possible?  Good  
thinking but I'm not sure if it's legal according to the spec.


 
   ...
   http://.../thefeed?lang=fr";
   http://.../thefeed?lang=de";
   
 tag:example.org,2005:some_entry
 ...
   
 

- James

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  


Is that even necessary? Wouldn’t @rel='self' already work here?
Regards,





Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James Holderness


A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:

To indicate that the feeds were translations of one another, a
new  "translation" link rel could be established on the feed
level


Is that even necessary? Wouldn’t @rel='self' already work here?


I would have thought @rel='alternate' (an alternate version of the resource 
described by the containing element) slightly more appropriate than 
@rel='self' (a resource equivalent to the containing element). There's also 
the restriction in section 4.1.1 that suggest that the 'self' link is the 
"preferred URI for retrieving Atom Feed Documents representing this Atom 
feed". I think that could be a bit confusing when mixed with hreflang.


Either way though, I agree that a new rel value isn't necessary.

Regards
James



Re: todo: add language encoding information

2005-12-22 Thread Henry Story


I still think it would be good to be able to have to entries in one  
feed and be able to state that they are translations of one another.  
I don't think that putting them in different feeds is going to cover  
all the cases. See below.


On 22 Dec 2005, at 17:45, James M Snell wrote:
One possibility for this in Atom is to provide multiple Atom  
documents, each covering their own language.  Given that the date  
restriction only covers Atom entries within the *same* feed  
document, the following would be perfectly acceptable:



  ...
  
tag:example.org,2005:some_entry
...
  



  ...
  
tag:example.org,2005:some_entry
...
  




[note I change my mind on this half way through the thought]

Ah. I did not read your post carefully enough when posting my  
previous reply to James Holderness.
I don't in fact think this is ok. If the date restriction covers  
entries inside the same feed (which can be millions of entries long  
btw) it has to do so for some good reason. Or else why put the  
restriction in there at all? Now that atom is final, and we cannot  
change the restriction I think we had better assume it is there for a  
very good reason. That will make us all look more intelligent that  
otherwise for one.


So my best explanation is that the id at a time identifies one and  
only one entry - universally, semantically if you will. If books had  
isbn numbers that persisted over editorial changes then this would be  
equivalent to our id. By using the isbn number of a book we could  
then refer to all the transformations of that book. But it would not  
make sense perhaps to have two different books versions published at  
the same time, with the same isbn number... (no that makes  
sense?!)... by the same publisher... Oops. That looks like the feed  
has taken the role of the publisher in the analogy...


(ponders)

Ok. So that would make me change my mind. It should be possible to  
have two entries with the same id and the same updated time stamp but  
different content in two different feeds. This would suggest that the  
feed is really part of the identity of the entry (and that the motto  
"it's the entry stupid" is wrong).


To indicate that the feeds were translations of one another, a new  
"translation" link rel could be established on the feed level



  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  




I still would suggest it be better to have different ids for the  
translations so that they can be put in the same feed. After all, why  
not allow that? If I am a publisher and I am publishing books, why  
could I not simultaneously publish a book and its translation? It  
should be possible then somehow to state that one entry is a  
translation of the other. Is there a way to do what I was going  
trying to state here in N3.  This is really the problem that Simon  
Phipps was looking at.




[ a :Feed, :Version;
:title [ :value "Example Feed";
 :type "text/plain" ];
:link  [ :href ;
 :rel iana:alternate ];
:updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
:author [ :name "John Doe" ];
:id ;
:entry [ a :Entry;
 = _:someid1;
 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href ;

  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "After a course of integration in  
French philosophy,...";

 :type "text/plain" ];
 ext:translation _:someid2;
   ];
:entry [ a :Entry;
  = _:someid2;
 :title [ :value "Des Robot Atomiques se rebellent";
  :type "text/plain" ];
 :link [  :href ;

  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "Apres un cours d'integration en  
philosophie francaise...";

 :type "text/plain" ]
 ext:translation _:someid1;
   ];
] .




Re: todo: add language encoding information

2005-12-22 Thread Henry Story


Simon makes some good practical points in the message I forwarded  
just previous to this one. But I would like to make some more  
abstract points too for those of you who are more of the Jungian  
introspective/rational character types (most of us here I guess or  
else we would be out surfing on the beach ;-)


On 22 Dec 2005, at 16:34, James Holderness wrote:

Henry Story wrote:
Does Atom allow there to be multiple parallel renditions of a  
blog  entry in different languages?


So it is not really possible to put atom entries with the same id  
and updated time stamp in a  feed (without a SHOULD level  
violation) even  if they are translation of each other. That means  
that the Swiss  would not be able to publish a law with the same  
atom id in french  and in german, as they are obliged to publish  
these at the exact same  time by law. (No linguistic preference)


The Swiss may need to publish laws in multiple languages  
simultaneously, but most users surely don't need to read those laws  
in multiple languages simultaneously. Why waste their bandwidth  
including several translations in one feed when it would be far  
more convenient if you just had a separate feed for each language?


The atom syntax has a very clear restriction which I believe is of  
semantic importance [1]. And that is as I mentioned in my original  
post that:


[[
   If multiple atom:entry elements with the same atom:id value  
appear in

   an Atom Feed Document, they represent the same entry.  Their
   atom:updated timestamps SHOULD be different.  If an Atom Feed
   Document contains multiple entries with the same atom:id, Atom
   Processors MAY choose to display all of them or some subset of them.
   One typical behavior would be to display only the entry with the
   latest atom:updated timestamp.
]] last para of section 4.1.1 of http://www.ietf.org/rfc/rfc4287

What would be the point of having this restriction if it was not of  
semantic significance? It seems a little odd to tell people that they  
cannot put two entries into a feed with the same updated time stamp,  
but that they can put them into two different feeds and all is ok.  
Why have that restricition in that case? My guess is that the  
restriction was accepted because it worked on the following  
intuition. The atom id is a string that identifies something we can  
think of as a document. It therefore does not make sense to have two  
different incompatible versions of the same document with the same  
time stamp. The atom id is therefore much more restrictive than a URL  
[2]. A URL in web architecture can have any number of different  
representations at the same time. Atom clearly rules this out. So my  
guess is that we should take this seriously. The id is identifying a  
document. And a document cannot be wholly in two languages at once.


Once we accept this, then there is no big problem. We just need the  
translations to have 2 different ids. They are after all two  
different documents. We can then put the two translations into the  
same feed, or different feeds without problem.


All we need is to find a way to state that one document is a  
translation of the other. And James Snell's proposal is a good start  
at getting us there (it is a possible translation of the N3 proposal  
in my original e-mail).


If you want to link the various translations together you can add  
one or more link elements at the top of the feed with  
rel="alternate" and hreflang set to the language of the alternate  
feed. If you're feeling really enthusiastic you can include  
alternate links pointing to the translated html pages for each  
entry too.


If my argument above is sound then we should be able to put the  
translations in the same feed. What we want to do then, is find some  
way to state that one entry in the feed is a translation of the other  
entry in the same feed. So though the solution saying that one feed  
is a translation of the other is ok (apart from it breaking what I  
think are the inherent semantics in atom) it is also too general. We  
need more precise tools.


No extensions need to be made to the Atom spec. No wasted bandwidth  
by having multiple translations in a feed that may not be used. But  
you can still publish muliple language simultaneously by making  
sure you update all feeds simultaneously. You could even publish  
all the feeds at the same URL, serving the correct translation  
based on the HTTP Accept-Language header.


Those are cool ideas. But again, it would be good to have a way to  
specify that an entry in one feed is a translation of a particular  
different entry in the other feed.




I may be missing something here, but it seems to me like a  
reasonable solution to the problem.




Those were my first thoughts too btw. And also Retos. :-)


Regards
James


I hope this type of argument also helps show how the semantics we are  
working on at the atom-owl group can help reveal hidden meanings  
lurking in th

Re: todo: add language encoding information

2005-12-22 Thread Henry Story


Simon is not a member of the atom-syntax list. I imagine his response  
will get through atom syntax moderation at some point, though as we  
are in holiday season, I imagine this could take some time.


I have also changed the policy on [EMAIL PROTECTED] to allow  
posts by non members, forcing them through a moderation stage. This  
option was not available on google groups when I started the group.  
So you non members can cc that group hapily now. :-)


Henry

On 22 Dec 2005, at 18:44, Simon Phipps wrote:



On Dec 22, 2005, at 15:34, James Holderness wrote:


Henry Story wrote:
Does Atom allow there to be multiple parallel renditions of a  
blog  entry in different languages?


So it is not really possible to put atom entries with the same id  
and updated time stamp in a  feed (without a SHOULD level  
violation) even  if they are translation of each other. That  
means that the Swiss  would not be able to publish a law with the  
same atom id in french  and in german, as they are obliged to  
publish these at the exact same  time by law. (No linguistic  
preference)


The Swiss may need to publish laws in multiple languages  
simultaneously, but most users surely don't need to read those  
laws in multiple languages simultaneously. Why waste their  
bandwidth including several translations in one feed when it would  
be far more convenient if you just had a separate feed for each  
language?


You're making a couple of dodgy assumptions here:

* You're assuming Atom is just used by an end-user here. Atom feeds  
can also be used to pass data between servers, for backup, or for  
content transfer. Structuring multilingual data as multiple feeds  
without including cross-relationships with the data involves loss  
of meta-data.


* You're assuming multiple languages are always present. When I  
blog, I have some entries translated for me because they relate to  
local issues in a country I am visiting. I won't be making a  
permanent pt-br feed when I visit Brazil, but I may well want the  
translated blog entry to be made available in the syndication feed.


It may well be appropriate to have multiple feeds in the case Henry  
cites where there is a stable set of translations, but the more  
general case merits serious consideration. James Snell has an  
interesting proposal that addresses the issue directly with minimal  
change.


S.




Re: [Fwd: Re: todo: add language encoding information]

2005-12-22 Thread A. Pagaltzis

* James M Snell <[EMAIL PROTECTED]> [2005-12-22 19:30]:
> To indicate that the feeds were translations of one another, a
> new  "translation" link rel could be established on the feed
> level
> 
> 
>   ...
>hreflang="fr"
> href="http://.../thefeed?lang=fr";
>   
> tag:example.org,2005:some_entry
> ...
>   
> 
> 
> 
>   ...
>hreflang="de"
> href="http://.../thefeed?lang=de";
>   
> tag:example.org,2005:some_entry
> ...
>   
> 

Is that even necessary? Wouldn’t @rel='self' already work here?

Regards,
-- 
Aristotle Pagaltzis // 



[Fwd: Re: todo: add language encoding information]

2005-12-22 Thread James M Snell


FYI...

 Original Message 
Subject: Re: todo: add language encoding information
Date: Thu, 22 Dec 2005 19:03:25 +0100
From: Henry Story <[EMAIL PROTECTED]>
To: James M Snell <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>

Hi James, you forgot to send this to the Atom syntax list, where it
will be most useful.

Henry
On 22 Dec 2005, at 18:53, James M Snell wrote:

(this may not make it to the atom-owl list as I'm not subscribed to  
that one)


One possibility for this in Atom is to provide multiple Atom  
documents, each covering their own language.  Given that the date  
restriction only covers Atom entries within the *same* feed  
document, the following would be perfectly acceptable:



  ...
  
tag:example.org,2005:some_entry
...
  



  ...
  
tag:example.org,2005:some_entry
...
  


To indicate that the feeds were translations of one another, a new  
"translation" link rel could be established on the feed level



  ...
  http://.../thefeed?lang=fr";
  
tag:example.org,2005:some_entry
...
  



  ...
  http://.../thefeed?lang=de";
  
tag:example.org,2005:some_entry
...
  


- James

Henry Story wrote:
Sorry again to take so long to respond. I have been a little too  
busy to respond recently.

On 4 Dec 2005, at 16:42, Simon Phipps wrote:

On Dec 4, 2005, at 14:33, Henry Story wrote:

I have written my first blog entry in French [1] which has made  
me aware that
it would be very useful to add language information to the atom  
file generated by

BlogEd. A menu beneath the Entry icon would probably do the trick.

Henry

[1] http://bblfish.net/blog/page9.html#t2005_12_04.14_17_02_642


Does Atom allow there to be multiple parallel renditions of a  
blog entry in different languages? I've been discussing with Dave  
Johnson how we might provide a facility in Roller where a blog  
entry might be provided in multiple languages and then select the  
appropriate entry based on the contents of the HTTP GET. If this  
were to be implemented, we'd need a way to express the same  
reality in the Atom feed.
This is a complicated and delicate issue. An atom entry has been  
syntactically limited by the atom spec by the following clause

[[
   If multiple atom:entry elements with the same atom:id value  
appear in

   an Atom Feed Document, they represent the same entry.  Their
   atom:updated timestamps SHOULD be different.  If an Atom Feed
   Document contains multiple entries with the same atom:id, Atom
   Processors MAY choose to display all of them or some subset of  
them.

   One typical behavior would be to display only the entry with the
   latest atom:updated timestamp.
]] last para of section 4.1.1 of http://www.ietf.org/rfc/rfc4287
So it is not really possible to put atom entries with the same id  
and updated time stamp in a  feed (without a SHOULD level  
violation) even if they are translation of each other. That means  
that the Swiss would not be able to publish a law with the same  
atom id in french and in german, as they are obliged to publish  
these at the exact same time by law. (No linguistic preference)
If we want to respect this we have to give different translations  
of a document different ids. We should be able to specify that  
they are translations of each other some other way. Using the Atom  
OWL ontology [1] we could do this by adding some translation  
relation between the entries. Once this the best way to do this is  
settled and that it satisfies our needs it should be possible to  
put a proposal to the atom syntax group for an extension to their  
xml format.

So perhaps something like this is needed
[ a :Feed, :Version;
:title [ :value "Example Feed";
 :type "text/plain" ];
:link  [ :href <http://example.org/>;
 :rel iana:alternate ];
:updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
:author [ :name "John Doe" ];
:id ;
:entry [ a :Entry;
 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href <http://example.org/2003/12/13/en/ 
atom03>;

  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "After a course of integration in  
French philosophy,...";

 :type "text/plain" ];
 ext:translation 
   ];
:entry [ a :Entry;
 :title [ :value "Des Robot Atomiques se rebellent";
  :type "text/plain" ];
 :link [  :href <http://example.org/2003/12/13/fr/ 
atom03>;

  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "Apre

Re: todo: add language encoding information

2005-12-22 Thread James Holderness


Henry Story wrote:
Does Atom allow there to be multiple parallel renditions of a blog  entry 
in different languages?


So it is not really possible to put atom entries with the same id and 
updated time stamp in a  feed (without a SHOULD level violation) even  if 
they are translation of each other. That means that the Swiss  would not 
be able to publish a law with the same atom id in french  and in german, 
as they are obliged to publish these at the exact same  time by law. (No 
linguistic preference)


The Swiss may need to publish laws in multiple languages simultaneously, but 
most users surely don't need to read those laws in multiple languages 
simultaneously. Why waste their bandwidth including several translations in 
one feed when it would be far more convenient if you just had a separate 
feed for each language?


If you want to link the various translations together you can add one or 
more link elements at the top of the feed with rel="alternate" and hreflang 
set to the language of the alternate feed. If you're feeling really 
enthusiastic you can include alternate links pointing to the translated html 
pages for each entry too.


No extensions need to be made to the Atom spec. No wasted bandwidth by 
having multiple translations in a feed that may not be used. But you can 
still publish muliple language simultaneously by making sure you update all 
feeds simultaneously. You could even publish all the feeds at the same URL, 
serving the correct translation based on the HTTP Accept-Language header.


I may be missing something here, but it seems to me like a reasonable 
solution to the problem.


Regards
James



Re: todo: add language encoding information

2005-12-22 Thread Henry Story


Sorry again to take so long to respond. I have been a little too busy  
to respond recently.


On 4 Dec 2005, at 16:42, Simon Phipps wrote:

On Dec 4, 2005, at 14:33, Henry Story wrote:

I have written my first blog entry in French [1] which has made me  
aware that
it would be very useful to add language information to the atom  
file generated by

BlogEd. A menu beneath the Entry icon would probably do the trick.

Henry

[1] http://bblfish.net/blog/page9.html#t2005_12_04.14_17_02_642


Does Atom allow there to be multiple parallel renditions of a blog  
entry in different languages? I've been discussing with Dave  
Johnson how we might provide a facility in Roller where a blog  
entry might be provided in multiple languages and then select the  
appropriate entry based on the contents of the HTTP GET. If this  
were to be implemented, we'd need a way to express the same reality  
in the Atom feed.


This is a complicated and delicate issue. An atom entry has been  
syntactically limited by the atom spec by the following clause


[[
   If multiple atom:entry elements with the same atom:id value  
appear in

   an Atom Feed Document, they represent the same entry.  Their
   atom:updated timestamps SHOULD be different.  If an Atom Feed
   Document contains multiple entries with the same atom:id, Atom
   Processors MAY choose to display all of them or some subset of them.
   One typical behavior would be to display only the entry with the
   latest atom:updated timestamp.
]] last para of section 4.1.1 of http://www.ietf.org/rfc/rfc4287

So it is not really possible to put atom entries with the same id and  
updated time stamp in a  feed (without a SHOULD level violation) even  
if they are translation of each other. That means that the Swiss  
would not be able to publish a law with the same atom id in french  
and in german, as they are obliged to publish these at the exact same  
time by law. (No linguistic preference)


If we want to respect this we have to give different translations of  
a document different ids. We should be able to specify that they are  
translations of each other some other way. Using the Atom OWL  
ontology [1] we could do this by adding some translation relation  
between the entries. Once this the best way to do this is settled and  
that it satisfies our needs it should be possible to put a proposal  
to the atom syntax group for an extension to their xml format.


So perhaps something like this is needed

[ a :Feed, :Version;

:title [ :value "Example Feed";
 :type "text/plain" ];
:link  [ :href ;
 :rel iana:alternate ];
:updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
:author [ :name "John Doe" ];
:id ;

:entry [ a :Entry;
 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "After a course of integration in  
French philosophy,...";

 :type "text/plain" ];

 ext:translation 

   ];

:entry [ a :Entry;
 :title [ :value "Des Robot Atomiques se rebellent";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
 :summary [  :value "Apres un cours d'integration en  
philosophie francaise...";

 :type "text/plain" ]


 ext:translation 

   ];
] .


The ext:translation relation above relates a :Entry to an :EntryId  
which is to say that it cannot distinguish between one version of an  
entry and another. So we would either be saying that each version of  
one entry is always going to be a translation of the other, or that  
there is some version of the other entry that is a translation of  
this one (which is a little too vague for comfort).


We could be more precise by giving each entry version its unique id,  
in which case we could say that this particular entry is a  
translation of that particular entry, which would be more precise and  
satisfactory. In N3:



[ a :Feed, :Version;

:title [ :value "Example Feed";
 :type "text/plain" ];
:link  [ :href ;
 :rel iana:alternate ];
:updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
:author [ :name "John Doe" ];
:id ;

:entry [ a :Entry;

 = _:someid1;


 :title [ :value "Atom-Powered Robots Run Amok";
  :type "text/plain" ];
 :link [  :href ;
  :rel iana:alternate ];
 :id ;
 :updated "2003-12-