Re: Current and permalink link rel values

2007-02-23 Thread Antone Roundy


On Feb 23, 2007, at 7:16 AM, Elliotte Harold wrote:
I'd like to add multiple links to my feed for both the current  
version of the story and the permalink. E.g.

...

http://www.cafeconleche.org/#February_22_2007_30633"/>
http://www.cafeconleche.org/oldnews/ 
news2007February22.html#February_22_2007_30633"/>


Both of those would probably be best described as "alternate" links.   
The second one in particular is what "alternate" was intended to be  
used for.  However, RFC 4287 contains the following:


   o  atom:entry elements MUST NOT contain more than one atom:link
  element with a rel attribute value of "alternate" that has the
  same combination of type and hreflang attribute values.

So you couldn't keep both as "alternate" links.  In my opinion, you  
should use the second one (the longer lasting one) only, and omit the  
first (which is going to become invalid as soon as the entry falls  
off the page anyway -- anyone who used it to get to your page and  
bookmarked it, and anyone who follows it from a cached copy of your  
feed isn't going to be able to find the entry without a lot of  
needless digging through your archives). You should have a link to  
http://www.cafeconleche.org/ at the feed level.  While that won't  
link directly to that entry, it'll get people to it as long as it's  
on that page.




Re: Query re: support of Media RSS extensions inside Atom feeds

2007-02-10 Thread Antone Roundy


On Feb 9, 2007, at 9:23 PM, John Panzer wrote:
Does anyone know of any issues with placing Yahoo! Media RSS  
extensions (which seem to fit the requirments for Atom extensions  
to me) inside an Atom feed?  Secondarily, do feed readers in  
general recognize MRSS inside either Atom or RSS?  Looking for  
field experience/implementor intentions here.


CaRP partially supports Media RSS in RSS (it doesn't directly support  
Atom at all, and Grouper, the companion script that converts Atom to  
RSS for it doesn't yet have Media RSS support, though I may add it in  
the next update).  It only looks at elements pointing to images  
(@type="image/*") and their types, heights and widths.  I added this  
in response to user requests--primarily, I believe, for use with  
Flickr feeds.


Antone



Re: AD Evaluation of draft-ietf-atompub-protocol-11

2006-12-16 Thread Antone Roundy


I'm not subscribed to the APP mailing list, so hopefully this isn't  
all redundant:


On 12/15/06, Lisa Dusseault <[EMAIL PROTECTED]> wrote:
A model where servers aren't required to keep such information  
won't, in

practice, allow that kind of extension. If clients can't rely on their
markup getting stored, then clients can't extend Atom unilaterally  
using XML

markup.


There are two different issues here, which I think has been  
mentioned, but which might bear being clearly stated:


1) Do servers have to keep all extension data?

2) Can a server accept an entry while discarding some or all  
extension data, or do they have to reject the entry and return an  
error code?


I think the answer to the first question is clearly no--servers  
shouldn't be required to store all arbitrary data that is sent to  
them.  So the questions are:


1) Which hurts more--data loss or rejected entries?

2) Is there any way to reduce that pain?

The pain of data loss is obvious--the data is lost.  The pain of  
rejected entries is having to fix and repost them or decide not to  
try again.


In either case, it might be useful to be able to query the server  
somehow to find out what it will and won't preserve.  If data is  
discarded, you can figure that out after the fact by loading the  
resulting entry and checking whether the data is all there, but one  
might prefer to know ahead of time if something is going to be lost  
in order to be able to decide whether to post it or not.  If the  
entry is just going to be rejected, unless there's a way for the  
server to communicate exactly which data it had issues with, fixing  
the data to make it acceptable could be extremely difficult ("Hmm,  
I'll leave this data out and try again...nope, still rejected. I'll  
put that back in and leave this out...nope. I'll take both  
out...nope. I'll put both back in and take yet another piece of data  
out...").


So, how might a client query a server to see what it will preserve?   
A few possibilities:


1) Have some way to request some sort of description of what will and  
won't be preserved and what might be altered.  I don't know how one  
would go about responding to such an inquiry except to basically send  
back a list of what will be preserved, including some way to say  
"I'll preserve unknown attributes here", "I'll preserve unknown child  
elements (and their children) here", "I'll store up to 32767 bytes  
here", etc.  If there is any known extension markup that a server  
wants to explicitly state that it won't preserve, there may need to  
be a way to do that too.


2) Have a way to do a "test post", where one posts the data one is  
considering posting (or something structurally identical), but says  
"don't store this--just tell me what you WOULD store".  The response  
could include what would be returned if one were to load the data  
after it being stored, or it could be some sort of list of anything  
that would be discarded or altered.


3) (I get the impression this could be done without requiring  
changes--is this the sort of process that has already been  
selected?)  Post the data as a draft, reload it to see if it's all  
still there.  If so, or if what has been preserved is acceptable,  
change it's status to "published" or whatever it's called.  If not  
delete it and give up or take whatever other action is appropriate.



My impression is that data loss would be less painful and more easily  
dealt with than rejection of entries that won't be completely preserved.


...but I haven't followed the discussion, so what do I know.



Re: PaceEntryMediatype

2006-12-06 Thread Antone Roundy


On Dec 6, 2006, at 4:26 PM, Jan Algermissen wrote:
Most feed readers knows how to handle feeds, but have no idea how  
to handle entries.


So they should be fixed, should they not?


If the purpose of a feed reader is to subscribe to feeds and bring  
new and updated entries to the user's attention, then if they don't  
also handle the monitoring of single entry documents (interesting to  
some people in some cases, but I doubt interesting to most people),  
that's not necessarily something that needs fixing.



They seem to only have implemented half a media type.


...or they've implemented all of what should be covered by one media  
type.




Re: PaceEntryMediatype

2006-12-06 Thread Antone Roundy


On Dec 6, 2006, at 12:14 PM, Jan Algermissen wrote:
Following a link is not the same thing as subscribing to something.  
The act of subscribing is a local activity performed by the user  
agent. What you do when you follow the link to a feed is a GET.  
Your agent then decides if subscribing to that resource is a good  
idea. To make that decision, the agent has to look at the  
representation and the it is insignificant overhead to see if the  
thing returnes  or .


...

Maybe I want to monitor a single media resource; an Atom media  
entry would be an ideal thing to do so (I'd rather look at the meta  
data than at the media resource upon each poll).


 I'd say: stick with the one media type that is currently there -  
there is no problem, just misconception about what it means to  
subscribe.


A few reasons why a user agent might want to be able to tell the  
difference between a link to a feed and a link to an entry beforehand  
is in order to:


1) be able to ignore the link to the entry (ie. not present it to the  
user) if the user agent doesn't handle entry documents (rather than  
presenting it as a "subscribe" link, only to have to say "sorry, it's  
not a feed" after the user tries to subscribe).


2) be able to say "subscribe" to links to feeds, and "monitor" links  
to entries (the user may not be interested in monitoring a single  
entry for changes--if they can't tell that that's what the link is  
for, they may end up needlessly doing so but think that they've added  
another feed to their subscription list).





Re: PaceEntryMediatype

2006-12-01 Thread Antone Roundy


On 12/1/06, Mark Baker <[EMAIL PROTECTED]> wrote:

On 11/30/06, Thomas Broyer <[EMAIL PROTECTED]> wrote:
All a media type tells you (non-authoritatively too) is the spec you
need to interpret the document at the other end of the link.  That has
very little to do with the reasons that you might want to follow the
link, subscribe to it, etc..  Which is why you need a mechanism
independent from the media type.  Like link types.


Now that this has sunk in, it makes a lot of sense--the @rel value  
says "you can subscribe to that", "that is an alternative  
representation of this", "that is where you'd go to edit this", and  
so on.  The media type helps the user agent figure out whether it has  
the capability to do those things.  For example, a feed reader that  
only handles RSS could ignore subscription links to resources of type  
"application/atom+xml" (ie. not present the subscription option to  
the user).  The "subscribe to hAtom feed" case where @type is "text/ 
html" might be a little difficult to make a decision on, because  
there's no indication of what microformat is being used by the  
"feed" (or even if there's a microformat in use at all--maybe it  
really is just an HTML page, and "subscribing" to it just means  
"watch for changes to the entire document").  But in the case of bare  
syndication formats, things should be clear enough.


So if it really is possible to do option 5 (new media type for entry  
documents, and @rel values to solve the rest of the issues), and do  
it cleanly, then that'd be my first choice.  If that's doomed (due to  
a need to be backwards compatible with existing practice) to be a  
mess of ambiguities and counter-intuitivities (eg. "alternate" means  
"subscribe" when combined with a syndication type, except when it  
might really mean "alternate" because it points to a feed archive  
document, but anything with "feed" in it always means "subscribe"...)  
then oh my.


One problem that I hadn't really thought clearly about till right now  
is that understanding the nature of the think linked TO may require  
some understanding of the nature of the thing linked FROM.  For  
example, an "alternate" link from a typical blog homepage to its feed  
really does point to the same thing in an alternative format.  Both  
are live documents in which new data gets added to the top, and old  
data drops off the bottom.  But if you don't know that the webpage is  
a live document, you wouldn't know whether the link pointed to a  
static or live document.  "alternate" is perfectly accurate, but it's  
not helpful enough.  "subscribe" would be much more explicit.


Which raises the question of how to point to a static alternative  
representation of the data currently found in the document.   
"alternate" WOULD be a good word to use for that except that it's  
already being used to point to live feeds.  An option that would  
almost surely cause confusion would be to use "alternative" for  
static alternative representations.  The meaning of "static" wouldn't  
exactly be intuitively clear.  Maybe something more long-winded like  
(oh no! hyphenation!) "static-alternate" would do.  Or would "static  
alternate" (and "alternate static" and "static foo alternate",etc.,  
or perhaps "archive alternate", etc.) be better?  For backwards  
compatibility (at least with UAs that don't expect only one value in  
@rel), "subscribe alternate" (and "alternate subscribe", etc.) could  
be used rather than simply "subscribe".


BTW, am I remembering correctly that "feed" is being promoted for use  
the way I'm considering "subscribe" above?  If it's not already in  
use, I'd thinK "subscribe" would be much better than "feed", because  
"feed" could as easily mean "archive feed" as "subscription feed"-- 
it's just not explicit enough.


But perhaps this discussion all belongs in a different venue anyway...


But before I end, what about the question of a different media type  
for entry documents?  For the APP accept element issue, it sounds  
like maybe they do.  But for autodiscovery, maybe they don't.   
Perhaps neither @type nor @rel is the place to distinguish, for  
example, between the edit links for entries, their parent feeds,  
their per-entry comment feeds, monolithic comment feeds, etc.  (A  
media type for entry documents would only help with one of those).   
Perhaps that is the domain of @title (title="Edit this entry", etc.)   
Do UAs really need to know the difference, or do only the users need  
to know?  Would making that information machine readable be worth the  
pain involved (rel="edit monolithic parent comments"???)



Okay, that's all I can take for now.



Re: PaceEntryMediatype

2006-11-30 Thread Antone Roundy


Summary of thoughts and questions:

*** Problems with the status quo ***

A) Consumers don't have enough information (without retrieving the  
remote resource) to determine whether to treat a link to an Atom  
document as a link to a live feed, a feed archive, or an entry.  (Is  
it appropriate to poll the link repeatedly?  How should information  
about the link be presented to the user?)


B) APP servers can't communicate whether they will accept feed  
documents or only entry documents.



*** Possible solutions ***

1) Add a type parameter to the existing media type:

+ With the exception of a few details, the documents are all exactly  
the same format (does it contain a feed element, or does it start at  
the entry element, is it a "live" feed document or an archive, etc.),  
so a single media type makes the most sense (definitely for live  
feeds vs. archives, less certainly for feeds vs. entries).


- Some existing applications will ignore the parameter and may handle  
links to non-live-feeds inappropriately


- Some existing applications may not recognize application/atom 
+xml;type=feed as something appropriate to handle the same way they  
handle application/atom+xml now.


? I haven't been following development of the APP, so forgive my  
ignorance, but can parameters be included in the accept element?



2) Create (a) new media type(s) (whether like application/atomentry 
+xml or application/atom.entry+xml):


+ Applications that currently treat all cases of application/atom+xml  
the same would ignore non-feed links until they were updated to do  
something appropriate with the new media type.


- Differentiating between live feeds and archives by media type seems  
really wrong since their formats are identical.  This isn't as big a  
negative for entry documents, but it still seems suboptimal to me.


- If a media type were created for archive documents, would APP  
accept including application/atom+xml imply acceptance of archive  
documents too?  Neither yes nor no feels like a satisfying answer.



3) Use @rel values to differentiate:

- That territory is already a bit of a mess, what with "feed" vs.  
"alternate" vs. "alternate feed" vs. "feed alternate" -- why make it  
worse?


+ That territory is already a bit of a mess, what with "feed" vs.  
"alternate" vs. "alternate feed" vs. "feed alternate" -- why not work  
on all these messy problems in the same place?


- That wouldn't help with the APP accept issue.


4) Create a new media type for entry documents, and add a parameter  
to application/atom+xml to differentiate between live and archive  
feeds (and for any other documents that have the identical format,  
but should be handled differently in significant cases).


- Doesn't prevent existing apps that ignore the parameter from  
polling archive documents.


+ Does solve the rest of the problems without the negatives of #2 above.


5) Create a new media type for entry documents, and use @rel values  
to solve issues that doesn't solve:


+/- Messy territory


If we were starting from scratch, I'd probably vote for #1.  Since  
we're not, I'd vote for #4 first, and perhaps #5 second, but I'd have  
to think about #5 more first.


Antone



Re: PaceEntryMediatype

2006-11-30 Thread Antone Roundy


On Nov 30, 2006, at 2:13 AM, Jan Algermissen wrote:

On Nov 29, 2006, at 7:22 PM, James M Snell wrote:

One such problem occurs in atom:link and atom:content elements.
Specifically:

  
  

Given no other information I have no way of knowing whether these are
references to Feed or Entry documents.


And what is the problem with that?


Here's one problem: in this and the autodiscovery case, the UA can't  
tell without fetching the remote resource whether it's appropriate to  
display a "subscribe" link.  In fact, even if the remote resource is  
a feed, it may not be appropriate to subscribe to, because it may be  
an archive document rather than the live end of a feed.


Of the options presented, I'd favor adding a type parameter to  
application/atom+xml.  In addition to "feed" and "entry", we may want  
"archive".




Re: atom license extension (Re: [cc-tab] *important* heads up)

2006-09-06 Thread Antone Roundy


On Sep 6, 2006, at 7:51 AM, James M Snell wrote:
The problem with specifying a per-feed default license is that  
there is
currently no way of explicitly indicating that an entry does not  
have a

license or that any particular entry should not inherit the default
feed-level license.


With respect to atom:rights (from RFC 4287 section 4.2.10):

   If an atom:entry element does not contain an atom:rights element,
   then the atom:rights element of the containing atom:feed element, if
   present, is considered to apply to the entry.

Thus, at the entry level,  would (certainly ought to!)  
detach a feed level atom:rights element from the entry without  
replacing it with anything.  With sure how you'd do the same thing.  Is it possible to specify a null  
URI?   points to the in-scope xml:base  
URI, right?  Perhaps the specification could define a "null license"  
URI.


With respect to the issue of aggregate feeds, I had thought that the  
existence of an atom:source element at the entry level blocked any  
"inheritance" of the feed metadata, but looking at RFC 4287, I don't  
see that explicitly stated.  Certainly if atom:source contains  
atom:rights, then that element overrides the feed-level atom:rights  
of the aggregate feed, but if neither atom:source nor atom:entry  
contains an atom:rights element, what then?  Perhaps in that case,  
the aggregator should add  as a child of atom:source  
(I'd think that preferable to adding it as a child of atom:entry).


On Sep 6, 2006, at 4:38 AM, Thomas Roessler wrote:

So, here's the proposal:

- Use  for entry licenses -- either on the feed
  level, setting a default analogous to what atom:rights does, or on
  the element level.

- Introduce  (or whatever else you
  find suitable) for licenses about the collection, to be used only
  on the feed level.


If there's a @rel="license" at the feed level, but no rel="collection- 
license", does the @rel="license" also become a "collection- 
license"?  (People who don't read the spec would probably think so).   
If there is no license for the collection, but one wishes to specify  
a default license for the entries, a "null license" would once again  
be useful.


Antone



Re: clarification: "escaped"

2006-07-26 Thread Antone Roundy


On Jul 26, 2006, at 3:19 AM, Bill de hÓra wrote:

A. Pagaltzis wrote:

* Robert Sayre <[EMAIL PROTECTED]> [2006-07-26 01:45]:

On 7/25/06, Bill de hÓra <[EMAIL PROTECTED]> wrote:

And I didn't know whether Atom code could get away with
escaping < and &.

 hmm

that is an XML fatal error, no doubt, as the ampersand before
"nbsp" must be escaped.

But he did say “escaping < and &”, so it would be. I’m not sure
what Bill’s question even is.


What do I escape, so I know what to unescape?


The point is that after your XML parser has unescaped the content of  
the element, it should be suitable for handling as HTML.  Escape  
whatever you have to ensure that the consumer gets HTML from their  
XML parser.  Converting & to & and < to < is sufficient  
(assuming that you've started with HTML--if you've started with plain  
text, then you need to double escape, but in that case, you should be  
using type="text" anyway to save yourself the trouble).  You could  
also convert > to >, " to ", ' to ' and any other  
characters to numeric character references.  Or you put the whole  
thing in a CDATA block.




Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests



On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote:

The content in the entries below should be handled the same way:

http://example.com/foo/";>
  ...
  
  http://example.com/
feu/">axe
  


http://example.com/foo/";>
  ...
  
feu/">
  axexhtml:div>

  



Of course the end result of both should be identical.  Is that what  
you mean by "should be handled the same way"?  The question is, if  
the xhtml:div is stripped by the library before handing it off to the  
app, how is the app going to get the attributes that were on the  
div?  Is the library going to push those values down into the content  
or act as if they were on the atom:content element (or something  
similar to that)?


BTW, it just occurred to me that pushing them down into the content  
won't work.  Here's an example where that would fail:



  ...
  
  Oui!
  


Notice that there are no elements inside the xhtml:div for xml:lang  
to be attached to (and even if there were any, any text appearing  
outside of them would not have the correct xml:lang attached to it).


So it looks like the options (both of a which a single library could  
support, of course) are:


* Strip the div, but provide a way to get the attributes that were on it
or
* Leave the div



Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests



On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2006-06-28 20:00]:

A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2006-06-28 14:35]:

Hiding the div completely from users of Abdera would mean
potentially losing important data (e.g. the div may contain
an xml:lang or xml:base) or forcing me to perform additional
processing (pushing the in-scope xml:lang/xml:base down to
child elements of the div.


How is that any different from having to find ways to pass
any in-scope xml:lang/xml:base down to API clients when the
content is type="html" or type="text"? I hope you didn’t punt
on those?


Our Content interface has methods for getting to that
information.


Then stripping the `div` is not an issue, is it?


Consider this:

http://example.com/foo/";>
...

		http://example.com/ 
feu/">axe




Whether there's a problem depends on whether one requests the  
xml:base, xml:lang, or whatever for the atom:content element itself  
or for the CONTENT OF the atom:content element, in which case the  
library could return the values it got from the xhtml:div.  Except in  
unusual cases like this, the result would be identical.


Certainly a distinction could be made between how an XML library  
would handle this vs. how an Atom library would handle it.  An Atom  
processing library might be expected to be able to do things like:


* give me the raw contents of the atom:content element
* give me the contents of the atom:content element converted to well- 
formed XHTML (whether it started as text, escaped tag soup, or inline  
xhtml)


In the former case, keeping the div feels like the right thing to do-- 
the consuming app would have to know to remove it.  In the latter  
case, removing the div from xhtml content feels like the right thing  
to do.  But unless the library gives me the xml:base, for example,  
which applies to the content of the atom:content element (as  
converted to well-formed xhtml or whatever), as opposed to the  
xml:base which applied to the atom:content element itself, there's  
potential for trouble.


...now that I think about it, if the library always returns the  
xml:base which applies to the content of the element, that could  
cause trouble too:


http://example.com/";>
...

		href="axe.html">axe




Here, if I get xml:base for the content of content, it will be  
"http://example.com/feu/";.  Then, if I get the raw content of the  
element, strip the div, and apply xml:base myself, I'll erroneously  
use "http://example.com/feu/feu/"; as the base URI unless I know to  
ignore the xml:base attribute on the div.




Re: Feed Thread in Last Call



On May 18, 2006, at 12:31 PM, A. Pagaltzis wrote:

Actually, you don’t really need another format. There’s no reason
why you couldn’t use atom:feed in place of your hypothetical
ct:comment-tracking. :-) Your ct:entry elements could almost be
atom:entry ones instead, too, except that assigning them titles
and IDs feels like overkill.
The point of the whole exercise is to create a lightweight document  
for volatile metadata. If it's an atom:feed, you have to include a  
lot of stuff that's not needed here--atom:title, atom:updated,  
atom:author, and atom:summary or atom:content.  Also, you'd need to  
have an atom:id for each entry in addition to the @ref pointing to  
the entry that it talks about.



The real cost is not the cost of an extra format, but that
implementations then need to understand the FTE in order to know
to poll an extra document to retrieve the out-of-band metadata.
Sure, but if they don't understand FTE, they wouldn't know what to do  
with the extra metadata anyway even if it were in the main feed.   
They MIGHT be able to do some generic processing of the comments  
link, but the reliability of any generic processing algorithm for  
unknown link types is questionable since we left atom:link open to  
all sorts of uses.  And you COULD keep the comments links in the main  
feed but just leave off @count and @when for the benefit of apps that  
don't process the sibling document.


On May 18, 2006, at 11:48 AM, Antone Roundy wrote:
This approach could be generalized to enable offloading of other  
metadata that's more volatile than the entries themselves.
I don't know yet what other metadata might be handled this way, but  
here's slightly revised pseudo-XML that makes it more general and  
adds a few useful things:



foobar
...

...

foo
...


bar
...

...


http://www.w3.org/2005/ 
Atom" xmlns:thr="...">
	

...
	

...
		thr:count="5" thr:when="..." />
		thr:count="3" thr:when="..." />



...
		thr:count="0" thr:when="..." />
		thr:count="1" thr:when="..." />


...





Re: Feed Thread in Last Call



On May 18, 2006, at 8:10 AM, Brendan Taylor wrote:

Do you have any suggestions about how this metadata could be included
without changing the content of the feed? AFAICT the only solution  
is to

not use the attributes (which aren't required, of course).


If it's in the feed document and it gets updated other than when the  
entry itself is updated (...and it wouldn't be of much use if it were  
only updated when the entry was updated), it's going to result in  
data getting re-fetched when nothing but the comment count and  
timestamp change.  I don't see any way around that.  So if you really  
want a way to publish comment counts and timestamps without causing  
lots of unchanged data from getting refetched, you're going to have  
to separate that data out of the feed. Here's pseudo-XML for a  
possible approach:



...

...

foo
...


bar
...

...


and in another document:




		ct:count="5" ct:when="..." />
		ct:count="3" ct:when="..." />



		ct:count="0" ct:when="..." />
		ct:count="1" ct:when="..." />


...


Of course the comment tracking document wouldn't only be  
authoritative for feeds that pointed to it with a comment-tracking link.


This would require an extra subscription to track the comments, as  
well as understanding an additional format (as opposed to just an  
additional extension--either approach requires SOME additional work),  
but it would prevent unnecessary downloads by clients that aren't  
aware of it, and would reduce the bandwidth used by those that are.


This approach could be generalized to enable offloading of other  
metadata that's more volatile than the entries themselves.


Antone



Re: Feed Thread in Last Call



On May 17, 2006, at 9:36 AM, Sylvain Hellegouarch wrote:

Besides you do not answer the question of HTTP caching I mentionned.
Basically it would break most planets out there which rely heavily  
on the
'304 Not Modified' status code to check if a feed has been  
modified. In
this case a server such as Apache would respond: "well yes the file  
has
changed on the disk so here it is" when in fact the content of the  
feed

has only changed for the number of comments of an entry.


I do think the document could use a note pointing out that using  
thr:when and thr:count may increase bandwidth usage by reducing the  
number of instances where you can send '304 Not Modified' responses.   
Publishers should weigh the value of those attributes against the  
cost of providing them when deciding whether to include them in their  
feeds.




Re: xml:base in your Atom feed



On Mar 31, 2006, at 4:12 PM, Sam Ruby wrote:

Antone Roundy wrote:

Sam,

Funny that this should come up today given the recent discussion  
on  the

mailing list--NetNewsWire isn't getting the links in your Atom  feed
right.


There is an off chance that I have been following the list.  ;-)


I certainly didn't mean to imply that you weren't--I just wanted to  
point out what I'm seeing in case you didn't know that this  
particular feed reader is having this particular problem today.  And  
I thought it might be of interest to the WG to know what NNW is doing  
given that it's doing something that has been argued against within  
the last 24 hours.


I don't remember which version of your feed I was subscribed to  
before--perhaps I wasn't subscribed to the Atom feed and NNW updated  
my subscription when you redirected to it. So I don't know whether  
you purposely removed xml:base to see what chaos would ensue, or  
whether it hasn't been there all along and I just haven't seen the  
problem since I was subscribed to a different version.




xml:base in your Atom feed



Sam,

Funny that this should come up today given the recent discussion on  
the mailing list--NetNewsWire isn't getting the links in your Atom  
feed right.  I looked at the source, and it's clearly a NetNewsWire  
bug since it's not even trying to resolve relative to the URI from  
which it retrieves the feed.  In fact it appears to be resolving  
relative to the alternate link (), and not doing  
such a good job of it--for example, instead of www.intertwingly.net/blog/2006/03/31/Rogers-Switches>, it's pointing  
to --but I wonder whether  
it would get it right if you set xml:base explicitly.


Antone



Re: Does xml:base apply to type="html" content?



On Mar 30, 2006, at 10:30 PM, James M Snell wrote:

Antone Roundy wrote:

[snip]
2) If you're consuming Atom and you encounter a relative URI, how  
should

you choose the appropriate base URI with which to resolve it?

I think there are only three remotely possible answers to #2:  
xml:base
(including the URI from which the feed was retrieved if xml:base  
isn't

explicitly defined), the URI of the self link, and the URI of the
alternate link.  Given that Atom explicitly supports xml:base, if  
it's

explicitly defined, it's difficult to justify ignoring it in favor of
anything else.


There is no basis in any of the specs for using the URI of the self or
alternate link as a base uri for resolving relative references in the
content.  The process for resolving relative references is very  
clearly

defined.


Right--my point is:

1) If the original publisher made the mistake of using relative  
references without explicitly setting xml:base (figuring that  
consumers could resolve the references relative to the location of  
the feed), and then the feed got moved or mirrored, one would  
certainly fail at finding the things the publisher intended to point  
to if the URI from which the feed was retrieved was used as the base  
URI, but might succeed by using the self link as the base URI.  (I do  
not advocate doing this as default behavior, as stated below).


2) If the original publisher made the mistake of not even thinking  
about relative references in the content and therefore didn't set  
xml:base, the relative references may very well be relative to the  
location pointed to by the alternate link.  For example, the person  
generating the content may have been thinking "my blog entry will  
appear at http://example.org/blog/2006/03/foo.html, so I can use the  
relative URL "../../../img/button.gif" to point to the image at  
http://example.org/img/button.gif";.  If the alternate link points to  
http://example.org/blog/2006/03/foo.html, then the consumer that  
wants to find the image will only succeed by using the alternate link  
as the base URI.  (I do not advocate doing this as default behavior,  
as stated below).


Moral of this story: failing to explicitly set xml:base is bad  
because it tempts consumers to ignore the spec in order to get what  
they want.  I do not advocate ignoring the spec as default behavior.   
But honestly, I might give the user of a consuming application the  
option of overriding the default behavior on specific feeds if they  
know that the publisher makes the mistake of publishing links  
relative to the self or alternate link without setting xml:base.  I'd  
LIKE to be able to hold the publisher's feet to the fire and make  
them fix the feed, but sometimes my users hold MY feet to the fire  
and make me give them usable workarounds.


Antone



Re: Does xml:base apply to type="html" content?



On Mar 31, 2006, at 7:01 AM, A. Pagaltzis wrote:

* M. David Peterson <[EMAIL PROTECTED]> [2006-03-31 07:55]:

I speaking in terms of mashups... If a feed comes from one
source, then I would agree...  but mashups from both a
syndication as well as an application standpoint are become the
primary focus of EVERY major vendor. Its in this scenario that
I see the problem of assuming the xml:base in current context
has any value whatsoever.


No. That is only a problem if you just mash markup together
without taking care to preserve base URIs by adding xml:base
at the junction points as necessary.

Copying an atom:entry from one feed to another correctly requires
that you query the base URI which is in effect in the scope of
the atom:entry in the source feed, and add an xml:base attribute
to that effect to the copied atom:entry in the destination feed.
If you do this, any xml:base attributes within the copy of the
atom:entry will continue to resolve correctly.

It’s much easier to get right than copying markup without
violating namespace-wellformedness, even.


Exactly.  When creating a mashup feed, there are any number of things  
that the ... "masher"(?) has to be careful of--for example:


* Getting namespace prefixes right
* Creating an atom:source element and putting the right data into it
* Ensuring that all entries use the same character encoding
* Ensuring that the xml:lang in context is correct
* Ensuring that the xml:base in context is correct
* If any of the source data isn't Atom, ensuring that all the  
required elements exist (...even if the source data IS Atom--you  
never know when you're going to aggregate from an invalid Atom feed-- 
then you have to decide whether to "fix" the entry or drop it to make  
your output correct)


If we start assuming that "mashers" can't do those correctly, then we  
may as well not be using Atom, or even XML.  If we did a proper job  
of specifying Atom, then we should be able to hold publishers' feet  
to the fire and make them get their feeds right.  In Atom, xml:base  
is the mechanism used to determine base URIs.




Re: Does xml:base apply to type="html" content?



On Mar 30, 2006, at 10:00 PM, M. David Peterson wrote:
Then it should be a best practice that if they invoke this, the  
xml:base value should be set upon the "element containing the  
text", in this case, the content element.  Obviously you can't  
simply assume that the current xml:base in context has any direct  
relation, and therefore value to the current entry/content in  
context, as, using Aristotle's use case (and a billion others just  
like it -- if not a billion now, it won't be too long before that  
number is quite realistic, and in fact only scratching the Atom  
feed surface of the not too distant future), there is no way that  
one can simply assume that the current @xml:base value is legit.


I disagree.  The best practice should be to set xml:base explicitly  
in any document using relative URIs, and at any point in the document  
where the relative URIs appear, ensure that the xml:base in context  
is the correct base URI by overriding it if necessary.  If this  
practice is followed, and only if this practice is followed, then  
consumers will be able to reliably resolve relative URIs.  I see no  
justification for assuming that the xml:base in context is invalid  
and using some other base URI just because xml:base is set somewhere  
other than the containing element.  It's a pretty sorry world if we  
not only assume, but operate on the assumption that publishers are  
and will continue to be that inept.


Just to amplify one point:
you can't simply assume that the current xml:base in context has  
any direct relation...


What you can't simply assume is that it the xml:base in context does  
NOT have any direct relation to the content.  Part of the point of  
XML is that we'll all be better off if consumers rely on publishers  
doing things correctly (in this case, getting xml:base right) and  
hold publishers to it until they get it right.


Antone



Re: Does xml:base apply to type="html" content?



On Mar 30, 2006, at 8:34 PM, M. David Peterson wrote:

...the content element can be basically anything as long as its either

- non-escaped plain text with a @type value set to text,
- escaped text,with a @type set to a valid 'text' mime-type
- enitity escaped with @type set to html,
- xhtml wrapped in a properly xhtml namespaced div with @type set  
to xhtml,

- base64 encoded with @type set to the proper media type, or
- its xml with @type set to a proper XML mime-type.

In each of these cases, the only one that shold have even a remote  
chance of the current value of the @xml:base in current context  
applying to is inline xml.

...
The escaped HTML content contained within the content element that  
David was originally concerned with is more than likely a copy of  
all or part of the elements and content contained inside the body  
tag of the external document referenced by an associated link  
element, and therefore no guarentee that the xml:base of the atom  
feed is going to be anywhere even close to accurate.


On what basis are you concluding that Atom publishers are more likely  
to be smart enough to set xml:base correctly when publishing inline  
XML than when publishing escaped HTML?  What if the source material  
is tag soup HTML?  You could clean it up and turn it into XHTML or  
publish it as is as escaped HTML.  Either option is valid, and may be  
preferable in some situations.  I don't see how any assumptions can  
be made about the publisher's ability to set xml:base correctly based  
on the content type.


If you're assuming that xml:base is going to be set only at the top  
of the Atom document, then it may very well fail to be correct for a  
lot of the content.  But xml:base may also be set at on the entry or  
content element, and could easily be set correctly based on the  
publisher's knowledge of the appropriate base URI for the content.


Anyway,theoretical arguments aside, there are two questions to answer  
for the real world:


1) If you're publishing Atom, in which content @types can you use  
relative URIs with reasonable confidence that consumers will apply  
the base URI correctly?


2) If you're consuming Atom and you encounter a relative URI, how  
should you choose the appropriate base URI with which to resolve it?


I think there are only three remotely possible answers to #2:  
xml:base (including the URI from which the feed was retrieved if  
xml:base isn't explicitly defined), the URI of the self link, and the  
URI of the alternate link.  Given that Atom explicitly supports  
xml:base, if it's explicitly defined, it's difficult to justify  
ignoring it in favor of anything else.


If xml:base isn't explicitly defined, there may be some justification  
for using the self link rather than the URI from which the feed was  
retrieved.  It's sloppy on the publisher's part, but might be more  
likely to succeed in practice.


The alternate link is only a possible choice if there is at least one  
alternate link, and if either there is only one, or there are more  
than one, and all of them point to documents in the same directory.  
I'd say it's a fairly weak choice.


Conclusion: you've got to resolve relative URIs with respect to  
SOMETHING, and clearly the best choice is xml:base if it's explicitly  
defined. If not, the self link and the URI from which the feed is  
retrieved each have some merit.


If that's the correct answer for #2, then in a reasonably perfect  
world, the answer to #1 should be that relative URIs should be safe  
anywhere as long as you're explicitly (and correctly!) defining  
xml:base.  In the real world, I'd guess that more consuming  
applications will get it right in inline XML than in escaped HTML.




Re: atom:name ... text or html?



On Mar 23, 2006, at 9:48 AM, James Holderness wrote:
Hahaha! It's RSS all over again. In the words of Mark Pilgrim:  
"Here's something that might be HTML. Or maybe not. I can't tell  
you, and you can't guess." :-)


Seriously though, the atom:name element is described as "a human- 
readable name", so unless your name really is "Betrand  
Caf&eacture;" that can't be right. If RFC4287 had intended to allow  
markup in the element it would have used atomTextConstruct.


I agree with James here--if we had intended for the name to be able  
to include markup, we should have used the construct we created to  
allow that.  This from RFC 4287 (section 3.2):


   element atom:name { text }

would have been this:

   element atom:name { atomTextConstruct }

if we had intended for it to be able to contain anything but literal  
text after XML un-escaping, right?


On Mar 23, 2006, at 9:57 AM, Eric Scheid wrote:
It's true that XML has only a half dozen or so entities defined,  
meaning
most interesting entities from html can't exist in XML ... unless  
maybe they

are wrapped like in CDATA block like above?
If they're wrapped in a CDATA block, then they don't trigger an XML  
parsing error, but wrapping something in CDATA isn't a license to  
enter data in a format other than what the RFC allows.


I'm getting the data by scraping an html page, so I'm expecting it  
to be

acceptable html code, including html entities.
You, the producer, are getting the data from an HTML page, so you  
should certainly be prepared to handle HTML entities in it. But you  
the Atom publisher are responsible for making sure that you've made  
any changes to the data that are necessary for it to be proper Atom  
before you publish it. The consumer of the Atom feed doesn't know  
where you got the data, and thus can't be expected to decide how to  
process it based on where you got it.




Re: Feed paging and atom:feed/atom:id



On 10 Mar 2006, at 18:44, James M Snell wrote:

If the feeds have the same atom:id, I would submit that they form a
single logical feed.  Meaning that all of the feed documents in an
incremental feed (using Mark's Feed History terminology) SHOULD use  
the

same atom:id value.  This is the way I have implemented paging in our
APP implementation.  If the linked feeds have different atom:id  
values,

they should represent different logical feeds.


Agreed.  From 4.2.6:

   "Put another way, an atom:id element
   pertains to all instantiations of a particular Atom entry or feed;
   revisions retain the same content in their atom:id elements."

All the Atom Feed Documents representing one incremental feed (or  
parts of one incremental feed) are "instantiations of a particular  
Atom ... feed", are they not?  So they should have the same value in  
atom:id.  If they don't, then they can't be considered instantiations  
of the same Atom feed.




Re: IE7 Feed Rendering Issue



On Mar 9, 2006, at 12:07 PM, James M Snell wrote:

As an alternative, Feed Readers can provide publishers with a way of
specifying optionally applied styling for feeds and entries.. e.g.,


  ...
  
  ...
  
...

...
  



Given my opinion on the use of the link element, I suppose I should  
propose an alternative:



...


or

http://..."; />

Either method permitted, like how we do atom:content.  'type="text/ 
css"' optional, or is it needed?  Warning to those daring to try the  
second that some feed readers won't bother downloading the external  
file.  Warning to publishers that if they specify styles for "body",  
for example, some readers may say "there's no body element in the  
content, so I'll ignore this rule" (so put the content in a container  
with an ID or class and set the style for that instead), and others  
may say "how dare you try to take over the styling of the body when  
the body element isn't allowed in the content, I'll ignore this  
rule", and others may just ignore all or some of it for whatever  
reason they wish.  Can be at feed or entry level and be intended for  
application to its siblings and their children (those with textual  
content only--and of course, some clients may not apply it to all  
siblings and children even if they are textual).  If we really want  
to get fancy (big if), we could add @apply-to="content", but then you  
get into the qnames in attributes problem...  Or we could specify  
that it only applies to atom:content and perhaps atom:summary (and  
any extension element that explicitly specifies that it applies).


Well, that's enough off the top of my head.

Antone



Re: Link rel attribute "stylesheet"



On Feb 27, 2006, at 8:29 AM, M. David Peterson wrote:
When you say "what it was designed for" can you be specific as to  
what that definition is?
Well, we failed to gain consensus on that.  Some of us wanted it to  
be used only for links intended to be traversed by the user (like the  
 element in HTML with an href attribute--the link is there so that  
the user can click it and get to the linked resource).  Others didn't  
want this limitation, but wanted the link to be resolvable (eg., no  
tag: URIs).  Others wanted to be able to stick any URI in it.  So  
there is no tightly defined "what it was designed for".


I'm just saying that if an extra attribute is required to  
disambiguate what's being pointed to in a case like the following  
(without requiring the link target to be loaded and inspected), then  
maybe you're trying to make this one element do too much:


http://example.org/atom-2-rss-2.0.xsl"; />
http://example.org/atom-2-rss-1.0.xsl"; />
http://example.org/atom-2-fooml.xsl"; />
etc.

If one were to encounter such a list of links at the top of an Atom  
document, which should one use?  Should one download all of them and  
then pick one?  Or are you going to add an attribute something like  
this:


http://example.org/atom-2-rss-2.0.xsl";  
ext:targettype="application/xml+rss" />


Sorry, new to the conversation, but I have particular interest in  
this topic as it is my belief that the URI/IRI can be used to imply  
a lot of information that is otherwise hidden from view, or uses  
more complex mechanisms to achieve the same result.  If there is  
real concern as to this approach, it would be great to gain a  
greater understanding as what they are such that I can apply this  
to the work I am doing in this area.


For a particular example of what I mean, please see this post >  
http://www.xsltblog.com/archives/2006/02/what_rest_gets_1.html <
Hmm.  If I'm reading that right, I wouldn't want to organize my  
websites that way.  And unless the specification for the "stylesheet"  
link relation were to mandate that URIs be constructed in a way  
enables readers to tell from the local path what type the stylesheet  
is going to transform the feed to, you wouldn't have any way to know  
whether you could apply such an interpretation in any given case.  I  
don't really see the benefit of putting the information into the URI  
versus creating an attribute whose sole purpose is to specify the  
type.  The number of bits it would save is trivial, and it would  
require the extra step of parsing the URI's local path to pull  
information out of it that could be taken more easily from a  
dedicated attribute.


Antone



Re: Link rel attribute "stylesheet"



On Feb 26, 2006, at 9:10 PM, James Yenne wrote:
My feeds contain a generic xml-stylesheet, which formats the feed  
for display along with a feed-specific css.  Since xsl processors  
do not have a standard way to pass parameters to xsl stylesheets, I  
provide this feed-specific css to the xsl processor in the feed as  
a link with rel="stylesheet".  Generating xhtml with this xsl/css  
solution works for rendering both in IE6 and FF1.5.  (Why does IE7  
rip out xml-stylesheet directives?)


A link rel="stylesheet" seems to be the most efficient solution,  
however, a fully qualified URI relation does the job too.  I would  
like to request a stylesheet link relation be added to the IANA  
List of Relations and supported in the validators.  Thoughts?


One problem with this is that there's no machine readable way without  
an extension attribute to indicate what format the stylesheet is  
going to transform the data to.  If you're going to add an extension  
attribute, I'd suggest just making the whole thing an extension  
element instead.


Of course, my opinion is partly based on my preference which was  
rejected by the group for limiting the link element to links intended  
for traversal, so maybe that doesn't matter.  But certainly the  
possibility should be considered that this is stretching the use of  
the link element beyond what it was designed for.


Antone



Re: partial xml in atom:content ?



On Jan 17, 2006, at 11:04 AM, James Holderness wrote:
but I think I've shown some pretty compelling reasons why a  
producer (if they really absolutely have to use "application/xhtml 
+xml"), would be wiser to use an xhtml document fragment than a  
complete xhml document.


I'm all for consuming applications that want to be really smart  
checking whether the content of  is a fragment or a complete document and handling either, but  
if your content is an xhtml document fragment, is there any reason at  
all to publish type="application/xhtml+xml" rather than  
type="xhtml"?  The only justification that comes to mind is if you  
want to make a political protest statement against the required  
wrapper div.  But unless you prominently warn your users that your  
app is doing this, you're doing them a grave disservice by making  
their feed content less likely to be seen.




Re: partial xml in atom:content ?



On Jan 16, 2006, at 4:21 PM, James Holderness wrote:
For example, below are the results of some tests I've run on 15  
aggregators. The tests included the use of a  tag as the root  
element, a  tag as the root element, and an  tag as the  
root element (i.e. a complete xhtml document).


The following applications worked with all three tests:
BlogBridge 2.7
Bloglines
BottomFeeder 4.1
Google Reader
Snarfer 0.1.2

The following applications worked with the  tag and the   
tag, but failed to handle a full document (the  tag):

FeedDemon 1.5
GreatNews 1.0.0.354
Newz Crawler 1.8.0
RSS Bandit 1.3.0.38
SharpReader 0.9.6.0


Out of curiosity, what constitutes success in the  case?  I'm  
mostly curious about the browser based readers.  If they displayed  
the content within a webpage, but failed to strip out the , html>,  and  tags and  section (assuming the test  
feed contained one), would that be a success or failure?  What did  
the apps that failed do in the  case?




Re: partial xml in atom:content ?



On Jan 15, 2006, at 8:09 PM, James Holderness wrote:
Thus, can atom be used to ship around parcels of xml snippets? I  
suppose it
could, but only so long as both ends knew what was going on, and  
knew naïve

atom processors might barf on the incomplete xml, right?


The one time I'd think it might be safe is with XHTML (as I  
mentioned in a previous message) since Atom processors are already  
required to handle XHTML fragments in the content element. Anything  
else would be highly risky unless it was a proprietary feed  
communicating between two known applications.


Processing type="xhtml" and type="application/xhtml+xml" are very  
different beasts.  Say your application converts Atom feeds to HTML  
to display in webpages.  With type="xhtml", the data could just be  
dumped into the webpage (after appropriate stripping of nasty tags  
and CSS and such).  With type="application/xhtml+xml", you'd have to  
figure out to do with everything outside of the  element.  If  
there's CSS involved for example, simply throwing it away could lead  
to some very messed up display.  But assuming your application is  
being called from within the webpage, it's not going to have the  
opportunity to add a 

Re: Sponsored Links and other link extensions



On Oct 25, 2005, at 1:16 PM, James M Snell wrote:
Also, assuming the title on the main link is supposed to describe  
the download file itself, there appears to be no way to inform the  
user of the mirror location of the main URI. Without a location  
name of some sort, the user can't make an informed decision about  
which mirror would be best to use. Perhaps something along the  
line of Antone's "label" suggestion might help here.




I could just do this:

http://example.com/ 
softwarepackage.tar.gz" type="application/x-gzip" x:group="software- 
package" nf:follow="no" >
http://example.com/softwarepackage.tar.gz";  
label="Main Server" />
http://example2.com/softwarepackage.tar.gz";  
label="California Server" />
http://example3.com/softwarepackage.tar.gz";  
label="European Server" />




or this:

http://example.com/ 
softwarepackage.tar.gz" type="application/x-gzip" x:group="software- 
package" x:label="Main Server" nf:follow="no" >
http://example2.com/softwarepackage.tar.gz";  
x:label="California Server" />
http://example3.com/softwarepackage.tar.gz";  
x:label="European Server" />





Re: Sponsored Links and other link extensions



On Oct 25, 2005, at 11:04 AM, James M Snell wrote:

All-in-one example

The x:group attribute links the two alternates into a single  
grouping; the x:mirror specifies the mirrors for each link.   
nf:follow="no" is my Atom Link No Follow extension that tells  
clients not to automatically download the enclosure.  Dumb clients  
will see what amounts to the current status quo, two different  
enclosures of different types.  Smart clients will see the mirrors,  
the grouping and the no-follow instruction.


http://example.com/softwarepackage.zip";  
type="application/zip" x:group="software-package" nf:follow="no">
 http://example2.com/softwarepackage.zip";  
title="California Server" />
 http://example3.com/softwarepackage.zip";  
title="European Server" />


http://example.com/ 
softwarepackage.tar.gz" type="application/x-gzip" x:group="software- 
package" nf:follow="no">
 http://example2.com/softwarepackage.tar.gz";  
title="California Server" />
 http://example3.com/softwarepackage.tar.gz";  
title="European Server" />



Thoughts?


The only thing I would change is the name of x:mirror/@title to make  
it clear that it isn't intended(?) to replace the parent link's  
@title.  My current favorite name is "label".




Re: Sponsored Links and other link extensions



On Oct 25, 2005, at 12:59 AM, A. Pagaltzis wrote:

I am asking if is there a generic way for an application to
implement alternate-link processing that gives sensible behaviour
for any type of main link. If an implementor has to support
alternative links explicitly for each type of main link, where’s
the difference to having specific relationships for alternative
links depending on the main link type?


Here are a few examples of generic processing algorithms an  
application might use:


Mirrors:
1) Randomly selecting a mirror to download from, thus helping to  
spread the bandwidth usage among them.
2) Try the main link, and if the DNS lookup fails, or a connection  
can't be made or something, automatically try the next one.
3) Ping each of the servers in the background, and if the user clicks  
the link, use the fastest one.


Alternates:
1) Have a prioritized list of formats, and choose the link that  
points to the highest priority format.
2) Of all the formats the app supports, choose the one with the  
smallest @length, if present.


Either one:
1) Show some sort of UI for selecting which link to follow (perhaps  
have the main link selected by default, but allow the user to select  
an alternate from the popup).


None of those ideas is necessarily tied to any particular link  
relation.  They might be more important for enclosures than any of  
the other relations that have been defined so far, and an application  
may or may not do some for enclosures that it doesn't do for some  
other specific link relations.  But again, it comes back to the yet  
unanswered question, are there any disadvantages to keeping it  
generic?  I haven't heard anyone suggest any downside yet--only that  
some people can't imagine why anyone would want to use alternative  
links for anything but enclosures.




Re: Sponsored Links and other link extensions



On Oct 24, 2005, at 9:59 PM, A. Pagaltzis wrote:

* Antone Roundy <[EMAIL PROTECTED]> [2005-10-25 00:35]:

2) You can break lines between elements, but you can't inside
an attribute, so it's better for display for humans.

That’s not what the XML spec says.


Doh!  Who knows where I got that idea.  I still prefer to have each  
piece of data in it's own place.



What if someday somebody does come up with a non-enclosure use
for this (which hardly seems far-fetched to me--enclosures
aren't the only things that get mirrored or exist in multiple
formats)? They'll have to define a new mechanism for it which
is either going to be identical except for element names, or
they're going to invent another way to do the same thing.
Either way, the pain of supporting both is completely
unnecessary unless there's potential for generality causing
problems.


If it isn’t obvious from the start what it means that there’s
an alternative-link for a via link or a previous or next link,
then clients will have to support each of these use case
separately. So on the implementor’s end, there’s no discernible
difference between the pain of supporting either approach.


I'm not sure I understand what you're saying.  Are you saying that  
one might do this if they want and alternate of a "next" link?





If that's what you mean, then sure, the code for that would be the  
same as for:





...but it would sure look odd.  I see no advantage to naming these  
things in terms of enclosures.




Re: Sponsored Links and other link extensions



On Oct 24, 2005, at 2:59 PM, A. Pagaltzis wrote:

* Antone Roundy <[EMAIL PROTECTED]> [2005-10-24 22:35]:

Interesting. Filling an attribute with a list of URIs doesn't
really appeal to me though. How about this:

http://example.com/
file.mp3" xml:id="x-file">
http://www2.example.com/file.mp3"; />
http://www3.example.com/file.mp3"; />



It’s a lot more verbose and you have to fiddle with nesting.

What do you get in return? “It looks more XMLish”?
1) Easier parsing, as James said, since your XML parsing library is  
going to give you the data with the URI's already split apart.


2) You can break lines between elements, but you can't inside an  
attribute, so it's better for display for humans.


I think XMLishness leans this direction for good reason.


Sounds good, but you may have noticed above that I used a
prefix not specific to enclosures--there's no reason to tie
this all to one particular type of link (nor to make it look
as if it were tied to one specific link type). So the other
link might, for example, be:


I don’t know if striving for generality in this fashion without
a practical need is worthwhile. It smells of architecture
astronautics for a reason I can’t particularly pinpoint. So maybe
my instinct is wrong.
The way I see it, striving for specificity without a practical need  
isn't worthwhile.  Unless generalizing risks leading to some sort of  
problem, why do it?  I see no potential problems.


What if someday somebody does come up with a non-enclosure use for  
this (which hardly seems far-fetched to me--enclosures aren't the  
only things that get mirrored or exist in multiple formats)?  They'll  
have to define a new mechanism for it which is either going to be  
identical except for element names, or they're going to invent  
another way to do the same thing.  Either way, the pain of supporting  
both is completely unnecessary unless there's potential for  
generality causing problems.




Re: Sponsored Links and other link extensions



On Oct 24, 2005, at 1:48 PM, A. Pagaltzis wrote:

I have a completely different proposition.

(e)
http://example.com/file.mp3";
encl:mirrors="http://www2.example.com/file.mp3 http:// 
www3.example.com/file.mp3"

xml:id="x-file"
/>
http://example2.com/file.ogg";
encl:alternative-to="x-file"
/>

Since bit-for-bit identical files all have the exact same
attributes, there is absolutely no reason to have an entire tag
dedicated to each. In addition, making mirror URLs second-class
citizens in this ways provides an intuitive hint at the
bit-for-bit identity semantics.
Interesting.  Filling an attribute with a list of URIs doesn't really  
appeal to me though.  How about this:


http://example.com/ 
file.mp3" xml:id="x-file">

http://www2.example.com/file.mp3"; />
http://www3.example.com/file.mp3"; />



Specifying alternative formats with a distinct link relationship
prevents bandwidth and diskspace drain from oblivious clients.
Sounds good, but you may have noticed above that I used a prefix not  
specific to enclosures--there's no reason to tie this all to one  
particular type of link (nor to make it look as if it were tied to  
one specific link type).  So the other link might, for example, be:





Although "alternative-link" doesn't tell you what kind of link this  
is, since you're going to have to tie it back to the primary link to  
decide what to do with it anyway, it really shouldn't matter.  Note  
that I changed "alternative-to" to "primary" just because it's  
shorter and one word.




Re: New Link Relations -- Ready to go?



On Oct 24, 2005, at 11:16 AM, James Holderness wrote:
A more sensible approach would be a single feed document containing  
the top N results (where N is manageable in size). You could  
subscribe to that as a non-incremental feed and you would know at  
any point in time which were the top 10 results. There is no real  
need for paging other than as a form of snapshot history (i.e. what  
were the top 10 results last week).
That is certainly a good approach--allowing the number of results to  
be determined dynamically by something in the URL, for example.   
However, it could be useful to limit the chunk size and allow paging  
for people who want more.  For example, you might allow a maximum of  
50 results per chunk, and then support ETags.  That way, if somebody  
wants to monitor the top 250, they can send 5 requests, and if most  
of the time there are no changes, they'll get a lot of 304s, but if  
occasionally something changes in the last chunk of 50 for example,  
they're only downloading 50 results each time something changes.   
There are of course other approaches, like support for just sending  
the diffs.  But that would probably more difficult for most people to  
implement, and may be less likely to be supported by a wide variety  
of clients.


Another reason for wanting to limit the number of results per query  
(and support paging for those who want more) is to avoid bandwidth  
waste if someone accidentally ads an extra digit to the desired  
number of results; or tries to waste your system resources by  
requesting huge result sets (but dropping the connection before using  
up their own bandwidth actually receiving the whole result set); or  
has a client that doesn't support paging or diffs or ETags or  
anything, and wants a huge result set (and you don't want to  
accommodate them since it would use so much bandwidth), etc.


Once again, I have to ask the same question I asked Thomas: do you  
have a problem with Mark's next/prev proposal as it stands, or are  
you just arguing with me because you think I'm wrong? If the  
latter, feel free to just ignore me. We can agree to disagree.  
Unless we're discussing a particular proposal I don't see the point.
I have a problem with not having link relations specific to paging  
through a feed's current state.  I'm fine with having general chain  
navigation link relations, but hope that we'll get something specific  
to paging and that people will use it instead of the general link  
relations.  I've spoken my peace on that and have given up swimming  
against the tide, but am still willing to discuss specific related  
issues.




Re: Sponsored Links and other link extensions



On Oct 24, 2005, at 5:18 AM, James Holderness wrote:

Eric Scheid wrote:
The challenge with using alternate to point to files of different  
types
is that why would someone do (a) when they can already do (b)  
without

the help of a new extension

(a)
http://example.com/ 
file.mp3">
http://example2.com/ 
file.ogg" />



(b)
http://example.com/file.mp3"; />
http://example2.com/file.ogg"; />



With (a), we know the .mp3 and the .ogg are simply different  
formats of the

same thing. With (b) we don't know either way.


I like (a) in concept because, as you say, it enables you to tell  
when two links are the same so if you're auto-downloading you don't  
need them both. However, I do think James is right in thinking that  
many people will just use (b) because it's already there.


I don't see the harm in allowing (a) though. If a feed producer  
uses (a) and an end-user has auto-downloading enabled for that  
feed, they both benefit from less wasted bandwidth. The only  
downside would be that aggregators that aren't aware of this  
extension would fail to see the alternate enclosures. Is that so  
bad though? It's a trade-off the feed producer has to make - I'm  
not sure we should be making that decision for them.


Here's the middle path:

(c)
http://example.com/ 
file.mp3" x:link-set="a" />



This won't save you from bandwidth waste by aggregators that don't  
support the extension, but it also won't prevent users of those  
aggregators from getting the data in a format they can use.  That  
said, this is not my preferred method.  I'd rather protect bandwidth  
and the user's hard drive space--all the more important because  
enclosures are often quite large.


Here's a final option--is it legal?  Is it better or worse than (a)  
in any ways?


(d)
http://example.com/ 
file.mp3">




Better: Doesn't require processing of a new namespace or element-- 
just a new way of using the data that one gets out of an existing  
element.


I prefer d, a, c and then b.



Re: Profile links



On Oct 23, 2005, at 6:45 PM, James Holderness wrote:

James M Snell wrote:
1. Can a profile element appear in an atom:feed/atom:source?  If  
so, what does it mean? I think it should with the caveat that the  
profile attribute should only impact the feed and should not  
reflect on the individual entries within that feed.


I can't see any particular use for atom:source myself, but I would  
definately want profile support at the feed level. As an aggregator  
I want to be able to display a custom view for a particular feed  
based on what it contains (e.g. slideshow view if it's a flickr  
feed - all images). It would be difficult to do something like that  
with only entry level profiles.


I don't think it's possible to allow something at the feed level, but  
disallow it in atom:source (the Atom format spec could have done  
that, but I don't think an extension can add such restrictions).


What does it mean in atom:source?  That the feed that the entry came  
from conformed to the profile.


What will consuming applications do with profile elements in  
atom:source?  That's entirely up to the application developer.  Maybe  
nothing--maybe they'll ignore profiles that don't apply to the entire  
feed.  Or maybe they'll come up with something useful.




Re: New Link Relations -- Ready to go?



On Oct 24, 2005, at 8:13 AM, James Holderness wrote:
With what we have so far we can do incremental feed archives; we  
can do at
least some form of searching; we can do non-incremental feeds (of  
the "Top

10" variety) with history. I think that's a good start.


But we also want paged non-incremental feeds (OpenSearch result  
feeds),

while "non-incremental feeds with history" have not yet proven to be
needed.


I still don't see why OpenSearch result feeds can't be implemented  
as incremental feeds.
Perhaps they can, but that wouldn't always be desirable. Consider  
this scenario: Somebody writes a program that searches Google,  
scrapes the HTML results, and publishes them as an Atom feed.  My  
purpose in subscribing to the feed is not to be alerted when a new  
webpage is added to page 20 of Google's results, it's to be alerted  
whenever a new webpage makes it onto page 1.  So I don't want new  
pages added to the live end of the feed--I just want whatever is  
currently in the top 10 results, and my feed reader will tell me when  
one of them is one it hasn't seen before.


Either they're being used as a one-off search and you can't  
subscribe to them (in which case there is no difference between  
incremental and non-incremental), or they're being updated with new  
results over time (like a filtered aggregate feed) in which case I  
would think they have to be incremental.

Given the above scenario, why wouldn't you be able to subscribe to them?

I'm proposing previous/next linking from chunk to chunk inside the  
same
snapshot and adding a new link relation (or set of link relations)  
for

linking from snapshot to snapshot.

Do you now see what I'm talking about?


I understand what you're talking about, but I just don't see the  
need. I would have expected a non-incremental feed to be a single  
Atom document.
In the case of something like a top 10 feed, I'd imagine it would  
be.  But a search results feed like what's described above may not be.


My reason for wanting paging is so that a user doesn't need to  
fetch data that he already has - this can never be a problem with a  
non-incremental feed because it doesn't grow.
I'm not sure I understand--it's not as if a non-incremental feed were  
simply a static document.  They're resources whose contents are  
replaced wholesale (with the things that were in the old set possibly  
still being in the new set) rather than having their old contents  
augmented when new things are added.




Re: New Link Relations -- Ready to go?



On Oct 21, 2005, at 7:19 PM, James Holderness wrote:
What's the difference between a search feed and a non-incremental  
feed? Aren't search feeds one facet of non-incremental feeds?


Not necessarily, no. A search feed could quite easily be  
implemented as an incremental feed. This is the most sensible  
approach since it would allow the feed to be viewed in all existing  
aggregators without requiring a special knowledge of non- 
incremental feeds.
If your goal is to work as well as possible with today's client  
software, then bending your data to fit their model is the most  
sensible approach, but that's not always the goal.


The initial feed document consists of all known results at the time  
the search is initiated. As new results are discovered over time,  
the feed can be updated by adding new entries to the top of the  
feed in much the same way that new entries would be added to the  
top of a blogging feed. In fact, if you do a search with something  
like feedster, this is exactly the sort of feed you will get back.
If creation time is relevant to the data being searched, then this  
makes sense.  But what if I want to subscribe to the top 10 Google  
results for some keywords I'm trying to optimize my site for  
(ignoring the fact that Google doesn't return search results in any  
feed format right now)?  Or what about alternative sort orders which  
are available on sites like Feedster, Google News, etc.? (You can  
sort by relevance rather than date--the date still has some weight,  
but the results aren't strictly in date order). How about Amazon.com  
affiliates who want to use an RSS parser to display affiliates links  
to "best sellers" search results?  There are a lot of search use  
cases that don't fit the incremental model.


All that said, search results are often a bit different than "top 10"  
lists and the like.  With search results, you often don't want to  
view the contents of the feed in order all at once--the first time  
you do, but after that, you may just want to see new things as they  
make it up into the top positions.  Today's clients can handle that  
just fine, unless you want to monitor more than just the first page  
of results.




Re: What is this entry about?



On Oct 21, 2005, at 5:47 PM, James M Snell wrote:

Err, are you forgetting atom:category? Doesn’t that satisfy all
your wants *and* more? It has a URI, a term and a human-readable
label.

Regards,


I dunno, that's why I was asking ;-)

atom:category works well for categorizing entries, but does it  
really tell us what the entry is about?  For instance, suppose that  
I want to indicate that an entry is about http://www.ibm.com and  
file that in a category called technology?  The categorization of  
the entry is different than the subject of the entry.. tho both are  
definitely related.


Why don't we define link/@rel="about" for pointing to a specific  
internet resource that an entry is about (a little more specific than  
the general case of rel="related").  I know we discussed this before  
and in the chaos of trying to hammer the spec out, didn't do it, but  
I still think it's a good idea.




Re: Application for addition to Atom Registry of Link Relations



On Oct 21, 2005, at 12:48 PM, James M Snell wrote:

Antone Roundy wrote:

The following two bits seem incompatible with each other:
   o  atom:entry, atom:feed and atom:source elements MUST NOT  
contain
  more than one 'license' link relation with the same   
combination of

  type and hreflang attribute values.


Note the caveat, "with the same combination of type and hreflang  
attribute values".  The idea is to prevent a single license from  
being appearing more than once.



   Multiple license link relations MAY be used to indicate that the
   informational content has been published under multiple copright
   licenses.  In such instances, each of the linked licenses is
   considered to be mutually exclusive of the others.


So if something is going to link to both the Creative Commons License  
and the GPL, only one of the two can be an English text/html  
document?  Maybe it would make more sense to say "... MUST NOT  
contain more than one 'license' link with the same href attribute."   
Isn't that the whole point?




Re: Application for addition to Atom Registry of Link Relations



The following two bits seem incompatible with each other:


   o  atom:entry, atom:feed and atom:source elements MUST NOT contain
  more than one 'license' link relation with the same  
combination of

  type and hreflang attribute values.



   Multiple license link relations MAY be used to indicate that the
   informational content has been published under multiple copright
   licenses.  In such instances, each of the linked licenses is
   considered to be mutually exclusive of the others.




Re: General/Specific [was: Feed History / Protocol overlap]



On Oct 19, 2005, at 11:12 AM, Mark Nottingham wrote:


"next"
"next-chunk"
"next-page"
"next-archive"
"next-entries"
are all workable for me.


...


Perhaps people could +1/-1 the following options:

* Reconstructing a feed should use:
   a) a specific relation, e.g., "prev-archive"
   b) a generic relation, e.g., "previous"


I'd prefer "prev-page".  "prev-archive" doesn't sound right for  
paging through search results.  Also, "prev-archive" or "next- 
archive" (whichever ends up going forward in time) doesn't quite work  
if the final step forward points to the subscription feed URI (which  
isn't an "archive".  That's a small matter since it's only that last  
step, but in search results type cases, "archive" would definitely be  
odd.


Just a little follow up on what I wrote last night about generic vs.  
specific link relations: "related" is a generic term that is likely  
to be a bit of a catch-all for links that don't have a specific  
relation defined for them.  "alternate" is a specific relation  
created for one of the major historical use cases for rss/link.  The  
proposed but not accepted "about" would have been the specific  
relation for the other major use case that rss/link was commonly used  
for.


"related" could conceivably handle the hypothetical use case of  
traversing a chain of different feeds--you'd just have to remember  
which "related" link to a feed document you had already traversed to  
know which one to follow next to continue down the chain.  It  
wouldn't be quite as nice for such an application as having a "next"  
and "prev" for that use, but I'd rather see it done that way till  
it's clear that such a thing is even needed than see intrafeed paging  
links used for interfeed navigation.




Re: Feed History / Protocol overlap



Here's what this discussion makes me think of--RSS has a link  
element.  That link was very generic, and has been variously used to  
link to what Atom calls link/@rel="alternate" and link/ 
@rel="related", and perhaps even other things.  Once we'd gained a  
little experience and discovered that the imprecision of the meaning  
of the element was limiting uses we wanted to make of feeds, we  
created more specific types of links.  Hopefully, we were specific  
enough this time that we won't run into significant use cases that  
we've rendered impossible, but who knows.


Now we're defining a method of navigating through a chain of linked  
documents.  We know of two specific use cases that we're sure we want  
to be able to do: paging through things like search results, and  
catching up on incremental feeds (or reconstructing the entire state  
of the feed, which is an extension of catching up).  It would appear  
that the same link relation can be used to do both of those things  
without the fear of conflict, because they operate within feeds that  
have a basic difference in nature, so they're unlikely to both be  
needed within one feed.  Also, from a certain point of view, they are  
really the same thing--a way to navigate through the current state of  
the feed.  The fact that incremental feeds don't have old states that  
have been discarded and replaced the way non-incremental feeds do  
(their former state gets augmented rather than being replaced)  
doesn't make a difference with respect to the issue of navigating  
through their current state.


So why don't we create a mechanism to do those two things (that are  
really one thing), and NOT make it generic enough to encompass other  
things that we might want to do someday, which might lead to the same  
sort of limitation that RSS has by only having one generic link  
element?  Sure, we COULD do all of our interdocument navigation using  
"next" and "prev" until someday when we decide that we need something  
more specific for some of the navigation use cases.  But then we'll  
be doing some of the same things multiple ways--some people sticking  
with "next" and "prev", and some using whatever new methods or link  
relations are invented, and nobody quite sure what "next" and "prev"  
mean in any particular feed.  Why not wait till we've really figured  
out what other ways we might want to navigate between documents, and  
then devise a new method for doing it?


If we're going to create some generic link relations for people to  
experiment with, let's create somethings that's explicitly for doing  
experimental things with so that the link relations we want to do  
more specific things with aren't rendered less useful by the  
experimentation.  Register "x-next" and "x-prev" or something for  
that, or register "next-page" and "prev-page" for the things we know  
we want to do.  Or don't register any such thing--just don't promote  
use of the the link relations we define for (reasonably) well  
understood use cases to do experimental things.


We'll, I've spoken my mind plenty on this issue, so unless somebody  
brings up an issue that my opinion on couldn't be understood from  
what I've written already, I think I'll leave it at that.  If we go  
with a highly-generic definition and it causes trouble down the road,  
I'll have some big ASCII art letters ready to say "I told you so".   
If not, then oops, I guess I was wrong.




Re: Feed History / Protocol overlap



On Oct 18, 2005, at 6:10 PM, Robert Sayre wrote:

On 10/18/05, Antone Roundy <[EMAIL PROTECTED]> wrote:

-3 to being that generic.


That's a very large negative number. Can you explain how your version
will me write software I otherwise couldn't?


Anything larger than -2 is bogomips--the point I was trying to make  
is that I think the idea of using the same link relation for paging  
within a feed and for navigating between feeds is absolutely absurd-- 
completely lacking in foresight--almost looks like an attempt to  
create for future problems.  People were complaining that trying to  
avoid problems with the hypothetical "top 100 DVDs" scenario (not  
trying to solve it--just trying to avoid problems if it comes about)  
was wandering too far off into hypotheticals, but now people want to  
make sure they can use the "next" relation for the arguably even more  
hypothetical idea of building a chain of otherwise independent  
feeds?  This boggles my mind.


Here's what my version will let you do that you won't be able to do  
if the definitions of these links allows them to be used for  
interfeed navigation--it will enable you to do paging within a feed  
that is also part of a chain of feeds (because anyone wanting to  
create a chain of feeds will have to come up with a non-conflicting  
link relation to do it).  It will also enable you to know that  
(unless somebody's breaking the spec) you are navigating through a  
single feed when you follow next and prev links around--that you are  
not jumping from feed to feed.  Your software will be able to follow  
those links with a much greater degree of confidence that it won't  
result in your users complaining "what the hell are you doing showing  
me entries from a feed I didn't subscribe to?"  It will enable your  
application to take more actions automatically without having to ask  
for confirmation from the user every time you follow another next or  
prev link to avoid such complaints.




Re: Feed History / Protocol overlap



On Oct 18, 2005, at 5:58 PM, James M Snell wrote:

Antone Roundy wrote:

On Oct 18, 2005, at 5:13 PM, Robert Sayre wrote:

On 10/18/05, Mark Nottingham <[EMAIL PROTECTED]> wrote:
rel: next
definition: A URI that points to the next feed in a series of feeds.
For example, in a reverse-choronological series of feeds, the 'next'
URI would point deeper into the past.


Ohh, nice readability.  Perhaps a few refinements:

A URI that points to the next in a series of Feed documents, each   
representing a segment of the same feed.  For example, in a  
reverse- chronologically ordered series of Feed documents, the  
'next' URI  would point to the document next further in the past.


-1 because each of the feed documents may not represent a segment  
of the same feed.  That's one potential use case, but it's not the  
only one.

+1 to Robert's version.

-3 to being that generic.  Surely, even if we can only imagine one  
method of paging through a single feed and thus aren't concerned  
about conflict for use of the "next" link, "paging", if you can even  
call it that (okay, so maybe you're not calling it that), between  
feeds using the same link relation as paging within each of the feeds  
is not just allowing for the possibility of conflicts, it's actively  
inviting conflicts.




Re: Feed History / Protocol overlap



On Oct 18, 2005, at 5:13 PM, Robert Sayre wrote:

On 10/18/05, Mark Nottingham <[EMAIL PROTECTED]> wrote:
rel: next
definition: A URI that points to the next feed in a series of feeds.
For example, in a reverse-choronological series of feeds, the 'next'
URI would point deeper into the past.


Ohh, nice readability.  Perhaps a few refinements:

A URI that points to the next in a series of Feed documents, each  
representing a segment of the same feed.  For example, in a reverse- 
chronologically ordered series of Feed documents, the 'next' URI  
would point to the document next further in the past.




Re: New Link Relations? [was: Feed History -04]



On Oct 17, 2005, at 10:17 PM, James M Snell wrote:
When I think of next/prev I'm not thinking about any form of  
temporal semantic.  I'm thinking about nothing more than a linked  
list of feed documents.  If you want to add a temporal semantic  
into the picture, use a mechanism such as the Feed History  
incremental=true element.
I don't think I expressed the point I wanted to make quite clearly  
enough, so let me try again.


Chains of Feed documents are going to be ordered in some way, whether  
it's specified or not, whether they explicitly indicate it or not.   
For example, the chain of Feed documents representing an incremental  
feed is going to naturally be in temporal order.  You're not going to  
be tacking on new entries willy nilly to whichever of the documents  
in the chain fits your fancy at the moment.  You're going to create a  
new document when the one you were most recently adding entries to  
gets "full", and then your going to add entries there till that one  
is "full", and so on.  There may be exceptions, but by and large,  
whether the temporal order is explicit or not, that's what's going to  
happen.


Chains of pages of search results feeds are going to naturally be  
ordered with the best matches "on top".


The point I was trying to make was that you're not going to create  
all the documents without links between then and then randomly assign  
links between them in no specific order.  You're going to link  
between then in an order that makes sense within the context of how  
the feed was created.


I don't know how client applications are going to adapt to deal with  
the difference between incremental feeds and, for example, search  
results feeds, but I can't imagine that client software isn't going  
to rely on there being some sort of sense to the order of the Feed  
documents.


What I was trying to say further down with the example spec text I  
wrote was, let's state explicitly that this link relation does not  
have a temporal semantic, and if somebody want's a link relation with  
a temporal semantic, they should create another link/@rel value for it.


In other words...

In other words,

this does not imply a feed history thing...
...let's have this be a link for navigating among the pages of the  
current state of the feed (whether it be incremental or not--noting  
that some non-incremental feeds will only have one page, and won't  
need it).  The entries in the current state of the feed are not in  
any specific order (though we know that naturally they will be in  
some sort of order):

 
   ...
   
 


How does the following have anything to do with history?  In an  
incremental feed, all of the entries, whether part of the Feed  
document at the subscription end or not, are part of the present  
state of the feed--they don't just exist back in history.  History is  
for non-incremental feeds.  I'm saying let's not work on navigation  
through history right now, but let's recognize that unless we say not  
to, people might try to use the mechanism designed for paging through  
the current state of a feed to navigate through the history of a feed  
too, so let's say not to.  I understand (or at least suppose) that  
you don't think we need to say not to, because you don't see the harm  
in making the link relation more generic.  I disagree.  I think we're  
going to end up with a mess if we don't make it specifically for  
navigating the current state.

this does...
 
   ...
   true
   
 




Re: New Link Relations? [was: Feed History -04]



On Oct 17, 2005, at 3:44 PM, Mark Nottingham wrote:

On 17/10/2005, at 12:31 PM, James M Snell wrote:
Debating how the entries are organized is fruitless.  The Atom  
spec already states that the order of elements in the feed has no  
significance; trying to get an extension to retrofit order- 
significance into the feed is going to fail... just as I  
discovered with my Feed Index extension proposal.
Here's what the spec says: "This specification assigns no  
significance to the order of atom:entry elements within the  
feed."  ...but there may be some.  ...but there's no action you can  
take based on it unless something else tells you what the  
significance is.  ...which, yes, is very difficult to specify.


For the purposes of this discussion, it doesn't matter what the order  
of atom:entry elements within a feed document is.  But the order of  
chunks of atom:entry elements within a linked series of feed  
documents may have significance, and in fact, unless you just want to  
reconstruct the complete feed state, working with a series of feed  
documents with no specific order would be fairly unwieldy.  Imagine  
paging though a feed of search results with no idea of whether you'd  
just jumped from the most to the least significant results, or to the  
second most significant results.  Imagine trying to catch up on a  
fast-moving incremental feed without having any idea whether a link  
would take you to the first entries ever added to a feed or the one's  
you just missed.


I do believe that a "last" link relation would be helpful for  
completeness
...and "last" certainly seems to imply SOME sort of ordering of  
chunks, even if we know nothing about the order of the entries in  
each chunk.


To each of the following, perhaps we could add something to indicate  
that these link relations are all used to page through the current  
state of a feed, and not to navigate among various states of a feed.   
The fact that most people wouldn't have a clue what that means  
without some discussion of incremental and non-incremental feeds may  
be an argument for having a spec document to provide more explanation  
(rather than embedding an identical explanation in each  
Description).  Example:


"At any point in time, a feed may be represented by a series of Feed  
documents, each containing some of the entries that exist in the feed  
at that point in time.  In other words, a feed may contain more  
entries than exist in the Feed document that one retrieves when  
dereferencing the subscription URI, and there may be other documents  
containing representations of those additional entries.  The link  
relations defined in this specification are used to navigate between  
Feed documents containing pages or chunks of those entries which  
exist simultaneously within a feed.


"Note that this specification does not address navigation between the  
current and previous states of a type of feed which does not  
simultaneously contain it's current and past entries.  For example, a  
"Top 100 Songs" feed might at any point in time only contain entries  
for the top 100 songs for a single week, which entries may or may not  
be divided among a number of Feed documents.  The entries for the top  
100 songs from the previous week are not only no longer part of the  
Feed document or Feed documents representing the current state of the  
feed--they are no longer part of the feed at all.  Another  
specification may describe a method of navigating between the current  
and previous states of such a feed.  The link relations defined in  
this specification are only used to navigate between the various Feed  
documents representing any single state of such a feed."



 -  Attribute Value: prev
 -  Description: A stable URI that, when dereferenced, returns a  
feed document containing entries that sequentially precede those in  
the current document. Note that the exact nature of the ordering  
between the entries and documents containing them is not defined by  
this relation; i.e., this relation is only relative.

 -  Expected display characteristics: Undefined.
 -  Security considerations: Because automated agents may follow  
this link relation to construct a 'virtual' feed, care should be  
taken when it crosses administrative domains (e.g., the URI has a  
different authority than the current document).


 -  Attribute Value: next
 -  Description: A stable URI that, when dereferenced, returns a  
feed document containing entries that sequentially follow those in  
the current document. Note that the exact nature of the ordering  
between the entries and documents containing them is not defined by  
this relation; i.e., this relation is only relative.

 -  Expected display characteristics: Undefined.
 -  Security considerations: Because automated agents may follow  
this link relation to construct a 'virtual' feed, care should be  
taken when it crosses administrative domains (e.g., the URI has a  
different a

Re: Are Generic Link Relations Always a Good Idea? [was: Feed History -04]



On Oct 17, 2005, at 5:17 PM, Mark Nottingham wrote:
They seem similar. But, what if you want to have more than one  
paging semantic applied to a single feed, and those uses of paging  
don't align? I.e., there's contention for prev/next?


If no one shares my concern, I'll drop it... as long as I get to  
say "I told you so" if/when this problem pops up :)

I share your concern.


On 17/10/2005, at 3:21 PM, Thomas Broyer wrote:

I don't think there are different concepts of paging.

Paging is navigation through subsets (chunks) of a complete set of  
entries.
Yeah, but what if you need what amounts to a multi-dimensional  
array.  The method of addressing each dimension has to be  
distinguishable from the others.


If the complete set represents all the entries ever published  
through an ever-changing feed document (what a feed currently is,  
you subscribe with an URI and the document you get when  
dereferencing the URI changes as a sliding-window upon a set of  
entries), then paging allows for feed state reconstruction.
In other terms, feed state reconstruction is a facet of paging, an  
application to non-incremental feeds.
Let's say you're doing a feed for the Billboard top 100 songs.  Each  
week, the entire contents of the feed are swapped out and replaced by  
a new top 100 (ie. it is a non-incremental feed).  And let's say you  
don't want to put all 100 in the same document, but you want to break  
it up into 4 documents with 25 entries each.  You now have two  
potential axes that people might want to traverse--from songs 1-25 to  
26-50 to 51-75 to 76-100, or from this weeks 1-25 to last weeks 1-25  
to two weeks ago's 1-25, etc.  You can't link in both directions with  
the same "next".


There are clearly two distinct concepts here--navigating through the  
chunks that make up the current state of the feed resource, and in a  
non-incremental feed, navigating though the historical states of the  
feed resource.




Re: Feed History -04 -- is it history or paging or both?



If we're going to separate the concepts of "history" and "paging",  
then the term "history" doesn't really apply to incremental feeds.   
In an incremental feed, all of the entries are part of the current  
state of the feed.  We don't go back through history to find the  
present--we go to different pages of the present.  In a non- 
incrememental feed also, we may have multiple pages of current  
entries (eg. the top 100 DVDs in chunks of 10), or we may have just  
one.  We also may preserve historical data (eg. the top 10 songs last  
week, the week before, etc.)


So what we end up with might looks like this:

Any feed, whether incremental or not, MAY contain something like this  
(names chosen somewhat arbitrarily, with an eye toward avoiding  
excess conceptual baggage):


page-a - the URI of one end of a chain of documents representing one  
state of a feed resource (eg. the current state of an incremental  
feed)--it doesn't really matter which end it is

page-b - the other end of the chain of documents
page++ - the next farther page from page-a
page-- - the next closer page to page-a

Neither "page-a" nor "page-b" is necessarily fixed--the entire  
contents of the chain may shuffle around, be added to, be deleted  
from, etc., in the case of something like search results.


A non-incremental feed MAY also contain something like this (history  
is temporal, so we can use temporally loaded terminology):


history-1 - a document containing a representation of one of the ends  
of or the entire temporally first historical state of the feed resource
history-n - a document containing a representation of one of the ends  
of or the entire temporally last (perhaps current and still changing)  
historical state of the feed resource
history++ - one of the ends or ... of the the next more recent  
historical state... (moves toward history-n)
history-- - one of the ends ... of the next less recent historical  
state... (moves toward history-1)


If you want to catch up on an incremental feed to which you're  
subscribed, or want to get the last month of an incremental feed to  
which you are newly subscribed, you look for "page++" or "page--" and  
follow whichever one the subscription document (which can only have  
one, since it's one of the ends) contains till you've got everything  
you want.


If you start in the middle, you don't know which direction you're  
going...but since the ordering of the chain isn't defined, it's like  
the Cheshire cat says--it doesn't matter which direction you go if  
you don't know where you want to end up...or something like that.   
Perhaps convention could dictate that "page-a" be where the publisher  
subjectively thinks that a newcomer to the feed would be most likely  
to want to start reading.  It wouldn't always be correct, but so what?




Re: Feed History -04




On Oct 17, 2005, at 11:25 AM, Eric Scheid wrote:

On 18/10/05 2:04 AM, "Antone Roundy" <[EMAIL PROTECTED]> wrote:
I'd prefer that our use of 'prev' and 'next' be consistent with   
other uses
elsewhere, where 'next' traverses from the current position to  
the  one that

*follows*, whether in time or logical order. Consider the use of
'first/next/prev/last' with chapters or sections rendered in HTML.

...so do you follow forward through time or backward?  Is the   
starting
"current position" "now" or the "the beginning of time"?
Especially if we're
talking about history, following backward makes  as much sense as  
following

forward.


You can "start" wherever you want, but the @rel='first' archive is the
archive which contains the first entry that ever existed. Why would  
the
@rel='first' archive contain the last entry created, that makes no  
sense.


Here's how "first" pointing to the last entry created could make  
sense--it's pointing not to the first entry, but to the first page in  
a chain of pages of entries.  In that case, "first" points to the  
starting point, whether that be the first entry created or the last.


If you want to go backwards in time, then the "next" archive would  
be found

by following the @rel='prev' link .. because you are going backwards!


Consider this: <http://www.taipeitimes.com/News/editorials/archives/ 
2005/02/27/2003224800> -- my brother, who has spent time in Taiwan  
and China tells me that the Chinese are the same--they think of  
themselves as facing the past (which they can see--makes sense)--not  
the future.  The future is still unseen behind them.


But getting back to what I was saying above, "next in the chain" only  
correlate to one particular direction in time if the chain is defined  
in terms or a specific direction in time.  I face the past when I  
look at incremental feeds.




Re: Feed History -04



On Oct 17, 2005, at 11:06 AM, Byrne Reese wrote:

5. Is the issue of whether a feed is incremental or not (the
fh:incremental
element) relevant to this proposal?



non-incremental feeds wouldn't be paged, by definition, would they?



Why not? Why wouldn't I have a "Top 100 DVDs of All Time" broken out
into 10 feeds of 10 entries each?

Although one could say that the presence of 'next' and 'prev' links
indicate that the feed is incremental, and the absence of those links
indicates the feed is complete.


Ugh!  This suggests yet another feed model.  First, what we've  
already got:


1) Incremental feed document: a feed document that contains a subset  
of the entries that are currently in the feed (resource), where new  
entries are added to the feed without replacing old entries.


2) Non-incrememental feed document: a feed document which contains  
representations of every entry which is currently part of the feed,  
where new entries replace old entries.


and then...

3) ??? feed document: a feed document which contains representations  
of a subset of the entries that are currently in the feed (resource),  
where new entries replace old entries.


Linking up #3 documents to show a history of the feed will be even  
more complicated than what we've been discussing.  Do we need such a  
beast?  I wish I could unequivocally say "no", but I'm forced to  
equivocate: are not search result feeds #3?  It seems that indeed we  
need two separate mechanisms--a paging mechanism and a history  
mechanism.  Unless both are going to go into the same extension, the  
extensions should be explicit about the the fact that they describe  
one and not the other.  And we should choose link/@rel names  
carefully so that one doesn't end up using names better suited fore  
the other.




Re: Feed History -04



On Oct 17, 2005, at 10:04 AM, Antone Roundy wrote:

4. Is the order of the entries in a feed relevant to this proposal?

...
1) A chain of temporally ordered chunks in the history of a feed  
where new entries are tacked onto the end.
2) Search results, where the order of everything all along the  
entire chain shifts around all the time.


If you're not going to reconstruct the whole thing, then your  
decision function for when to stop may have to be different  
depending on how things are ordered.


BTW, case 2 destroys the idea of a "fixed" end and a "live" end.

Having a means to indicate what the ordering is might make it  
easier to make the distinction between "next" and "prev" more  
intuitive.  I'm not sure how else we're going to reconcile  
terminology for significance and temporally ordered feeds.


Okay, I've got another idea--switch to totally generic terminology, a  
la:


"end-a": the URI of "most significant", "most current",  
"prerequisite"[1], etc. end of a sequence of documents, or a randomly  
selected end if there is no order.
"end-b": the URI of the "least significant", "least current",  
or ...uh, "postrequisite"? end of a sequence of documents or  
otherwise the opposite end from "end-a".
"a-ward": the URI of the document next closest to "end-a" in the  
sequence.
"b-ward": the URI of the document next closest to "end-b" in the  
sequence.


If you have neither "end-a" nor "end-b", then you should use "b-ward"  
to traverse out of the subscription document (ie. the subscription  
document in that case is assumed to be "end-a").


[1] if the sequence should be read first to last, for example, if  
it's a novel broken down into entries, "end-a" points to the place  
from which one should start.  Which end is "end-a" and which is "end- 
b" is somewhat subjective. For example, in a temporally ordered feed,  
is it most important to read what's most current, or to understand  
the origins of the present first before reading what's most current?



One more thing occurs to me--if this extension is going to be used to  
handle things like paging in search results, then it's not really  
"feed history", it's "paging".




Re: Feed History -04



On Oct 17, 2005, at 2:20 AM, Eric Scheid wrote:

On 17/10/05 5:09 PM, "James Holderness" <[EMAIL PROTECTED]> wrote:
1. Which relationship,  next or prev, is used to specify a link  
backwards in
time to an older archive. Mark Nottingham's Feed History proposal  
used prev.

Mark Pilgrim's XML.com article used next.
I'd prefer that our use of 'prev' and 'next' be consistent with  
other uses
elsewhere, where 'next' traverses from the current position to the  
one that

*follows*, whether in time or logical order. Consider the use of
'first/next/prev/last' with chapters or sections rendered in HTML.
...so do you follow forward through time or backward?  Is the  
starting "current position" "now" or the "the beginning of time"?   
Especially if we're talking about history, following backward makes  
as much sense as following forward.


I prefer "next" to go back in time (if temporally ordered--from the  
most current chunk to the next most current chunk) or to less  
significant pages (in things like search results).  But I'll probably  
have to stop and think what "next" means in temporally ordered feeds  
from time to time since it'd be the reverse of temporal order.


2. Are next and prev both needed in the spec if we only require  
one of them

to reconstruct the full history?
Knowing that the most recently published archive won't likely  
remain the
most recently published archive, there will be use cases where it's  
better
to reconstruct the full history by starting at the one end which is  
fixed.

Not much sense starting at the other end which is constandly shifting.
Is this only going to be used to reconstruct full history?  What  
about just reconstructing the last 3 months (in which case you'd want  
a link from closer to the live end to close to the fixed end), or  
reading from the beginning before deciding whether to continue  
reading what comes later (in which case you'd want a link from closer  
to the fixed end to closer to the live end).



3. Are the first/last relationships needed?
See (2) above for 'first'. Meanwhile 'last' could be followed by a  
user to
jump ahead to the end of the set of archives to see if the butler  
did it.

Who said 'first/next/prev/last' would only be used by machines?
As mentioned above, there may be cases where you'd prefer to start at  
either the fixed or live end, so as long as complete feed  
reconstruction isn't the only goal, I'd say yes.


But what's "first"?  It'd be the top results in a search feed, but  
would it be the start of time or the start from the present (before  
possibly traveling backward through time) in a temporally ordered  
feed?  Making it the start of time would prevent it from matching up  
well with how significance ordered feeds match up (ie. does start  
point to the thing you'd most likely want to see if you subscribed to  
the feed?)  If we're not careful, we'll be traversing out of "first"  
through "prev" and "last" through "next"!



4. Is the order of the entries in a feed relevant to this proposal?

not to this proposal.
If you mean not just the order within each chunk of the feed, but the  
order of the chunks, then not central, but certainly related.  Two  
cases come to mind:


1) A chain of temporally ordered chunks in the history of a feed  
where new entries are tacked onto the end.
2) Search results, where the order of everything all along the entire  
chain shifts around all the time.


If you're not going to reconstruct the whole thing, then your  
decision function for when to stop may have to be different depending  
on how things are ordered.


BTW, case 2 destroys the idea of a "fixed" end and a "live" end.

Having a means to indicate what the ordering is might make it easier  
to make the distinction between "next" and "prev" more intuitive.   
I'm not sure how else we're going to reconcile terminology for  
significance and temporally ordered feeds.


5. Is the issue of whether a feed is incremental or not (the  
fh:incremental

element) relevant to this proposal?

non-incremental feeds wouldn't be paged, by definition, would they?
This week's top ten on the first page, last week's ten on the second  
page...


Since this proposal is defining a paging mechanism, I think that what  
each page represents is relevant.  Is it an earlier part of the feed  
or an earlier state of the feed?


6. What to name the link relation that points to the active feed  
document?

subscribe, subscription, self, something else?

'subscribe'
I just noticed something about the definition of "self" in the format  
spec.  In one place it says:


   o  atom:feed elements SHOULD contain one atom:link element with a  
rel

  attribute value of "self".  This is the preferred URI for
  retrieving Atom Feed Documents representing this Atom feed.

Does that mean that it's the preferred subscription  
URI, or the preferred place to retrieve this chunk  
of the feed history?  The format spec didn't define paging, so it  
didn't

Re: Feed History -04



On Oct 14, 2005, at 11:28 AM, Thomas Broyer wrote:

Mark Nottingham wrote:

How about:



?

I always thought this was the role of @rel="self" to give the URI  
you should subscribe to, though re-reading the -11 it deals with "a  
resource equivalent to the containing element".
That's what some of us wanted it to be and thought it was intended to  
be.  The language that made it into the spec certainly falls short of  
expressing what was in PaceFeedLink, which is the proposal that added  
@rel="self" [1].


1. Isn't "a resource equivalent to the containing element" the same  
as "an alternate version of the resource described by the  
containing element"?
That's how I would read that language knowing nothing of the history  
of that part of the spec.  I think some people intended "equivalent"  
to mean "it may not be a different copy of the same bits, but  
whatever it is, it contains the same bits" (or at least the same code  
points, if it happens to be transcoded).


2. Is the answer to 1. is no then what does "a resource equivalent  
…" mean? Is it really different than "the URI you should subscribe  
to" (at least if @type="application/atom+xml")?
I think what some people want that to mean is "here's a place you  
could get the feed, but I'm not making any assertions regarding  
whether it's preferable to get it from there or somewhere else."


[1] http://www.imc.org/atom-syntax/mail-archive/msg15062.html



Re: Feed History -04



On Oct 14, 2005, at 11:13 AM, Mark Nottingham wrote:

On 14/10/2005, at 9:22 AM, Lindsley Brett-ABL001 wrote:
I have a suggestion that may work. The issue of defining what is  
"prev" and "next" with respect to a time ordered sequence seems to  
be a problem. How about defining the link relationships in terms  
of time - such as "newer" and "older" or something like that. That  
way, the collection returned should be either "newer" (more recent  
updated time) or "older" (later updated time) with respect to the  
current collection doc.


A feed isn't necessarily a time-ordered sequence. Even a feed  
reconstructed using fh:prev (or a similar mechanism) could have its  
constituent parts generated on the fly, e.g., in response to a  
search query.


The OpenSearch case mentioned by Thomas is what convinced me that  
terms related to temporal ordering aren't appropriate (what a pity,  
since "newer" and "older" are the perfect terms for time ordered  
sequences of feed documents!)


"Previous" and "next" suffer from the fact that they could easily be  
interpreted differently in different use cases. For example, for  
OpenSearch results "pages", clearly "prev" points to the search  
results that come up "on top" and "next" to the lower results. But in  
a conventional syndication feed, "next" could easily be taken to mean  
either "the next batch of entries as you track back towards the  
beginning of time from where you started (which is usually going to  
be the growing end of the feed)", or "a batch of entries containing  
the entries that were published next after the ones in this batch."   
I'd have to look at the document to remind myself of which "next"  
means, because either makes just as much sense to me.


Which brings me back to "top", "bottom", "up" and "down".  In the  
OpenSearch case, it's clear which end the "top" results are going to  
be found.  In the syndication feed case, the convention is to put the  
most recent entries at the "top".  If you think of a feed as a stack,  
new entries are stacked on "top".  The fact that these terms are less  
generic and flexible than "previous" and "next" is both an advantage  
and a disadvantage.  I think the question is whether it's an  
advantage in a significant majority of cases or not.  What orderings  
would those terms not work well for?


Antone



Re: Spec wording bug?



On Oct 14, 2005, at 5:43 AM, Danny Ayers wrote:

I believe "the language of the resource" for hreflang makes no sense -
it will be the *representations* that are associated with languages,
and "the" implies a single language - there may be more than one.

If content negotiation might be used to select from among different  
languages (ie. if multiple representations are available from the  
same URI), then perhaps the hreflang attribute should be omitted.   
Were we to have allowed multiple languages to be specified in the  
same hreflang attribute to cover such cases, the wording would be  
incorrect, but since we didn't, I think it's correct as it is.




Re: Feed History -04



On Oct 13, 2005, at 7:58 PM, Eric Scheid wrote:

On 14/10/05 9:18 AM, "James M Snell" <[EMAIL PROTECTED]> wrote:



Excellent.  If this works out, there is an opportunity to merge the
paging behavior of Feed History, OpenSearch and APP collections  
into a

single set of paging link relations (next/previous/first/last).



'first' or 'start'?

Do we need to define what 'first' means though?  I recall a dissenting
opinion on the wiki that the 'first' entry could be at either end  
of the

list, which could surprise some.


Yeah, that's a good question.  Maybe calling them "top" and "bottom"  
would work better.  Considering that the convention is to put the  
newest entry at the top of a feed document, "top" might be more  
intuitively understandable as being the new end.  You might also  
rename "next" and "previous" (or is it "previous" and "next"?) to  
"down" and "up".  There's SOME chance of that getting confused with  
hierarchical levels, but I could live with that.




Re: more than one content element?



On Oct 13, 2005, at 12:06 PM, A. Pagaltzis wrote:

* John Panzer <[EMAIL PROTECTED]> [2005-10-13 19:40]:

Well, you can pass them around by reference with [EMAIL PROTECTED]
I think.

By the letter of the spec, but not by the spirit.


I just ran through the discussion of this very question on the  
mailing list[1], and though it looks like allowing composite types in  
remote content had pretty good support, that doesn't appear to have  
been translated into a Pace, and obviously, no language specifically  
allowing it got into the spec document.  Thus, it looks like the  
prohibition from section 4.1.3.1 stands, and that you're right that  
the only way you could do it without breaking the rules outright  
would be by ignoring the SHOULD (have content/@type when using  
content/@src), which would certainly be contrary to the "spirit" of  
the spec as it stands.


[1] http://www.imc.org/atom-syntax/mail-archive/msg15949.html



Re: Straw Poll: age:expires vs. dcterms:valid (was Re: Unofficial last call on draft-snell-atompub-feed-expires-04.txt) On Oct 8, 2005, at 8:37 AM, James M Snell wrote: I wanted to indicate that a gi



Oops, sent this from the wrong address on Saturday. No wonder it  
didn't get through.


On Oct 8, 2005, at 8:37 AM, James M Snell wrote:

I wanted to indicate that a given entry must expire at Midnight on  
Dec, 12, 2005 (GMT).

using age:expires:


[snip]

using dcterms:valid (http://web.resource.org/rss/1.0/modules/ 
dcterms/#valid)


 
   end:2005-12-12T00:00:00Z
 

 Advantage:
   * Existing namespace, known element
 Disadvantage:
   * Value can be many different things. I've even seen cases in  
which the content of dcterms:valid is an XML structure.
 My chief problem with dcterms:valid (and with dublin core in  
general) is that the elements are very loosely defined.  The  
content can literally be anything folks want it to be and still be  
considered valid.  Unless we constrain the value space for this  
element when used in Atom, it *could* lead to a bunch of extra work  
for consumers to parse and process those dates. I prefer very  
crisply defined elements.  Then again, reusing an existing  
namespace is Goodness.




I think it would be going too far to say "when using dcterms:valid in  
Atom, you must follow this profile", because we don't own dcterms,  
and doing so might limit people from doing valid things with it that  
don't follow that profile.  But I do think it would be reasonable to  
say "when using dcterms:valid in Atom, it is recommended that you  
follow this profile--otherwise your data may be technically valid,  
but not widely understood", thus giving developers an excuse for not  
supporting data not formatted according to that profile.  If a use  
case that requires a different format becomes common, then developers  
can start supporting more formats at that point.


That said, my "vote" is for doing what I just said--advocate the use  
of dcterms:valid for this purpose, with the date formatted to match  
Atom's date construct profile.


BTW, you might choose language that leaves room for having both start  
and end dates for validity--for example, to enable Atom delivery of a  
coupon that's valid for a particular span of dates.




Re: Straw Poll: age:expires vs. ...... plus a gazillion words



Gh!  Sorry about the mile long subject. Gotta be careful with  
that copy and paste!




Re: ACE - Atom Common Extensions Namespace



On Oct 2, 2005, at 11:15 PM, Mark Nottingham wrote:
I think this is a well-intentioned effort, but at the wrong end of  
the process. The market (i.e., users and implementors) should have  
a go at sorting out at what's common/prevalent enough to merit this  
sort of thing; having a co-ordinated namespace will lead to the  
problem of what to lump into it, how to version individual  
extensions within it, etc.


I have to agree with Mark.  Consider this scenario: an extension gets  
added to ACE. Someone makes an extension that does the same thing  
differently. The market prefers the non-ACE method and adopts it more  
widely than the ACE solution. Now not only do you have multiple  
namespaces to declare, but one of them has a bunch of elements that  
don't get used, yet implementors feel compelled to implement them  
because they're part of this special namespace.


Here's another scenario: an extension gets added to ACE, and another  
extension gets created that does the same thing better. Because the  
first has the ACE stamp of approval, the inferior method gets wide  
support, and the superior method dies.


Both scenarios suggest that the market should be given time to choose  
best practices rather than some group deciding which practices are  
going to get special status in advance. If a feed is going to carry  
elements from a bunch of different extensions, it's going to be a  
relatively heavy feed anyway. The overhead of including multiple  
namespace declarations isn't going to be that great.




Re: FYI: Updated Index draft



On Thursday, September 22, 2005, at 10:20  AM, James M Snell wrote:

Antone Roundy wrote:
I was thinking yesterday of suggesting that feed/id be used the way 
you're using i:domain. Which is better is probably a matter of 
whether ranking domains that span multiple feeds will be useful or 
not. In the movie ratings use case presented below, perhaps rather 
than a fivestarts scheme and netflix and amazon domains, it might 
make more sense to do this:


Using atom:id as the ranking domain would limit the ranking to a 
single feed which is useful, but does not cover the full range of 
cases.

...

Yes, there are two special cases here:

1. Lack of a i:domain
2. i:domain value that is a same document reference


I think a ranking without a domain is pretty much useless--or at least 
likely to lead to problems downstream--so that case doesn't need to be 
covered.  More on that below.



 
   ...
   
 
   Feed1
   # 
   
 A
 50
 20
   
   
 B
 25
 40
   
 
 
   Feed2
   # 
   
 C
 50
 30
   
   
 D
 25
 10
   
 
   
 


In this example, the domainless rankings were added when the XHTML 
document was created, right?  So the XHTML document is essentially an 
aggregate feed, just not in Atom format.  Would it not make as much or 
more sense to mint an ID for the document (call it the ID of a "virtual 
Atom Feed Document" if you don't actually create an aggregate feed) and 
use it to scope those i:rank elements?  If, somehow, someone were to 
pull the atom:feeds out of the XHTML document (if atom:feed getting 
embedded into xhtml:body is going to happen, then is not atom:feed 
getting extracted from xhtml:body also likely?) and aggregate them with 
other feeds with domainless i:rank elements, the scopes of those 
elements would get mixed.


* Since the urn:(netflix|amazon).com/reviews schemes are feed 
independent, it is not necessary to indicate a feed (or "domain") in 
this case.
* For a feed-specific scheme, like natural order, the feed ID would 
be included like this (so that if these entries were aggregated, it 
would be clear that the i:order elements were relevant to the source 
feed, not the aggregate feed):


The goal of @scheme is to identify the type of ranking to apply while 
the goal of @domain is to identify the scope of the ranking.  I do not 
believe that it is a good idea to conflate the two.


Okay, I've come to agree with that while writing and editing this 
message.  Note however that "fivestar" also indicates multiple things:


1) Higher numbers are "better"
2) The range is 0 to 5 (BTW, if this is limited to integers, how will 
you handle things like 3.5 stars, which are common in that type of 
rating system? Maybe decimal values need to be allowed.)

3) Hint: you might want to display the value as stars

#1 is the only one needed for sorting of entries. #2 would be useful if 
the feed reader wanted to display some sort of graphical element to 
indicate the ranking. #3 might be slightly useful, but except for the 
most popular schemes, would probably be ignored. Perhaps all of these 
should be separated, a la:



...

3

...where @domain is the feed/id of the feed if there's just one feed in 
scope, or a value that won't be duplicated by any feed/id otherwise (if 
one can mint a unique feed id, surely one can also mint a unique id 
that won't be used for a feed).


I'd suggest that i:ranking-scheme/@domain either default to the 
containing feed/id (or the one from atom:source, if it exists) or be 
required, i:rank/@domain be required, @order default to ascending, 
@min-value default to 0, and the rest of the attributes be optional 
with no defaults.




Re: FYI: Updated Index draft



On Wednesday, September 21, 2005, at 11:43  PM, James M Snell wrote:


{domain}
I was thinking yesterday of suggesting that feed/id be used the way 
you're using i:domain. Which is better is probably a matter of whether 
ranking domains that span multiple feeds will be useful or not. In the 
movie ratings use case presented below, perhaps rather than a 
fivestarts scheme and netflix and amazon domains, it might make more 
sense to do this:



urn:my_reviews
descending
descending


Movie A
3
4


Movie B
2
1



Notes:
* The i:order element tells the user agent whether higher or lower 
numbers are considered "better", "higher priority", "first", or 
whatever. In these cases, higher numbers are better, so would 
typicially be shown first, so they're considered a "descending" schemes.
* i:order/@label indicates a human readable label for the scheme, and 
could be optional.
* Since the urn:(netflix|amazon).com/reviews schemes are feed 
independent, it is not necessary to indicate a feed (or "domain") in 
this case.
* For a feed-specific scheme, like natural order, the feed ID would be 
included like this (so that if these entries were aggregated, it would 
be clear that the i:order elements were relevant to the source feed, 
not the aggregate feed):



urn:my_feed
ascending

urn:my_feed/a
1


urn:my_feed/b
2



If sticking with i:domain, I'd recommend that you recommend that in 
cases where a ranking domain does not span multiple feeds, the feed/id 
value be used for the value of i:domain, and that in all cases, the 
same care be taken to (attempt to) ensure that i:domain's value is 
unique to what is intended to be a particular domain.




Re: Don't Aggregrate Me



On Monday, August 29, 2005, at 10:39  AM, Antone Roundy wrote:



More robust would be:

...enabling extension elements to be named in @target without requiring 
a list of @target values to be maintained anywhere.




Re: Don't Aggregrate Me



On Monday, August 29, 2005, at 10:12  AM, Mark Pilgrim wrote:

On 8/26/05, Graham <[EMAIL PROTECTED]> wrote:

(And before you say "but my aggregator is nothing but a podcast
client, and the feeds are nothing but links to enclosures, so it's
obvious that the publisher wanted me to download them" -- WRONG!  The
publisher might want that, or they might not ...


So you're saying browsers should check robots.txt before downloading
images?

...

Normal Web browsers are not robots, because they are operated by a
human, and don't automatically retrieve referenced documents (other
than inline images).


As has been suggested, to "inline images", we need to add frame 
documents, stylesheets, Java applets, external JavaScript code, objects 
such as Flash files, etc., etc., etc.  The question is, with respect to 
feed readers, do external feed content (), 
enclosures, etc. fall into the same exceptions category or not?  If 
not, then what's the best mechanism for telling feed readers whether 
they can download them automatically--robots.txt, another file like 
robots.txt, or something in the XML?  I'd prefer something in the XML.  
A possibility:





...



...



Re: Don't Aggregrate Me



On Friday, August 26, 2005, at 04:39  AM, Eric Scheid wrote:

On 26/8/05 3:55 PM, "Bob Wyman" <[EMAIL PROTECTED]> wrote:

Remember, PubSub never does
anything that a desktop client doesn't do.


Periodic re-fetching is a robotic behaviour, common to both desktop
aggregators and server based aggregators. Robots.txt was established to
minimise harm caused by automatic behaviour, whether by excluding
non-idempotent URL, avoiding tarpits of endless dynamic links, and such
forth. While true that each of these scenarios involve crawling new 
links,

the base principle at stake is to prevent harm caused by automatic or
robotic behaviour. That can include extremely frequent periodic 
re-fetching,
a scenario which didn't really exist when robots.txt was first put 
together.


I'm with Bob on this.  If a person publishes a feed without limiting 
access to it, they either don't know what they're doing, or they're 
EXPECTING it to be polled on a regular basis.  As long as PubSub 
doesn't poll too fast, the publisher is getting exactly what they 
should be expecting.  Any feed client, whether a desktop aggregator or 
aggregation service, that polls too fast ("extremely frequent 
re-fetching" above) is breaking the rules of feed consuming 
etiquette--we don't need robots.txt to tell feed consumers to slow down.




Re: Don't Aggregrate Me



On Thursday, August 25, 2005, at 03:12  PM, Walter Underwood wrote:

I would call desktop clients "clients" not "robots". The distinction is
how they add feeds to the polling list. Clients add them because of
human decisions. Robots discover them mechanically and add them.

So, clients should act like browsers, and ignore robots.txt.

How could this all be related to aggregators that accept feed URL 
submissions?  I'd imagine the desired behavior is the same as for 
crawlers--should they check for robots.txt at the root of any domain 
where a feed is submitted?  How about cases where the feed is hosted on 
a site other than the website that it's tied to (for example, a service 
like FeedBurner) so some other site's robot.txt controls access to the 
feed (...or at least tries to)?


We've already rejected the idea of trying to build DRM into feeds--is 
there some way to sidestep the legal complexities and problems that 
would arise from trying to to that and at the same time enable machine 
readable statements about what the publisher wants to allow others to 
do with the feed, and things they want to prohibit, into the feed?  If 
we're not qualified to design an extension to do that, is there someone 
else who is qualified, and who cares enough to do it?




Re: Don't Aggregrate Me



I can see reasonable uses for this, like marking a feed of local disk 
errors

as not of general interest.


"This is not published data" - 
Security by obscurity^H^H^H^H^H^H^H^H^H saying "please" - < 
http://www-cs-faculty.stanford.edu/~knuth/> (see the second link from 
the bottom)


This certainly wouldn't be useful as a security measure.  But yeah, a 
way to tell the big republishing aggregators that you'd prefer they 
didn't republish the feed could be useful, in case they somehow go 
ahold of the URL of a non-sensitive (and thus non- encrypted and 
authentication-protected), but not-intended-for-public-consumption 
feed.  Ideally though, such feeds should probably be password 
protected, since that wouldn't require aggregator support for an 
extension element.




Re: Don't Aggregrate Me



On Thursday, August 25, 2005, at 08:16  AM, James M Snell wrote:
Good points but it's more than just the handling of human-readable  
content.  That's one use case but there are others.  Consider, for  
example, if I was producing a feed that contained javascript and CSS  
styles that would otherwise be unwise for an online aggregator to try  
to display (e.g. the now famous Platypus prank...  
http://diveintomark.org/archives/2003/06/12/ 
how_to_consume_rss_safely).  Typically aggregators and feed readers  
are (rightfully) recommended to strip scripts and styles from the  
content in order to reliably display the information.  But, it is  
foreseeable that applications could be built that rely on these types  
of mechanism within the feed content.  For example, I may want to  
create a feed that provides the human interaction for a workflow  
process -- each entry contains a form that uses javascript for  
validation and perhaps some CSS styles for formatting.


For that, you'd either need to use a less sophisticated feed reader  
that didn't strip anything out (and only use it to subscribe to fully  
trusted feeds, like internal feeds), or a more sophisticated feed  
reader that allowed you to turn off the stripping of "potentially  
dangerous" stuff, or to configure exactly what was, or better yet,  
wasn't, stripped (perhaps and a feed by feed basis).


The stripping-or-not behavior should be controlled from the client  
side, so I don't see any point in providing a mechanism for the  
publisher to provide hints about whether or not to strip things out.   
That would probably only benefit malicious publishers at the expense of  
brain-dead clients:



...

	>TriggerExploitThatErasesDrive('C:');





Re: Don't Aggregrate Me



On Thursday, August 25, 2005, at 12:25  AM, James M Snell wrote:
Up to this point, the vast majority of use cases for Atom feeds is the 
traditional syndicated content case.  A bunch of content updates that 
are designed to be distributed and aggregated within Feed readers or 
online aggregators, etc.  But with Atom providing a much more flexible 
content model that allows for data that may not be suitable for 
display within a feed reader or online aggregator, I'm wondering what 
the best way would be for a publisher to indicate that a feed should 
not be aggregated?


For example, suppose I build an application that depends on an Atom 
feed containing binary content (e.g. a software update feed).  I don't 
really want aggregators pulling and indexing that feed and attempting 
to display it within a traditional feed reader.  What can I do?



In that particular use case, I'd expect entries something like this:


...
Patch for MySoftware
	This patch updated MySoftware version 1.0.1 to version 
1.0.2

k3jafidf8adf...


Looking at this, my thoughts are:
1) Feed readers that can't handle the content type are just going to 
display the summary or title anyway, so it's not going to hurt anything.
2) People whose feed readers can't handle the patches probably aren't 
going to subscribe to this feed anyway.  Instead they'll subscribe to 
your other feed (?) which gives them a link to use to download the 
patch:


...
Patch for MySoftware
		This patch updated MySoftware version 1.0.1 to version 
1.0.2




I don't think we need anything special to tell aggregators to beware 
content that they don't know how to handle in this feed.  That should 
be marked clearly enough by @type.  More in a separate message...




Re: Feed History: stateful -> incremental?



On Wednesday, August 24, 2005, at 04:07  PM, Mark Nottingham wrote:
Just bouncing an idea around; it seems that there's a fair amount of 
confusion / fuzziness caused by the term 'stateful'. Would people 
prefer the term 'incremental'?


I.e., instead of a "stateful feed", it would be an "incremental feed"; 
fh:stateful would become fh:incremental.


Worth it?

I think it's worth seeing if a term can be found that has a more 
intuitively understandable meaning.  It might be helpful to explore the 
kinds of names that describe non-stateful feeds too--if a better term 
can be found for that, it could be used instead (and just reverse true 
& false).  Brainstorming a little:


Stateful: sliding window, most recent segment, segment, stream, entry 
stream, appendable, appending, augmentable, augmenting


Non-stateful: uh...stateful? ("what you just downloaded represents the 
current state of the entire feed"), current state, current, snapshot, 
fixed entry, set entry, replacable, replacing, entry replacing, 
non-appending, non-augmenting




Re: If you want "Fat Pings" just use Atom!



On Monday, August 22, 2005, at 09:54  PM, A. Pagaltzis wrote:

* Martin Duerst <[EMAIL PROTECTED]> [2005-08-23 05:10]:

Well, modulo character encoding issues, that is. An FF will
look differently in UTF-16 than in ASCII-based encodings.

Depends on whether you specify a single encoding for all entries
at the HTTP level or not. For this application, I would do just
that, in which case, as a bonus, non-UTF-8 streams would get to
avoid resending the XML preamble over and over and over.


Of course, if you do that, you won't be able to keep signatures for 
entries originally published in an encoding other than the one you've 
chosen.


If one were to want to signal an encoding change mid-stream, how might 
that work with what's been proposed thus far?




Re: Proposed changes for format-11



On Monday, August 1, 2005, at 09:55  AM, A. Pagaltzis wrote:

* Robert Sayre <[EMAIL PROTECTED]> [2005-08-01 17:25]:

On 8/1/05, Sam Ruby <[EMAIL PROTECTED]> wrote:

Perhaps the following could be added to section 6.2:

  The Atom namespace is reserved for future
  forwards-compatable revisions of Atom.


s/compatable/compatible/


Sounds OK to me, but I recall squawking about this.


There wasn’t any squawking about the rule as such, I think. A
minor amount of squawking was about what a consumer should do
when it encounters Atom-namespaced elements in locations it
didn’t expect them.

Per spec: it should simply treat them as unknown foreign markup.
Intent: this allows old consumers to continue working with future
revisions of the spec, so long as changes are not so drastic that
a new namespace is warranted to prevent existing consumers from
doing anything with new documents.

It sounds to me like we might benefit from adding language specifying 
that elements in the Atom namespace can appear as children of elements 
from other namespaces, but may not appear as children of elements in 
the Atom namespace except as specified by the spec (or from wording the 
language to be added so that it says that).


...I am correct about our intent to allow Atom elements to be used as 
children of extension elements, right?  For example, that one be able 
to do this:



My title



...rather than having to do this:


My title



...right?



Re: Comments Draft



On Sunday, July 31, 2005, at 10:24  AM, A. Pagaltzis wrote:

* Antone Roundy <[EMAIL PROTECTED]> [2005-07-31 01:15]:

I could add more, but instead, here's my suggestion for
replacing that sentence:

If the resource being replied to is an atom:entry, the
value of the href attribute MUST be the atom:id of the
atom:entry. If the value of the type attribute is
"application/atom+xml" then the href attribute MUST be the
(URL/URI/IRI) of an Atom Entry Document containing the
atom:entry being replied to.


This undermines the purpose of the link.


I'd say that not being able to tell whether @href in 
[EMAIL PROTECTED]"in-reply-to"] is dereferencable or not is what undermines 
link.


The primary purpose of [EMAIL PROTECTED]"in-reply-to"] is to identify the 
resource (which may be an atom:entry) being replied to.  If that 
resource is an atom:entry, then the appropriate identifier for it is 
it's atom:id.


If "If the resource being replied to is an atom:entry, the value of the 
href attribute MUST be the atom:id of the atom:entry" doesn't sound 
like a good rule, then I'd argue that using atom:link to identify the 
resource being replied to is a bad idea.


As I've said before, I think that stuffing data that happens to be a 
URI but may not be dereferencable into link/@href is a bad idea.  If we 
ARE going to do it, then I think we need a way to at least hint at 
whether it's a dereferencable link or some other data stuffed into a 
link element.


Here's what the spec says @type is for:

   On the link element, the "type" attribute's value is an advisory
   media type; it is a hint about the type of the representation that is
   expected to be returned when the value of the href attribute is
   dereferenced.

If @href isn't dereferenable, then the existence of @type is deceptive. 
 I suppose it could mean "when I saw it, it was in some kind of Atom 
document", but so what?  What if the feed gets converted to RSS 2.0, 
the atom:id is put into guid, and I find the entry in the RSS feed?



Atom Entry Documents can move around; their IDs are eternal.
True, so you could just omit @type from this link if you're concerned 
that your entry document might move.  Or we could go with something 
like this:







Or we could just stick with what has been proposed, perhaps including 
what I proposed in my last message, and if they entry document moves, 
then oh well, the web has another broken link just as it would in what 
I proposed just above here or in any case where a dereferencable link 
was published, but the atom:id would still be valid.  If after moving 
the entry document, one were to publish the in-reply-to link again, it 
would be appropriate to remove the @type attribute.


...okay, that last sentence suggests that what I propose just above 
here is a superior way to having possibly-derefencable atom:links, 
because you could update the found-in-entry-document link if it got out 
of sync with the location of the document.  Otherwise, we'll have to be 
limited to linking to the feed in which the entry is found.




Re: Comments Draft



On Saturday, July 30, 2005, at 04:37  PM, James M Snell wrote:
One challenge is that for anything besides references to Atom entries, 
there is no guarantee that in-reply-to links will be non-traversable.  
For instance, if someone were to go and define a behavior for using 
in-reply-to with RSS, the href of the link may point to the same URL 
that the RSS item's link element points to (given that there is no way 
to uniquely identify an RSS item).
href="http://www.example.com/entries/1"; />


This is legal in the spec but is left undefined.


The natural choice of values when replying to an RSS 2.x item would be 
the guid, since it's the closest counterpart to atom:id.  But if the 
guid is not a permalink (ie. not dereferencable), then it won't have a 
MIME type, just as non-dereferencable atom:id's don't have a MIME type. 
 Both of these facts suggest that the following sentence should 
probably be removed from section 3:


   If the type attribute is omitted, it's value is assumed to be 
"application/atom+xml".


Instead, I'd suggest stating that if the type attribute is omitted, the 
in-reply-to link cannot be assumed to be dereferencable, and that 
non-dereferencable links MUST NOT have a type attribute.


Editorial notes about this sentence:

   A type attribute
   value of "application/atom+xml" indicates that the resource being
   responded to is an atom:entry and that the href attribute MUST
   specify the value of the parent entries atom:id element.

1) "parent" probably isn't the best word here since in-reply-to isn't 
being defined in terms of parents and children.


2) "entries" -> "entry's"

I could add more, but instead, here's my suggestion for replacing that 
sentence:


If the resource being replied to is an atom:entry, the value of the 
href attribute MUST be the atom:id of the atom:entry.  If the value of 
the type attribute is "application/atom+xml" then the href attribute 
MUST be the (URL/URI/IRI) of an Atom Entry Document containing the 
atom:entry being replied to.


Anything else could lead to inconsistencies.  For example, when 
replying to an atom:entry that can be found in an Atom Entry Document, 
but whose atom:id does NOT point to that document, there would be 
multiple choices available for the reply link's href attribute.




Re: Comments Draft



On Saturday, July 30, 2005, at 03:59  PM, A. Pagaltzis wrote:

I’d prefer to eliminate the one contra you listed by using an
extension element for this purpose (as always, nested into the
link.) Of course, that means need a namespace…


Given that the link to the feed is traversable, and the "link" to the 
ID of the entry may not be, I'd suggest using an extension element for 
the possibly-non-traversable link, if for eitherwhich would leave 
us needing to pick a good @rel value for the traversable one.  That 
would also keep my con point around:






...unless it were done this way:





...but doesn't quite work, because the logical relationships between 
the elements are entry to foo:non-traversable and foo:non-traversable 
to link, not entry to link and link to foo:non-traversable.


I'm satisfied to live with the negative point of having a link in an 
unexpected place.  And since I don't like non-traversable atom:links, I 
myself prefer , 
though I don't expect it to be adopted.




Re: Comments Draft



On Saturday, July 30, 2005, at 02:38  PM, A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-07-30 18:10]:

Yeah, source is likely the most logical choice, but I didn't
want to confuse folks with a link @rel="source" that has a
different meaning from atom:source.


An argument by way of which I came around to Antone’s suggested
“start-of-thread,” though I was going to suggest “thread-start.”

I took a look at the draft to verify whether I correctly understood 
what this link points to, and I think it isn't what I originally 
thought based on the old name "root".  Does this point to the feed in 
which the immediate parent entry was found, or to the feed in which the 
first entry in a thread of replies was found?  If the former, which the 
draft seems to suggest, and which seems more useful, then 
"start-of-thread" and "thread-start" probably aren't such good names 
after all.  With clarity in mind, "in-reply-to-feed" might be good, 
though it's a bit long.


And problem comes to mind: if you have multiple "in-reply-to" links, 
how do you related those to their respective "in-reply-to-feed" links 
(in case they're different)?  Is it possible?  Dare we do something 
like this?  (Wish we to if we dare?)






Pro:
* Groups the two links together
* Gives us more options for what to call the inside one without 
creating confusion: "source-feed", for example.  It would be nice to 
choose a name that's not likely to be the perfect name for some other 
use, or to define this @rel value broadly enough to be applicable to 
other purposes.


Con:
* Puts an atom:link in a location not expected by apps that don't 
understand this extension.





Re: Comments Draft



On Friday, July 29, 2005, at 02:41  PM, A. Pagaltzis wrote:

* Antone Roundy <[EMAIL PROTECTED]> [2005-07-29 02:40]:

On Thursday, July 28, 2005, at 05:58  PM, James M Snell wrote:

"root" is now called "replies-source"... which is a horrible
name but I'm not sure what else to call it


How about "start-of-thread".


Or maybe “parent-entries?”


How about "mother-of-all-entries"?  Ha ha.

The problem with "parent-entries" is that this link may not be pointing 
to the immediate parent, right?




Re: Comments Draft



On Thursday, July 28, 2005, at 05:58  PM, James M Snell wrote:
 * "root" is now called "replies-source"... which is a horrible name 
but I'm not sure what else to call it



How about "start-of-thread".



Re: Notes on the latest draft - xml:base



On Wednesday, July 20, 2005, at 10:22  PM, A. Pagaltzis wrote:

* James Cerra <[EMAIL PROTECTED]> [2005-07-21 05:00]:

Sjoerd Visscher,

That's because it is not an attempt at abbreviating strings,
but to preserve the meaning of relative URIs, when content is
used outside of its original context.


Same thing.  You are framing the question in a manner that
hides the problem, but it's still there.


No, it frames the question in a manner that addresses the purpose
of having the mechanism.

Right--it frames it in the context created by RFC 3986.  However, since 
this issue is commonly misunderstood, it's likely that xml:base will 
often be used for string abbreviation in the wild--thus, indeed the 
problem is still there.


If anyone doubts that base URIs as defined by RFC 3986 are not intended 
simply for abbreviation, read section 4.4 ("Same-Document References"). 
 The method outlined there for recognizing same-document references 
would be entirely unreliable if base URIs were used to abbreviate 
arbitrary portions of URIs.  It only works if the base URI is an 
address from which the data containing the relative URI can be 
retrieved.  If base URIs are intended for abbreviation convenience, 
then that section of RFC 3986 is completely broken.  My impression is 
that it isn't broken, but says what was intended.


...but now I've forgotten whether anyone has made a concrete suggestion 
about what can be done at this point, and to solve exactly what 
problem. Do I smell another note in the infamous implementers guide?




Re: I-D ACTION:draft-nottingham-atompub-feed-history-01.txt



On Wednesday, July 20, 2005, at 11:44  AM, Thomas Broyer wrote:

I was actually wondering why non-stateful feeds couldn't have archives:
in the "This month's Top 10 records" feed, why couldn't I link to "Last
month's Top 10 records"?
If this kind of links are not dealt within feed-history, then I suggest
splitting the draft into two (three) parts:
  1. fh:stateful: whether a feed is stateful or not
  2. fh:prev: state reconstruction of a stateful feed
  3. (published later) fh:: link to archives of a non-stateful feed
(no, I actually don't want such a split, I'd rather deal with the "3."
in feed-history, no matter how)

If we want to solve this "issue" using a distinct element (fh:prev if
fh:stateful=true, fh: if fh:stateful=false), is fh:stateful still
needed? The presence of fh:prev would be equivalent to 
fh:stateful=true,

the presence of fh: would be equivalent to fh:stateful=false, the
absence of both fh:prev and fh: would be equivalent to the absence
of fh:stateful, and the presence of both fh:prev and fh: would be 
an

error.
This is off course true only if fh:prev must be accompanied by
fh:stateful=true. The question is: is it useful to have fh:stateful if
you have no link to any kind of archive?

I would think that rather than fh:stateful="true | false", it might be 
more useful to have (with a different element name, and perhaps 
different values) fh:what-kind-of-feed-is-this="sliding-window | 
snapshot | ???".  If it's a sliding-window feed, fh:prev points to the 
previous sliding window.  If it's a snapshot feed, then fh:prev points 
to the previous snapshot.  fh:what-kind-of-feed-is-this might have a 
default value of sliding-window.




Re: Atom 1.0 xml:base/URI funnies



Reluctantly, I must admit that I can't find anything in RFC 3986 or the 
xml:base spec to convince me that the "same document reference" rule 
doesn't cause the problems for Tim's feed that have been asserted.  The 
existence of the text regarding same document references, and the fact 
that that text points explicitly to section 5.1, where the method of 
determining the base URI (included embedded in content) is defined, 
leads me to conclude that it is intended that dereferencing a base URI 
would result in retrieval of the same data as exists within the scope 
of the setting of the base URI, and that base URIs are not intended as 
prefixes for convenience.


On Tuesday, July 19, 2005, at 02:28  AM, A. Pagaltzis wrote:

* David Powell <[EMAIL PROTECTED]> [2005-07-19 08:25]:

Why does xml:base allow for relative base URIs and stacking
then? If xml:base can only describe the actual source URI of
the document, then these features don't make sense.
I think it does...or at least could.  Consider the following pseudo-XML 
(element names have no significance--line numbers are for convenient 
reference below):


1   http://www.example.com/";>
2 Welcome to example.com
3 
4   Here's what's new on example.com
5   foo
6   bar
7 
8 
9   Here's what's popular on example.com
10 qwerty
11  asdf
12 
13   Here's the popular Atom stuff
14   link
15 
16   
17 

If you dereference http://www.example.com/, you get this whole 
document, or at least lines 2-16.  If you dereference 
http://www.example.com/new/, you get a document containing lines 3-7, 
or at least 4-6 (the "what's new" page).  If you dereference 
http://www.example.com/popular/, you get lnes 8-16, or at least lines 
9-15 (the "what's popular" page).  If you dereference 
http://www.example.com/popular/atom/, you get lines 12-15, or at least 
13-14 (the "what's popular with Atom" page).


The entire document is a composite of the documents at 
http://www.example.com/new/ and http://www.example.com/popular/, which 
in turn is a composite of the document at 
http://www.example.com/popular/atom/ along with some additional data 
that originates at http://www.example.com/popular/.  xml:base enables 
this compositing without requiring adjustment of the relative URIs.  It 
makes it look to the consumer as if it had gotten different parts of 
the document from different places, so that their relative URIs can be 
resolved correctly.



The example in the xml:base spec [1] uses a relative URI in the
 element, after defining an
absolute URI in http://example.org/today/";> at
the top of the document.

[1] http://www.w3.org/TR/xmlbase/#syntax


That example says: the content of the root element can be found
in the resource at , and the content
of the olist tag can be found in the resource at
. xml:base is quite apparently
being used as “a prefix for calculating relative URIs” instead of
“the source URI for the material found inside this tag.”

As you can see above, I reached the opposite conclusion.


Now, xml:base appears to try to address the situation where an
aggregate document may contain fragments from many sources, and
each of which thus has its own base URI. But the devilish detail
is that RFC-specified behaviour means that if a useragent were to
find a link to  somewhere inside the
example document except inside the olist tag, or a link to
 inside the olist tag, it may not
retrieve that URL – instead it would have to consider the XML
document itself to be the document found at the respective URL.

...

It is the xml:base TR which is at odds with this; applying
same-document reference behaviour to fragments of an aggregate
document is non-sensical.
The problem lies not in applying same-document reference behavior, but 
in copying EXCERPTS from source documents that have links to fragments 
that aren't part of the excerpt.  The same-document reference behavior 
is desirable if both the link and the fragment it links to are copied 
into the destination document. But there is no way to link to 
non-excerpted fragments.  The URI spec would have to say that if the 
fragment isn't found in the current document, you can fetch the base 
URI to see if it exists there (it could even say that you can only do 
this if the current base URI was embedded in the content).  If the 
fragment doesn't at the base URI, it's a broken link.


A hackish solution to the "Tim's Feed Conundrum" would be to set 
xml:base not to 'http://www.tbray.org/ongoing/', but to 
'http://www.tbray.org/ongoing/foo', where "foo" doesn't actually exist, 
but is just used to ensure that relative references don't end up being 
identical to the base URI.  Then, instead of  (which 
would be a same-document reference...I think I was wrong in the other 
thread), you could say .


The other solution I can think of would be for the Atom spec to say 

Re: Feed History -02



On Tuesday, July 19, 2005, at 12:29  PM, Antone Roundy wrote:

On Monday, July 18, 2005, at 01:59  AM, Stefan Eissing wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful 
feed. As an alternative one could drop fh:stateful and define that an 
empty fh:prev (refering to itself) is the last document in a stateful 
feed. That would eliminate the cases of wrong mixes of fh:stateful 
and fh:prev.


The problem is that an empty @href in fh:prev is subject to xml:base 
processing, and who knows what the current xml:base is going to be 
when you get to it.  Is there a way to explicitly make xml:base 
undefined?  If I'm not mistaken xml:base="" doesn't do it--it just 
adds nothing to the existing xml:base.  If there is a way, you could 
say , but otherwise, using an empty @href is probably 
overloading the wrong attribute.  A different @rel value like 
"fh:noprev" (with an empty link, since it doesn't matter what it 
actually points to) might be a step up, but using any kind of link to 
indicate the lack of a link is a little odd.


Yikes, I should have caught up on the xml:base thread first!  Looks 
like the jury's out, or at least hung, on this issue.




Re: Feed History -02



On Monday, July 18, 2005, at 01:59  AM, Stefan Eissing wrote:
Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. 
As an alternative one could drop fh:stateful and define that an empty 
fh:prev (refering to itself) is the last document in a stateful feed. 
That would eliminate the cases of wrong mixes of fh:stateful and 
fh:prev.


The problem is that an empty @href in fh:prev is subject to xml:base 
processing, and who knows what the current xml:base is going to be when 
you get to it.  Is there a way to explicitly make xml:base undefined?  
If I'm not mistaken xml:base="" doesn't do it--it just adds nothing to 
the existing xml:base.  If there is a way, you could say rel="fh"prev" href="" xml:base="[whatever value sets it to 
"undefined"]" />, but otherwise, using an empty @href is probably 
overloading the wrong attribute.  A different @rel value like 
"fh:noprev" (with an empty link, since it doesn't matter what it 
actually points to) might be a step up, but using any kind of link to 
indicate the lack of a link is a little odd.




Re: The Atomic age



On Friday, July 15, 2005, at 09:56  AM, Walter Underwood wrote:
--On July 14, 2005 11:37:05 PM -0700 Tim Bray <[EMAIL PROTECTED]> 
wrote:


So, implementors... to  work.


Do we have a list of who is implementing it? That could be used in
the "Deployment" section of .

I've update Grouper (http://www.geckotribe.com/rss/grouper/) to support 
conversion of Atom 1.0 to RSS 2.0.  A future version will support going 
the other way...when I get time to complete the major overhaul that 
will be required to do that.


Antone



Re: I-D ACTION:draft-ietf-atompub-format-10.txt



A misspelling...in case the opportunity to fix it arises: "Text 
Contruct" -- missing an "s" in 6.3.  (I found it because I misspelled 
it the same way when searching for it!)




Comment feeds (was Re: More while we're waiting discussion)



On Wednesday, July 13, 2005, at 12:25  AM, A. Pagaltzis wrote:

Another might be that the aggregator asks the user on
subscription whether he/she also wants to poll the comment feed,

There's an implementation detail that should be pointed out, in case it 
might influence the language ultimately chosen to define this feature.  
Consider a real world example: I'm subscribed to Slashdot's feed.  If 
they had a comments feed for each story, I would NOT want to subscribe 
to all of them...in fact, with Slashdot's volume, I'd never want to 
subscribe to comments, even if they had a separate feed for each entry. 
 Okay, bad example.


Imagine a hypothetical feed with low enough volume that one might wish 
to subscribe to the comments on some of the main entries, but enough 
volume that one might not wish to subscribe to all comments.  Deciding 
whether to subscribe to comments at the feed level would not be 
sufficient granularity. The user might want the option to subscribe to 
comments on an entry-by-entry basis.  If all of the comments are in a 
single comments feed, then an application supporting this would need to 
be able to track which threads within that feed the user had expressed 
the wish to see.  I'm not saying that's a bad idea--a feed that 
occasionally gets a comment or two on one of its entries would likely 
want to use a one comments feed for all entries approach--just pointing 
it out.  In fact, an application that could do that could go a step 
further and allow the user to stop following specific threads within 
the comments for a particular entry.


So here are the options a user might need:
Application level:
* Don't subscribe to all comments feeds (default)
	* Subscribe to all comments feeds (DANGER! Beware of things that go 
bump in the night!)


Feed level:
* Don't subscribe to all comments in this feed (default)
* Subscribe to all comments in this feed

Entry level:
* Don't subscribe to comments on this entry (default)
* Subscribe to comments on this entry

Comment level:
* Show this thread (default)
* Don't show anything more in this thread

Turning a higher level option on turns lower ones on, but shouldn't 
take away the ability to turn them off.




Re: More while we're waiting discussion



On Tuesday, July 12, 2005, at 07:23  PM, A. Pagaltzis wrote:

I think what we want to say is that “aggregators consuming this
feed should consider automatically subscribing to the referenced
feed as well,”


Automating feed subscriptions isn't something that should be done too  
lightly[1].  The default behavior for an aggregator should be NOT to do  
this, and aggregators should give the user the opportunity to control  
this feature on a feed-by-feed basis.  Some sort of same-source policy  
(like we see with JavaScripts accessing the pages they're loaded into)  
might be wise to help prevent abuse, but with lots of feeds getting run  
through services like FeedBurner, that would be far from sufficient.


[1]  
http://www.megginson.com/blogs/quoderat/archives/2005/06/13/the-case- 
against-easier-feed-subscriptions/




Re: More while we're waiting discussion



On Tuesday, July 12, 2005, at 06:21  PM, A. Pagaltzis wrote:

* Thomas Broyer <[EMAIL PROTECTED]> [2005-07-13 00:00]:

As an atom:id is an identifier that might (should?) not be
"dereferenceable", atom:link is not a good choice.

There is nothing in the spec that forbids atom:link

That should be "atom:id", right?

 from being
dereferencable, nor anything that advises against it being so.
See 4.2.6 and 4.2.6.1 in -09.

...

The spec just says is that the URI MUST NOT be assumed to be
dereferencable,

...

Whether atom:link is a bad choice for carrying a non-
dereferencable URI around is a better argument. The spec says,
verbatim:

The "atom:link" element defines a reference from an entry or
feed to a Web resource.

That would seem to imply dereferencability, but is open to
interpretation.

...

Personally, I would prefer to interpret the spec liberally, if
that is within the intended spirit,
It's definitely not within the spirit that I, for one, intended.  But 
the spirit that I intended (atom:link being limited to links intended 
to be traversed in response to explicit user interaction) was not 
accepted by the WG, so perhaps that has little bearing.


If atom:link is intended to be dereferencable, then clearly, any 
solution that takes a value from atom:id and puts it into 
atom:link/@href has a strike against it since any feed that uses 
non-dereferencable atom:ids would either have to violate the spirit of 
atom:link to participate in the feature, or would have to invent a 
competing solution.


Also, if a feed that uses dereferencable atom:ids is relocated, clients 
would be much more likely to attempt to dereference the atom:links that 
carried those previously dereferencable values than an extension 
element that was explicitly defined as not necessarily dereferencable.




Re: More while we're waiting discussion



On Tuesday, July 12, 2005, at 12:42  PM, A. Pagaltzis wrote:

* James M Snell <[EMAIL PROTECTED]> [2005-07-12 02:00]:

The second extension is a comments link type that allows an
entry to be associated with a separate feed containing
comments. […]

  
 
href="http://example.com/commentsfeed.xml"; />

 
  


What I don’t like about this idea is that if a thread-aware
aggregator wants to keep up with *all* discussion on a weblog, it
will have to poll any number of comments-for-entry-X feeds per
single main newsfeed in the general case – in the case of a
typical weblog encountered in practice, that would be several
hundred. Clearly, this is untenable.

If you're already creating an extension link type, why not throw in an 
additional attribute too to help with that:


http://example.org/commentfeed";>

   href="http://example.com/commentsfeed.xml"; />




Then you'd only need to poll the main feed unless it indicated an 
update in the comment feed.  Of course, if comments were threaded, you 
have to cascade comments:updated values up through all the feeds in a 
thread, and aggregators would have to follow updates back the other 
way, potentially down multiple branches, to find all the updated leaves.


...which raises the question of whether an application like this might 
beg a minimal feed for comments that simply pointed to an Entry 
Document for each comment. Entries in such a feed would really only 
require an atom:id, atom:updated, atom:link pointing to the entry 
document, and atom:link pointing to the parent comment or entry. 
atom:title could conceivably be considered undesirable bloat for such a 
feed. Is Atom the right format for this need? An alternative might be 
to define a format for this need that used Atom elements but had 
minimalized cardinality requirements.


Well, enough stream of thought blabbering for now.



Re: Roll-up of proposed changes to atompub-format section 5



On Tuesday, July 5, 2005, at 01:09  PM, A. Pagaltzis wrote:

* Bob Wyman <[EMAIL PROTECTED]> [2005-07-05 19:30]:

Antone Roundy wrote:

"When signing individual entries that do not contain an
atom:source element, be aware that aggregators inserting an
atom:source element will be unable to retain the signature. For this
reason, publishers might consider including an atom:source element in
all individually signed entries."

+1

+1 as well. It is one of those obvious-in-hindsight things that
the spec would do well to point out to implementors in advance.

If putting this into the spec would require a delay, then I
suppose we’ll have to end up living with a spec that could have
been more explicit. This clarification is not worth slowing
things down for.


Agreed.  If we can get it in without delaying things, I'm all for it.  
But if not, then I can live without it.  It doesn't actually change 
anything--just reduces the probability of the issue being overlooked.




Re: Roll-up of proposed changes to atompub-format section 5



On Tuesday, July 5, 2005, at 10:11  AM, Tim Bray wrote:

On Jul 5, 2005, at 8:58 AM, Bob Wyman wrote:

We can debate what it means to have an "interoperability" issue,
however, my personal feeling is that if systems are forced to break 
and
discard signatures in order to perform usual and customary processing 
on
entries that falls very close to the realm of interoperability if not 
within
it. Deferring this issue until the implementer's guide is written is 
likely
to defer it beyond the point at which common practice is established. 
The
result is likely to be that intermediaries and aggregators end up 
discarding

most signatures that appear in source feeds.


Huh?!  Pardon my ignorance, could you please provide an explanation 
for the simple-minded as to how the absence of a source element in a 
signed entry will lead to signatures being discarded?  Also, it would 
be helpful to sketch in some of the surrounding scenario... -Tim


If a signed entry doesn't have a source element and an aggregator 
inserts one, the signature will be broken--thus the aggregator will 
either discard the signature or republish the entry with a broken 
signature.


Perhaps language like this would work without being too much of a 
change at this late date:


"When signing individual entries that do not contain an atom:source 
element, be aware that aggregators inserting an atom:source element 
will be unable to retain the signature. For this reason, publishers 
might consider including an atom:source element in all individually 
signed entries."




Re: More on Atom XML signatures and encryption



On Thursday, June 30, 2005, at 12:58  PM, James M Snell wrote:
6. If an entry contains any "enclosure" links, the digital signature 
SHOULD cover the referenced resources.  Enclosure links that are not 
covered are considered untrusted and pose a potential security risk


Fully disagree. We are signing the bits in the document, not the 
outside. There is "security risk", those items are simply unsigned.


I tend to consider enclosures to be part of the document, even if they 
are included by reference.  As a potential consumer of an enclosure I 
want to know whether or not the referenced enclosure can be trusted.  
Is it accepted to change the SHOULD to a MAY with a caveat outlining 
the security risk?


Perhaps a good approach would be for the signed entry to contain a 
separate signature for the enclosure--so the entry's signature would 
cover the bits in the enclosure's signature, but not the bits in the 
enclosure itself.  That way, the signature for the entry could be 
verified without having to fetch the enclosure.


Where would that signature go?  Did we decide that  doesn't have 
to be empty?  If so, that might be a good place...but then I don't have 
any experience with signed XML, so I don't know whether there would be 
technical difficulties with putting it in any particular place.




Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt



On Wednesday, June 29, 2005, at 06:50  PM, A. Pagaltzis wrote:

My first thought upon reading the draft was what I assume is
what Stefan Eissing said: I would rather have a single,
entry-less “archive hub” feed which contains “prev” links to
*all* previous instances
For an active feed, that document could easily grow till it was larger 
than many feed instances.  I prefer the chain of instances method.



, leading to a setup like

http://example.com/latest
└─> http://example.com/archive/feed/
├─> http://example.com/archive/feed/2005/05/
├─> http://example.com/archive/feed/2005/04/
├─> http://example.com/archive/feed/2005/03/
├─> http://example.com/archive/feed/2005/02/
└─> http://example.com/archive/feed/2005/01/
I don't quite get what the "hub feed" would look like.  Could you show 
us some XML?



I don’t see anything in the draft that would preclude this use,
and as far as I can tell, aggregators which support the draft
should have no trouble handling this scenario correctly.
The draft doesn't explicitly say that a feed can only contain one 
"prev" link, but I find it hard to read "a" to mean "one or more" in 
'and optionally a Link element with the relation "prev"'.



Again, I don’t see anything in the draft that would preclude
this use, and as far as I can tell, aggregators which support
the draft should have no trouble handling this scenario
correctly.

...unless they expected only to find one "prev" link per document.


Note how the archive directory feed being static makes this
painlessly possible, while it would be a pain to achive
something similar using the paginated approach with local
“prev” links (you’d need to go back and change the previously
newest old version every time a new one was added).
I don't see why this would be any more difficult.  The paginated 
approach could easily use static documents that never need to be 
updated, as I described earlier.  I'll re-explain at the end of this 
email.



It would in fact require a “prev” link to what is actually the “next”
page.

Funnily enough, I don’t see anything in the draft that would
preclude this counterintuitive use of the “prev” link to point
to the “next” version

Could you explain what you mean by that?


I’d much rather have a single archive feed containing all
entries, and use RFC3229+feed to return partial versions of it;
That might be good for those who can support it, but many people won't 
be able to.  On the other hand, if that single feed grows to where it's 
hundreds of MB, it could cause real problems if someone requests the 
whole thing or a large portion of it.



Getting back to how to use static documents for a chain of instances, 
that could easily be done as follows. The following assumes that the 
current feed document and the archive documents will each contain 15 
entries.


The first 15 instances of the feed document do not contain a "prev" 
link (assuming one entry is added each time).


When the 16th entry is added, a static document is created containing 
the first 15 entries, and a "prev" link pointing to it is added to the 
current feed document. This link remains unchanged until the 31st entry 
is added.


When the 31st entry is added, another static document is created 
containing the 16th through 30th entries. It has a prev link pointing 
to the first static document. The current feed document's prev link is 
updated to point to the second static document, and it continues to 
point to the second static document until the 46th entry is added.


When the 46th entry is added, a third static document is created 
containing the 31st through 45th entries, etc.


If you want to reduce the number of requests required to get the entire 
history (which I don't imagine would happen often enough that it would 
necessary be worth bothering), you could put more entries into each 
static document.  If you didn't correspondingly increase the number of 
entries in the current feed document, you'd have to update the most 
recent static document a number of times rather than only outputting it 
once as described above, but even that would only require multiple 
updates to the most recent static document at any time.




Dealing with namespace prefixes when syndicating signed entries



Mulling more...

Let's say an aggregator is putting these two entries into the same 
aggregate feed:



...

[signature]


...




...

[signature]


...



Perhaps a reasonable way to deal with the namespace prefix conflict 
would be for the signature to be applied after a transform that yielded 
this (putting full namespace names in where the prefixes were):


<[atom's namespace]:entry>
[signature]


...


Unprefixed attributes would naturally remain unprefixed, but elements 
in the default namespace would need to have their namespace names 
prepended.




Annotating signed entries (was Re: More on Atom XML signatures and encryption)



On Wednesday, June 29, 2005, at 01:47  PM, James M Snell wrote:
8. Aggregators and Intermediaries MUST NOT alter/augment the content 
of digitally signed entry elements.



Just mulling over things...

Obviously, we don't have any way to annotate signed entries without 
breaking the signature.  I hesitate to introduce new complexity, so I 
don't know whether I LIKE the idea I'm about to write about, but here 
it is.  If you want to annotate a signed entry, or even annotate an 
unsigned one but keep your annotations separate, you might do something 
like this:



[feed metadata]

		the entry's signature goes 
here

[this annotation could be signed here]
...
...

...

foo
[entry's signature here if signed]
...



Notes:
1)  is optional, but recommended if the entry is 
signed and the annotation is signed.

2) Multiple annotations could point to the same entry
3) It could be requested that aggregators forward annotations along 
with their entries...but of course, that's optional, and they could 
certainly be dropped at the request of the end user if they only want 
to see the originals.
4) It might be recommended or required that  elements 
appear before the entries they annotate (whether above all entries or 
interspersed with them) to make life easier for processors that 
finalize their processing of entries as soon as they hit  
rather than doing it after they've parsed the whole document.
5) Aggregators COULD attach annotations from various sources when 
outputting entries, even if those annotations never appeared together 
within a feed before.

6) I don't see any way to choose between conflicting annotations.



Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt



On Wednesday, June 29, 2005, at 07:27  AM, Dave Pawson wrote:

I guess the answer is:
http://example.com/latest is your feed, e.g. containing the latest 10 
entries

http://example.com/archive-1 through n are your "archive" feeds.


Which would mean that the instance at /latest keeps changing?
I need to keep swapping old ones out, new ones in, i.e. rebuilding
each time?

  I guess that's another reason it feels like a kludge.

Replace http://example.com/latest with http://example.com/atom.xml.  Of 
course the latest document keeps changing and has to be rebuilt and 
replaced each time.  It's the feed document just like what we see 
today.  At least that's how I read what was written 
above--"http://example.com/latest"; was intended as the URI to which 
you'd subscribe.




  1   2   3   4   >