date:20050505

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story

Tonight something incredible happened to me. You won't believe it. I  
was walking
back from the pubs when I got snapped by a passing space ships full
of hyper advanced aliens. They did various experiments on me, and  
cloned me 1000
times. It is terrible. I just don't know what to do.

I suppose that I means am +1000 on this now.
:-) That's consensus, I am sure.
Henry
http://bblfish.net/blog/
On 5 May 2005, at 06:02, Tim Bray wrote:

http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
This Pace was motivated by a talk I had with Bob Wyman today about  
the problems the synthofeed-generator community has.

Summary:
1. There are multiple plausible use-cases for feeds with duplicate IDs
2. Pro and Contra
3. Alternate Paces
4. Details about this Pace
1. Use-Cases
Here's a stream of stock-market quotes.
My Portfolio
 
 MSFT
  2005-05-03T10:00:00-05:00
  Bid: 25.20 Ask: 25.50 Last: 25.20
  
 MSFT
  2005-05-03T11:00:00-05:00
  Bid: 25.15 Ask: 25.25 Last: 25.20
  
 MSFT
  2005-05-03T12:00:00-05:00
  Bid: 25.10 Ask: 25.15 Last: 25.10
  

You could also imagine a stream of weather readings.  Bob's actual  
here-and-now today use-case from PubSub is earthquakes, an entry  
describes an earthquake and they keep re-issuing it as new info  
about strength/location comes in.

Some people only care about the most recent version of the entry,  
others might want to see all of them.  Basically, each atom:entry  
element describes the same Entry, only at a different point in time.

You could argue that in some cases, these are representations of  
the Web resources identified by the atom:id URI, but I don't think  
we need to say that explicitly.

Yes, you could think of alternate ways of representing stock quotes  
or any of the other use-cases but this is simple and direct and  
idiomatic.

2. Pro and Contra
Given that I issued the consensus call rejecting the last attempt  
to do this, which was  PaceRepeatIdInDocument, I felt nervous about  
revisiting the issue.  So I went and reviewed the discussion around  
that one, which I extracted and placed at http://www.tbray.org/tmp/ 
RepeatID.txt for the WG's convenience.

Reviewing that discussion, I'm actually not impressed.  There were  
a few -1's but very few actual technical arguments about why this  
shouldn't be done.  The most common was "Software will screw this  
up".  On reflection, I don't believe that.  You have a bunch of  
Entries, some of them have the same ID and are distinguished by  
datestamp.  Some software will show the latest, some will show all  
of them, the good software will allow switching back and forth.   
Doesn't seem like rocket science to me.

So here's how I see it: there are plausible use cases for doing  
this, and one of the leading really large-scale implementors in the  
space (PubSub) wants to do this right now.  Bob's been making  
strong claims about not being able to use Atom if this restriction  
remains in place.

I believe strongly that if there's something that implementors want  
to do, standards shouldn't get in the way unless there's real  
interoperability damage.  I'm certainly prepared to believe that  
this could cause interoperability damage, but to date I haven't  
seen any convincing arguments that it will.  I think that if we  
nonetheless forbid it, people who want to do this will (a) use RSS  
instead of Atom, (b) cook up horrible kludges, or (c) ignore us and  
just do it.

So my best estimate is that the cost of allowing dupes is probably  
much lower than the cost of forbidding them.

Finally, our charter does say that we're also supposed to specify  
how you'd go about archiving feeds, and AllowDuplicateIDs makes  
this trivial.  I looked around and failed to find how we claimed we  
were going to do that while still forbidding duplicates, but it's  
possible I missed that.

3. Alternate Paces
I didn't want to just revive PaceRepeatIdInDocument, because it  
used the word "version" in what I thought was kind of a sloppy way,  
and because it wasn't current against format-08.  I don't like  
either PaceDuplicateIDWithSource or ...WithSource2, they are  
complicated and don't really meet PubSub's needs anyhow.  So I'm  
strongly -1 on both of those.  Yes, that means that if this Pace  
fails, we'll allow no duplicates at all.  I prefer either "dupes  
OK" or "no dupes" to "dupes OK in the following circumstances";  
cleaner.

4. Details
Section 4.1.2 of format-08 says that atom:entry "represents an  
individual entry".  The Pace says that if you have dupes, they  
"represent the same entry", which I think is consistent with both  
the letter and spirit of 4.1.2.

The Pace discourages duplicate timestamps without resorting to MUST  
language, because accidents can happen; this allows software to  
throw such entries on the floor while positively encouraging noisy  
complaining.  On the other hand, if the WG wanted either to insist  
on a MUST here or remove the discouragement altogether I could live  
wi

Re: rel profiles [was Autodiscovery

2005-05-05 Thread Henri Sivonen

On May 5, 2005, at 03:28, Kevin Marks wrote:
We have published profiles for both license and nofollow:
http://developers.technorati.com/wiki/RelLicense
http://developers.technorati.com/wiki/RelNoFollow
feel free to use them...
What value do they add compared to profileless usage?
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

RE: Atom feed refresh rates

2005-05-05 Thread Andy Henderson


 >>>Actually, as I recall, last time this came up (proposed by Walter 
>>>Underwood), someone pointed out accurately that RSS2 has had this 
>>>functionality for a long time and that nobody ever really implemented 
>>>it; thus there was a strong vote from experience against such a 
>>>feature. -Tim<<<

There is no RSS2 feature I can see that allows feed providers to tell
aggregators the minimum refresh period.  There's the ttl tag.  That was, I
believe, introduced for a different purpose and determines the Maximum time
a feed should be cached in a certain situation.  The tag's usage has morphed
over time and, if more than 60 minutes, I assume it's a Minimum time; but
it's not surprising if feed providers are wary of using the tag in this way.

What has yet to be tried is a specific tag in the core feed standard that
promotes and determines good behaviour for aggregators refreshing their
feeds.  Even if it were to prove only a limited benefit, it would still be a
benefit.

Andy

RE: Atom feed refresh rates

2005-05-05 Thread Andy Henderson


>>>You can do that with the "Expires" response header. Everytime the 
resource is requestef, serve it with a value of "now +
minimumrefreshinterval".<<<

Ah.  I see what you mean.  Thank you.

The problem is that when you say "You can do that now with the "Expires"
response header" - I can't.  It's a theoretical capability I have, but not a
practical one.

I am directly responsible for three feeds.  One is a feed associated with my
aggregator.  It's a simple xml file stored on a shared server along with the
rest of my web site.  I have no access to any HTML headers.  When
aggregators access my feed, no code of mine runs - the transaction is
handled by the server alone.  The other two are feeds generated on the fly
from a back-end database; again they are running on a shared server and,
again, my development tool (IBM's Domino) gives me no access to set the
Expires header.

I just used Microsoft's Fiddler tool to check all the feeds I subscribe to
(not a scientific sample, I admit, but it's a pretty broad mix and includes
blog sites and blogging tools) and just two provide Expires headers.  One is
the BBC, the other is Wired.  Both set Expires to expire immediately.  I'm
guessing they have good reason to do that.  I re-subscribed to Slashdot
(which has implemented draconian bandwidth throttling measures) and it
doesn't use Expires headers.

So, Expires is a measure that I could use in theory but is not available to
me either directly or, apparently, via third party blogging sites/tools.
When I look at best practice, I find Expires is either not used or is used
in a different way.

Both from a provider viewpoint and from an aggregator viewpoint, Expires
does not seem a practical option.

Andy

Re: Autodiscovery, real-world examples

2005-05-05 Thread fantasai

Nikolas 'Atrus' Coukouma wrote:
> fantasai wrote:
>
> I think you've missed how things are working at the moment. Most
> programs implemented what's in the spec before it's written. Mark is
> trying to negotiate a common standard when implementations already
> exist. A lot of experimentation has already occurred.
I'm not suggesting that the spec invalidate such well-entrenched practice,
but that it allows an alternative (not requiring 'alternate') for situations
in which it is not appropriate.
> One of the key points seems to be that autodiscovery is not meant to
> find all feeds linked to on a page, just the ones that serve as
> alternates to the current one. If people wanted this functionality, they
> would have done it by now.
Done what? Linked to non-alternate feeds with rel="alternate" just so
that it would trigger autodiscovery? People do that all the time. Give
them an alternative that doesn't require such a hack.
> I think you have three separate cases of autodiscovery:
> * the feed for *this* page - handled by this autodiscovery proposal
> * other feeds the author reads or recommends - usually done by linking
> to a separate file. Some quick searching reveals one suggestion to use
> rel="blogroll" for this
> * any other feeds linked to for any reason at all - seems to be little
> interest in
>
> I don't think combining these three into one case will do any good. In
> fact, I think it's confusing and unusable.
That makes sense.
I think that you're missing one key use case, though: autodiscovery of
a blog's main feed from sub-parts of it. A lot of websites link to the
main blog feed from individual entries, for example, and they're doing
it with rel="alternate", which is not appropriate. It frustrates me that
there is no way of changing these links to not use rel="alternate".
And for linking to other pages.. Here's a real-world example:
The mozilla.org main page  is an example
of where rel="alternate" is a problem. There were three feeds on
it: "Announcements", "mozillaZine News", and "Mozilla Weblogs"
(now only two). Each one is an alternate of a web page, but of
_other_ pages (http://www.mozilla.org/news.html, http://www.mozillazine.org/, 
and http://planet.mozilla.org/ respectively), not the mozilla.org
front page. The last few headlines for each feed are listed on
the front page, and the designer felt it was appropriate for
autodiscovery to work on this page -- but it is not appropriate
for rel="alternate" to be used for those autodiscovery links.
They are not alternate representations of the front page.

Here's another example:
LiveJournal creates a "Friends" page, where it aggregates the
blogs of all the users you've designated as "friends". It could
create an Atom feed representing this aggregation, and mark that
as rel="alternate". What could also be useful, however, would be
linking to each of these blogs' feeds individually as well so
that they're represented individually in my aggregator and it
can aggregate them itself. Unlike the pre-aggregated feed,
however, these are not alternate representations of the Friends
page, and shouldn't be marked as such.
Making it possible for pages to link to non-alternate autodiscoverable
feeds without using rel="alternate" -- and encouraging this practice --
would make it possible for UAs to actually /discriminate/ between
alternate and non-alternate feeds. Right now they can't, because
everything is indiscriminately marked as "alternate".
~fantasai

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham

On 5 May 2005, at 5:02 am, Tim Bray wrote:
My Portfolio
 
 MSFT
  2005-05-03T10:00:00-05:00
  Bid: 25.20 Ask: 25.50 Last: 25.20
  
 MSFT
  2005-05-03T11:00:00-05:00
  Bid: 25.15 Ask: 25.25 Last: 25.20
  
 MSFT
  2005-05-03T12:00:00-05:00
  Bid: 25.10 Ask: 25.15 Last: 25.10
  

Tim, model this as a blog first. Is it:
a) One entry that's being updated?
b) Hourly new postings with the latest price?
See, I think it's b). Which under any sensible circumstance would  
count as new entries, and therefoe get new ids. You're trying to use  
atom:id as a category system here. Let's say I post a new picture of  
my cat every day. Should all my blog entries have the same id?

Technical problems:
The problem multiple ids is that we don't have a date element that  
provides a definitive answer to the question, "What is the current  
version?", which 99% of the time is all an aggregator needs. For  
example, what happens if I retract an update to an entry, and  
presumably roll back atom:updated? The new version stays? If so, the  
spec of atom:updated needs changing.

I see you have the constraint "Their atom:updated timestamps SHOULD  
be different, and processing software SHOULD regard entries with  
duplicate atom:id and atom:updated values as evidence of an error in  
the feed generation". Does this apply temporally as well as  
spatially? For example, if the content changes the second time I load  
something, but the atom:updated doesn't, is that an error?

Again, atom:updated falls short for this purpose.
Finally, at pubsub, what happens when they download an entry from one  
feed, then the user edits it, but doesn't modify atom:updated, then  
they download the new entry from a second feed associated with the  
site. Different content, identical atom:ids, identical atom:updated  
=> Invalid feed. They're not in any better position than they were  
before. This doesn't even solve the problem it's meant to.

"If an Atom Feed Document contains multiple entries with the same  
atom:id, software MAY choose to display all of them or some subset of  
them"

What does this even mean, other than "atom:id is meaningless, ignore  
it"?

I looked around and failed to find how we claimed we were going to  
do that while still forbidding duplicates, but it's possible I  
missed that.
Duplicate ids is a constraint of the atom:feed element. Use a  
different top level element, atom:archive, for archives.

Graham

Re: http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs

2005-05-05 Thread Graham

On 5 May 2005, at 2:20 am, Bob Wyman wrote:
Basically, you don’t have to update atom:updated unless you think  
it makes sense OR you are publishing to a feed that already has an  
entry with the same atom:id as the atom:id of the entry you are  
currently publishing.
Or someone downstream is publishing one, of course. Which means you  
must always change atom:updated, just in case. No harm done, Tim.  
None at all.

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story

I hope it won't be thought rude for me to answer your questions to Tim.
On 5 May 2005, at 10:52, Graham wrote:
On 5 May 2005, at 5:02 am, Tim Bray wrote:
[I have added the id's tim mentioned he forgot]
My Portfolio
 
 
 MSFT
 urn:uuid:1335c695-cfb8-4ebb--80da344efa6a
 2005-05-03T10:00:00-05:00
 Bid: 25.20 Ask: 25.50 Last: 25.20
 
 
MSFT
urn:uuid:1335c695-cfb8-4ebb--80da344efa6a
2005-05-03T11:00:00-05:00
Bid: 25.15 Ask: 25.25 Last: 25.20
 
 
MSFT
urn:uuid:1335c695-cfb8-4ebb--80da344efa6a
2005-05-03T12:00:00-05:00
Bid: 25.10 Ask: 25.15 Last: 25.10
 

So with these additions lets look at your question:
Tim, model this as a blog first. Is it:
a) One entry that's being updated?
b) Hourly new postings with the latest price?
Given that the ids are the same it is now clear that we have  
situation a)

See, I think it's b). Which under any sensible circumstance would  
count as new entries, and therefoe get new ids. You're trying to  
use atom:id as a category system here.
No, this is just the way resources on the internet work. If you go to  

you will get a different web page every day. A resource can have  
different representation
at different times. Google could of course architecture its web site  
differently and have
 be simply a redirect to a dated resource that  
would be unchanging.
Both work.

Let's say I post a new picture of my cat every day. Should all my  
blog entries have the same id?
If all your cat blog entries had the same id, and you posted a new  
picture of your cat every
day, then you should expect that some aggregators or clients might  
dump all but the latest
version of your entry.

So it depends on what you want.
  If you want a feed of how your cat changed, and you want to make  
sure that people will be
 able to see all the changes to your cat, then you would have a  
different entry for
each of the cat states, and each of these would have their own  
"alternate" link.

  If you want an entry about what you think of your cat today, or  
what your cat looks like
today, then you would have one entry with the same id changing on a  
daily basis. Aggregators
could choose to keep the older versions, or they could choose to  
discard them.

Technical problems:
The problem multiple ids is that we don't have a date element that  
provides a definitive answer to the question, "What is the current  
version?", which 99% of the time is all an aggregator needs.
You can never know about any resource on the net what the "current  
version" is. This is due
to limits such as the speed of light, the speed at which data can  
travel through the pipes
across the world, etc...

But it does not matter. All you may be interested in at any point in  
time is what is the
latest version that *you* have available. You can't do better than  
that. So if you get a
feed with two entries with the same id, you just look at the time  
stamp and keep the latest.
If you find the same id in another trusted feed, you make the same  
comparison.

For example, what happens if I retract an update to an entry, and  
presumably roll back atom:updated? The new version stays? If so,  
the spec of atom:updated needs changing.
There are two situations here:
  A. You remove the later entry representation that you placed in  
your feed. You can do this,
   but it won't stop people who have already retrieved your feed  
from having received the old
   representation of the feed, with the non retracted entry.  
Remember that the internet has a
   memory, both long term [1] and short term. For example all the  
caches in the world that this
   representation may have travelled through may also  contain that  
representation. The more
   intelligent caches (such as the one in an aggregator) may on  
later retrieving your altered
   feed and comparing the entry therein with the one  they have in  
memory, decide to keep the
   one they have in memory, since that is a later version of your  
entry.
  B. You make clear that your retraction is an event, and so you add  
a new entry with that id
   (you may, but need not, remove the old entry representation (the  
)) and
   with a new time stamp, but with the old content.

I suggest that B. would be the more honest and less confusing option  
to the readers of your feed.

I see you have the constraint "Their atom:updated timestamps SHOULD  
be different, and processing software SHOULD regard entries with  
duplicate atom:id and atom:updated values as evidence of an error  
in the feed generation". Does this apply temporally as well as  
spatially? For example, if the content changes the second time I  
load something, but the atom:updated doesn't, is that an error?
As I see, Tim's language is currently restricted to what happens with  
two entries with the same id
in the same feed document. What happens across feed documents is not  
defined here.

I myself think that having entries

AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Sam Ruby

*** REMINDER **
** Use more specific subject lines when responding to this note! **
*** REMINDER **
First the meat... here's the new atom pub issues list, conveniently 
sorted into categories:

  EntryId:
http://intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
http://intertwingly.net/wiki/pie/PaceDuplicateIDWithSource2
http://intertwingly.net/wiki/pie/PaceDuplicateIDWithSource
http://intertwingly.net/wiki/pie/PaceExplainDuplicateIds
  FeedId:
http://intertwingly.net/wiki/pie/PaceFeedIdOrAlternate
http://intertwingly.net/wiki/pie/PaceFeedIdOrSelf
http://intertwingly.net/wiki/pie/PaceOptionalFeedLink
  Provenance:
http://intertwingly.net/wiki/pie/PaceOriginalAttribute
http://intertwingly.net/wiki/pie/PaceSourceRecs
  Status:
http://intertwingly.net/wiki/pie/PaceEntryState
http://intertwingly.net/wiki/pie/PacePubControl
http://intertwingly.net/wiki/pie/PacePubStatusResource
  Text:
http://intertwingly.net/wiki/pie/PaceBriefExample
http://intertwingly.net/wiki/pie/PaceCoConstraintsAreBad
http://intertwingly.net/wiki/pie/PaceOptionalSummary
http://intertwingly.net/wiki/pie/PaceRecommendPlainTextContent
http://intertwingly.net/wiki/pie/PaceTextShouldBeProvided
  Recommended for Closure:
http://intertwingly.net/wiki/pie/PaceXhtmlDivSuggestedOnly
http://intertwingly.net/wiki/pie/PaceXmlContentWrapper
Now for some administrivia.  No progress was made on the last published 
issued list[1], so I've gone ahead and marked those issues that were 
recommended for closure, closed; and those currently under discussion 
were moved back to Needing to Revisit.

Next, I'd like to remind everybody that last call for the Atom Format 
went out.  Operationally, what this means is that the secretary and 
co-chairs are going to be increasingly reluctant to revisit things 
simply because somebody wants to bring them up again.  What that means 
is that in order to successfully bring up an issue, you need to do a 
little homework.  Demonstrate that you have revisited the previous 
discussion, and that you either have something new to add, or can point 
out some evidence that the previous consensus call was made in error.

Tim has taken the opportunity to lead by example on this one with 
PaceAllowDuplicateIDs.  The secretary and co-chairs all are in agreement 
that the XhtmlDiv related paces don't meet this criteria.  If anyone 
disagrees, what we would like to ask is that you follow Tim's lead.

Because we are in last call, I've scheduled everything related to the 
Format document.  As one of the status paces touches on the format, I've 
scheduled all three.  All we need to resolve now is the extent to which 
this is going to affect the format document.

I believe that PaceBriefExample is truly editorial, meaning that the 
editors can act on this at their discretion.

- Sam Ruby
[1] http://www.imc.org/atom-syntax/mail-archive/msg13691.html

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 11:36 am, Henry Story wrote:
Tim, model this as a blog first. Is it:
a) One entry that's being updated?
b) Hourly new postings with the latest price?
Given that the ids are the same it is now clear that we have  
situation a)
I said "first", before we decide what ids we should use. If created  
as a blog, Tim's stock quotes would make most sense (to me) posted as  
hourly new entries. Ergo, they should each have different ids. Again,  
atom:id is not a category system.

Either the change they made is "significant" or it is not. If it is  
a significant change then
by not changing the atom:updated field the user will have done  
something other than what he thought
he was doing. For by not changing the date he is allowing receiving  
software to decide by themselves
whether they wish to keep or drop the change. If it is not a  
significant change, then the receiving
software won't be doing anything problematic by either dropping the  
later version received or keeping it.
atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, wants to see the  
latest version, reliably, even if the publisher disagrees that the  
change was significant. This is the core problem with Tim's proposal.  
There is no way to create an aggregator that works in the way the  
user expects.

I am lying down in the road here, as Tim would say.
Well that seems like a very complicated way of solving a problem  
where allowing entries with duplicate ids in a feed document from  
the start would be much simpler. If you are going to
allow  feeds to keep duplicates then why not just allow   
 and be done with it?
Because feeds are feeds and archives are archives? They have  
different audiences and different uses and different requirements.

Graham

RE: Atom feed refresh rates

2005-05-05 Thread Andy Henderson


>>>we're designing a feed format here. When this feed is served through
HTTP, (re-)using the caching features of HTTP will ensure that any standard
HTTP client will take advantage of it. For instance, if you use an HTTP
client component that maintains it's own cache, it will automatically do the
right thing. Also, when you're accessing the feed through an HTTP proxy, you
will get copies from the proxy's cache when available.<<<

Understood.  My issue is that creating the headers is outside the capability
of many (most?) feed providers.

>>>I just checked and Apache allows you to set the "Expires" header through
"mod_expires" (). <<<

I'm sure you're right, but it would mean little to most feed providers
(including me).

>>>Lotus Domino seems to do it though a thing called "Web Site Rule" 
(). <<<

DWA is a sub-feature of the mail client so not helpful, I'm afraid.  Thanks
for looking into it, though.

>>>I'm sure you can do it with other packages as well.<<<

If it was available at the blogging package level, I'm sure people that use
those packages would use the feature.  The fact few feeds seem to use the
Expires header, and those that do use it to immediately expire, seems to
indicate an issue (proxy caching?).

>>>On the other hand, of the feeds you checked, how many did actually
implement the corresponding RSS feature
()?<<<

Out of 33 RSS.9/2 feeds, 16 have ttl tags.  Syndic8 reports that 16,840 RSS
feeds use it.  It says that's 7% of the total.  The actual percentage is
better than that because I believe ttl is RSS2 only; it's certainly not
RSS1.

That's pretty high considering that ttl is flawed because it was not
originally designed to communicate minimum refresh intervals.

>>>If you can demonstrate that lots of feeds use thos feature, and that
aggregators indeed pay attention to it, you may be able to convince the WG
that Atom needs this to achive feature parity.<<<

I've no way to demonstrate aggregator use except my own - even though I
support a tiny community I do observe and enforce the ttl tag.  I'm sure
that if there were a clearly-defined tag supported by Mark's implementation
document, usage would significantly improve over RSS2 ttl levels.  As for
convincing the WG, I would simply point out that a mechanism widely
available to, and understood by, feed providers and aggregators cannot do
harm and has the potential to do a great deal of good.  It seems to me to be
a useful opportunity to demonstrate a clear improvement over both RSS1 and
RSS2.

Andy

Re: Atom feed refresh rates

2005-05-05 Thread Julian Reschke

Andy Henderson wrote:
You can do that with the "Expires" response header. Everytime the 
resource is requestef, serve it with a value of "now +
minimumrefreshinterval".<<<
Ah.  I see what you mean.  Thank you.
The problem is that when you say "You can do that now with the "Expires"
response header" - I can't.  It's a theoretical capability I have, but not a
practical one.
I am directly responsible for three feeds.  One is a feed associated with my
aggregator.  It's a simple xml file stored on a shared server along with the
rest of my web site.  I have no access to any HTML headers.  When
aggregators access my feed, no code of mine runs - the transaction is
handled by the server alone.  The other two are feeds generated on the fly
from a back-end database; again they are running on a shared server and,
again, my development tool (IBM's Domino) gives me no access to set the
Expires header.
I just used Microsoft's Fiddler tool to check all the feeds I subscribe to
(not a scientific sample, I admit, but it's a pretty broad mix and includes
blog sites and blogging tools) and just two provide Expires headers.  One is
the BBC, the other is Wired.  Both set Expires to expire immediately.  I'm
guessing they have good reason to do that.  I re-subscribed to Slashdot
(which has implemented draconian bandwidth throttling measures) and it
doesn't use Expires headers.
So, Expires is a measure that I could use in theory but is not available to
me either directly or, apparently, via third party blogging sites/tools.
When I look at best practice, I find Expires is either not used or is used
in a different way.
Both from a provider viewpoint and from an aggregator viewpoint, Expires
does not seem a practical option.
Well,
we're designing a feed format here. When this feed is served through 
HTTP, (re-)using the caching features of HTTP will ensure that any 
standard HTTP client will take advantage of it. For instance, if you use 
an HTTP client component that maintains it's own cache, it will 
automatically do the right thing. Also, when you're accessing the feed 
through an HTTP proxy, you will get copies from the proxy's cache when 
available.

I just checked and Apache allows you to set the "Expires" header through 
"mod_expires" (). 
Lotus Domino seems to do it though a thing called "Web Site Rule" 
(). 
I'm sure you can do it with other packages as well.

On the other hand, of the feeds you checked, how many did actually 
implement the corresponding RSS feature 
()?

If you can demonstrate that lots of feeds use thos feature, and that 
aggregators indeed pay attention to it, you may be able to convince the 
WG that Atom needs this to achive feature parity.

Best regards, Julian

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim

On 5/5/05, Andy Henderson <[EMAIL PROTECTED]> wrote:
> convincing the WG, I would simply point out that a mechanism widely
> available to, and understood by, feed providers and aggregators cannot do
> harm and has the potential to do a great deal of good.

Not to be flippant, but we have one that's widely available.  It's
called the Expires header.  I spoke with Roy Fielding at Apachecon
2003 and asked him this exact question: "If I set an Expires header on
a feed of now + 3 hours, does that mean that I don't want the client
to fetch the feed again for at least 3 hours?"  And he said yes,
that's exactly what it means.

I sympathize with your dilemma that you have no control over your HTTP
headers, but... wait, no I don't sympathize.  At all.

-- 
Cheers,
-Mark

RE: Atom feed refresh rates

2005-05-05 Thread Walter Underwood

--On May 5, 2005 8:15:10 AM +0100 Andy Henderson <[EMAIL PROTECTED]> wrote:
>
> here is no RSS2 feature I can see that allows feed providers to tell
> aggregators the minimum refresh period.  There's the ttl tag.  That was, I
> believe, introduced for a different purpose and determines the Maximum time
> a feed should be cached in a certain situation. 

We need both a ttl (max-age) and expires. One or the other is appropriate
for different publishing needs. We also need to specify what you do with
those values, or you end up with a mess, like the RSS2 ttl meaning reversing
over an undocumented value (Yikes!).

> What has yet to be tried is a specific tag in the core feed standard that
> promotes and determines good behaviour for aggregators refreshing their
> feeds.  Even if it were to prove only a limited benefit, it would still be a
> benefit.

It has been tried several ways, originally in robots.txt extensions and
also in RSS. It doesn't work. The model is not rich enough for publishers
or for spiders/aggregators.

Max-age/expires is already designed and proven. By page count, 20% of the
HTTP 1.1 spec is about caching. If we want to write a new caching/scheduling
approach, we can expect it to be a 20 page spec, plus an additional 10
pages on how to work with the HTTP model.

See the Notes section here for details on when to use max-age or expires,
and on the problems with calendar-based schemes.

wunder
--
Walter Underwood
Principal Architect, Verity

status paces

2005-05-05 Thread Robert Sayre


Throw them all back. No reason we can't do this in an extension
element. It would be big mistake to put these in right now.

Robert Sayre

Re: Atom feed refresh rates

2005-05-05 Thread Graham

On 5 May 2005, at 1:54 pm, Andy Henderson wrote:
I've no way to demonstrate aggregator use except my own - even  
though I
support a tiny community I do observe and enforce the ttl tag.
You seem to want the ttl element so that you have the publisher's  
permission to check less often. Why not just do so anyway if it  
causes so many problems? If that degrades the user experience too  
much, you're free to check more often. How is the ttl element useful  
to you?

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 5/5/05 9:32 PM, "Graham" <[EMAIL PROTECTED]> wrote:

> atom:updated is used by the publisher to show what they consider a
> significant change. The user, on the other hand, wants to see the
> latest version, reliably, even if the publisher disagrees that the
> change was significant. This is the core problem with Tim's proposal.
> There is no way to create an aggregator that works in the way the
> user expects.

perhaps we needed atom:modified after all :-(

>> Well that seems like a very complicated way of solving a problem
>> where allowing entries with duplicate ids in a feed document from
>> the start would be much simpler. If you are going to
>> allow  feeds to keep duplicates then why not just allow
>>  and be done with it?
> 
> Because feeds are feeds and archives are archives? They have
> different audiences and different uses and different requirements.

And what about the use case of a wiki's RecentChanges log? Each entry refers
to a specific page, and there may be multiple such entries for each page as
it gets rapidly edited ... and wiki folks have found it important to be able
to monitor all change events.

I sure hope we don't have to define yet another atom document root element
(eg. ). That sounds like such a hack.

e.

Re: Atom feed refresh rates

2005-05-05 Thread Walter Underwood

--On May 5, 2005 8:07:15 AM -0500 Mark Pilgrim <[EMAIL PROTECTED]> wrote:
>
> Not to be flippant, but we have one that's widely available.  It's
> called the Expires header. 

You need the information outside of HTTP. To quote from the RSS spec
for ttl:

  This makes it possible for RSS sources to be managed by a file-sharing 
  network such as Gnutella. 

Caching information is about knowing when your client cache is stale,
regardless of how you got the feed.

wunder
--
Walter Underwood
Principal Architect, Verity

Re: Autodiscovery, real-world examples

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 01:27  AM, fantasai wrote:
And for linking to other pages.. Here's a real-world example:
The mozilla.org main page  is an example
of where rel="alternate" is a problem. There were three feeds on
it: "Announcements", "mozillaZine News", and "Mozilla Weblogs"
(now only two). Each one is an alternate of a web page, but of
_other_ pages (http://www.mozilla.org/news.html, 
http://www.mozillazine.org/, and http://planet.mozilla.org/ 
respectively), not the mozilla.org
front page. The last few headlines for each feed are listed on
the front page, and the designer felt it was appropriate for
autodiscovery to work on this page -- but it is not appropriate
for rel="alternate" to be used for those autodiscovery links.
They are not alternate representations of the front page.

I'm beginning to sway in the direction of this argument, but I'm not 
sure whether I'll sway back or not.  Given that @type will clearly (for 
Atom and RSS 2 anyway, if not for RSS 1) identify the feed as a feed 
(...or maybe an Atom Entry Document...the feed reader can deal with 
that issue when the user tries to subscribe), I don't think there's a 
big need for @rel to say "feed".  But it wouldn't be illogical for use 
"alternate" for only the feed associated with a particular page 
(perhaps including the case of an individual entry archive page), and 
something else like "related" to point to other feeds.  A UA could 
check @rel and @type and present a UI saying something like "subscribe 
to the feed for this page" and "subscribe to a feed related to this 
page".

RE: Atom feed refresh rates

2005-05-05 Thread Andy Henderson


>>>You seem to want the ttl element so that you have the publisher's
permission to check less often. Why not just do so anyway if it causes so
many problems? If that degrades the user experience too much, you're free to
check more often. How is the ttl element useful to you?<<<

I allow anyone to specify any refresh interval higher than the greater of
ttl or 60 minutes.

The ttl allows me to extend the minimum refresh interval beyond 60 minutes.
'MSDN just published' at http://msdn.microsoft.com/rss.xml includes
1440.  I therefore set the refresh interval to 1 day when the
feed is added and I do not allow people to specify a lower refresh interval.

If the ttl tag simply described the minimum refresh interval, I would also
use it to allow people to specify refresh intervals less than 60 minutes
knowing that was acceptable to the feed provider.  Unfortunately, the
genesis of the ttl tag means that lower ttl values are unreliable.  The BBC,
for example, specifies a ttl of 5 which I'm sure refers to that tag's
original use, not a minimum refresh interval.

Andy

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Robert Sayre

On 5/5/05, Eric Scheid <[EMAIL PROTECTED]> wrote:

> > Because feeds are feeds and archives are archives? They have
> > different audiences and different uses and different requirements.
> 
> And what about the use case of a wiki's RecentChanges log? Each entry refers
> to a specific page, and there may be multiple such entries for each page as
> it gets rapidly edited ... and wiki folks have found it important to be able
> to monitor all change events.

I'm much more sympathetic to the aggregate feed problem of multiple
IDs. People advocating this type of thing seem to think the default
action should be grouping, so they want to use the same ID. I think
that's a bad idea, and there are plenty of other ways to indicate the
fundamental sameness of entries. For example, NewsML URNs have a
NewsItemID and a RevisionID, which would allow smart aggregators to
group the entries without violating Atom's constraint.

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 2:26 pm, Eric Scheid wrote:
perhaps we needed atom:modified after all :-(
Yes we do, if we want to go down this route. I suggest appending the  
current time (or for old versions, the last time that version was  
current) at the source.

And what about the use case of a wiki's RecentChanges log? Each  
entry refers
to a specific page, and there may be multiple such entries for each  
page as
it gets rapidly edited ... and wiki folks have found it important  
to be able
to monitor all change events.
Each log entry is an entry in itself, with its own id. That seems a  
far better functional parallel to the basic blog feed. As with the  
share price example, the topic of the entry (the company, or the wiki  
page) is far more analogous to a category that the entry belongs to,  
than to an its identity. Everyone stop trying to use ids as a  
category system.

Graham

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Walter Underwood

--On May 5, 2005 7:17:00 AM -0400 Sam Ruby <[EMAIL PROTECTED]> wrote:
>
> Demonstrate that you have revisited the previous discussion, and that you 
> either
> have something new to add, or can point out some evidence that the previous
> consensus call was made in error.

PaceCaching was not discussed and rejected based on false information.
It was rejected because it was HTTP-specific (it is not), and because
it was non-core (similar features are common in other RSS specs).

It does not interact with other features, so it should be a fairly
clean, quick discussion.

wunder
--
Walter Underwood
Principal Architect, Verity

Re: Last Call: 'The Atom Syndication Format' to Proposed Standard

2005-05-05 Thread A. Pagaltzis


* Thomas Broyer <[EMAIL PROTECTED]> [2005-05-03 19:35]:
> This means type="text" content is a single paragraph of text.
> If you need paragraphs, lists or any other "structural
> formatting", you have to use type="html" or type="xhtml" with
> the appropriate content.

Or type="text/plain", Iâd assume?

Regards,
-- 
Aristotle

Re: Autodiscovery, real-world examples

2005-05-05 Thread Robert Sayre

On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
> 
> On Thursday, May 5, 2005, at 01:27  AM, fantasai wrote:
> > front page. The last few headlines for each feed are listed on
> > the front page, and the designer felt it was appropriate for
> > autodiscovery to work on this page -- but it is not appropriate
> > for rel="alternate" to be used for those autodiscovery links.
> > They are not alternate representations of the front page.

I can see your point, but the autodiscovery spec isn't standardizing
all usages. Just the one we have a good grasp on. Secondly, there's
nothing stopping UAs from "discovering" other feeds. Safari 2.0
already does this.

Robert Sayre

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Mark Pilgrim


On 5/5/05, Walter Underwood <[EMAIL PROTECTED]> wrote:
> It does not interact with other features, so it should be a fairly
> clean, quick discussion.

You must be new here.

/ducks

-- 
Cheers,
-Mark

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim

On 5/5/05, Walter Underwood <[EMAIL PROTECTED]> wrote:
> You need the information outside of HTTP. To quote from the RSS spec
> for ttl:
> 
>   This makes it possible for RSS sources to be managed by a file-sharing
>   network such as Gnutella.

Ignoring, for the moment, that this is a horrible idea and no one
supports it, Gnutella has its own caching and time-to-live mechanisms
that the RSS spec is ignoring.

-- 
Cheers,
-Mark

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 5/5/05 11:55 PM, "Graham" <[EMAIL PROTECTED]> wrote:

>> And what about the use case of a wiki's RecentChanges log? Each entry refers
>> to a specific page, and there may be multiple such entries for each  page as
>> it gets rapidly edited ... and wiki folks have found it important to be able
>> to monitor all change events.
>> 
> Each log entry is an entry in itself, with its own id.

Sorry, that makes as much sense as changing the id for a blog entry if that
blog entry is updated.

The functional parallel is wiki-page = blog-entry, and if a blog-entry is
updated then that is reflected in the feed as an updated entry - with the
same id.

e.

Re: PaceCaching

2005-05-05 Thread Graham

On 5 May 2005, at 3:02 pm, Walter Underwood wrote:
PaceCaching was not discussed and rejected based on false information.
It was rejected because it was HTTP-specific (it is not), and because
it was non-core (similar features are common in other RSS specs).
It does not interact with other features, so it should be a fairly
clean, quick discussion.
Unless we can make providing incorrect or misleading information in  
either of these elements lead to the immediate purging of first born,  
they're not at all useful to anyone. The "expires date" can't apply  
to 99% of feeds since they don't work on a fixed or predictable  
schedule. Meanwhile "max-age" doesn't provide any actionable  
information to caches, beyond "I chose this random number off the top  
of my head when I wrote my Atom script 3 years ago. Deal." Do you  
seriously expect it to be interpreted as a promise that the feed  
won't change for the next x minutes?

Graham

Re: PaceCaching

2005-05-05 Thread Mark Pilgrim

On 5/5/05, Graham <[EMAIL PROTECTED]> wrote:
> seriously expect it to be interpreted as a promise that the feed
> won't change for the next x minutes?

No, but I do seriously expect it to be interpreted that the feed
publisher does not wish clients to check it for the next x minutes.

-- 
Cheers,
-Mark

Re: Atom feed refresh rates

2005-05-05 Thread James Aylett

On Thu, May 05, 2005 at 08:07:15AM -0500, Mark Pilgrim wrote:

> Not to be flippant, but we have one that's widely available.  It's
> called the Expires header.  I spoke with Roy Fielding at Apachecon
> 2003 and asked him this exact question: "If I set an Expires header on
> a feed of now + 3 hours, does that mean that I don't want the client
> to fetch the feed again for at least 3 hours?"  And he said yes,
> that's exactly what it means.

I think the problem here may be that the HTTP/1.1 spec gives the
impression that the Expires header is not designed to affect end
clients (user agents).

For instance, from 13.2.1 ("Server-Specified Expiration"), we get the
sentence:

"The expiration mechanism applies only to responses taken from a cache
and not to first-hand responses forwarded immediately to the
requesting client."

Now many clients themselves contain caches, but this distinction may
still be the source of some confusion, especially as the number of
people who know about the distinction (by having written a user agent)
compared to the number who are affected by it (by writing server
components) is pretty small.

James

-- 
/--\
  James Aylett  xapian.org
  [EMAIL PROTECTED]   uncertaintydivision.org

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 15:55, Graham wrote:
On 5 May 2005, at 2:26 pm, Eric Scheid wrote:

perhaps we needed atom:modified after all :-(
Yes we do, if we want to go down this route. I suggest appending  
the current time (or for old versions, the last time that version  
was current) at the source.
Sorry I don't understand why we need atom:modified.

And what about the use case of a wiki's RecentChanges log? Each  
entry refers
to a specific page, and there may be multiple such entries for  
each page as
it gets rapidly edited ... and wiki folks have found it important  
to be able
to monitor all change events.

Each log entry is an entry in itself, with its own id. That seems a  
far better functional parallel to the basic blog feed.
As I explained in my lengthy reply to your lengthy post, I think one  
should be able to do either.
Each way has its advantages and disadvantages. Let the publisher  
decide which mechanism to use.

As with the share price example, the topic of the entry (the  
company, or the wiki page) is far more analogous to a category that  
the entry belongs to, than to an its identity.
Again let the publisher choose what the identity criterion of his  
objects are. Some will stick
some will not. But it is not up to us to decide for our users.

Since it does not cause any interoperability issues, what's the problem?
Everyone stop trying to use ids as a category system.
I don't think that one would be using ids as a category system. If  
you go to
 you get todays front page. Tomorrow you get  
tomorrows front page.
What's the problem? Is  a hidden category system?

Henry Story
http://bblfish.net/blog/

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Robert Sayre

On 5/5/05, Eric Scheid <[EMAIL PROTECTED]> wrote:
> 
> On 5/5/05 11:55 PM, "Graham" <[EMAIL PROTECTED]> wrote:
> >>
> > Each log entry is an entry in itself, with its own id.
> 
> Sorry, that makes as much sense as changing the id for a blog entry if that
> blog entry is updated.

Graham's got it exactly right. 

> The functional parallel is wiki-page = blog-entry, and if a blog-entry is
> updated then that is reflected in the feed as an updated entry - with the
> same id.

That's right, that is the functional parallel. No software I know of 
shows both revisions of the entry in the feed when it's updated. If
you are syndicating wiki changes, part of each entry is the diff and
revision id--each revision is a unique thing.

Another analogous use case would be a feed watching a certain file in
CVS. Every entry would be about the same file, but each would have its
own atom:id.

Once again, there remains a downstream problem for PubSub, etc. 

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 3:32 pm, Henry Story wrote:
As I explained in my lengthy reply to your lengthy post, I think  
one should be able to do either.
Each way has its advantages and disadvantages. Let the publisher  
decide which mechanism to use.
Well please flag it so that I can provide a consistent user interface  
to people's whims?

Since it does not cause any interoperability issues, what's the  
problem?
I have to come up with a new way to recognise and interpret such  
feeds where an entry (as defined by its id) isn't an entry but a feed  
of different entries.

I don't think that one would be using ids as a category system. If  
you go to
 you get todays front page. Tomorrow you get  
tomorrows front page.
What's the problem? Is  a hidden category system?
Charter: "Atom defines a feed format for representing resources such  
as Weblogs, online journals, Wikis,
and similar content"

Atom is not a replacement for HTTP. Google.com is a web page, not  
"similar content". It's not relevant here.

Graham

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Sam Ruby

Walter Underwood wrote:
--On May 5, 2005 7:17:00 AM -0400 Sam Ruby <[EMAIL PROTECTED]> wrote:
Demonstrate that you have revisited the previous discussion, and that you either
have something new to add, or can point out some evidence that the previous
consensus call was made in error.
PaceCaching was not discussed and rejected based on false information.
It was rejected because it was HTTP-specific (it is not), and because
it was non-core (similar features are common in other RSS specs).
Actually, it never has been rejected.  I had miscategorized it as 
protocol.  I've fixing that now, and scheduled it for this cycle.

Sorry for the confusion.
- Sam Ruby

Re: status paces

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 07:13  AM, Robert Sayre wrote:
Throw them all back. No reason we can't do this in an extension
element.
+1

Re: PaceCaching

2005-05-05 Thread James Aylett


On Thu, May 05, 2005 at 09:26:46AM -0500, Mark Pilgrim wrote:

> > seriously expect it to be interpreted as a promise that the feed
> > won't change for the next x minutes?
> 
> No, but I do seriously expect it to be interpreted that the feed
> publisher does not wish clients to check it for the next x minutes.

Indeed - Expires doesn't say it won't change, it says that you
shouldn't care whether it's changed.

James

-- 
/--\
  James Aylett  xapian.org
  [EMAIL PROTECTED]   uncertaintydivision.org

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Antone Roundy

If we accept this Pace, are we going to do anything to address the DOS 
issue for aggregated feeds?

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham


On 5 May 2005, at 3:23 pm, Eric Scheid wrote:
And what about the use case of a wiki's RecentChanges log? Each  
entry refers
to a specific page, and there may be multiple such entries for  
each  page as
it gets rapidly edited ... and wiki folks have found it important  
to be able
to monitor all change events.


Each log entry is an entry in itself, with its own id.
Sorry, that makes as much sense as changing the id for a blog entry  
if that
blog entry is updated.
Have a look here:
http://en.wikipedia.org/w/index.php?title=Main_Page&action=history
There you have a reverse chrono list, each with an author, date, and  
summary. Looks an awful lot like each one is an entry to me.

Graham

PaceTextShouldBeProvided

2005-05-05 Thread Antone Roundy

+1 except for one thing: In section 4.1.2, I'd suggest something along 
these lines:

atom:entry elements which do not contain an atom:content element, or 
whose atom:content element's type attribute indicates a MIME media 
type, SHOULD contain an atom:summary element.

PaceOptionalSummary

2005-05-05 Thread Antone Roundy

+1.
...oh, and the wording I just suggested for part of 
PaceTextShouldBeProvided would depend on this also being accepted.

PaceRecommendPlainTextContent

2005-05-05 Thread Antone Roundy

-1.
This is entirely up to the publisher.  I think enough publishers are 
going to want things like links and line/paragraph 
breaks in their content that this recommendation would be so routinely 
ignored as to be meaningless.

PaceAlternateLinkWeakening

2005-05-05 Thread Henry Story

I have put PaceAlternateLinkWeakening on the wiki, though it was  
discusses on this
list, as it might not have cought the eye of the editors/secretary.

http://www.intertwingly.net/wiki/pie/PaceAlternateLinkWeakening
I think this is very uncontroversial clarification.
Henry Story

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 6/5/05 12:32 AM, "Henry Story" <[EMAIL PROTECTED]> wrote:

> Sorry I don't understand why we need atom:modified.

Graham suggested it : a reliable way for an aggregator to discern the latest
version of an entry.

> atom:updated is used by the publisher to show what they consider a
> significant change. The user, on the other hand, wants to see the
> latest version, reliably, even if the publisher disagrees that the
> change was significant. This is the core problem with Tim's proposal.
> There is no way to create an aggregator that works in the way the
> user expects.

... no way, that is, unless we have atom:modified making each same-id entry
distinct, and not just distinct but also time-ordered and time-distanced (an
advantage over just using something similar to the NewsML RevisionID
mechanism).

e.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 6/5/05 12:45 AM, "Graham" <[EMAIL PROTECTED]> wrote:

> Have a look here:
> http://en.wikipedia.org/w/index.php?title=Main_Page&action=history
> 
> There you have a reverse chrono list, each with an author, date, and
> summary. Looks an awful lot like each one is an entry to me.

and looks to me like a stream of meta-data concerning the one entry to me.

and not distinct and separable entries like you'd find in your every day
blog.

Henry has the right idea -- the spec should allow both kinds, rather than
trying to shoe-horn everything into the one viewpoint of what is an entry.

e.

PaceOriginalAttribute

2005-05-05 Thread Antone Roundy

-1.
I don't see that this solves any problem.  It may help people avoid 
accidentally generating invalid feeds (if we stick to not to allowing 
duplication of atom:id within a feed), but it does it by simply 
shunting the issue off into a different element which doesn't have 
duplication constraints.  It doesn't address the DOS problem--neither 
accidental nor intentional.  And it doesn't make it any easier to 
determine whether or not entries in different feeds with the same 
atom:id are really the same entry or not.  In fact, it just complicates 
the task by requiring the inspection of two elements instead of one.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 16:38, Graham wrote:
On 5 May 2005, at 3:32 pm, Henry Story wrote:
As I explained in my lengthy reply to your lengthy post, I think  
one should be able to do either.
Each way has its advantages and disadvantages. Let the publisher  
decide which mechanism to use.
Well please flag it so that I can provide a consistent user  
interface to people's whims?
What is the problem with the user interface that you have exactly? I  
have pointed you to
BlogEd that keeps a history of all the changes to an entry. Try it out:
http://blogs.sun.com/roller/page/bblfish/
Its open source, so you can also copy the code.

If you don't want to keep a history of the entries all you need to do  
is drop all but the
latest entry with the same id. There is nothing more to it. Just show  
the user the last
one you came across.

Since it does not cause any interoperability issues, what's the  
problem?
I have to come up with a new way to recognise and interpret such  
feeds where an entry (as defined by its id) isn't an entry but a  
feed of different entries.
No you don't. Just drop the old ones, if you don't care about the  
history. Really simple.
As Tim Bray's text says

 [[
   software MAY choose to display all of them or some subset of them
 ]]
So just drop the older versions.
I don't think that one would be using ids as a category system. If  
you go to
 you get todays front page. Tomorrow you get  
tomorrows front page.
What's the problem? Is  a hidden category system?
Charter: "Atom defines a feed format for representing resources  
such as Weblogs, online journals, Wikis,
and similar content"
yes, and it must also allow the representation of
[[
   * a complete archive of all entries in a feed
]]
This proposal permits this, and it does not harm anyone else.
Atom is not a replacement for HTTP. Google.com is a web page, not  
"similar content". It's not relevant here.
I don't know where you get the idea that I said atom is a replacement  
for HTTP. Take a breath
perhaps and relax before you answer.

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread David M Johnson

I'm -1 on PaceAllowDuplicateIDs
Reasons:
1) We're supposed to be standardizing current practice not inventing 
new things. Current best practice is to have unique IDs and current 
software (e.g. Javablogs.com) is predicated on this practice. I know, 
this practice is not followed widely enough, but that is another 
matter.

2) I think it is *much* more useful to think of an Atom Entry as an 
event that occurred at a specific time. Typically, an event is the 
publication of an article or blog entry on the web. For example:

   event: CNET published article
   subject: CNET
   object: article
But an event it could also represent other events.
   event: delivery van delivers package
   subject: delivery van
   object: package
   event: alarm system sends warning
   subject: alarm system
   object: warning
   event: server sends load warning
   subject: server
   object: load warning
If you think of Atom Entries as events, then it makes sense to consider 
the Atom Entry ID to be the ID of the event, not the ID of the subject 
or object of the event. Events are unique (you can't have more than one 
version of an event) and can be assigned GUIDs and therefore you cannot 
have more than one entry with the same ID.

In the case of earthquake data, each new data report is a new event.
   event: agency reports earthquake data
   subject: agency
   object: earthquake data
The ID is the ID of the "data reported" event not the ID of the 
earthquake.

We don't know what subjects and objects people are going to use in the 
future, so we can't specify Atom elements or IDs for subjects and 
objects -- that's what extensions are for. If you want to create a feed 
to syndicate information about earthquakes, then you introduce an 
extension for uniquely identifying earthquakes. The same goes for 
earthquakes.

- Dave

On May 5, 2005, at 12:02 AM, Tim Bray wrote:

http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
This Pace was motivated by a talk I had with Bob Wyman today about the 
problems the synthofeed-generator community has.

Summary:
1. There are multiple plausible use-cases for feeds with duplicate IDs
2. Pro and Contra
3. Alternate Paces
4. Details about this Pace
1. Use-Cases
Here's a stream of stock-market quotes.
My Portfolio
 
 MSFT
  2005-05-03T10:00:00-05:00
  Bid: 25.20 Ask: 25.50 Last: 25.20
  
 MSFT
  2005-05-03T11:00:00-05:00
  Bid: 25.15 Ask: 25.25 Last: 25.20
  
 MSFT
  2005-05-03T12:00:00-05:00
  Bid: 25.10 Ask: 25.15 Last: 25.10
  

You could also imagine a stream of weather readings.  Bob's actual 
here-and-now today use-case from PubSub is earthquakes, an entry 
describes an earthquake and they keep re-issuing it as new info about 
strength/location comes in.

Some people only care about the most recent version of the entry, 
others might want to see all of them.  Basically, each atom:entry 
element describes the same Entry, only at a different point in time.

You could argue that in some cases, these are representations of the 
Web resources identified by the atom:id URI, but I don't think we need 
to say that explicitly.

Yes, you could think of alternate ways of representing stock quotes or 
any of the other use-cases but this is simple and direct and 
idiomatic.

2. Pro and Contra
Given that I issued the consensus call rejecting the last attempt to 
do this, which was  PaceRepeatIdInDocument, I felt nervous about 
revisiting the issue.  So I went and reviewed the discussion around 
that one, which I extracted and placed at 
http://www.tbray.org/tmp/RepeatID.txt for the WG's convenience.

Reviewing that discussion, I'm actually not impressed.  There were a 
few -1's but very few actual technical arguments about why this 
shouldn't be done.  The most common was "Software will screw this up". 
 On reflection, I don't believe that.  You have a bunch of Entries, 
some of them have the same ID and are distinguished by datestamp.  
Some software will show the latest, some will show all of them, the 
good software will allow switching back and forth.  Doesn't seem like 
rocket science to me.

So here's how I see it: there are plausible use cases for doing this, 
and one of the leading really large-scale implementors in the space 
(PubSub) wants to do this right now.  Bob's been making strong claims 
about not being able to use Atom if this restriction remains in place.

I believe strongly that if there's something that implementors want to 
do, standards shouldn't get in the way unless there's real 
interoperability damage.  I'm certainly prepared to believe that this 
could cause interoperability damage, but to date I haven't seen any 
convincing arguments that it will.  I think that if we nonetheless 
forbid it, people who want to do this will (a) use RSS instead of 
Atom, (b) cook up horrible kludges, or (c) ignore us and just do it.

So my best estimate is that the cost of allowing dupes is probably 
much lower than the cost of forbidding them.

Finally, our charter does say that we're al

Re: PaceTextShouldBeProvided

2005-05-05 Thread Sam Ruby

Antone Roundy wrote:
+1 except for one thing: In section 4.1.2, I'd suggest something along 
these lines:

atom:entry elements which do not contain an atom:content element, or 
whose atom:content element's type attribute indicates a MIME media type, 
SHOULD contain an atom:summary element.
Incorporated.  Thanks!
- Sam Ruby

Re: AtomPubIssuesList for 2005/05/05

2005-05-05 Thread Henry Story

Can you add PaceAlternateLinkWeakening. It was discussed but I never  
put it on
the wiki.

Henry
On 5 May 2005, at 13:17, Sam Ruby wrote:

*** REMINDER **
** Use more specific subject lines when responding to this note! **
*** REMINDER **
First the meat... here's the new atom pub issues list, conveniently  
sorted into categories:

  EntryId:
http://intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
http://intertwingly.net/wiki/pie/PaceDuplicateIDWithSource2
http://intertwingly.net/wiki/pie/PaceDuplicateIDWithSource
http://intertwingly.net/wiki/pie/PaceExplainDuplicateIds
  FeedId:
http://intertwingly.net/wiki/pie/PaceFeedIdOrAlternate
http://intertwingly.net/wiki/pie/PaceFeedIdOrSelf
http://intertwingly.net/wiki/pie/PaceOptionalFeedLink
  Provenance:
http://intertwingly.net/wiki/pie/PaceOriginalAttribute
http://intertwingly.net/wiki/pie/PaceSourceRecs
  Status:
http://intertwingly.net/wiki/pie/PaceEntryState
http://intertwingly.net/wiki/pie/PacePubControl
http://intertwingly.net/wiki/pie/PacePubStatusResource
  Text:
http://intertwingly.net/wiki/pie/PaceBriefExample
http://intertwingly.net/wiki/pie/PaceCoConstraintsAreBad
http://intertwingly.net/wiki/pie/PaceOptionalSummary
http://intertwingly.net/wiki/pie/PaceRecommendPlainTextContent
http://intertwingly.net/wiki/pie/PaceTextShouldBeProvided
  Recommended for Closure:
http://intertwingly.net/wiki/pie/PaceXhtmlDivSuggestedOnly
http://intertwingly.net/wiki/pie/PaceXmlContentWrapper
Now for some administrivia.  No progress was made on the last  
published issued list[1], so I've gone ahead and marked those  
issues that were recommended for closure, closed; and those  
currently under discussion were moved back to Needing to Revisit.

Next, I'd like to remind everybody that last call for the Atom  
Format went out.  Operationally, what this means is that the  
secretary and co-chairs are going to be increasingly reluctant to  
revisit things simply because somebody wants to bring them up  
again.  What that means is that in order to successfully bring up  
an issue, you need to do a little homework.  Demonstrate that you  
have revisited the previous discussion, and that you either have  
something new to add, or can point out some evidence that the  
previous consensus call was made in error.

Tim has taken the opportunity to lead by example on this one with  
PaceAllowDuplicateIDs.  The secretary and co-chairs all are in  
agreement that the XhtmlDiv related paces don't meet this  
criteria.  If anyone disagrees, what we would like to ask is that  
you follow Tim's lead.

Because we are in last call, I've scheduled everything related to  
the Format document.  As one of the status paces touches on the  
format, I've scheduled all three.  All we need to resolve now is  
the extent to which this is going to affect the format document.

I believe that PaceBriefExample is truly editorial, meaning that  
the editors can act on this at their discretion.

- Sam Ruby
[1] http://www.imc.org/atom-syntax/mail-archive/msg13691.html

Re: PaceTextShouldBeProvided

2005-05-05 Thread Robert Sayre


On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
> 
> +1 except for one thing: In section 4.1.2, I'd suggest something along
> these lines:
> 
> atom:entry elements which do not contain an atom:content element, or
> whose atom:content element's type attribute indicates a MIME media
> type, SHOULD contain an atom:summary element.

This Pace is incompatible with PaceOptionalSummary and incomplete. -1.

Robert Sayre

PaceSourceRecs

2005-05-05 Thread Antone Roundy

Looks good, but perhaps the recommendations could be slightly 
different: Start by calculating the the language of the atom:feed and 
the atom:entry.  Second, if the language of atom:entry isn't the same 
as the aggregate feed, set it.  Third, if the language of atom:feed 
isn't the same as the atom:entry, set it.  Same process with the Base 
IRI.  This process would prevent unnecessary duplication.

Re: Atom feed refresh rates

2005-05-05 Thread Henri Sivonen

On May 5, 2005, at 16:24, Walter Underwood wrote:
--On May 5, 2005 8:07:15 AM -0500 Mark Pilgrim <[EMAIL PROTECTED]> 
wrote:
Not to be flippant, but we have one that's widely available.  It's
called the Expires header.
You need the information outside of HTTP. To quote from the RSS spec
for ttl:
  This makes it possible for RSS sources to be managed by a 
file-sharing
  network such as Gnutella.

Caching information is about knowing when your client cache is stale,
regardless of how you got the feed.
Virtually everyone with IP connectivity can do HTTP, and HTTP has the 
Expires header. If this feature is important to you, why would you 
switch to a transfer protocol that doesn't have the feature? (I am not 
claiming anything about the actual Gnutella features.) To me, the "what 
if the feed is not over HTTP" argumentation seems theoretical 
over-generalization.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 09:15  AM, Eric Scheid wrote:
Have a look here:
http://en.wikipedia.org/w/index.php?title=Main_Page&action=history
There you have a reverse chrono list, each with an author, date, and
summary. Looks an awful lot like each one is an entry to me.
and looks to me like a stream of meta-data concerning the one entry to 
me.

and not distinct and separable entries like you'd find in your every 
day
blog.

Henry has the right idea -- the spec should allow both kinds, rather 
than
trying to shoe-horn everything into the one viewpoint of what is an 
entry.

+1 -- allow the publisher to decide which model fits their intent.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Dave Johnson

Immediately after sending this message, I had a rush  of second 
thoughts.

My point #2 is not very well thought out. I think it applies for things 
like earthquake data, but when Atom feeds represent blog entries or 
articles (in an archive or an Atom Protocol feed) the  ID represents 
the article not an event in the blog entry's life.  So, you can 
discount my second reason against the pace.

- Dave

On May 5, 2005, at 11:27 AM, David M Johnson wrote:
I'm -1 on PaceAllowDuplicateIDs
Reasons:
1) We're supposed to be standardizing current practice not inventing 
new things. Current best practice is to have unique IDs and current 
software (e.g. Javablogs.com) is predicated on this practice. I know, 
this practice is not followed widely enough, but that is another 
matter.

2) I think it is *much* more useful to think of an Atom Entry as an 
event that occurred at a specific time. Typically, an event is the 
publication of an article or blog entry on the web. For example:

   event: CNET published article
   subject: CNET
   object: article
But an event it could also represent other events.
   event: delivery van delivers package
   subject: delivery van
   object: package
   event: alarm system sends warning
   subject: alarm system
   object: warning
   event: server sends load warning
   subject: server
   object: load warning
If you think of Atom Entries as events, then it makes sense to 
consider the Atom Entry ID to be the ID of the event, not the ID of 
the subject or object of the event. Events are unique (you can't have 
more than one version of an event) and can be assigned GUIDs and 
therefore you cannot have more than one entry with the same ID.

In the case of earthquake data, each new data report is a new event.
   event: agency reports earthquake data
   subject: agency
   object: earthquake data
The ID is the ID of the "data reported" event not the ID of the 
earthquake.

We don't know what subjects and objects people are going to use in the 
future, so we can't specify Atom elements or IDs for subjects and 
objects -- that's what extensions are for. If you want to create a 
feed to syndicate information about earthquakes, then you introduce an 
extension for uniquely identifying earthquakes. The same goes for 
earthquakes.

- Dave

On May 5, 2005, at 12:02 AM, Tim Bray wrote:

http://www.intertwingly.net/wiki/pie/PaceAllowDuplicateIDs
This Pace was motivated by a talk I had with Bob Wyman today about 
the problems the synthofeed-generator community has.

Summary:
1. There are multiple plausible use-cases for feeds with duplicate IDs
2. Pro and Contra
3. Alternate Paces
4. Details about this Pace
1. Use-Cases
Here's a stream of stock-market quotes.
My Portfolio
 
 MSFT
  2005-05-03T10:00:00-05:00
  Bid: 25.20 Ask: 25.50 Last: 25.20
  
 MSFT
  2005-05-03T11:00:00-05:00
  Bid: 25.15 Ask: 25.25 Last: 25.20
  
 MSFT
  2005-05-03T12:00:00-05:00
  Bid: 25.10 Ask: 25.15 Last: 25.10
  

You could also imagine a stream of weather readings.  Bob's actual 
here-and-now today use-case from PubSub is earthquakes, an entry 
describes an earthquake and they keep re-issuing it as new info about 
strength/location comes in.

Some people only care about the most recent version of the entry, 
others might want to see all of them.  Basically, each atom:entry 
element describes the same Entry, only at a different point in time.

You could argue that in some cases, these are representations of the 
Web resources identified by the atom:id URI, but I don't think we 
need to say that explicitly.

Yes, you could think of alternate ways of representing stock quotes 
or any of the other use-cases but this is simple and direct and 
idiomatic.

2. Pro and Contra
Given that I issued the consensus call rejecting the last attempt to 
do this, which was  PaceRepeatIdInDocument, I felt nervous about 
revisiting the issue.  So I went and reviewed the discussion around 
that one, which I extracted and placed at 
http://www.tbray.org/tmp/RepeatID.txt for the WG's convenience.

Reviewing that discussion, I'm actually not impressed.  There were a 
few -1's but very few actual technical arguments about why this 
shouldn't be done.  The most common was "Software will screw this 
up".  On reflection, I don't believe that.  You have a bunch of 
Entries, some of them have the same ID and are distinguished by 
datestamp.  Some software will show the latest, some will show all of 
them, the good software will allow switching back and forth.  Doesn't 
seem like rocket science to me.

So here's how I see it: there are plausible use cases for doing this, 
and one of the leading really large-scale implementors in the space 
(PubSub) wants to do this right now.  Bob's been making strong claims 
about not being able to use Atom if this restriction remains in 
place.

I believe strongly that if there's something that implementors want 
to do, standards shouldn't get in the way unless there's real 
interoperability damage.  I'm certainly pre

Re: PaceRecommendPlainTextContent

2005-05-05 Thread Eric Scheid


-1 also.

> Abstract
>
> Text containers and content blocks specifically may contain rich-text, which
> must be down-stripped by more basic aggregators. Simply removing tags from
> X/HTML streams can however easily truncate meaning as well.

I am subscribed to multiple feeds which already down-strip their X/HTML
content for their feeds, and definitely agree that such down-stripping does
clobber a lot of meaning.

This pace, if the recommendation takes effect, would result in yet more
publishers down-stripping their rich content at the publishing end. Let a
thousand buggy implementations bloom :-(

I say leave the down-stripping in the aggregator - that way if it really is
awful the user can choose a different aggregator.


> The examples are of course construed. Most blogging software doesn't emit such
> exotic HTML features.

A far more common example (ie. less exotic html) is where  or
 get stripped. Reading the stripped-text version of that can be quite
confusing.

e.

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham

On 5 May 2005, at 4:22 pm, Henry Story wrote:
If you don't want to keep a history of the entries all you need to  
do is drop all but the
latest entry with the same id. There is nothing more to it. Just  
show the user the last
one you came across.
But, if we follow Eric's model of how a wiki changelog should be  
defined, I'll be missing entries in the log, because several  
different entries have the same id. Ergo, the user interface and data  
model for the new type of feed this proposal permits is very different.

This proposal permits this, and it does not harm anyone else.
It harms everyone, by allowing a second, unrelated data model in Atom  
feeds. They may not be posting today, but I assure you, when other  
aggregator authors get the first user complaints about how Eric's  
wiki log displays incompletely in their program, they'll forgive Dave  
Winer everything.

Graham

Re: Atom feed refresh rates

2005-05-05 Thread Tim Bray


Warning: we are into the end-game.  What really counts is the set of 
outstanding Paces.  When Paul and I are going through the list to 
figure out consensus calls, emails that don't have a Pace in the 
Subject line are apt to get ignored.

So I'm not sure this endless thread entitled "feed refresh rates" is 
doing anyone any good unless it can coalesce around a Pace. -Tim

Re: PaceOriginalAttribute

2005-05-05 Thread Robert Sayre

On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
> 
> -1.
> 
> I don't see that this solves any problem.  

I suggest you reread it. Your analysis is deeply flawed.

> It may help people avoid
> accidentally generating invalid feeds (if we stick to not to allowing
> duplication of atom:id within a feed), but it does it by simply
> shunting the issue off into a different element which doesn't have
> duplication constraints. 

Incorrect. Think harder about what PubSub services do. They take an
entry, and munge it (people like that). They move the feed data to
atom:source, and probably add their own extension elements to it. I
think they are "forwarding" a message. My proposal preserves the
identity of the original message, while requiring the service to mint
an identifier for its forwarded message.

> It doesn't address the DOS problem--neither
> accidental nor intentional.  

Oh yes it does. Each entry's provenance is documented. The data format
accurately states that the intermediary has munged the original entry.

> And it doesn't make it any easier to
> determine whether or not entries in different feeds with the same
> atom:id are really the same entry or not.  In fact, it just complicates
> the task by requiring the inspection of two elements instead of one.

Incorrect. What it does is explicitly state that two different feeds
think they are fowarding the same entry.

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story

Hi Dave,
 nice to see you participate here. I understand your points, and  
I myself thought the
way you did for a while.

[Oops, I see now that you have retracted your point. Oh well. I had  
already started writing
the following]

On 5 May 2005, at 17:27, David M Johnson wrote:
I'm -1 on PaceAllowDuplicateIDs
Please consider the following points before you vote.
Reasons:
1) We're supposed to be standardizing current practice not  
inventing new things. Current best practice is to have unique IDs  
and current software (e.g. Javablogs.com) is predicated on this  
practice. I know, this practice is not followed widely enough, but  
that is another matter.
Atom is standardizing current practice, but it is also adding some  
features. For example name
spaces and ids. The atom charter also requires us to allow archives

[[
  * a complete archive of all entries in a feed
 ]]
Graham himself thinks that archives are possible, since he supports  
the use of an
 head element.

2) I think it is *much* more useful to think of an Atom Entry as an  
event that occurred at a specific time. Typically, an event is the  
publication of an article or blog entry on the web. For example:

   event: CNET published article
   subject: CNET
   object: article
But an event it could also represent other events.
   event: delivery van delivers package
   subject: delivery van
   object: package
   event: alarm system sends warning
   subject: alarm system
   object: warning
   event: server sends load warning
   subject: server
   object: load warning
If you think of Atom Entries as events, then it makes sense to  
consider the Atom Entry ID to be the ID of the event, not the ID of  
the subject or object of the event.
You are right. There are two types of objects that we need to think  
about:
   A- the event/state of a resource at a particular time
   B- the thing that makes these different states the state of the  
same thing

Clearly we need (B) or else all the talk about an entry changing over  
time (atom:updated)
would not make sense.

So let us start off, as I did a long time ago, by thinking that the  
the id of an entry
uniquely identifies the event/state of the entry. For every id there  
can be only one and
only one "" representation. That id is that  
representation. It is, if you
wish, the name of a state of something else... and that would be?

I think it is clear that one of the roles of the id is to make it  
possible for an
entry to be moved from one web site to another, so that if your blog  
service provider
lets you down, you can still refer to the entry even when you have  
moved it to a
different "alternate" position. Graham has made such a point quite  
often. Entries it
has often been said can change, but the id remains the same. I think  
this is clearly
the consensus on this list. So the id URI is what identifies the  
different
"..." representations as being representations of the  
same thing.


Events are unique (you can't have more than one version of an  
event) and can be assigned GUIDs and therefore you cannot have more  
than one entry with the same ID.
yes. But I don't think that this is the consensus on this group. The  
good thing is that
you can achieve the same identification of a state through the  
combination of the id and the
modification time.

[here I noticed that you had changed your mind, anyway. I think I had  
exactly the same
thought as you did when I first started thinking about this. ]


In the case of earthquake data, each new data report is a new event.
   event: agency reports earthquake data
   subject: agency
   object: earthquake data
The ID is the ID of the "data reported" event not the ID of the  
earthquake.

We don't know what subjects and objects people are going to use in  
the future, so we can't specify Atom elements or IDs for subjects  
and objects -- that's what extensions are for. If you want to  
create a feed to syndicate information about earthquakes, then you  
introduce an extension for uniquely identifying earthquakes. The  
same goes for earthquakes.

- Dave

Re: Atom feed refresh rates

2005-05-05 Thread Dan Brickley

* Henri Sivonen <[EMAIL PROTECTED]> [2005-05-05 18:35+0300]
> 
> On May 5, 2005, at 16:24, Walter Underwood wrote:
> 
> >--On May 5, 2005 8:07:15 AM -0500 Mark Pilgrim <[EMAIL PROTECTED]> 
> >wrote:
> >>
> >>Not to be flippant, but we have one that's widely available.  It's
> >>called the Expires header.
> >
> >You need the information outside of HTTP. To quote from the RSS spec
> >for ttl:
> >
> >  This makes it possible for RSS sources to be managed by a 
> >file-sharing
> >  network such as Gnutella.
> >
> >Caching information is about knowing when your client cache is stale,
> >regardless of how you got the feed.
> 
> Virtually everyone with IP connectivity can do HTTP, and HTTP has the 
> Expires header. If this feature is important to you, why would you 
> switch to a transfer protocol that doesn't have the feature? (I am not 
> claiming anything about the actual Gnutella features.) To me, the "what 
> if the feed is not over HTTP" argumentation seems theoretical 
> over-generalization.

+1 

FWIW various P2P/filesharing protocols use HTTP, eg. Kazaa and others make 
use of HTTP's ability to request a byte range, handy if you're
requesting chunks of the same file from different servers. Those who
care to have HTTP header semantics show up in other environments can 
do various things (eg. reflect into an XML namespace). But it doesn't 
seem to me to be core business of the AtomPub WG to do this work...

[googles a bit] OK it looks like Gnutella also uses HTTP for the file 
download part of it's protocol, fwiw. (including Range: header)
http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

Dan

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim


On 5/5/05, Dan Brickley <[EMAIL PROTECTED]> wrote:
> [googles a bit] OK it looks like Gnutella also uses HTTP for the file
> download part of it's protocol, fwiw. (including Range: header)
> http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

You mean RSS's  element is even more useless than I thought?  I
didn't think that was possible.

-- 
Cheers,
-Mark

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 17:53, Graham wrote:
On 5 May 2005, at 4:22 pm, Henry Story wrote:
If you don't want to keep a history of the entries all you need to  
do is drop all but the
latest entry with the same id. There is nothing more to it. Just  
show the user the last
one you came across.

But, if we follow Eric's model of how a wiki changelog should be  
defined, I'll be missing entries in the log, because several  
different entries have the same id. Ergo, the user interface and  
data model for the new type of feed this proposal permits is very  
different.
If your tool (Is it Shrook2? [1]) only shows people the latest  
version available to you of
an entry, then by showing them only the latest version, Shrook2 will  
be giving the user what
he is expecting.

When your news reader currently reads feeds on the internet what does  
it do
with changed entries? Either it keeps the older version around, for  
the user to browse, or it
does not. If your users don't mind you throwing away the older  
versions of an entry, then
they won't mind you throwing away the older versions of the above  
entries either. There is no
difference in the behavior between allowing changed entries across  
feed documents and changed
entries inside a feed document. People who place two entries with the  
same id inside a feed
document should be aware that tools like yours will have the behavior  
they do, and that this is
ok.

Other people may be interested in looking at things historically.  
They will get a historical
viewer and be happy with it.

I think the current proposal is good exactly because it allows the  
wiki people to express
what they want to express correctly. Namely how their wiki entry is  
changing over time.

This proposal permits this, and it does not harm anyone else.
It harms everyone, by allowing a second, unrelated data model in  
Atom feeds. They may not be posting today, but I assure you, when  
other aggregator authors get the first user complaints about how  
Eric's wiki log displays incompletely in their program, they'll  
forgive Dave Winer everything.
Again, has anyone yet complained to you that you have not kept a  
historical and browse-able track
record of how the entries Shrook2  is looking at have changed over  
time? Clearly they could,
as you sometimes let them know that an entry that they already have  
read has been updated. They
could ask you what the changes were, no? How it changed, etc.

If your users don't care that much about the history of an entry,  
then you can dump all but the
latest entry. Or you could just keep the last two entries, so that  
you can show them a diff.

Graham
HJStory
http://bblfish.net/blog/
[1] http://www.fondantfancies.com/apps/shrook/

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Eric Scheid

On 6/5/05 1:53 AM, "Graham" <[EMAIL PROTECTED]> wrote:

>> This proposal permits this, and it does not harm anyone else.
> 
> It harms everyone, by allowing a second, unrelated data model in Atom
> feeds. They may not be posting today, but I assure you, when other
> aggregator authors get the first user complaints about how Eric's
> wiki log displays incompletely in their program, they'll forgive Dave
> Winer everything.

Many wiki's offer options in displaying their change log with either most
recent changes only, or all changes. Both models are commonly supported
because some people want to see notifications of all changes, while others
just want to see the most recent change. That is part of wiki culture, all
the way back to ward's wiki.

It wouldn't be surprising to find the same options made available for wiki
logs in rss. Hey, here's one right now

http://www.intertwingly.net/wiki/pie/RecentChanges?action=rss_rc

Apparently, if you add an "unique=1" URL parameter you get a list of changes
where page names are unique, i.e. where only the latest change of each page
is reflected.

e.

Re: PaceOriginalAttribute

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 10:14  AM, Robert Sayre wrote:
On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
It may help people avoid
accidentally generating invalid feeds (if we stick to not to allowing
duplication of atom:id within a feed), but it does it by simply
shunting the issue off into a different element which doesn't have
duplication constraints.
Incorrect. Think harder about what PubSub services do. They take an
entry, and munge it (people like that). They move the feed data to
atom:source, and probably add their own extension elements to it. I
think they are "forwarding" a message. My proposal preserves the
identity of the original message, while requiring the service to mint
an identifier for its forwarded message.
Okay, the forwarded message has it's own all-in-one-element identifier. 
 This sounds useful...except that someone may accidentally or 
intentionally duplicate that identifier too.

It doesn't address the DOS problem--neither
accidental nor intentional.
Oh yes it does. Each entry's provenance is documented. The data format
accurately states that the intermediary has munged the original entry.
Maybe I am missing something, but if so...well, by definition, I'm not 
seeing it.  Let's look at an example.  Say your aggregator sees these 
in different feeds:


foo:foo

foo:bar
I'm the real thing



bar:bar

bar:foo

foo:foo
foo:bar

I may be an imposter


Do you display one or both?  How would your decision making process 
differ from if you were to see the following in the second case?


bar:bar

foo:bar

foo:foo

I may be an imposter


And what if we added a third case?

qwerty:bar

bar:foo

foo:foo
foo:bar

I'm definitely an imposter


And it doesn't make it any easier to
determine whether or not entries in different feeds with the same
atom:id are really the same entry or not.  In fact, it just 
complicates
the task by requiring the inspection of two elements instead of one.
Incorrect. What it does is explicitly state that two different feeds
think they are fowarding the same entry.
Yeah, they think they are, or at least claim to think so.  But isn't 
that the same thing that is stated if you see the following in two 
feeds?


bar:bar

foo:bar

foo:foo

I may be an imposter


This says that this feed is (or at least claims it is) forwarding the 
entry with the id "foo:bar" from the feed "foo:foo".

I am honestly trying to see more in this, but as yet, I don't.

Re: PaceTextShouldBeProvided

2005-05-05 Thread Sam Ruby

Robert Sayre wrote:
On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
+1 except for one thing: In section 4.1.2, I'd suggest something along
these lines:
atom:entry elements which do not contain an atom:content element, or
whose atom:content element's type attribute indicates a MIME media
type, SHOULD contain an atom:summary element.
This Pace is incompatible with PaceOptionalSummary and incomplete. -1.
Something a little less curt would be appreciated.
The stated abstract of PaceOptionalSummary (i.e., "removing the 
requirement for ") is met.  In your mind, this equates to 
completely optional.  That has yet to be conclusively established.

What concerns me more, however, is that interoperability issues that 
PaceOptionalSummary not only creates, but also uncovered during its 
discussion.

Unless there is some plan for addressing these interoperability issues 
(and by that, I mean something more constructive than "That's fine, but 
we're not here to tailor the format to your app."), then perhaps BOTH 
paces are incomplete.

There are a number of ways to finesse the identification of the issue 
into the spec.  For example, take a look at how Tim worded 
PaceAllowDuplicateIDs.  Producers are put on effectively put on notice 
that if they include multiple entries with the same ID, that some or all 
of them may be ignored.

How should we convey a similar sentiment about the reality that entries 
without a readily available textual representation may suffer the same fate?

- Sam Ruby

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Antone Roundy

On Thursday, May 5, 2005, at 08:44  AM, Antone Roundy wrote:
If we accept this Pace, are we going to do anything to address the DOS 
issue for aggregated feeds?
Bob, if I may direct a few question to you, since you have the most 
experience with this issue: if PaceAllowDuplicateIDs is adopted, how 
would you anticipate that PubSub would go about handling entries with 
the same atom:id coming from different feeds?  What if each appears to 
be claiming to be the original feed for the entry?  What if both are 
getting aggregated into the same feed, but your system doesn't think 
they're really the same entry?

I'm in favor of the Pace, as far as it goes, but was surprised to see 
that it doesn't talk about these issues, given that it was motivated by 
a conversation with you.

Re: PaceTextShouldBeProvided

2005-05-05 Thread Robert Sayre

On 5/5/05, Sam Ruby <[EMAIL PROTECTED]> wrote:
>
> > This Pace is incompatible with PaceOptionalSummary and incomplete. -1.
> 
> Something a little less curt would be appreciated.
> 
> The stated abstract of PaceOptionalSummary (i.e., "removing the
> requirement for ") is met.  In your mind, this equates to
> completely optional.  That has yet to be conclusively established.

Well, consider the name of the Pace, and then consider this sentence:

5. MAY   This word, or the adjective "OPTIONAL", mean that an item is
truly optional.

It would be deeply bogus to accept a Pace whose sole action was to
remove a normative requirement, and simultaneously accept a Pace that
puts it back in. Seems obvious to me.

> What concerns me more, however, is that interoperability issues that
> PaceOptionalSummary not only creates, but also uncovered during its
> discussion.

We know exactly what issues optional content has, because all of the
other formats have it.

> Unless there is some plan for addressing these interoperability issues
> (and by that, I mean something more constructive than "That's fine, but
> we're not here to tailor the format to your app."), then perhaps BOTH
> paces are incomplete.

Let's enumerate the issues, rather than insist they exist. Frankly, I
seriously doubt that anyone with customers will outright reject a
title-only feed.

> There are a number of ways to finesse the identification of the issue
> into the spec.  For example, take a look at how Tim worded
> PaceAllowDuplicateIDs.  Producers are put on effectively put on notice
> that if they include multiple entries with the same ID, that some or all
> of them may be ignored.

I don't think PaceAllowDuplicateIDs successfully finessed the issue.

> How should we convey a similar sentiment about the reality that entries
> without a readily available textual representation may suffer the same fate?

So, we're looking for some way to say "provide as much information as
you can." The problem with saying SHOULD is that we purport to know
how much information the publisher can provide. It would be very easy
to explain this issue in the spec, and I have no objection to doing
so. Why do we need the RFC2119 words?

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Graham

On 5 May 2005, at 5:38 pm, Eric Scheid wrote:
Many wiki's offer options in displaying their change log with  
either most
recent changes only, or all changes. Both models are commonly  
supported
because some people want to see notifications of all changes, while  
others
just want to see the most recent change. That is part of wiki  
culture, all
the way back to ward's wiki.
OK that makes sense. I still think it's the wrong way to model a  
change log as a feed.

My other two criticisms still stand:
"atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, probably wants to  
see the latest version, reliably, even if the publisher disagrees  
that the change was significant. This is the core problem with Tim's  
proposal. There is no way to create an aggregator that works in the  
way the user expects."

"Finally, at pubsub, what happens when they download an entry from  
one feed, then the user edits it, but doesn't modify atom:updated,  
then they download the new entry from a second feed associated with  
the site? Different content, identical atom:ids, identical  
atom:updated => Invalid feed. They're not in any better position than  
they were before. This doesn't even solve the problem it's meant to."

Basically, atom:updated doesn't properly differentiate versions, and  
the way atom:updated is being used by the proposal doesn't gel with  
the actual spec of the element.

Graham

Re: PaceAllowDuplicateIDs

2005-05-05 Thread Henry Story


On 5 May 2005, at 19:23, Graham wrote:
On 5 May 2005, at 5:38 pm, Eric Scheid wrote:
Many wiki's offer options in displaying their change log with  
either most
recent changes only, or all changes. Both models are commonly  
supported
because some people want to see notifications of all changes,  
while others
just want to see the most recent change. That is part of wiki  
culture, all
the way back to ward's wiki.

OK that makes sense. I still think it's the wrong way to model a  
change log as a feed.
Cool. Rational argument does sometimes work :-)
My other two criticisms still stand:
What was wrong with the arguments I gave?
"atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, probably wants to  
see the latest version, reliably, even if the publisher disagrees  
that the change was significant. This is the core problem with  
Tim's proposal. There is no way to create an aggregator that works  
in the way the user expects."
I think this is relatively simple. If the publisher publishes a feed  
document with two entries containing the same id and the same  
atom:updated timestamp then he is claiming that there are no  
significant changes between them. In fact the same is true if he  
publishes those two entries in two different feed documents. So there  
really is no problem here that is particular to PaceAllowDuplicateIDs.

If the feed reader wants reliability he can't get more reliability by  
making the spec tighter
than the feed producer is able to give. If the feed producer is  
unreliable, then nothing in the
spec can change that.

The feed producer must understand that by creating feeds where the  
above described situation is
true will lead to erratic behavior. If there is a significant  
difference between those entries
then he had better change the time stamp. Otherwise his unreliable  
behavior will create unreliable
effects. Nothing new here.

"Finally, at pubsub, what happens when they download an entry from  
one feed, then the user edits it, but doesn't modify atom:updated,  
then they download the new entry from a second feed associated with  
the site? Different content, identical atom:ids, identical  
atom:updated => Invalid feed. They're not in any better position  
than they were before. This doesn't even solve the problem it's  
meant to."
I imagine this is simple, but I will leave the full details for Bob  
to answer. My guess is that
if Bob comes across the above situation, then he will be confronted  
by a situation where he has
according to the publisher two entries that are not significantly  
different. He can therefore choose
between them randomly, and just add one of them. The publisher after  
all thinks that there are
no significant differences between them. If the initial publisher  
thought that there were significant differences between these  
entries, then he should have given them different time
stamps.


Basically, atom:updated doesn't properly differentiate versions,  
and the way atom:updated is being used by the proposal doesn't gel  
with the actual spec of the element.
I think the above argument shows that there is in fact nothing wrong  
with atom:updated.
We just have to think in terms of interoperability and not in terms  
of Platonic forms.

Henry Story
http://bblfish.net/blog/

Re: Atom feed refresh rates

2005-05-05 Thread John Panzer

I assume an HTTP Expires header for Atom content will work and play well 
with caches such as the Google Accelerator 
(http://webaccelerator.google.com/).  I'd also guess that a syntax-level 
tag won't.  Is this important? 

The HTML solution for people who could not implement Expires: seems to 
be META tags with in theory equivalent information.  Though in practice 
the whole thing is a mess, this seems like a conceptually simple 
workaround.  Is there something obviously wrong with it?

-John

Re: PaceTextShouldBeProvided

2005-05-05 Thread Robert Sayre


On 5/5/05, Graham <[EMAIL PROTECTED]> wrote:
> On 5 May 2005, at 6:23 pm, Robert Sayre wrote:
> 
> > It would be deeply bogus to accept a Pace whose sole action was to
> > remove a normative requirement, and simultaneously accept a Pace that
> > puts it back in. Seems obvious to me.
> 
> Not really. Assuming PaceOptionalSummary is accepted, there are two
> completely valid outcomes:
> 
> 1. PaceTextShouldBeProvided rejected => summaries are not required,
> and textual content is not encouraged
> 2. PaceTextShouldBeProvided accepted => summaries are not required,
> but textual content is encouraged
> 
> I don't see a conflict there. What's wrong with accepting two similar
> paces because one corrects the flaws in the other?

Graham, that's just not true. It wasn't called
PaceSummariesAreNotRequired, was it? It materially changes the only
action PaceOptionalSummary takes. They are not compatible.

In fact, let's get the chairs to clarify this.

> > So, we're looking for some way to say "provide as much information as
> > you can." The problem with saying SHOULD is that we purport to know
> > how much information the publisher can provide. It would be very easy
> > to explain this issue in the spec, and I have no objection to doing
> > so.
> 
> SHOULD here means "must unless you absolutely can't". 

 We've covered lots of perfectly valid reasons not to include a
summary, and we've heard from implementors that actually prefer its
absence.

Robert Sayre

Re: PaceAllowDuplicateIDs

2005-05-05 Thread John Panzer

Graham wrote:
On 5 May 2005, at 5:38 pm, Eric Scheid wrote:
Many wiki's offer options in displaying their change log with  either 
most
recent changes only, or all changes. Both models are commonly  supported
because some people want to see notifications of all changes, while  
others
just want to see the most recent change. That is part of wiki  
culture, all
the way back to ward's wiki.

OK that makes sense. I still think it's the wrong way to model a  
change log as a feed.

My other two criticisms still stand:
"atom:updated is used by the publisher to show what they consider a  
significant change. The user, on the other hand, probably wants to  
see the latest version, reliably, even if the publisher disagrees  
that the change was significant. This is the core problem with Tim's  
proposal. There is no way to create an aggregator that works in the  
way the user expects."
Just a thought:  On the other hand, perhaps this is an opportunity to 
operationally define "significant change":  A change which results in a 
new version being exposed on one's feed.  If you think your users would 
care about seeing the change, then change the atom:updated field and 
'republish' by adding to the feed.  If not, just change your content and 
don't republish.

Examples of this might include:  Fixing irrelevant typos.  Changing 
character set encodings.  Changing formatting to match a new style guide.

-John

Re: Atom feed refresh rates

2005-05-05 Thread Mark Pilgrim

On 5/5/05, John Panzer <[EMAIL PROTECTED]> wrote:
> I assume an HTTP Expires header for Atom content will work and play well
> with caches such as the Google Accelerator
> (http://webaccelerator.google.com/).  I'd also guess that a syntax-level
> tag won't.  Is this important?

Yes, and yes.  This is exactly the sort of software that we're talking
about when we say that HTTP's native caching mechanism is widely
supported.  All the proxies in the world (which is what Google's Web
Accelerator is, except it runs on your own machine and listens on port
9100) are able to reduce network traffic and therefore make the end
user's experience faster because they understand and respect the HTTP
caching mechanism.  (Google Web Accelerator does other things too,
like proxying requests through Google's servers.  And what are those
servers running?  Another caching HTTP proxy.)  Many ISPs do this at
the ISP level, both to reduce their own upstream bandwidth costs and
to make their end users happier.  Many corporations do this as well (I
would bet good money that IBM does it).  At one time, I even had Squid
installed on my home network to do this. 

HTTP caching works.

> The HTML solution for people who could not implement Expires: seems to
> be META tags with in theory equivalent information.  Though in practice
> the whole thing is a mess, this seems like a conceptually simple
> workaround.  Is there something obviously wrong with it?

Other than being a God-awful mess?  No, there's nothing wrong with it. ;)

-- 
Cheers,
-Mark

Re: Autodiscovery

2005-05-05 Thread Sjoerd Visscher

Why not support hyperlinks too?
So besides:


also:
Main Atom feed

Most webpages already have a hyperlink to the feed, so they'd only need 
to add two attributes. It would be a waste to have to duplicate the 
information in the document head.

--
Sjoerd Visscher
http://w3future.com/weblog/

Re: Autodiscovery

2005-05-05 Thread Nikolas 'Atrus' Coukouma


Sjoerd Visscher wrote:

>
> Why not support hyperlinks too?
>
> So besides:
>
> 
>
> also:
>
>  href="/xml/index.atom">Main Atom feed
>
> Most webpages already have a hyperlink to the feed, so they'd only
> need to add two attributes. It would be a waste to have to duplicate
> the information in the document head.
>
The intent of the head element is to indicate a feed that serves as a
substitute for the page you're currently viewing.

This other case is locating all feeds linked to from a page. For that,
the type attribute should be sufficient to indicate that you're linking
to a feed.

-Nikolas 'Atrus' Coukouma

-Nikolas 'Atrus' Coukouma

RE: Autodiscovery, real-world examples

2005-05-05 Thread Bob Wyman


Fantasia wrote:
> Making it possible for pages to link to non-alternate 
> autodiscoverable feeds without using rel="alternate" -- and 
> encouraging this practice -- would make it possible for UAs to 
> actually /discriminate/ between alternate and non-alternate feeds.
> Right now they can't, because everything is indiscriminately marked 
> as "alternate".
+1. Being able to distinguish between "alternates" for the current
page and "just other feeds" that are linked to from the page would be very
useful. Also, in the case where there are multiple real alternates to the
page, it would be useful to be able to mark which feed is "preferred." My
concern here is the transition between Atom V0.3 and Atom V1.0. A page might
link to feeds in both formats (as well as RSS, RDF, etc.) but it would be
good to know which of these feeds is considered the "preferred" feed by the
producer. In this way, people could migrate off the older feeds and one day
we'd actually be able to stop producing multiple feeds on each site.
We should also consider providing such "preferred" links in Atom,
RSS, RDF, etc. feeds. I'd like to be able to publish something in my Atom
0.3 feeds that tell people "Don't keep reading this feed. Read the Atom 1.0
feed instead..."

bob wyman

Re: Autodiscovery

2005-05-05 Thread Sjoerd Visscher

Nikolas 'Atrus' Coukouma wrote:
Sjoerd Visscher wrote:

Why not support hyperlinks too?
So besides:

also:
Main Atom feed
Most webpages already have a hyperlink to the feed, so they'd only
need to add two attributes. It would be a waste to have to duplicate
the information in the document head.
The intent of the head element is to indicate a feed that serves as a
substitute for the page you're currently viewing.
This other case is locating all feeds linked to from a page. For that,
the type attribute should be sufficient to indicate that you're linking
to a feed.
No, a hyperlink with a rel attribute means the same as a link element. 
The HTML spec describes the rel attribute on the a element thus:

This attribute describes the relationship from the current document to 
the anchor specified by the href attribute. The value of this attribute 
is a space-separated list of link types.

--
Sjoerd Visscher
http://w3future.com/weblog/

Re: Autodiscovery - different cases should use different rel

2005-05-05 Thread Nikolas 'Atrus' Coukouma

fantasai wrote:

>
> Nikolas 'Atrus' Coukouma wrote:
>
> > I think you have three separate cases of autodiscovery:
> > * the feed for *this* page - handled by this autodiscovery proposal
> > * other feeds the author reads or recommends - usually done by linking
> > to a separate file. Some quick searching reveals one suggestion to use
> > rel="blogroll" for this
> > * any other feeds linked to for any reason at all - seems to be little
> > interest in
> >
> > I don't think combining these three into one case will do any good. In
> > fact, I think it's confusing and unusable.
>
> That makes sense.
>
> I think that you're missing one key use case, though: autodiscovery of
> a blog's main feed from sub-parts of it. A lot of websites link to the
> main blog feed from individual entries, for example, and they're doing
> it with rel="alternate", which is not appropriate. It frustrates me that
> there is no way of changing these links to not use rel="alternate".

An excellent point. Perhaps these should use rel="home" :)

>
> And for linking to other pages.. Here's a real-world example:
> The mozilla.org main page  is an example
> of where rel="alternate" is a problem. There were three feeds on
> it: "Announcements", "mozillaZine News", and "Mozilla Weblogs"
> (now only two). Each one is an alternate of a web page, but of
> _other_ pages (http://www.mozilla.org/news.html,
> http://www.mozillazine.org/, and http://planet.mozilla.org/
> respectively), not the mozilla.org
> front page. The last few headlines for each feed are listed on
> the front page, and the designer felt it was appropriate for
> autodiscovery to work on this page -- but it is not appropriate
> for rel="alternate" to be used for those autodiscovery links.
> They are not alternate representations of the front page.

These other feeds are suggestion/blogroll cases.

>
> Here's another example:
> LiveJournal creates a "Friends" page, where it aggregates the
> blogs of all the users you've designated as "friends". It could
> create an Atom feed representing this aggregation, and mark that
> as rel="alternate". 

Actually, a patch was just committed to do this ;)

> What could also be useful, however, would be
> linking to each of these blogs' feeds individually as well so
> that they're represented individually in my aggregator and it
> can aggregate them itself. Unlike the pre-aggregated feed,
> however, these are not alternate representations of the Friends
> page, and shouldn't be marked as such.

I think this is a suggestion/blogroll case.

>
> Making it possible for pages to link to non-alternate autodiscoverable
> feeds without using rel="alternate" -- and encouraging this practice --
> would make it possible for UAs to actually /discriminate/ between
> alternate and non-alternate feeds. Right now they can't, because
> everything is indiscriminately marked as "alternate".

>
> ~fantasai

I've basically concluded that the keys to autodiscovery of feeds, in the
general sense, should not be three (rel, type, and href), but two (type
and href). Type is plenty of specification that it's a feed. Claiming
it's relationship as "feed" doesn't seem correct. There are a few
mime-types used, and the one for atom (application/atom+xml) will be an
official standard as soon as the draft is accepted by the IETF.

The value of rel, if present, will vary based on relation
* the feed for *this* page - rel="alternate"
* the feed for main feed for this blog, in general - rel="home"
* other feeds the author reads or recommends - rel="suggested"
* any other feeds linked to for any reason at all - no rel, just the
type and href

Is this acceptable? I'm not completely happy with "home" and "suggested"
because they're not specified as link types in the HTML specs [1].
Sadly, it seems the HTML authors didn't consider these cases. "home"
seems to be an informal standard. Close matches in the HTML list are
"index", "contents", and "start". All of these are inaccurate, but I
think "contents" is the best fit.

"suggested" is just my own idea. I mentioned the rel="blogroll" before,
but that seems overly specific. "bookmark" seems to be the closest match
in the HTML list. Not in the way it's defined in the list, but the way
people usually think of it. I'm not sure what the heck the HTML spec is
indicating with:
"Refers to a bookmark. A bookmark is a link to a key entry point within
an extended document. The title attribute may be used, for example, to
label the bookmark. Note that several bookmarks may be defined in each
document."
That definition makes it a close match to "home," I suppose. Really, the
definition there is so vague that it's useless.

I can think of a couple other cases:
- Comment feeds, which are only generated by a few pieces of software so
far. These are close to, but not quite, alternate. they're usually
missing the entry itself, from what I understand. I think more work
needs to be done with comment feeds in general before we

Re: Autodiscovery

2005-05-05 Thread Nikolas 'Atrus' Coukouma

Sjoerd Visscher wrote:

> Nikolas 'Atrus' Coukouma wrote:
>
>> Sjoerd Visscher wrote:
>>
>>> Why not support hyperlinks too?
>>>
>>> So besides:
>>>
>>> 
>>>
>>> also:
>>>
>>> >> href="/xml/index.atom">Main Atom feed
>>>
>>> Most webpages already have a hyperlink to the feed, so they'd only
>>> need to add two attributes. It would be a waste to have to duplicate
>>> the information in the document head.
>>>
>>
>> The intent of the head element is to indicate a feed that serves as a
>> substitute for the page you're currently viewing.
>>
>> This other case is locating all feeds linked to from a page. For that,
>> the type attribute should be sufficient to indicate that you're linking
>> to a feed.
>
>
> No, a hyperlink with a rel attribute means the same as a link element.
> The HTML spec describes the rel attribute on the a element thus:
>
> This attribute describes the relationship from the current document to
> the anchor specified by the href attribute. The value of this
> attribute is a space-separated list of link types.

What I was getting at is that link elements in the head are usually a
kind of metadata intended for the user agent. Hyperlinks are usually
meant to be displayed. This proposal is aimed at providing metadata for
the user agent, so it makes since to put it in a link element in the head.

I'm on the fence about whether or not a link element should be the
*required*, even when a hyperlink is present in the body.

Supporting general hyperlinks starts making more sense if we have cases
other than alternate (I've written elsewhere about this) because the
amount of duplicated information is much greater. If you're only
supporting feeds that serve as an alternate form of the content, then it
makes sense to repeat one link once just to make the programmer stuck
writing the user agent. I'd hope that whatever library/toolkit they're
using supports XPath queries. Using them makes it easy to pluck out
anything with type="application/atom+xml" and an href property.

It's worth noting that in recent versions of XHTML, anything with an
href property is a hyperlink. There's no requirement to use an anchor or
an xlink:link element.

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery

2005-05-05 Thread Nikolas 'Atrus' Coukouma


Nikolas 'Atrus' Coukouma wrote:

>I'm on the fence about whether or not a link element should be the
>*required*, even when a hyperlink is present in the body.
>
>Supporting general hyperlinks starts making more sense if we have cases
>other than alternate (I've written elsewhere about this) because the
>amount of duplicated information is much greater. If you're only
>supporting feeds that serve as an alternate form of the content, then it
>makes sense to repeat one link once just to make the programmer stuck
>writing the user agent.
>  
>
correction:
"just to make [things easier for] the programmer stuck writing the user
agent."

-Nikolas 'Atrus' Coukouma

Re: status paces

2005-05-05 Thread Tim Bray


On May 5, 2005, at 6:13 AM, Robert Sayre wrote:
Throw them all back. No reason we can't do this in an extension
element. It would be big mistake to put these in right now.
+1.  -Tim

RE: Autodiscovery, real-world examples

2005-05-05 Thread James Tauber


On Thu, 5 May 2005 16:35:21 -0400, "Bob Wyman" <[EMAIL PROTECTED]> said:
> Being able to distinguish between "alternates" for the current
> page and "just other feeds" that are linked to from the page would be
> very useful.

+1

> Also, in the case where there are multiple real alternates to the
> page, it would be useful to be able to mark which feed is "preferred."

+0.5

James

Re: Autodiscovery

2005-05-05 Thread Sjoerd Visscher


Supporting general hyperlinks starts making more sense if we have cases
other than alternate (I've written elsewhere about this) because the
amount of duplicated information is much greater. If you're only
supporting feeds that serve as an alternate form of the content, then it
makes sense to repeat one link once just to make the programmer stuck
writing the user agent. I'd hope that whatever library/toolkit they're
using supports XPath queries. Using them makes it easy to pluck out
anything with type="application/atom+xml" and an href property.
Maybe atom needs only one link with a rel attribute, but there are 
others. I have a lot of hyperlinks with rel attributes on my weblog 
homepage, and I refuse to repeat them all as link elements.

--
Sjoerd Visscher
http://w3future.com/weblog/

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Tim Bray

On May 5, 2005, at 11:02 AM, Robert Sayre wrote:
I don't see a conflict there. What's wrong with accepting two similar
paces because one corrects the flaws in the other?
Graham, that's just not true. It wasn't called
PaceSummariesAreNotRequired, was it? It materially changes the only
action PaceOptionalSummary takes. They are not compatible.
In fact, let's get the chairs to clarify this.
Speaking not as the chair but as an interested WG member,  I read them 
about eight times and I do not understand why they are in conflict.  
Someone please explain, as simply as possible, what the problem is, 
because I just don't get it.  On the face of it, I am inclined to be +1 
to both PaceOptionalSummary and PaceTextShouldBeProvided.

Note: I totally fail to understand the "Notes" bit at the end of 
PaceTextShouldBeProvided.  It is underspecified to the extent that I 
can't figure out what language change it is actually saying is 
necessary.

Basically, allowing title-only feeds seems OK to me, and encouraging 
people to provide text also seems OK to me, so what's the problem?  In 
fact, these seem pretty orthogonal; it seems quite plausible that one 
could like either of these without liking both.  -Tim

PaceBriefExample

2005-05-05 Thread Tim Bray

+1.  Precision is good. -Tim

PacePubStatusResource

2005-05-05 Thread Tim Bray

Mu
This is interesting but orthogonal to finalizing the format spec. -Tim

PaceOptionalFeedLink

2005-05-05 Thread Tim Bray

+1
There are people who want to publish feeds without rel="alternate" 
links.  I'm against telling people they can't do something they want to 
do without strong reasons, as in loss of interoperability.  I don't see 
the reasons here as strong enough.  -Tim

PaceCoConstraintsAreBad

2005-05-05 Thread Tim Bray

Uh, this one is redundant, right?  It's covered by various combinations 
of other Paces, or am I missing something? -Tim

PacePubControl

2005-05-05 Thread Tim Bray

+1 kind of, there was all sorts of discussion over in Atom-Protocol 
around improvements and extensibility of the various fields; so I don't 
think work is done.  In any case, this is the right place for this 
stuff, so its discussion is orthogonal to consideration of the Atompub 
format Format draft. -Tim

PaceDuplicateIDWithSource

2005-05-05 Thread Tim Bray

-1
Either we're OK with duplicates or we're not.  If we're not, we 
shouldn't relax that condition just because they come from different 
feeds.  The definition of atom:id says it's supposed to be 
*universally* unique, not unique per-source.  -Tim

PaceFeedIdOrSelf

2005-05-05 Thread Tim Bray

-1
Too complicated, not enough benefit.  -Tim

PaceOriginalAttribute

2005-05-05 Thread Tim Bray

+1
I'm not 100% convinced it solves the problems Rob says it does, but it 
seems cheap, lightweight, and unlikely to cause any harm. -Tim

PaceDuplicateIDWithSource2

2005-05-05 Thread Tim Bray

-1
Either we're OK with duplicates or we're not.  If we're not, we 
shouldn't relax that condition just because they come from different 
feeds.  The definition of atom:id says it's supposed to be 
*universally* unique, not unique per-source.  -Tim

PaceRecommendPlainTextContent

2005-05-05 Thread Tim Bray

-1, ill-formed.
The sentiment is worthy but this does not suggest specific language for 
the draft. -Tim

PaceEntryState

2005-05-05 Thread Tim Bray

-1
I want to decouple this stuff from the format spec, i.e. I believe in 
PacePubControl.  -Tim

PaceSourceRecs

2005-05-05 Thread Tim Bray

-1
Irrespective of whether I agree with this or not, I think the material 
belongs in an implementor's guide, not the specification.  -Tim

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-05 Thread Tim Bray

On May 5, 2005, at 3:52 PM, Robert Sayre wrote:
Everything in the proposal section is fine with me, as well. It's that
"Notes" section that's the problem.
Note: I totally fail to understand the "Notes" bit at the end of
PaceTextShouldBeProvided.  It is underspecified to the extent that I
can't figure out what language change it is actually saying is
necessary.
That section says is "If PaceOptionalSummary is 'accepted', this Pace
changes summary to SHOULD." That's OK to propose, but you can't accept
both of them. They conflict.
No it doesn't, it says something about inserting the phrase "...is 
either not present or..." which, by the way, I don't understand.  Are 
we looking at the same document?

Basically, allowing title-only feeds seems OK to me, and encouraging
people to provide text also seems OK to me, so what's the problem?
Current spec: MUST contain a summary
after PaceOptionalSummary: MAY contain a summary
after PaceTextShouldBeProvided: SHOULD contain a summary
So what you're actually objecting to is the last part of the Pace 
before the "Impacts" section, that wants 4.1.2 to say that summary 
SHOULD be there if Content is absent.  Or am I missing something? -Tim

Re: PaceOriginalAttribute

2005-05-05 Thread Robert Sayre


On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:

> >
> Yeah, they think they are, or at least claim to think so.  But isn't
> that the same thing that is stated if you see the following in two
> feeds?
> 
> 
> bar:bar
> 
> foo:bar
> 
> foo:foo
> 
> I may be an imposter
> 
> 

> This says that this feed is (or at least claims it is) forwarding the
> entry with the id "foo:bar" from the feed "foo:foo".
> 
> I am honestly trying to see more in this, but as yet, I don't.

OK, now let's say you're subscribed to "imposter" in PubSub.


  bar:bar
  
 quux:quux1
 
 foo:foo
 
 I may be an imposter

  
 quux:quux2
 
 baz:baz
 
 I may be an imposter

 

Robert Sayre

Re: PaceTextShouldBeProvided

2005-05-05 Thread Graham

On 5 May 2005, at 6:23 pm, Robert Sayre wrote:
It would be deeply bogus to accept a Pace whose sole action was to
remove a normative requirement, and simultaneously accept a Pace that
puts it back in. Seems obvious to me.
Not really. Assuming PaceOptionalSummary is accepted, there are two  
completely valid outcomes:

1. PaceTextShouldBeProvided rejected => summaries are not required,  
and textual content is not encouraged
2. PaceTextShouldBeProvided accepted => summaries are not required,  
but textual content is encouraged

I don't see a conflict there. What's wrong with accepting two similar  
paces because one corrects the flaws in the other?

We know exactly what issues optional content has, because all of the
other formats have it.
Yes, and we don't like it. A basic title only feed gives you fuck all  
to work with - it displays poorly when mixed with rich-content feeds,  
it isn't searchable because there aren't many keywords in the title,  
etc etc. They cause all sorts of problems.

So, we're looking for some way to say "provide as much information as
you can." The problem with saying SHOULD is that we purport to know
how much information the publisher can provide. It would be very easy
to explain this issue in the spec, and I have no objection to doing
so.
SHOULD here means "must unless you absolutely can't". That seems like  
a perfectly dandy explanation of the intention of encouraging high  
quality feeds.

Graham

1 2 >

1 - 100 of 119 matches

Mail list logo