Tim Bray wrote:

On May 18, 2005, at 9:11 AM, Sam Ruby wrote:

    There seemed to be consensus that feeds needed something to identify
    them.  What there wasn't consensus on is which element should be
    that identifier.  The solution selected was to make none of the
    identifiers required - something I don't think has much support, and
    furthermore creates problems with the ability to cite a source.

Paul and I talked this over, and while we're not sure (the email trail was confusing) Sam may be right; so let's find out. Note that it seems pretty clear that the cost of requiring atom:id is nearly zero, since anyone who's generating Atom has to have an ID generator around for entries. So, let's find out if Sam is right:

Permit me to provide some more background. I *think* that there is some desire that if the SAME entry appears in multiple places (say, TheServerSide, Java.net, Artima, etc.) that it not appear multiple times. This specific example was called out here:


  http://www.tbray.org/ongoing/When/200x/2005/04/03/Atom-Now#p-1

I appologize for referencing something outside of the mailing list, and if the item I cited does not represent the consensus of the working group, then the following text should be removed from PaceDuplicateIDs:

  If multiple atom:entry elements with the same atom:id value appear in
  an Atom Feed document, they represent the same entry

 - - -

If, however, this above is what we desire (and I certainly support it), let's see what Bob's objection was:

  http://www.imc.org/atom-syntax/mail-archive/msg14470.html

In that, he said:

  The problem is, once again, that prohibiting duplicate ids provides an
  easy to use attack vector for those wishing to effectively “erase”
  entries written by another author.

Now, lets take a look again at that text from PaceDuplicateIDs:

  If multiple atom:entry elements with the same atom:id value appear in
  an Atom Feed document, they represent the same entry.

It seems to me that this does not solve the problem that Bob described. More specifically, if pubsub were to republish data from TheServerSide, Artima, or other places, then the "erasure" that Bob fears would come to pass.

What's most puzzling is that it appears that PaceDuplicate IDs was specifically written in response to Bob's concerns:

  http://www.imc.org/atom-syntax/mail-archive/msg14731.html

What is missing in all this is the following, again from Bob's original statement of the problem:

  Graham Park has proposed that we loosen the existing language to
  permit duplicate ids in the case where the entries have atom:source
  elements which identify different URI’s in “self” links. I support
  this compromise and believe it should be supported by the WG and
  incorporated into the Atom Draft.

This proposal seemed to have enjoyed some support, yet it did not seem to have made it into the current draft, despite being crucial to the solving the issue that PaceDuplicateIDs was designed to address. However, for it to work, the re-aggregator would need to have access to a "self" link, which is not required by the current draft.

What should we do? One way to solve this is to require "id" *and* update Graham's original proposal accordingly, *and* incorporate it into the next (and presumably final draft).

 - - -

That's what I meant by "There is a danger of looking at changes in isolation.":

  http://www.imc.org/atom-syntax/mail-archive/msg15292.html

Of course, breaking any link in my complicated chain of logic above would cause the whole argument to collapse, which would be fine with me.

Does anybody see something that I am missing?

- Sam Ruby



Reply via email to