On Jun 20, 2005, at 11:17 PM, James M Snell wrote:

The thought here then is that feeds would not be considered atomic units and that <entry /> elements can be pulled as is out of a containing <feed /> element and passed around independently of it.

That's basically the idea, yes.

That really doesn't seem to square up with some basic XML signature principles and other atom conventions such as the ability to omit the author element from a contained <entry /> if the containing feed has an author...

Which XML-DSig principles does this violate? It seems to me that if you come across a signed node, as long as you don't break the well-formedness of it, you can do what you like with it.

As for Atom conventions, such as the omission of key information from an <entry>---yes, this is a complication. One approach would be to make my Atom feeds "disassemblable", such that each entry found in the feed could equally stand on its own as an Atom Entry Document.

Then again, this may not be such a big deal. In the case of an aggregator it's an issue, but in the peers-sharing-entries case, it's conceivable that each peer already knows the <feed> level metadata and can refer to it when necessary. (Perhaps the peers periodically---but rarely!---poll the conventional Atom feed, or perhaps they exchange the entire <feed> contents among themselves if it too has been signed by the original publisher.)

So the question with regards to enabling aggregation services is whether or not those services could even exist without performing a level of processing against the feed and entries that would necessarily break the digital signature. In other words, if the entry in the first example above were to be included in an aggregate feed containing entries from multiple authors, the <author /> element from its containing feed would need to be added to the entry element's collection of metadata... (<entry>...</entry> becomes <entry> .. .<author /> ... </entry>) thereby invalidating any signature calculated over the entry.

I agree that destructively processing the entries is almost certain to cause trouble (and relates directly to your canonicalization argument below). This might not be such a big deal; aggregators might simply pass along what users can verify as authentic, placing the pressure on original Atom content producers (i.e. publishers and their software) to create "self-sufficient" entries.

One would also have to contend with the potential problems introduced by namespace declarations with the feed. The bottom line of this is that an entry with a signature could not simply be copied over to a new containing feed element with the signature in tact making the aggregator scenario unworkable.

Ugh. I don't have an answer for the namespaces problem. (XML isn't doing me any favors here. Or perhaps the reverse is more true?)

I suppose that in my mind, an Atom feed is mostly a vector (pun intended) for carrying Atom entries. After all, the entries are what the user (or end processor, or what have you) is really interested in! They are stable, mostly-unchanging, "atomic" (!) bits of data that can be exchanged and stored and reasoned about. On the other hand, the feed is always changing, even if most of its data remains the same; as a sliding window of Atom "events", it's a moving target, almost entirely uninteresting in itself from the perspective of processing applications. Digital signatures just bring this point to the fore.

Which brings me to the following heresy: Wouldn't it be nice if an Atom feed were just a list of self-contained Atom Entry Documents? Yes, some data will be duplicated between items, but there's a tremendous flexibility benefit, to my mind. Entries are now much more loosely coupled to one another, and can survive on their own in any context, for any purpose.

(The reason I think that this is a *useful* flexibility is that when I look at current applications of newsfeeds, I see a strong focus on entries. Popular newsreaders spend the bulk of their UI energy on listing and presenting entries, because that's where the user's focus is. The "objects" in the system, from the user's perspective, are the entries. The feed is just the box they came in. How long before newsreaders allow users to clip and save their favorite entries, individually, in a *different* box---like saving email messages to a folder? Even this simple operation is made difficult by binding entries tightly to their feeds. But I digress.)

The only potential way around this problem would be to define a standard canonicalization mechanism for Atom entries that would make it possible to reliably sign and verify them across multiple feeds.

I agree that canonicalization would also solve this. (Just to be sure we're on the same page: XML Canonicalization is woefully insufficient for this domain-specific task.) You might view the concept of an Atom Entry Document as a start in this direction.

Unless such a canonicalization mechanism is defined, it would appear as if there would be no way of ensuring that the individual entries within a synthesized aggregate feed are indeed "authentic" unless a) they contained a self-referential pointer back to a digitally signed version of itself, b) the synthesized feed in which it was contained is digitally signed by a trusted entity and c) the version of the entry contained in the synthesized feed is identical to its digitally signed reference copy.

That seems pretty complicated. Also, it requires the client, having received an entry signed by an aggregator, to go fetch the entry from the original server in order to get the "authentic" version (signed by originator). In that case, why bother distributing the content in the aggregator? The aggregator could save some effort and just publish list of URLs for the clients to fetch.

[...] The key challenge with this approach is that one would really have to trust the aggregator in order for it to work.

Right, which is the kind of trust I'm trying to avoid relying on.

Another challenge is the fact that the Atom specification only accounts for Signature elements at the document level -- e.g. as a child of the top level <feed /> or <entry /> elements -- and not on the child entry level.

Yes. This has always struck me as an unnecessary limitation in the spec. I chalked it up to getting 1.0 out the door, and trying to hash out all the ugly details of signatures later (in which case, mission accomplished).

So... given all this... I think I'm gong to make an assertion and open that assertion up for debate: The need for an end-to-end trust model for Atom capable of traversing any number of intermediaries is largely a myth. What is really needed is a simple mechanism for protecting feeds against spoofed sources (e.g. man-in-the-middle serving up a bogus feed) and for indicating that content is trustworthy* on the document level (as opposed to individual feed entry level). * by which I mean, for example, binary enclosures are trustworthy, the feed itself does not contain any malicious content, etc

If indeed end-to-end authenticity is unnecessary, then I agree: document-level trust is all you need.

However, providing only document-level authenticity locks Atom (as a data format) out of other distribution models. In essence, you're stuck with either "client polls source for entire feed" or "client gets entire feed from somewhere else". There's no opportunity to send deltas, and there's no opportunity to combine feeds. (Unless security is unimportant to Atom applications, in which case there's no reason to sign feeds or entries at all.)

---dan, kicking in way more than his two-cent quota


Reply via email to