Dan Sandler wrote:
On Jun 20, 2005, at 4:15 PM, James M Snell wrote:
Question: should we only allow signing of the entire document or are
there valid use cases for allowing each individual entry in the feed
to be individually signed?
I believe that individually signed entries are essential for a couple
of Atom usage contexts:
(1) aggregation services, e.g. third parties which combine or
filter portions of feeds
(2) redistribution applications, e.g. clients which share portions
of feeds with one another
The common thread here is that these scenarios represent third parties
interposed between publisher and user. [In category (1) the third
party is the aggregator; in category (2) the third party is "some
other client". This is a continuous design space, of course, but
let's stick with these two simple scenarios.] When a third party is
involved, signatures are required to verify that the information
received by the user has not been altered since the publisher released
it.
If Atom data may only be signed at the <feed/> level, the granularity
at which these third parties may slice and dice content is constrained
to that level also. The <entry/> seems to me to be more generally
useful as, well, an "atomic" signature unit for these applications.
Without individually signed entries, it would be impossible, for
example, to implement the most basic category (1) service: an
aggregator which can be *trusted* to combine two Atom feeds into one.
With signed entries, a client could be sure that each entry found in
the aggregation is authentic, despite having received the entries from
a potentially untrustworthy entity.
If this seems a bit far-fetched because there are currently few
examples of these Atom intermediaries, just wait; aggregators are the
new clipping services, and I suspect we'll see more of them in the
future. (Just as with traditional clipping services, people may even
pay money for this valuable service, which is a Good Thing for
technology that enables it.) And as for (2), I'm working on it :)
The thought here then is that feeds would not be considered atomic units
and that <entry /> elements can be pulled as is out of a containing
<feed /> element and passed around independently of it. That really
doesn't seem to square up with some basic XML signature principles and
other atom conventions such as the ability to omit the author element
from a contained <entry /> if the containing feed has an author... for
example:
<feed>
...
<author>...</author>
<entry>...</entry>
</feed>
as oppose to
<feed>
...
<entry>
<author>...</author>
</entry>
</feed>
In the first case, the <entry /> is entirely dependent on the containing
feed for specifying a critical and required piece of metadata making it
impossible to separate from the feed without first processing it.
So the question with regards to enabling aggregation services is whether
or not those services could even exist without performing a level of
processing against the feed and entries that would necessarily break the
digital signature. In other words, if the entry in the first example
above were to be included in an aggregate feed containing entries from
multiple authors, the <author /> element from its containing feed would
need to be added to the entry element's collection of metadata...
(<entry>...</entry> becomes <entry> .. .<author /> ... </entry>) thereby
invalidating any signature calculated over the entry. One would also
have to contend with the potential problems introduced by namespace
declarations with the feed. The bottom line of this is that an entry
with a signature could not simply be copied over to a new containing
feed element with the signature in tact making the aggregator scenario
unworkable.
e.g.
Feed 1:
<feed xmlns="...">
...
<entry>
<id>urn:123</id>
...
<Signature xmlns="..." />
</entry>
</feed>
Feed 2:
<x:feed xmlns:x="...">
...
<x:entry>
<x:id>urn:abc</x:id>
...
<ds:Signature xmlns:ds="..." />
</x:entry>
</x:feed>
Perfectly Legal Synthesized Aggregated Feed
<a:feed xmlns:a="..." >
<a:entry>
<a:id>urn:123</a:id>
...
<Signature xmlns="..." /> <!-- invalid signature !! -->
</a:entry>
<a:entry>
<a:id>urn:abc</a:id>
...
<Signature xmlns="..." /> <!-- invalid signature !! -->
</a:entry>
</a:feed>
The only potential way around this problem would be to define a standard
canonicalization mechanism for Atom entries that would make it possible
to reliably sign and verify them across multiple feeds. While
plausible, that approach gets rather complicated and is only justified
if a) there is enough of a use case justification and b) there's not an
easier way. Unless such a canonicalization mechanism is defined, it
would appear as if there would be no way of ensuring that the individual
entries within a synthesized aggregate feed are indeed "authentic"
unless a) they contained a self-referential pointer back to a digitally
signed version of itself, b) the synthesized feed in which it was
contained is digitally signed by a trusted entity and c) the version of
the entry contained in the synthesized feed is identical to its
digitally signed reference copy. but even this is obviously imperfect.
Digitally signed original @ http://myblog.com/entries/1
<?xml ...?>
<entry>
<id>urn:123</id>
<link rel="self" href="http://myblog.com/entries/1" />
<Signature xmlns="..."/> <!-- signed by original source -->
</entry>
Synthesized Aggegate Feed @ http://someaggregator/feed
<?xml ...?>
<feed>
...
<entry>
<id>urn:123</id>
<link rel="self" href="http://myblog.com/entries/1" />
</entry>
<Signature xmlns="..." /> <!-- signed by aggregator -->
</feed>
With this approach, the referenced copy of the entry becomes the
"authenticated" version (assuming you trust both the aggregator's and
original source's signatures) while the version of the entry contained
in the entry is "unauthenticated". The key challenge with this approach
is that one would really have to trust the aggregator in order for it to
work.
Another challenge is the fact that the Atom specification only accounts
for Signature elements at the document level -- e.g. as a child of the
top level <feed /> or <entry /> elements -- and not on the child entry
level.
So... given all this... I think I'm gong to make an assertion and open
that assertion up for debate: The need for an end-to-end trust model
for Atom capable of traversing any number of intermediaries is largely a
myth. What is really needed is a simple mechanism for protecting feeds
against spoofed sources (e.g. man-in-the-middle serving up a bogus feed)
and for indicating that content is trustworthy* on the document level
(as opposed to individual feed entry level).
* by which I mean, for example, binary enclosures are trustworthy, the
feed itself does not contain any malicious content, etc
Thoughts?
---dan
http://www.cs.rice.edu/~dsandler/
- James