Re: rev hash stability

Jens Alfke Fri, 17 Oct 2014 16:16:45 -0700

> On Oct 17, 2014, at 2:22 PM, Brian Mitchell <[email protected]> 
> wrote:
> 
> Giving revs meaning outside of this scope is likely to bring up more meta
> discussion about the CouchDB data model and a long history of
> undocumented choices which only manifest in the particular
> implementation we have today.


That does appear to be a danger. I'm not interested in bike-shedding; if the 
Apache CouchDB community can't make progress on this issue then we can discuss 
it elsewhere to come up with solutions. I can't speak for Chris, but I'm here 
as a courtesy and because I believe interoperability is important. But I 
believe making progress is more important.

Back to the matter at hand: experience from a long line of P2P systems (from 
FreeNet onwards) shows the value of giving pieces of distributed content their 
own unique and unforgeable IDs. CouchDB-style revision IDs partly meet this 
need, except that:
(a) there are interoperability issues because every implementation has its own 
algorithm for generating the IDs;
(b) none of the current ones are very unforgeable because they use the broken 
MD5 hash instead of something like SHA256;
(c) the unforgeability isn't verified because the replicator doesn't check that 
a revision's ID matches its contents.

At some point — Couchbase would like to build P2P systems in the future — we 
may need to take this more seriously, at which point it becomes necessary to 
have a canonical rev-ID generation algorithm which is enforced by the 
replicator. That algorithm will need to be standardized for interoperability 
purposes, since otherwise two implementations would reject each other's 
revisions as forgeries.

That's why I see this issue as important.

—Jens

Re: rev hash stability

Reply via email to