David Roundy wrote:
On Mon, Dec 04, 2006 at 04:34:48PM +0000, Simon Marlow wrote:

David Roundy wrote:


I've been working hard on getting support for the new hashed inventory
format into good shape.  If you aren't familiar with the benefits of the
new format (which I've talked about with at least some of you in person),
suffice to say that I see it as a precursor to working out the new way of
dealing with conflicts.

As an interested bystander, I'd really like to hear a brief description of what a "hashed inventory" is, and what benefits it brings. Not a 12-page paper, just a quick outline will do fine, I don't want to distract you from the hacking frenzy :)


A hashed inventory is a modification of the darcs repository format, which
essentially replaces the _darcs/inventory file (which is human-readable, if
not human-modifiable, so if you're not familiar with it, you could take a
look) with a _darcs/hashed_inventory file.  The difference is that a hash
of the contents of each patch is stored, along with the identifier of the
patch, as is currently stored.  This hash is then used as the filename in
_darcs/patches/.  This has several benefits.

At the most obvious level, we've now got some extra information for
checking the consistency of a repository (helpful if, e.g. an http proxy
modifies files in transit).

The next advantage is that by cryptographically signing the hashed
inventory, you cryptographically sign the entire contents of the repository
(unless someone cracks sha1).  This is potentially valuable to high-profile
projects, or projects that use untrusted mirrors.

Next, because the filename for patches now depends on patch contents, all
darcs commands will be atomic (except with respect to the pristine
cache--but atomic with respect to remote access), including those that
currently aren't, such as amend-record and obliterate.

With hashed inventories it will be possible to implement "lazy" partial
repositories, in which darcs downloads patch files as needed to do the
commands you ask, since we'll have the hash with which to verify that the
patch files haven't been commuted (and therefore are still in the proper
context for our use).

Finally, as I mentioned above, the refactoring for this change should help
with our plans for new conflict handling, which will probably require that
we break the current picture of one patch file per named patch (which
wouldn't work in the current scheme where the patch filename is determined
by the name of the patch).

Great, thanks David!

Simon


_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel

Reply via email to