My latest thoughts on Fossil-2.0 are in the attachment. (I think I
have successfully modified the mailing list settings to allow text
attachments through - this message will serve as a test case.)
--
D. Richard Hipp
[email protected]
The latest thinking on the Fossil-2.0 upgrade.
(1) Keep the BLOB.UUID column. The value in BLOB.UUID is the display name
for an artifact. Most artifacts have only this one name. For older
artifacts the name will be a SHA1 hash. For newer artifacts the display
name will be the SHA3-256 hash (or some other hash).
(2) Add a new ALIAS table as follows:
CREATE TABLE alias(
hval TEXT, -- Hex-encoded hash value
htype ANY, -- hash type
rid INTEGER REFERENCES blob, -- Blob that this hash names
PRIMARY KEY(hval,htype,id)
) WITHOUT ROWID;
CREATE INDEX alias_rid ON alias(rid);
This alias table will hold alternative names for artifacts. If the
display name is SHA3-256, there might be a SHA1 alias. Fossil will
work to keep the number of aliases to a minimum. Most artifacts will
have only a display name and no aliases. Many repositories will have
no aliases at all.
Once aliases are registered for an artifact, the artifact can be referred
to using either its display name or any of its aliases.
(3) The repository keeps a list of all hash algorithms used. For new
respositories, this list will be a singleton: SHA3-256. For legacy
respositories, the list will be of length two: SHA3-256, SHA1. The
first algorithm on the list is the preferred algorithm and is the hash
used for new artifacts added by a "fossil commit".
(4) As each new artifacts is added by "fossil commit", all possible hash
names must be computed, in order to check to see if that artifact is
already in the repository. If the new artifact is already in the
repository, it takes on the display name of the preexisting artifact.
If the artifact has never been seen before, the preferred hash algorithm
is used for the display name and the other hashes are discarded.
(5) During synchronization, if one side knows the artifact only by its SHA1
name and the other size knows the artifact only by its SHA3-256 name, then
the two sides will not know that they are holding the same artifact.
The artifact content will be sent over the wire unnecessarily. But once
that happens, both sides will register aliases and no further unnecessary
syncing will occur. It is expected that this unnecessary syncing will
be very rare.
(6) Check-in manifests and other structural artifacts are allowed to contain
a mixture of hash types. A check-in that occurs after transitioning a
project from SHA1 to SHA3-256 will identify older files using their SHA1
hashes and will identify files that have changed since the transition by
their SHA3-256 hashes.
(7) If a Fossil-2.0 repository contains only SHA1 display names, then it will
sync with an older Fossil-1.x peer. However, the Fossil-1.x peer will
complain about protocol errors if artifacts with display names other
than SHA1 are used.
(8) There are no changes to the sync protocol, other than relaxing the
constraint on hash length. For fossil-1.x, the hash length must be
exactly 40 characters. For Fossil-2.0, the hash length must be 40
characters or more.
(9) There are no changes to the file formats, other than relaxing the size
constraint on artifact hashes - allowing hash to be greater than or
equal to 40 characters rather than requiring it to be exactly 40
characters.
(10) Probably: If the display name for an artifact is shorter than an
alias name, then the display name and alias name will swap places.
In this way, if the same artifact is referenced by both its SHA3-256
name and its SHA1 name, then SHA3-256 name will automatically become
the display name.
(11) URLs and command-line arguments can use either the display name or any
of the aliases for an artifact.
(12) Web pages that show details about an artifact will be titled by the
display name, but will also show all aliases.
(13) Repositories will have the option to reject newer content that uses
SHA1 hash names.
_______________________________________________
fossil-dev mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/fossil-dev