Jeff King <p...@peff.net> writes:

> A much bigger problem is the other places we reference sha1s. The
> obvious place is trees, which have no room for backup pointers (either
> in headers, or with a NUL trick).

This is a tangent (as I do not have anything particularly worth
adding on top of what have already been said around the exact
SHA-[123] topic), but we probably would want to start thinking about
the tree object format "v2" at some point.

Some random thoughts:

 - It is OK if existing versions of Git barfed when asked to read a
   tree object in the "v2" format.  The repository format version
   may need to be bumped up when writing such an object, and
   transfer protocols need to pay attention to it, to avoid
   transferring history with objects in newer representation to
   repositories with older repository format version.

 - We do not need a new "tree v2" object type.  Existing versions of
   Git will barf upon seeing such an object, but that won't be the
   only way to prevent existing versions of Git from misinterpreting
   a tree object recorded in the "v2" format as if it were in the
   current format (e.g. a non-octal in the mode field of the first
   entry causes tree-walk.c::get_mode() to barf).

 - We do not mind two tree objects that encodes the same tree in the
   current and the enhanced formats to have different object names.
   In fact, we care more about the object names derived purely from
   the content of the object as an uninterpreted bytestream, so it
   is expected that they have different object names.

   This will make the path-limited traversal and diff to open more
   trees unnecessarily at the "version bump" boundary in the
   history, but that is normal (think of a project that used to
   record its text files with CRLF and one day decides to convert
   everything to LF; the trees before and after the conversion will
   record logically the same contents "git show" should give an
   emptyness, but the diff machinery needs to go into contents at
   the flag day boundary).

   As long as we do not let random "extension of the day" into the
   new format willy-nilly, the resulting history will still be
   useful and usable.  From that point of view, no parts of the
   additional information we would record in the updated format that
   is not present in the current format should be optional (iow,
   once you decide to use the "v2" format to record a certain tree,
   you will produce an identical and reproducible representation in
   "v2", regardless of your implementation).

All of the above are issues for Git 3.0 and beyond, though ;-).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to