Michael Haggerty <mhag...@alum.mit.edu> writes:

>>>>> CVS stores all of the revisions of a single file in a single filename,v
>>>>> file in rcsfile(5) format.  The revisions are stored as deltas ordered
>>>>> so that a single revision can be reconstructed from a single serial read
>>>>> of the file.
>>>>>
>>>>> cvs2git reads each of these files once, reconstructing *all* of the
>>>>> revisions for a file in a single go.  It then pours them into a
>>>>> git-fast-import stream as blobs and sets a mark on each blob.

This is more or less off-topic but in the bigger picture it is more
interesting and important X-<.

The way you describe how cvs2git handles the blobs is the more
efficient way, given that fast-import does not even attempt to
bother to create good deltas. The only thing it does is to see if
the current data deltifies against the last object.

IIRC, CVS's backend storage is mostly recorded in backward delta, so
if you are feeding the blob data from new to old, then the resulting
pack would follow Linus's law (the file generally grows over time)
and would generally give you a good deltified chain of objects.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to