This series looks fine to me, except for some minor issues I commented individually. Thanks for improving the documentation!
In general, I think Patch 1 could be made more friendly to new people. Patch 2 is good. And Patch 3 looks a bit biased on some topics - probably fine - I can improve it on topics like linkrevs and map data structures in the future. Excerpts from Gregory Szorc's message of 2017-02-27 12:54:00 -0800: > # HG changeset patch > # User Gregory Szorc <gregory.sz...@gmail.com> > # Date 1488226671 28800 > # Mon Feb 27 12:17:51 2017 -0800 > # Node ID ded4aedfaffbabce6c083f660fc5feeeeb287f0c > # Parent abb92b3d370e116b29eba4d2e3154e9691c8edbb > help: clarify revision / chunk behavior > > Try to make it easier to understand the differences between the logical > and physical model of revlog storage. > > diff --git a/mercurial/help/internals/revlogs.txt > b/mercurial/help/internals/revlogs.txt > --- a/mercurial/help/internals/revlogs.txt > +++ b/mercurial/help/internals/revlogs.txt > @@ -2,17 +2,18 @@ Revision logs - or *revlogs* - are an ap > storing discrete entries, or *revisions*. They are the primary storage > mechanism of repository data. > > +A revlog revision logically consists of 2 parts: metadata and a content > +blob. Metadata includes the hash of the revision's content, sizes, and > +links to its *parent* entries. The collective metadata is referred > +to as the *index* and the revision content is the *data*. > + > Revlogs effectively model a directed acyclic graph (DAG). Each node > has edges to 1 or 2 *parent* nodes. Each node contains metadata and > the raw value for that node. > > -Revlogs consist of entries which have metadata and revision data. > -Metadata includes the hash of the revision's content, sizes, and > -links to its *parent* entries. The collective metadata is referred > -to as the *index* and the revision data is the *data*. > - > -Revision data is stored as a series of compressed deltas against previous > -revisions. > +The revision data physically stored in a revlog entry is referred to as > +a *chunk*. A *chunk* is either the raw fulltext of a revision or a delta > +against a previous fulltext. In both cases, a *chunk* may be compressed. > > Revlogs are written in an append-only fashion. We never need to rewrite > a file to insert nor do we need to remove data. Rolling back in-progress > @@ -87,7 +88,7 @@ 0-3 (4 bytes) (rev 0 only) > Revlog header > > 0-5 (6 bytes) > - Absolute offset of revision data from beginning of revlog. > + Absolute offset of revision chunk from beginning of revlog. > > 6-7 (2 bytes) > Bit flags impacting revision behavior. The following bit offsets define: > @@ -100,15 +101,15 @@ 6-7 (2 bytes) > 2: REVIDX_EXTSTORED revision data is stored externally. > > 8-11 (4 bytes) > - Compressed length of revision data / chunk as stored in revlog. > + Compressed length of revision chunk as stored in revlog. > > 12-15 (4 bytes) > Uncompressed length of revision data. This is the size of the full > - revision data, not the size of the chunk post decompression. > + revision data (as opposed to the delta/chunk). > > 16-19 (4 bytes) > Base or previous revision this revision's delta was produced against. > - -1 means this revision holds full text (as opposed to a delta). > + -1 means this chunk holds full text (as opposed to a delta). > For generaldelta repos, this is the previous revision in the delta > chain. For non-generaldelta repos, this is the base or first > revision in the delta chain. > @@ -185,16 +186,16 @@ The actual layout of revlog files on dis > *store format*. Typically, a ``.i`` file represents the index revlog > (possibly containing inline data) and a ``.d`` file holds the revision data. > > -Revision Entries > -================ > +Revision Chunks > +=============== > > -Revision entries consist of an optional 1 byte header followed by an > -encoding of the revision data. The headers are as follows: > +Chunks in revision entries consist of an optional 1 byte header followed > +by an encoding of the chunk data. The headers are as follows: > > \0 (0x00) > - Revision data is the entirety of the entry, including this header. > + Chunk data is the entirety of the entry, including this header. > u (0x75) > - Raw revision data follows. > + Raw chunk data follows. > x (0x78) > zlib (RFC 1950) data. > _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel