David Turner <dtur...@twopensource.com> writes:

> From: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
>
> Instead of reading the index from disk and worrying about disk
> corruption, the index is cached in memory (memory bit-flips happen
> too, but hopefully less often). The result is faster read. Read time
> is reduced by 70%.
>
> The biggest gain is not having to verify the trailing SHA-1, which
> takes lots of time especially on large index files. But this also
> opens doors for further optimiztions:
>
>  - we could create an in-memory format that's essentially the memory
>    dump of the index to eliminate most of parsing/allocation
>    overhead. The mmap'd memory can be used straight away. Experiment
>    [1] shows we could reduce read time by 88%.
>
>  - we could cache non-index info such as name hash
>
> The shared memory's name folows the template "git-<object>-<SHA1>"
> where <SHA1> is the trailing SHA-1 of the index file. <object> is
> "index" for cached index files (and may be "name-hash" for name-hash
> cache). If such shared memory exists, it contains the same index
> content as on disk. The content is already validated by the daemon and
> git won't validate it again (except comparing the trailing SHA-1s).

This indeed is an interesting approach; what is not explained but
must be is when the on-disk index is updated to reflect the reality
(if I am reading the explanation and the code right, while the
daemon is running, its in-core cache becomes the source of truth by
forcing everybody's read-index-from() to go to the daemon).  The
explanation could be "this is only for read side, and updating the
index happens via the traditional 'write a new file and rename it to
the final place' codepath, at which time the daemon must be told to
re-read it."
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to