On Thu, May 18, 2017 at 10:13 PM, Ben Peart <peart...@gmail.com> wrote:
> This includes the core.fsmonitor setting, the query-fsmonitor hook,
> and the fsmonitor index extension.
>
> Signed-off-by: Ben Peart <benpe...@microsoft.com>
> ---
>  Documentation/config.txt                 |  7 +++++++
>  Documentation/githooks.txt               | 23 +++++++++++++++++++++++
>  Documentation/technical/index-format.txt | 18 ++++++++++++++++++
>  3 files changed, 48 insertions(+)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 96e9cf8b73..4ffbf0d4c2 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -389,6 +389,13 @@ core.protectNTFS::
>         8.3 "short" names.
>         Defaults to `true` on Windows, and `false` elsewhere.
>
> +core.fsmonitor::
> +       If set to true, call the query-fsmonitor hook proc which will
> +       identify all files that may have had changes since the last
> +       request. This information is used to speed up operations like
> +       'git commit' and 'git status' by limiting what git must scan to
> +       detect changes.
> +
>  core.trustctime::
>         If false, the ctime differences between the index and the
>         working tree are ignored; useful when the inode change time
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index 706091a569..f7b4b4a844 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -448,6 +448,29 @@ The commits are guaranteed to be listed in the order 
> that they were
>  processed by rebase.
>
>
> +[[query-fsmonitor]]
> +query-fsmonitor
> +~~~~~~~~~~~~
> +
> +This hook is invoked when the configuration option core.fsmonitor is
> +set and git needs to identify changed or untracked files.  It takes
> +a single argument which is the time in elapsed seconds since midnight,
> +January 1, 1970.
> +
> +The hook should output to stdout the list of all files in the working
> +directory that may have changed since the requested time.  The logic
> +should be inclusive so that it does not miss any potential changes.
> +The paths should be relative to the root of the working directory
> +and be separated by a single NUL.
> +
> +Git will limit what files it checks for changes as well as which
> +directories are checked for untracked files based on the path names
> +given.
> +
> +The exit status determines whether git will use the data from the
> +hook to limit its search.  On error, it will fall back to verifying
> +all files and folders.
> +
>  GIT
>  ---
>  Part of the linkgit:git[1] suite
> diff --git a/Documentation/technical/index-format.txt 
> b/Documentation/technical/index-format.txt
> index ade0b0c445..b002d23c05 100644
> --- a/Documentation/technical/index-format.txt
> +++ b/Documentation/technical/index-format.txt
> @@ -295,3 +295,21 @@ The remaining data of each directory block is grouped by 
> type:
>      in the previous ewah bitmap.
>
>    - One NUL.
> +
> +== File System Monitor cache
> +
> +  The file system monitor cache tracks files for which the query-fsmonitor
> +  hook has told us about changes.  The signature for this extension is
> +  { 'F', 'S', 'M', 'N' }.
> +
> +  The extension starts with
> +
> +  - 32-bit version number: the current supported version is 1.
> +
> +  - 64-bit time: the extension data reflects all changes through the given
> +       time which is stored as the seconds elapsed since midnight, January 
> 1, 1970.
> +
> +  - 32-bit bitmap size: the size of the CE_FSMONITOR_DIRTY bitmap.
> +
> +  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
> +    is CE_FSMONITOR_DIRTY.

We already have a uint64_t in one place in the codebase (getnanotime)
which uses a 64 bit time for nanosecond accuracy, and numerous
filesystems already support nanosecond timestamps (ext4, that new
Apple thingy...).

I don't know if any of the inotify/fsmonitor APIs support that yet,
but it seems inevitable that that'll be added if not, in some
pathological cases we can have a lot of files modified in 1 second, so
using nanosecond accuracy means there'll be a lot less data to
consider in some cases.

It does mean this'll only work until the year ~2500, but that seems
like an acceptable trade-off.

Reply via email to