Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-21 Thread Junio C Hamano
Ben Peart  writes:

> Since the fsmonitor data can be trusted and is kept in sync with the
> working directory, the only remaining valid uses are those locations
> where we don't want to trigger an unneeded refresh_fsmonitor() call.

Now that is a lot more assuring ;-)  And the exceptions below also
make sense to me.

Thanks for thinking this through.


Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-21 Thread Ben Peart



On 9/20/2017 10:24 PM, Ben Peart wrote:



On 9/20/2017 10:00 PM, Junio C Hamano wrote:

Ben Peart  writes:


Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
features.  That is generally when you are about to overwrite data so
want to be *really* sure you have what you think you have.


Now that makes me worried gravely.

IGNORE_VALID is ignored in these places because we have been burned
by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
because we know that the working tree state does not match the
"reality" the index wants to have.  The fact that the code treats
the status reported and kept up to date by fsmonitor the same way as
these two implies that it is merely advisory and cannot be trusted?
Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
ignore the state fsmonitor reported and check the state ourselves?



Sorry for causing unnecessary worry.  The fsmonitor data can be trusted 
(as much as you can trust that Watchman or your file system monitor is 
not buggy).  I wasn't 100% sure *why* these places passed the various 
IGNORE_VALID and IGNORE_SKIP_WORKTREE flags.  When I looked at them, 
that lack of trust seemed to be the reason.


Adding IGNORE_FSMONITOR in those same places was simply an abundance of 
caution on my part.  The only down side of passing the flag for 
fsmonitor is that we will end up calling lstat() on a file where we 
technically didn't need too.  That seemed safer than potentially missing 
a change if I had misunderstood the code.


I'd much rather return correct results (and fall back to the old 
performance) than potentially be incorrect.  I followed that same 
principal in the entire design of fsmonitor - if anything doesn't look 
right, fall back to the old code path just in case...




I spent some time with git blame/show trying to figure out the *why* for 
all the places CE_MATCH_IGNORE_* are passed without gaining a lot of 
additional understanding.  Based on your description above of why these 
exist, I believe there are very few places we actually need to pass 
CE_MATCH_IGNORE_FSMONITOR and that I was being overly cautious.


Here is a patch that removes the unnecessary CE_MATCH_IGNORE_FSMONITOR 
instances.  While the test suite passes with this change, I'm not 100% 
confident that we actually have test cases that would have detected all 
the places that we needed the CE_MATCH_IGNORE_* flags.


If this seems like a reasonable additional optimization to make, I can 
roll it into the next iteration of the patch series as I have some 
spelling, documentation changes and other tweaks as a result of all the 
feedback.



From 6ff7ed0467fd736dca73efe62391bb3ee9b4e771 Mon Sep 17 00:00:00 2001
From: Ben Peart 
Date: Thu, 21 Sep 2017 09:09:42 -0400
Subject: [PATCH] fsmonitor: remove unnecessary uses of
 CE_MATCH_IGNORE_FSMONITOR

With a better understanding of *why* the CE_MATCH_IGNORE_* flags are
used, it is now more clear they are not required in most cases where
CE_MATCH_IGNORE_FSMONITOR was being passed out of an abundance of
caution.

Since the fsmonitor data can be trusted and is kept in sync with the
working directory, the only remaining valid uses are those locations
where we don't want to trigger an unneeded refresh_fsmonitor() call.

One is where preload_index() is doing a fast precompute of state for
the bulk of the index entries but is not required for correctness as
refresh_cache_ent() will ensure any "missed" by preload_index() are
up-to-date if/when they are needed.

The second is in is_staging_gitmodules_ok() where we don't want to
trigger a complete refresh just to check the .gitignore file.

The net result of this change will be that there are more cases where
we will be able to use the cached index state and avoid unnecessary
lstat() calls.

Signed-off-by: Ben Peart 
---
 apply.c| 2 +-
 entry.c| 2 +-
 read-cache.c   | 4 ++--
 unpack-trees.c | 6 +++---
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/apply.c b/apply.c
index 9061cc5f15..71cbbd141c 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct 
cache_entry *ce, struct stat *st)

return -1;
return 0;
}
-	return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);

 }

 #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/entry.c b/entry.c
index 5e6794f9fc..3a7b667373 100644
--- a/entry.c
+++ b/entry.c
@@ -404,7 +404,7 @@ int checkout_entry(struct cache_entry *ce,

if (!check_path(path.buf, path.len, , state->base_dir_len)) {
const struct submodule *sub;
-		unsigned changed = ce_match_stat(ce, , 

Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-20 Thread Ben Peart



On 9/20/2017 10:00 PM, Junio C Hamano wrote:

Ben Peart  writes:


Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
features.  That is generally when you are about to overwrite data so
want to be *really* sure you have what you think you have.


Now that makes me worried gravely.

IGNORE_VALID is ignored in these places because we have been burned
by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
because we know that the working tree state does not match the
"reality" the index wants to have.  The fact that the code treats
the status reported and kept up to date by fsmonitor the same way as
these two implies that it is merely advisory and cannot be trusted?
Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
ignore the state fsmonitor reported and check the state ourselves?



Sorry for causing unnecessary worry.  The fsmonitor data can be trusted 
(as much as you can trust that Watchman or your file system monitor is 
not buggy).  I wasn't 100% sure *why* these places passed the various 
IGNORE_VALID and IGNORE_SKIP_WORKTREE flags.  When I looked at them, 
that lack of trust seemed to be the reason.


Adding IGNORE_FSMONITOR in those same places was simply an abundance of 
caution on my part.  The only down side of passing the flag for 
fsmonitor is that we will end up calling lstat() on a file where we 
technically didn't need too.  That seemed safer than potentially missing 
a change if I had misunderstood the code.


I'd much rather return correct results (and fall back to the old 
performance) than potentially be incorrect.  I followed that same 
principal in the entire design of fsmonitor - if anything doesn't look 
right, fall back to the old code path just in case...



Oh, wait...



The other place I used it was in preload_index(). In that case, I
didn't want to trigger the call to refresh_fsmonitor() as
preload_index() is about trying to do a fast precompute of state for
the bulk of the index entries but is not required for correctness as
refresh_cache_ent() will ensure any "missed" by preload_index() are
up-to-date if/when that is needed.


That is a very valid design decision.  So IGNORE_FSMONITOR is,
unlike IGNORE_VALID and IGNORE_SKIP_WORKTREE, to tell us "do not
bother asking fsmonitor to refresh the state of this entry--it is OK
for us to use a slightly stale information"?  That would make sense
as an optimization, but that does not mesh well with the previous
"we need to be really really sure" usecase.  That one wants "we do
not trust fsmonitor, so do not bother asking to refresh; we will do
so ourselves", which would not help the "we can use slightly stale
one and that is OK" usecase.

Puzzled...



Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-20 Thread Junio C Hamano
Ben Peart  writes:

> Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
> and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
> features.  That is generally when you are about to overwrite data so
> want to be *really* sure you have what you think you have.

Now that makes me worried gravely.  

IGNORE_VALID is ignored in these places because we have been burned
by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
because we know that the working tree state does not match the
"reality" the index wants to have.  The fact that the code treats
the status reported and kept up to date by fsmonitor the same way as
these two implies that it is merely advisory and cannot be trusted?
Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
ignore the state fsmonitor reported and check the state ourselves?

Oh, wait...


> The other place I used it was in preload_index(). In that case, I
> didn't want to trigger the call to refresh_fsmonitor() as
> preload_index() is about trying to do a fast precompute of state for
> the bulk of the index entries but is not required for correctness as
> refresh_cache_ent() will ensure any "missed" by preload_index() are
> up-to-date if/when that is needed.

That is a very valid design decision.  So IGNORE_FSMONITOR is,
unlike IGNORE_VALID and IGNORE_SKIP_WORKTREE, to tell us "do not
bother asking fsmonitor to refresh the state of this entry--it is OK
for us to use a slightly stale information"?  That would make sense
as an optimization, but that does not mesh well with the previous
"we need to be really really sure" usecase.  That one wants "we do
not trust fsmonitor, so do not bother asking to refresh; we will do
so ourselves", which would not help the "we can use slightly stale
one and that is OK" usecase.

Puzzled...


Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-20 Thread Ben Peart



On 9/20/2017 2:23 AM, Junio C Hamano wrote:

Ben Peart  writes:


@@ -344,6 +346,7 @@ struct index_state {
struct hashmap dir_hash;
unsigned char sha1[20];
struct untracked_cache *untracked;
+   uint64_t fsmonitor_last_update;


This field being zero has more significance than just "we haven't
got any update yet", right?  The way I am reading the code is that
setting it 0 is a way to signal that fsmon has been inactivated.  It
also made me wonder if add_fsmonitor() that silently returns without
doing anything when this field is already non-zero is a bug (in
other words, I couldn't tell what the right answer would be to a
question "shouldn't the caller be avoiding duplicate calls?").



Correct again.  For better (and sometimes for worse) I followed the 
pattern set by the untracked cache.  If you compare them, you will 
notice striking similarities. :)



diff --git a/fsmonitor.c b/fsmonitor.c
new file mode 100644
index 00..b8b2d88fe1
--- /dev/null
+++ b/fsmonitor.c
...


This part was a pleasant read.
> Thanks.



Thank you for your careful review.  I appreciate having another set of 
eyes taking a close look especially as I see this as a big first step 
towards making many git operations O(# changed files) instead of O(# 
size of working directory). Seeing status times drop from 1m22s to 1.45s 
is a huge perf win - but only if it is correct!


Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-20 Thread Ben Peart



On 9/19/2017 10:28 PM, Junio C Hamano wrote:

Ben Peart  writes:


+/* do stat comparison even if CE_FSMONITOR_VALID is true */
+#define CE_MATCH_IGNORE_FSMONITOR 0X20


Hmm, when should a programmer use this flag?



Pretty much the same places you would also use CE_MATCH_IGNORE_VALID and 
CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those 
features.  That is generally when you are about to overwrite data so 
want to be *really* sure you have what you think you have.


The other place I used it was in preload_index(). In that case, I didn't 
want to trigger the call to refresh_fsmonitor() as preload_index() is 
about trying to do a fast precompute of state for the bulk of the index 
entries but is not required for correctness as refresh_cache_ent() will 
ensure any "missed" by preload_index() are up-to-date if/when that is 
needed.



+int git_config_get_fsmonitor(void)
+{
+   if (git_config_get_pathname("core.fsmonitor", _fsmonitor))
+   core_fsmonitor = getenv("GIT_FSMONITOR_TEST");


Will the environment be part of the published API, or is it a
remnant of a useful tool for debugging while developing the feature?

If it is the former (and I'd say why not, even though "git -c
core.fsmontor=..." may be easy enough), we might want to rename it
to replace _TEST with _PROGRAM or something and document it.



This was added this to facilitate testing.  That is why it has the magic 
naming of "GIT_***_TEST" which is the only way I found to ensure that 
env variable gets passed to tests.  Its use is discussed in patch 10 
which contains the tests.



diff --git a/diff-lib.c b/diff-lib.c
index 2a52b07954..23c6d03ca9 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -12,6 +12,7 @@
  #include "refs.h"
  #include "submodule.h"
  #include "dir.h"
+#include "fsmonitor.h"
  
  /*

   * diff-files
@@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int 
option)
  
  		if (!changed && !dirty_submodule) {

ce_mark_uptodate(ce);
+   mark_fsmonitor_valid(ce);


I notice all calls to mark_fsmonitor_valid(ce) always follow a call
to ce_mark_uptodate(ce).  Should the call to the former be made as
part of the latter instead?  Are there cases where we want to call
ce_mark_uptodate(ce) to mark the ce up-to-date, yet we do not want
to call mark_fsmonitor_valid(ce)?  How does a programmer tell when
s/he wants to call ce_mark_uptodate(ce) if s/he also should call
mark_fsmonitor_valid(ce)?


mark_fsmonitor_valid(ce) is the way to indicate that cache entries that 
were once fsmonitor dirty are now properly reflected in the index so can 
come off the "dirty" list.  It can't really be combined with 
ce_mark_uptodate(ce) as that would prevent the CE_MATCH_IGNORE_FSMONITOR 
logic:


if (!ignore_skip_worktree && ce_skip_worktree(ce)) {
ce_mark_uptodate(ce);
return ce;
}
if (!ignore_valid && (ce->ce_flags & CE_VALID)) {
ce_mark_uptodate(ce);
return ce;
}
if (!ignore_fsmonitor && (ce->ce_flags & CE_FSMONITOR_VALID)) {
ce_mark_uptodate(ce);
return ce;
}

In addition, fsmonitor is an optional feature and so the 
mark_fsmonitor_valid(ce) call should only happen when the feature is 
turned on. I tried to keep it as simple as possible by making that test 
and set logic part of the function but still be performant by making the 
function inline.




Together with "when to pass IGNORE_FSMONITOR" question, is there a
summary that guides new programmers to answer these questions in the
new documentation?



Only the discussion in this mail thread.  I could add something to the 
function header in fsmonitor.h if that would help.  How about something 
like:


diff --git a/fsmonitor.h b/fsmonitor.h
index c2240b811a..03bf3efe61 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -34,9 +34,11 @@ extern void tweak_fsmonitor(struct index_state *istate);
  */
 extern void refresh_fsmonitor(struct index_state *istate);

-/*
- * Set the given cache entries CE_FSMONITOR_VALID bit.
- */
+/*
+ * Set the given cache entries CE_FSMONITOR_VALID bit. This should be
+ * called any time the cache entry has been updated to reflect the
+ * current state of the file on disk.
+ */
 static inline void mark_fsmonitor_valid(struct cache_entry *ce)
 {
if (core_fsmonitor) {
@@ -46,8 +48,11 @@ static inline void mark_fsmonitor_valid(struct 
cache_entry *ce)

 }

 /*
- * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate any
- * corresponding untracked cache directory structures.
+ * Clear the given cache entry's CE_FSMONITOR_VALID bit and invalidate
+ * any corresponding untracked cache directory structures. This should
+ * be called any time git creates or modifies a file that should
+ * trigger an lstat() or invalidate the untracked cache for the
+ * corresponding directory
  */
 static inline void 

Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-20 Thread Junio C Hamano
Ben Peart  writes:

> @@ -344,6 +346,7 @@ struct index_state {
>   struct hashmap dir_hash;
>   unsigned char sha1[20];
>   struct untracked_cache *untracked;
> + uint64_t fsmonitor_last_update;

This field being zero has more significance than just "we haven't
got any update yet", right?  The way I am reading the code is that
setting it 0 is a way to signal that fsmon has been inactivated.  It
also made me wonder if add_fsmonitor() that silently returns without
doing anything when this field is already non-zero is a bug (in
other words, I couldn't tell what the right answer would be to a
question "shouldn't the caller be avoiding duplicate calls?").

> diff --git a/fsmonitor.c b/fsmonitor.c
> new file mode 100644
> index 00..b8b2d88fe1
> --- /dev/null
> +++ b/fsmonitor.c
> ...

This part was a pleasant read.

Thanks.


Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-19 Thread Junio C Hamano
Ben Peart  writes:

> +/* do stat comparison even if CE_FSMONITOR_VALID is true */
> +#define CE_MATCH_IGNORE_FSMONITOR 0X20

Hmm, when should a programmer use this flag?

> +int git_config_get_fsmonitor(void)
> +{
> + if (git_config_get_pathname("core.fsmonitor", _fsmonitor))
> + core_fsmonitor = getenv("GIT_FSMONITOR_TEST");

Will the environment be part of the published API, or is it a
remnant of a useful tool for debugging while developing the feature?

If it is the former (and I'd say why not, even though "git -c
core.fsmontor=..." may be easy enough), we might want to rename it
to replace _TEST with _PROGRAM or something and document it.

> diff --git a/diff-lib.c b/diff-lib.c
> index 2a52b07954..23c6d03ca9 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -12,6 +12,7 @@
>  #include "refs.h"
>  #include "submodule.h"
>  #include "dir.h"
> +#include "fsmonitor.h"
>  
>  /*
>   * diff-files
> @@ -228,6 +229,7 @@ int run_diff_files(struct rev_info *revs, unsigned int 
> option)
>  
>   if (!changed && !dirty_submodule) {
>   ce_mark_uptodate(ce);
> + mark_fsmonitor_valid(ce);

I notice all calls to mark_fsmonitor_valid(ce) always follow a call
to ce_mark_uptodate(ce).  Should the call to the former be made as
part of the latter instead?  Are there cases where we want to call
ce_mark_uptodate(ce) to mark the ce up-to-date, yet we do not want
to call mark_fsmonitor_valid(ce)?  How does a programmer tell when
s/he wants to call ce_mark_uptodate(ce) if s/he also should call
mark_fsmonitor_valid(ce)?

Together with "when to pass IGNORE_FSMONITOR" question, is there a
summary that guides new programmers to answer these questions in the
new documentation?

> diff --git a/dir.h b/dir.h
> index e3717055d1..fab8fc1561 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -139,6 +139,8 @@ struct untracked_cache {
>   int gitignore_invalidated;
>   int dir_invalidated;
>   int dir_opened;
> + /* fsmonitor invalidation data */
> + unsigned int use_fsmonitor : 1;

This makes it look like we will add a bit more fields in later
patches that are about "invalidation" around fsmonitor, but it
appears that such an addition never happens til the end of the
series.  And use_fsmonitor boolean does not seem to be what the
comment says---it just tells us if fsmonitor is in use in the
operation of the untracked cache, no?

> diff --git a/entry.c b/entry.c
> index cb291aa88b..5e6794f9fc 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -4,6 +4,7 @@
>  #include "streaming.h"
>  #include "submodule.h"
>  #include "progress.h"
> +#include "fsmonitor.h"
>  
>  static void create_directories(const char *path, int path_len,
>  const struct checkout *state)
> @@ -357,6 +358,7 @@ static int write_entry(struct cache_entry *ce,
>   lstat(ce->name, );
>   fill_stat_cache_info(ce, );
>   ce->ce_flags |= CE_UPDATE_IN_BASE;
> + mark_fsmonitor_invalid(state->istate, ce);
>   state->istate->cache_changed |= CE_ENTRY_CHANGED;

Similar to "how does mark_fsmonitor_valid() relate to
ce_mark_uptodate()?" question earlier, mark_fsmonitor_invalid()
seems to often appear in pairs with the update to cache_changed.
Are there cases where we need CE_ENTRY_CHANGED bit added to the
index state yet we do not want to call mark_fsmonitor_invalid()?  I
am wondering if there should be a single helper function that lets
callers say "I changed this ce in this istate this way.  Please
update CE_VALID, CE_UPDATE_IN_BASE and CE_FSMONITOR_VALID
accordingly".

That "changed _this way_" is deliberately vague in my question
above, because it is not yet clear to me when mark-valid and
mark-invalid should and should not be called from the series.

> + /* a fsmonitor process can return '*' to indicate all entries are 
> invalid */
> + if (query_success && query_result.buf[0] != '/') {
> + /* Mark all entries returned by the monitor as dirty */

The comment talks about '*' and code checks if it is not '/'?  The
else clause of this if/else handles the "invalidate all" case, so
shouldn't we be comparing with '*' instead here?

Ah, fsmonitor-watchman section of the doc tells us to write the hook
in a way to return slash, so the comment above the code is stale and
the comparison with '/' is what we want, I guess.


[PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.

2017-09-19 Thread Ben Peart
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart 
---
 Makefile   |   1 +
 apply.c|   2 +-
 builtin/update-index.c |   2 +
 cache.h|  10 +-
 config.c   |  14 +++
 config.h   |   1 +
 diff-lib.c |   2 +
 dir.c  |  27 --
 dir.h  |   2 +
 entry.c|   4 +-
 environment.c  |   1 +
 fsmonitor.c| 253 +
 fsmonitor.h|  61 
 preload-index.c|   6 +-
 read-cache.c   |  49 --
 submodule.c|   2 +-
 unpack-trees.c |   8 +-
 17 files changed, 419 insertions(+), 26 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h

diff --git a/Makefile b/Makefile
index f2bb7f2f63..9d6ec9c1e9 100644
--- a/Makefile
+++ b/Makefile
@@ -786,6 +786,7 @@ LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
+LIB_OBJS += fsmonitor.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/apply.c b/apply.c
index 71cbbd141c..9061cc5f15 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct cache_entry 
*ce, struct stat *st)
return -1;
return 0;
}
-   return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
+   return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
 }
 
 #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/builtin/update-index.c b/builtin/update-index.c
index e1ca0759d5..6f39ee9274 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -16,6 +16,7 @@
 #include "pathspec.h"
 #include "dir.h"
 #include "split-index.h"
+#include "fsmonitor.h"
 
 /*
  * Default to not allowing changes to the list of files. The
@@ -233,6 +234,7 @@ static int mark_ce_flags(const char *path, int flag, int 
mark)
else
active_cache[pos]->ce_flags &= ~flag;
active_cache[pos]->ce_flags |= CE_UPDATE_IN_BASE;
+   mark_fsmonitor_invalid(_index, active_cache[pos]);
cache_tree_invalidate_path(_index, path);
active_cache_changed |= CE_ENTRY_CHANGED;
return 0;
diff --git a/cache.h b/cache.h
index a916bc79e3..eccab968bd 100644
--- a/cache.h
+++ b/cache.h
@@ -203,6 +203,7 @@ struct cache_entry {
 #define CE_ADDED (1 << 19)
 
 #define CE_HASHED(1 << 20)
+#define CE_FSMONITOR_VALID   (1 << 21)
 #define CE_WT_REMOVE (1 << 22) /* remove in work directory */
 #define CE_CONFLICTED(1 << 23)
 
@@ -326,6 +327,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define CACHE_TREE_CHANGED (1 << 5)
 #define SPLIT_INDEX_ORDERED(1 << 6)
 #define UNTRACKED_CHANGED  (1 << 7)
+#define FSMONITOR_CHANGED  (1 << 8)
 
 struct split_index;
 struct untracked_cache;
@@ -344,6 +346,7 @@ struct index_state {
struct hashmap dir_hash;
unsigned char sha1[20];
struct untracked_cache *untracked;
+   uint64_t fsmonitor_last_update;
 };
 
 extern struct index_state the_index;
@@ -679,8 +682,10 @@ extern void *read_blob_data_from_index(const struct 
index_state *, const char *,
 #define CE_MATCH_IGNORE_MISSING0x08
 /* enable stat refresh */
 #define CE_MATCH_REFRESH   0x10
-extern int ie_match_stat(const struct index_state *, const struct cache_entry 
*, struct stat *, unsigned int);
-extern int ie_modified(const struct index_state *, const struct cache_entry *, 
struct stat *, unsigned int);
+/* do stat comparison even if CE_FSMONITOR_VALID is true */
+#define