Disclaimer:  I am not a git developer, nor have I ever written
anything FUSE. So I apologize if the following is idiotic:

I've been looking for a virtual file system (in Linux) that works with
git to make huge working directories fast.  I have found that
Microsoft has written GVFS in Windows for this purpose.  I read in a
forum a discussion where they said using a FUSE implementation was too
slow and that they had to write a full file system at the kernel level
to be fast enough.  Their web page also boasts that a checkout takes
30 seconds rather than 3 hours.  My question is why?

If a FUSE was implemented in a way where a git checkout would do
nothing more than copy the snapshot manifest file locally, wouldn't
that basically be instantaneous?  The file system could then fetch
files by the hashes within that manifest whenever one needed to be
read.  Files would only need to be stored locally if it they were
modified.  Since the file system would know exactly what files were
modified, then it seems that git status and commit would be fast as
well.

Perhaps MS implemented GVFS that way because building a big tree from
scratch would be slow if it had to go into user space over and over
again?  If so, then what if a build system like Bazel (from Google)
was used to always build everything incrementally? It too could be
modified (maybe via plugin) to interact with the file system to know
exactly what files changed without reading everything.  The file
system could also use Google's hashes and remote caching to provide
unmodified binary content just like it would use git's SHA1 to provide
unmodified source content from git.  So when a user did a checkout, it
would appear that all the binaries were committed along the source
code.  Bazel would build new binaries from the source that was
modified, and only those new binaries would be written locally. To
execute those binaries, most of them would be read from the cache
while new ones would be read locally.

Perhaps that runtime part is the issue?  Executing the resulting code
was too slow due to the slower file access?  I would think that hit
would not be too bad, and only be during init, but perhaps I'm wrong.


Microsoft is full of really smart guys.  So clearly I am missing
something.  What is it?


(Sorry if I'm wasting your time)

Reply via email to