https://blogs.msdn.microsoft.com/bharry/2017/02/03/scaling-git-and-some-back-story/ is a good read about Microsoft's history with version control.
tl;dr Microsoft has created Git Virtual File System (GVFS) to allow Git to scale to millions of files and tens of gigabytes in size using the Windows equivalent of a FUSE filesystem. The underpinnings of GVFS appear to be a File System Filter Driver [1], an HTTP server exposing Git data, and a userland daemon for communicating with the file system filter and the HTTP server. Basically, the .git directory and working directory are "virtualized" by the file system filter driver (think FUSE file system). When a Git client interacts with a "virtualized" directory, the I/O request is initially handled by the file system filter driver, which appears to pass on the request to the userland daemon. Instead of a fully distributed clone, a client lazily downloads data from a server upon initial access. Subsequent accesses are serviced locally. Microsoft open sourced the "middleware" userland daemon, GVFS [2], which is written mostly in C#. The low-level file system filter driver is still closed source and has a restrictive EULA that says "solely for use with Microsoft Git Virtual File System (GVFS) and otherwise for your internal business purposes. You may not use the software in a live operating environment unless Microsoft permits you to do so under another agreement." [3] However, someone at Microsoft has indicated the file system filter driver may be open sourced [4]. The server-side components are part of TFS Git Server, which is closed source. However, the protocol is documented [5] and building your own server should be possible. The custom file system filter driver (currently) requires changing low-level OS configs to enable unsigned drivers. That's a bit scary. But you are inserting a driver into the kernel. Presumably Microsoft will distribute a signed version of the driver eventually. I'm still trying to wrap my head around what Git client modifications are necessary. Microsoft has a GVFS-compatible fork of Git at [6]. In theory, you shouldn't need too many modifications of the Git client because .git is "virtualized." But I'm sure there are certain operations that needed tweaking to better handle the drastically different behavior profile of GVFS. So, basically Microsoft extended the concepts of Git LFS (remote storage) to normal repository storage. Put in Mercurial terms, it is like remotefilelog (or a fully-implemented narrow+shallow clone) plus a virtual file system (like Google's Piper/CitC). The novel work here is Windows support (AFAIK nobody has really done a virtual file system for version control on Windows - only on Linux and Linux-like platforms). GVFS is an impressive piece of work. While it only supports Windows with a TFS server currently, my guess is someone will eventually hack up a FUSE filesystem for use with the GVFS server protocol. And, a non-TFS server implementation seems achievable. I'll be much more excited about this if/when Microsoft open sources the file system filter driver. I've toyed around with file system filter drivers in the past and I quickly got overwhelmed because of the complexity and lack of a good reference implementation. If Microsoft's "gvflt" is open sourced, one could imagine using it as a base for writing a file system filter driver for Mercurial using a similar architecture as GVFS. [1] https://msdn.microsoft.com/en-us/windows/hardware/drivers/ifs/introduction-to-file-system-filter-drivers [2] https://github.com/Microsoft/GVFS [3] https://www.nuget.org/packages/Microsoft.GVFS.GvFlt/0.17131.2-preview [4] https://github.com/Microsoft/GVFS/issues/5 [5] https://github.com/Microsoft/GVFS/blob/master/Protocol.md [6] https://github.com/Microsoft/git
_______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel