Re: [PATCH 0/3] Win32: nanosecond-precision file times
Am 16.02.2015 um 23:10 schrieb Junio C Hamano: > Karsten Blees writes: > >> However, the Makefile has this to say on the subject: >> >> # Define USE_NSEC below if you want git to care about sub-second file mtimes >> # and ctimes. Note that you need recent glibc (at least 2.2.4) for this, and >> # it will BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely >> # randomly break unless your underlying filesystem supports those sub-second >> # times (my ext3 doesn't). >> >> Am I missing something? > [...] > > If you use NSEC, however, and "refresh" grabbed a subsecond time and > then later "diff-files" learned a truncated/rounded time because the > filesystem later purged the cached inodes and re-read it from the > underlying filesystem with no subsecond time resolution, the times > would not match so you will again see "diff-files" report that "foo" > is now different. > > That is what the comment you cited is about. > OK, so it all boils down to the "inode cache doesn't round to on-disk resolution" issue after all, as explained in racy-git.txt. But then the Makefile comment is quite misleading. Enabling USE_NSEC will not unconditionally "BREAK YOUR LOCAL DIFFS". Show-diff / diff-files will also not "break", but may report false positives instead (which may be worse than failing hard...). It also seems to me that this is a Linux-only issue which is only remotely related to the USE_NSEC setting or file systems' timestamp resolutions. The kernel patch referenced in racy-git.txt only addresses sub-second resolutions. So even if USE_NSEC is *disabled*, the diff-files issue will bite you on e.g. FAT32-formatted flash-drives on Linux, at least on re-mount ("sync && echo 2>/proc/sys/vm/drop_caches" didn't seem to trigger the rounding, though). I also suspect that the sub-second rounding function of that patch (timespec_trunc()) takes some invalid shortcuts - if you configure the kernel for 300 jiffies per second (i.e. 3,333,333 ns per tick), UDF, NTFS, SMBFS and CIFS file times will most likely not be properly rounded in the inode cache. Haven't tested this, though. So the only file systems with reliable file times on Linux seem to be those with exactly 1s or 1ns resolution...? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Karsten Blees writes: > However, the Makefile has this to say on the subject: > > # Define USE_NSEC below if you want git to care about sub-second file mtimes > # and ctimes. Note that you need recent glibc (at least 2.2.4) for this, and > # it will BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely > # randomly break unless your underlying filesystem supports those sub-second > # times (my ext3 doesn't). > > Am I missing something? I think "it would break" is about show-diff which wanted to use the cached stat information for freshness. >foo git update-index --add foo sleep 2 >foo git diff-files ;# modern counterpart of show-diff would say that "foo" is *different*, because the plumbing commands like diff-files expect you to refresh the index before you call them. And if you did "git update-index --refresh" after touching "foo" the last time before running "git diff-files" in the above sequence, you should expect that it does not say "foo" is different, no matter how much time passes between the time you run that "refresh" and "diff-files" (or between the time you last touched "foo" and you run "refresh", for that matter), as long as you do not touch "foo" in the meantime. The following should say "foo" is *not* different, that is: >foo git update-index --add foo sleep 2 >foo sleep arbitrary git update-index --refresh sleep arbitrary git diff-files ;# modern counterpart of show-diff If you use NSEC, however, and "refresh" grabbed a subsecond time and then later "diff-files" learned a truncated/rounded time because the filesystem later purged the cached inodes and re-read it from the underlying filesystem with no subsecond time resolution, the times would not match so you will again see "diff-files" report that "foo" is now different. That is what the comment you cited is about. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Am 13.02.2015 um 20:28 schrieb Junio C Hamano: > Karsten Blees writes: > >> Am 13.02.2015 um 00:38 schrieb Junio C Hamano: >>> >>> We do have sec/nsec fields in cache_time structure, so I have >>> nothing against updating the msysGit port to fill that value. > > Having said that, we do not enable the NSEC stuff by default on Unix > for a reason. I'd expect those who know Windows filesystems well to > pick the default there wisely ;-) > Now I'm a bit confused about the discrepancy between racy-git.txt and the Makefile. Racy-git.txt explains that the nsec-part may be dropped when an inode is flushed to disk if the file system doesn't support nsec resolution. This was supposedly an issue with the Linux kernel fixed back in 2005. In my understanding, this means that git would see the file as changed and re-check the content (i.e. it will hurt performance). IOW: Git may be slow if the file system cache has better file time resolution than the on-disk file system representation. However, the Makefile has this to say on the subject: # Define USE_NSEC below if you want git to care about sub-second file mtimes # and ctimes. Note that you need recent glibc (at least 2.2.4) for this, and # it will BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely # randomly break unless your underlying filesystem supports those sub-second # times (my ext3 doesn't). Am I missing something? Is there anything in Git that will actually "break" with USE_NSEC if the OS / file system doesn't support it (rather than just being slow)? History: * The Makefile comment was added in 2005 (bdd4da59), along with a comment in read-cache.c explaining the issue (i.e. flushing to disk will clear the nsec field). * The comment in read-cache.c was removed in 2008 (7a51ed66), seemingly dropping USE_NSEC support entirely. * USE_NSEC support was re-added (without the read-cache.c comment) in 2009 (fba2f38a). Regarding the Windows situation: I've just verified (on my Win7 x64 box) that file times obtained through a variety of APIs (GetFileTime, GetFileAttributesEx, GetFileInformationByHandle, FindFirstFile) are consistent and properly rounded to the file system's resolution (e.g. 10ms / 2s for FAT). This is even if the file is still open and I try to SetFileTime() to unrounded values. So I think enabling USE_NSEC should be fine on Windows. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Karsten Blees writes: > Am 13.02.2015 um 00:38 schrieb Junio C Hamano: >> >> We do have sec/nsec fields in cache_time structure, so I have >> nothing against updating the msysGit port to fill that value. Having said that, we do not enable the NSEC stuff by default on Unix for a reason. I'd expect those who know Windows filesystems well to pick the default there wisely ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Am 13.02.2015 um 00:38 schrieb Junio C Hamano: > Karsten Blees writes: > >> This is more about copying 'old' things around, which usually also >> copies mtime on Windows. E.g.: >> >> # create two files with slightly different mtime >> for i in {1..10}; do (echo "v1" >> test); done && >> for i in {1..10}; do (echo "v2" >> test2); done >> # wait a bit so that '.git/index' is always newer than 'test' / 'test2' >> sleep 1 >> git add test >> git commit -m v1 >> # copy test2 over test (similar to 'cp -p', but native 'copy' also >> # copies mtime nanoseconds) >> cmd //c "copy /y test2 test" >> git add test >> git commit -m v2 >> >> Without these patches, git does not detect the change, and the second >> git add / git commit are noops. > > We do have sec/nsec fields in cache_time structure, so I have > nothing against updating the msysGit port to fill that value. > > I was and am just reacting to the fact that this is sold as if it > "fixes" something. Sorry, that must have been a misunderstanding. This series does NOT fix the problem with VSS2Git, nor any other tool that abuses mtime for the author's birthday or whatever. The issue that two files may accidentally have the same size and mtime was just brought up in this discussion. > It doesn't fundamentally change the fact that > mtime that does not follow the semantics Dscho mentioned in his > earlier message does not work well with Git. > > Having said that, even with such a patch, as long as the system is > sufficiently fast, test and test2 will have nonoseconds identical > timestamp and you would have the same issue, no? > Right. Where "sufficiently fast" would mean opening and closing a file ten times in less than 100ns...on Windows... ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Karsten Blees writes: > This is more about copying 'old' things around, which usually also > copies mtime on Windows. E.g.: > > # create two files with slightly different mtime > for i in {1..10}; do (echo "v1" >> test); done && > for i in {1..10}; do (echo "v2" >> test2); done > # wait a bit so that '.git/index' is always newer than 'test' / 'test2' > sleep 1 > git add test > git commit -m v1 > # copy test2 over test (similar to 'cp -p', but native 'copy' also > # copies mtime nanoseconds) > cmd //c "copy /y test2 test" > git add test > git commit -m v2 > > Without these patches, git does not detect the change, and the second > git add / git commit are noops. We do have sec/nsec fields in cache_time structure, so I have nothing against updating the msysGit port to fill that value. I was and am just reacting to the fact that this is sold as if it "fixes" something. It doesn't fundamentally change the fact that mtime that does not follow the semantics Dscho mentioned in his earlier message does not work well with Git. Having said that, even with such a patch, as long as the system is sufficiently fast, test and test2 will have nonoseconds identical timestamp and you would have the same issue, no? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Am 12.02.2015 um 20:48 schrieb Junio C Hamano: > Karsten Blees writes: > >> However, some users have expressed concerns that 'same size and >> mtime' [2] may theoretically happen by chance in daily operation. > > Hmph. > > Haven't we already accepted that it is not just "may theoretically > happen" and had counter-measures in racy-git detection machinery > for quite some time? > Racy-git only triggers for files that are modified at the same time as .git/index (i.e. we don't know if the stat cache is up to date). This is more about copying 'old' things around, which usually also copies mtime on Windows. E.g.: # create two files with slightly different mtime for i in {1..10}; do (echo "v1" >> test); done && for i in {1..10}; do (echo "v2" >> test2); done # wait a bit so that '.git/index' is always newer than 'test' / 'test2' sleep 1 git add test git commit -m v1 # copy test2 over test (similar to 'cp -p', but native 'copy' also # copies mtime nanoseconds) cmd //c "copy /y test2 test" git add test git commit -m v2 Without these patches, git does not detect the change, and the second git add / git commit are noops. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [msysGit] Re: [PATCH 0/3] Win32: nanosecond-precision file times
Hi, On 2015-02-12 20:48, Junio C Hamano wrote: > Karsten Blees writes: > >> This patch series was inspired by the problem that Git does not >> detect changed file content if st_size, st_mtime and st_ctime >> are unchanged. This was apparently caused by VSS2Git resetting >> mtime to a value in the past. [1] >> >> I believe (or rather hope) that all involved in the discussion >> agree that Git cannot reasonably be expected to detect changed >> file content if file time(s) are reset on purpose. >> >> However, some users have expressed concerns that 'same size and >> mtime' [2] may theoretically happen by chance in daily operation. > > Hmph. > > Haven't we already accepted that it is not just "may theoretically > happen" and had counter-measures in racy-git detection machinery > for quite some time? I agree that the "racy-git" reference is more of a red herring; it appears that the report Karsten referred to is based on a different understanding of the mtime than mine (it is purported that the mtime of a file should reflect the time when it entered the repository, rather than the time when the file was written to disk). Please let that not distract you. This patch series has merits on the basis of populating st_mtim.tv_nsec already. We can provide this information in the common case (i.e. NTFS, *not* FAT), so we should. Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Win32: nanosecond-precision file times
Karsten Blees writes: > This patch series was inspired by the problem that Git does not > detect changed file content if st_size, st_mtime and st_ctime > are unchanged. This was apparently caused by VSS2Git resetting > mtime to a value in the past. [1] > > I believe (or rather hope) that all involved in the discussion > agree that Git cannot reasonably be expected to detect changed > file content if file time(s) are reset on purpose. > > However, some users have expressed concerns that 'same size and > mtime' [2] may theoretically happen by chance in daily operation. Hmph. Haven't we already accepted that it is not just "may theoretically happen" and had counter-measures in racy-git detection machinery for quite some time? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] Win32: nanosecond-precision file times
This patch series was inspired by the problem that Git does not detect changed file content if st_size, st_mtime and st_ctime are unchanged. This was apparently caused by VSS2Git resetting mtime to a value in the past. [1] I believe (or rather hope) that all involved in the discussion agree that Git cannot reasonably be expected to detect changed file content if file time(s) are reset on purpose. However, some users have expressed concerns that 'same size and mtime' [2] may theoretically happen by chance in daily operation. This patch series adopts POSIX 2013 'struct timespec' file times to make this practically impossible, at least on NTFS with 100ns file time resolution. Cheers, Karsten [1] https://github.com/msysgit/git/issues/312 [2] Note that st_ctime of a file never changes on Windows, as it means 'creation time' rather than 'change status time'. Karsten Blees (3): Win32: make FILETIME conversion functions public Win32: replace MSVCRT's fstat() with a Win32-based implementation Win32: implement nanosecond-precision file times compat/mingw.c | 56 +--- compat/mingw.h | 55 +-- config.mak.uname | 4 ++-- 3 files changed, 72 insertions(+), 43 deletions(-) -- 2.3.0.3.ge7778af -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html