Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-17 Thread Karsten Blees
Am 16.02.2015 um 23:10 schrieb Junio C Hamano:
> Karsten Blees  writes:
> 
>> However, the Makefile has this to say on the subject:
>>
>> # Define USE_NSEC below if you want git to care about sub-second file mtimes
>> # and ctimes. Note that you need recent glibc (at least 2.2.4) for this, and
>> # it will BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely
>> # randomly break unless your underlying filesystem supports those sub-second
>> # times (my ext3 doesn't).
>>
>> Am I missing something?
> 
[...]
> 
> If you use NSEC, however, and "refresh" grabbed a subsecond time and
> then later "diff-files" learned a truncated/rounded time because the
> filesystem later purged the cached inodes and re-read it from the
> underlying filesystem with no subsecond time resolution, the times
> would not match so you will again see "diff-files" report that "foo"
> is now different.
> 
> That is what the comment you cited is about.
> 

OK, so it all boils down to the "inode cache doesn't round to on-disk
resolution" issue after all, as explained in racy-git.txt.

But then the Makefile comment is quite misleading. Enabling USE_NSEC
will not unconditionally "BREAK YOUR LOCAL DIFFS". Show-diff / diff-files
will also not "break", but may report false positives instead (which may
be worse than failing hard...).

It also seems to me that this is a Linux-only issue which is only remotely
related to the USE_NSEC setting or file systems' timestamp resolutions.

The kernel patch referenced in racy-git.txt only addresses sub-second
resolutions. So even if USE_NSEC is *disabled*, the diff-files issue will
bite you on e.g. FAT32-formatted flash-drives on Linux, at least on
re-mount ("sync && echo 2>/proc/sys/vm/drop_caches" didn't seem to trigger
the rounding, though).

I also suspect that the sub-second rounding function of that patch
(timespec_trunc()) takes some invalid shortcuts - if you configure the
kernel for 300 jiffies per second (i.e. 3,333,333 ns per tick), UDF, NTFS,
SMBFS and CIFS file times will most likely not be properly rounded in the
inode cache. Haven't tested this, though.

So the only file systems with reliable file times on Linux seem to be
those with exactly 1s or 1ns resolution...?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-16 Thread Junio C Hamano
Karsten Blees  writes:

> However, the Makefile has this to say on the subject:
>
> # Define USE_NSEC below if you want git to care about sub-second file mtimes
> # and ctimes. Note that you need recent glibc (at least 2.2.4) for this, and
> # it will BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely
> # randomly break unless your underlying filesystem supports those sub-second
> # times (my ext3 doesn't).
>
> Am I missing something?

I think "it would break" is about show-diff which wanted to use the
cached stat information for freshness.

>foo
git update-index --add foo
sleep 2
>foo
git diff-files ;# modern counterpart of show-diff

would say that "foo" is *different*, because the plumbing commands
like diff-files expect you to refresh the index before you call
them.

And if you did "git update-index --refresh" after touching "foo" the
last time before running "git diff-files" in the above sequence, you
should expect that it does not say "foo" is different, no matter how
much time passes between the time you run that "refresh" and
"diff-files" (or between the time you last touched "foo" and you run
"refresh", for that matter), as long as you do not touch "foo" in
the meantime.  The following should say "foo" is *not* different,
that is:

>foo
git update-index --add foo
sleep 2
>foo
sleep arbitrary
git update-index --refresh
sleep arbitrary
git diff-files ;# modern counterpart of show-diff

If you use NSEC, however, and "refresh" grabbed a subsecond time and
then later "diff-files" learned a truncated/rounded time because the
filesystem later purged the cached inodes and re-read it from the
underlying filesystem with no subsecond time resolution, the times
would not match so you will again see "diff-files" report that "foo"
is now different.

That is what the comment you cited is about.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-16 Thread Karsten Blees
Am 13.02.2015 um 20:28 schrieb Junio C Hamano:
> Karsten Blees  writes:
> 
>> Am 13.02.2015 um 00:38 schrieb Junio C Hamano:
>>>
>>> We do have sec/nsec fields in cache_time structure, so I have
>>> nothing against updating the msysGit port to fill that value.
> 
> Having said that, we do not enable the NSEC stuff by default on Unix
> for a reason.  I'd expect those who know Windows filesystems well to
> pick the default there wisely ;-)
> 

Now I'm a bit confused about the discrepancy between racy-git.txt and
the Makefile.

Racy-git.txt explains that the nsec-part may be dropped when an inode
is flushed to disk if the file system doesn't support nsec resolution.
This was supposedly an issue with the Linux kernel fixed back in 2005.

In my understanding, this means that git would see the file as
changed and re-check the content (i.e. it will hurt performance).

IOW: Git may be slow if the file system cache has better file time
resolution than the on-disk file system representation.


However, the Makefile has this to say on the subject:

# Define USE_NSEC below if you want git to care about sub-second file mtimes
# and ctimes. Note that you need recent glibc (at least 2.2.4) for this, and
# it will BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely
# randomly break unless your underlying filesystem supports those sub-second
# times (my ext3 doesn't).

Am I missing something? Is there anything in Git that will actually
"break" with USE_NSEC if the OS / file system doesn't support it
(rather than just being slow)?

History:
* The Makefile comment was added in 2005 (bdd4da59), along with a
  comment in read-cache.c explaining the issue (i.e. flushing to disk
  will clear the nsec field).
* The comment in read-cache.c was removed in 2008 (7a51ed66),
  seemingly dropping USE_NSEC support entirely.
* USE_NSEC support was re-added (without the read-cache.c comment) in
  2009 (fba2f38a).


Regarding the Windows situation: I've just verified (on my Win7 x64
box) that file times obtained through a variety of APIs (GetFileTime,
GetFileAttributesEx, GetFileInformationByHandle, FindFirstFile) are
consistent and properly rounded to the file system's resolution (e.g.
10ms / 2s for FAT). This is even if the file is still open and I try
to SetFileTime() to unrounded values.

So I think enabling USE_NSEC should be fine on Windows.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-13 Thread Junio C Hamano
Karsten Blees  writes:

> Am 13.02.2015 um 00:38 schrieb Junio C Hamano:
>> 
>> We do have sec/nsec fields in cache_time structure, so I have
>> nothing against updating the msysGit port to fill that value.

Having said that, we do not enable the NSEC stuff by default on Unix
for a reason.  I'd expect those who know Windows filesystems well to
pick the default there wisely ;-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-12 Thread Karsten Blees
Am 13.02.2015 um 00:38 schrieb Junio C Hamano:
> Karsten Blees  writes:
> 
>> This is more about copying 'old' things around, which usually also
>> copies mtime on Windows. E.g.:
>>
>>   # create two files with slightly different mtime
>>   for i in {1..10}; do (echo "v1" >> test); done &&
>>   for i in {1..10}; do (echo "v2" >> test2); done
>>   # wait a bit so that '.git/index' is always newer than 'test' / 'test2'
>>   sleep 1
>>   git add test
>>   git commit -m v1
>>   # copy test2 over test (similar to 'cp -p', but native 'copy' also
>>   # copies mtime nanoseconds)
>>   cmd //c "copy /y test2 test"
>>   git add test
>>   git commit -m v2
>>
>> Without these patches, git does not detect the change, and the second
>> git add / git commit are noops.
> 
> We do have sec/nsec fields in cache_time structure, so I have
> nothing against updating the msysGit port to fill that value.
> 
> I was and am just reacting to the fact that this is sold as if it
> "fixes" something.

Sorry, that must have been a misunderstanding. This series does
NOT fix the problem with VSS2Git, nor any other tool that abuses
mtime for the author's birthday or whatever.

The issue that two files may accidentally have the same size and
mtime was just brought up in this discussion.

> It doesn't fundamentally change the fact that
> mtime that does not follow the semantics Dscho mentioned in his
> earlier message does not work well with Git.
> 
> Having said that, even with such a patch, as long as the system is
> sufficiently fast, test and test2 will have nonoseconds identical
> timestamp and you would have the same issue, no?
> 

Right. Where "sufficiently fast" would mean opening and closing a
file ten times in less than 100ns...on Windows... ;-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-12 Thread Junio C Hamano
Karsten Blees  writes:

> This is more about copying 'old' things around, which usually also
> copies mtime on Windows. E.g.:
>
>   # create two files with slightly different mtime
>   for i in {1..10}; do (echo "v1" >> test); done &&
>   for i in {1..10}; do (echo "v2" >> test2); done
>   # wait a bit so that '.git/index' is always newer than 'test' / 'test2'
>   sleep 1
>   git add test
>   git commit -m v1
>   # copy test2 over test (similar to 'cp -p', but native 'copy' also
>   # copies mtime nanoseconds)
>   cmd //c "copy /y test2 test"
>   git add test
>   git commit -m v2
>
> Without these patches, git does not detect the change, and the second
> git add / git commit are noops.

We do have sec/nsec fields in cache_time structure, so I have
nothing against updating the msysGit port to fill that value.

I was and am just reacting to the fact that this is sold as if it
"fixes" something.  It doesn't fundamentally change the fact that
mtime that does not follow the semantics Dscho mentioned in his
earlier message does not work well with Git.

Having said that, even with such a patch, as long as the system is
sufficiently fast, test and test2 will have nonoseconds identical
timestamp and you would have the same issue, no?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-12 Thread Karsten Blees
Am 12.02.2015 um 20:48 schrieb Junio C Hamano:
> Karsten Blees  writes:
> 
>> However, some users have expressed concerns that 'same size and
>> mtime' [2] may theoretically happen by chance in daily operation.
> 
> Hmph.
> 
> Haven't we already accepted that it is not just "may theoretically
> happen" and had counter-measures in racy-git detection machinery
> for quite some time?
> 

Racy-git only triggers for files that are modified at the same time
as .git/index (i.e. we don't know if the stat cache is up to date).

This is more about copying 'old' things around, which usually also
copies mtime on Windows. E.g.:

  # create two files with slightly different mtime
  for i in {1..10}; do (echo "v1" >> test); done &&
  for i in {1..10}; do (echo "v2" >> test2); done
  # wait a bit so that '.git/index' is always newer than 'test' / 'test2'
  sleep 1
  git add test
  git commit -m v1
  # copy test2 over test (similar to 'cp -p', but native 'copy' also
  # copies mtime nanoseconds)
  cmd //c "copy /y test2 test"
  git add test
  git commit -m v2

Without these patches, git does not detect the change, and the second
git add / git commit are noops.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [msysGit] Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-12 Thread Johannes Schindelin
Hi,

On 2015-02-12 20:48, Junio C Hamano wrote:
> Karsten Blees  writes:
> 
>> This patch series was inspired by the problem that Git does not
>> detect changed file content if st_size, st_mtime and st_ctime
>> are unchanged. This was apparently caused by VSS2Git resetting
>> mtime to a value in the past. [1]
>>
>> I believe (or rather hope) that all involved in the discussion
>> agree that Git cannot reasonably be expected to detect changed
>> file content if file time(s) are reset on purpose.
>>
>> However, some users have expressed concerns that 'same size and
>> mtime' [2] may theoretically happen by chance in daily operation.
> 
> Hmph.
> 
> Haven't we already accepted that it is not just "may theoretically
> happen" and had counter-measures in racy-git detection machinery
> for quite some time?

I agree that the "racy-git" reference is more of a red herring; it appears that 
the report Karsten referred to is based on a different understanding of the 
mtime than mine (it is purported that the mtime of a file should reflect the 
time when it entered the repository, rather than the time when the file was 
written to disk).

Please let that not distract you.

This patch series has merits on the basis of populating st_mtim.tv_nsec 
already. We can provide this information in the common case (i.e. NTFS, *not* 
FAT), so we should.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Win32: nanosecond-precision file times

2015-02-12 Thread Junio C Hamano
Karsten Blees  writes:

> This patch series was inspired by the problem that Git does not
> detect changed file content if st_size, st_mtime and st_ctime
> are unchanged. This was apparently caused by VSS2Git resetting
> mtime to a value in the past. [1]
>
> I believe (or rather hope) that all involved in the discussion
> agree that Git cannot reasonably be expected to detect changed
> file content if file time(s) are reset on purpose.
>
> However, some users have expressed concerns that 'same size and
> mtime' [2] may theoretically happen by chance in daily operation.

Hmph.

Haven't we already accepted that it is not just "may theoretically
happen" and had counter-measures in racy-git detection machinery
for quite some time?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Win32: nanosecond-precision file times

2015-02-11 Thread Karsten Blees
This patch series was inspired by the problem that Git does not
detect changed file content if st_size, st_mtime and st_ctime
are unchanged. This was apparently caused by VSS2Git resetting
mtime to a value in the past. [1]

I believe (or rather hope) that all involved in the discussion
agree that Git cannot reasonably be expected to detect changed
file content if file time(s) are reset on purpose.

However, some users have expressed concerns that 'same size and
mtime' [2] may theoretically happen by chance in daily operation.

This patch series adopts POSIX 2013 'struct timespec' file times
to make this practically impossible, at least on NTFS with 100ns
file time resolution.

Cheers,
Karsten

[1] https://github.com/msysgit/git/issues/312
[2] Note that st_ctime of a file never changes on Windows, as it
means 'creation time' rather than 'change status time'.

Karsten Blees (3):
  Win32: make FILETIME conversion functions public
  Win32: replace MSVCRT's fstat() with a Win32-based implementation
  Win32: implement nanosecond-precision file times

 compat/mingw.c   | 56 +---
 compat/mingw.h   | 55 +--
 config.mak.uname |  4 ++--
 3 files changed, 72 insertions(+), 43 deletions(-)

-- 
2.3.0.3.ge7778af

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html