On 20/11/2018 16:28, Nguyễn Thái Ngọc Duy wrote:
> Commit b878579ae7 (clone: report duplicate entries on case-insensitive
> filesystems - 2018-08-17) adds a warning to user when cloning a repo
> with case-sensitive file names on a case-insensitive file system. The
> "find duplicate file" check was doing by comparing inode number (and
> only fall back to fspathcmp() when inode is known to be unreliable
> because fspathcmp() can't cover all case folding cases).
> 
> The inode check is very simple, and wrong. It compares between a
> 32-bit number (sd_ino) and potentially a 64-bit number (st_ino). When
> an inode is larger than 2^32 (which seems to be the case for APFS), it
> will be truncated and stored in sd_ino, but comparing with itself will
> fail.
> 
> As a result, instead of showing a pair of files that have the same
> name, we show just one file (marked before the beginning of the
> loop). We fail to find the original one.
> 
> The fix could be just a simple type cast (*)
> 
>     dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino
> 
> but this is no longer a reliable test, there are 4G possible inodes
> that can match sd_ino because we only match the lower 32 bits instead
> of full 64 bits.
> 
> There are two options to go. Either we ignore inode and go with
> fspathcmp() on Apple platform. This means we can't do accurate inode
> check on HFS anymore, or even on APFS when inode numbers are still
> below 2^32.
> 
> Or we just to to reduce the odds of matching a wrong file by checking
> more attributes, counting mostly on st_size because st_xtime is likely
> the same. This patch goes with this direction, hoping that false
> positive chances are too small to be seen in practice.
> 
> While at there, enable the test on Cygwin (verified working by Ramsay
> Jones)

Well, no, I tested the previous version of this patch. However, this
patch also passes the test. (Note _test_ singular - in order to check
that this patch doesn't cause a regression I would need to run the
whole test-suite - that takes 3.5 hours, if I'm not doing anything
else!)

> 
> (*) this is also already done inside match_stat_data()
> 
> Reported-by: Carlo Arenas <care...@gmail.com>
> Helped-by: Ramsay Jones <ram...@ramsayjones.plus.com>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
> ---
>  So I'm going with match_stat_data(). But I don't know, perhaps just
>  ignoring inode (like Carlo's original patch) is safer/better?
> 
>  Tested on case-insensitive JFS on Linux. But I don't think it really
>  matters because I'm not even sure if I could push inode above 2^32
>  with this. Hacking JFS for this test sounds fun, but no time for that.
> 
>  entry.c          | 4 ++--
>  t/t5601-clone.sh | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/entry.c b/entry.c
> index 5d136c5d55..0a3c451f5f 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -404,7 +404,7 @@ static void mark_colliding_entries(const struct checkout 
> *state,
>  {
>       int i, trust_ino = check_stat;
>  
> -#if defined(GIT_WINDOWS_NATIVE)
> +#if defined(GIT_WINDOWS_NATIVE) || defined(__CYGWIN__)

I was a little curious about this (but couldn't be bothered actually
read the code, post-application), so I removed this hunk from the
patch, rebuilt and ran the test again: it _passed_ the test. :-D

So, ...

ATB,
Ramsay Jones

>       trust_ino = 0;
>  #endif
>  
> @@ -419,7 +419,7 @@ static void mark_colliding_entries(const struct checkout 
> *state,
>               if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE))
>                       continue;
>  
> -             if ((trust_ino && dup->ce_stat_data.sd_ino == st->st_ino) ||
> +             if ((trust_ino && !match_stat_data(&dup->ce_stat_data, st)) ||
>                   (!trust_ino && !fspathcmp(ce->name, dup->name))) {
>                       dup->ce_flags |= CE_MATCHED;
>                       break;
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index f1a49e94f5..c28d51bd59 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -628,7 +628,7 @@ test_expect_success 'clone on case-insensitive fs' '
>       )
>  '
>  
> -test_expect_success !MINGW,!CYGWIN,CASE_INSENSITIVE_FS 'colliding file 
> detection' '
> +test_expect_success !MINGW,CASE_INSENSITIVE_FS 'colliding file detection' '
>       grep X icasefs/warning &&
>       grep x icasefs/warning &&
>       test_i18ngrep "the following paths have collided" icasefs/warning
> 

Reply via email to