Re: Troubleshoot clone issue to NFS.

2015-05-21 Thread Christian Couder
On Thu, May 21, 2015 at 3:13 PM,   wrote:

[...]

> So the questions are:
>
> 1) Should I expect a clone to NFS to be that much slower?
> 2) Is there anything I could do to speed it up (I've tried --bare as that is 
> what the repositories will be when stored on NFS and there wasn't really a 
> difference).
> 3) What else can I use in git to compare the performance on local to NFS to 
> see if it is just clones that are affected?
> 4) I assume I could bisect between 1.8.4.1 and 1.8.4.2 to find exactly where 
> things get worse for me?

Yeah, it would be very nice if you could bisect further.
You can probably do it automatically using a script such as the one
Junio used in this email:

http://thread.gmane.org/gmane.comp.version-control.git/45195/

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Troubleshoot clone issue to NFS.

2015-05-21 Thread Duy Nguyen
On Thu, May 21, 2015 at 8:13 PM,   wrote:

> So there appears to be a change in 1.8.4.2 that made this issue appear for 
> me.  Looking at the release notes the only thing that I can see that might be 
> related could be:
>
> * When an object is not found after checking the packfiles and then
>loose object directory, read_sha1_file() re-checks the packfiles to
>prevent racing with a concurrent repacker; teach the same logic to
>has_sha1_file().

That would be commit 45e8a74 (has_sha1_file: re-check pack directory
before giving up - 2013-08-30). Maybe you can try the version without
that commit to confirm. In case an object is not found pack directory
is re-read again, which might cause some increased load on nfs.
has_sha1_file() not finding the object should not happen often.. Maybe
you could do an strace (following forks) on the "git clone" process to
see if readdir() is really called a lot?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Troubleshoot clone issue to NFS.

2015-05-21 Thread Duy Nguyen
On Thu, May 21, 2015 at 9:31 PM, Duy Nguyen  wrote:
> In case an object is not found pack directory
> is re-read again, which might cause some increased load on nfs.
> has_sha1_file() not finding the object should not happen often..

That last statement is probably very wrong, but I have no time to test
this now. In index-pack, there is a has_sha1_file() for file collision
test. That call on a fresh clone would fail for _every_ object in the
(new) pack and the cost of reprepare pack files could be sky high...
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Troubleshoot clone issue to NFS.

2015-05-21 Thread steve.norman
On Thu, May 21, 2015a at 9:31 PM, Duy Nguyen [mailto:pclo...@gmail.com], did 
scribble:
> > In case an object is not found pack directory is re-read again, which
> > might cause some increased load on nfs.
> > has_sha1_file() not finding the object should not happen often..
> 
> That last statement is probably very wrong, but I have no time to test this
> now. In index-pack, there is a has_sha1_file() for file collision test. That 
> call
> on a fresh clone would fail for _every_ object in the
> (new) pack and the cost of reprepare pack files could be sky high...

Confirmed with bisect that it is that commit:

~/git $ git bisect bad
45e8a7487339c0f0ea28244ef06851308d07387c is the first bad commit
commit 45e8a7487339c0f0ea28244ef06851308d07387c
Author: Jeff King 
Date:   Fri Aug 30 15:14:13 2013 -0400

I have an strace for that build but it is 153 megabytes so I probably shouldn't 
attach, but the call summary is below:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 93.71   39.670084 103386835  2171 futex
  3.161.338572   7190550   181 open
  1.560.658786  18 37450 3 read
  0.620.262740   2141390   pread
  0.410.171526   5 37313 9 write
  0.180.076166   0188859188835 access
  0.110.047941   0374712   getdents
  0.110.045174  11  4067  3910 lstat
  0.060.023630   0190425   close
  0.040.017995   6  2975 1 fstat
  0.020.0076681917 4   wait4
  0.010.004150   1  5065   madvise
  0.010.003548   0 16090 8 recvfrom
  0.000.001872   0  8048   select
  0.000.001870  11   173 1 mkdir
  0.000.000872   0  8055   poll
  0.000.000262  221212 readlink
  0.000.000185   0  1217  1146 stat
  0.000.000158   0   457   mprotect
  0.000.74   0   298   mmap
  0.000.69   1   109 8 rt_sigreturn
  0.000.47   0   159   brk
  0.000.21   117   getcwd
  0.000.00   042 3 lseek
  0.000.00   092   munmap
  0.000.00   035   rt_sigaction
  0.000.00   0 9   rt_sigprocmask
  0.000.00   0 8 3 ioctl
  0.000.00   011   pipe
  0.000.00   0 3   dup
  0.000.00   0 8   dup2
  0.000.00   0 6   setitimer
  0.000.00   011 1 socket
  0.000.00   0 8 7 connect
  0.000.00   0 8   sendto
  0.000.00   0 2   recvmsg
  0.000.00   0 1   bind
  0.000.00   0 1   getsockname
  0.000.00   0 3 1 getpeername
  0.000.00   0 2   setsockopt
  0.000.00   0 2   getsockopt
  0.000.00   0 8   clone
  0.000.00   0 5   execve
  0.000.00   0 3   uname
  0.000.00   0   100   fcntl
  0.000.00   0 2   fsync
  0.000.00   013   chdir
  0.000.00   014   rename
  0.000.00   0 2   link
  0.000.00   0 5   unlink
  0.000.00   0 2   symlink
  0.000.00   0 9   chmod
  0.000.00   0 6   getrlimit
  0.000.00   0 2   sysinfo
  0.000.00   0 8   getuid
  0.000.00   0 1   statfs
  0.000.00   0 5   arch_prctl
  0.000.00   0 1   gettid
  0.000.00   0 5   set_tid_address
  0.000.00   013   set_robust_list
-- --- --- - - 
100.00   42.333410   1594736196300 total

Is there anything else I can provide or test?

Thanks for the help,

Steve



Re: Troubleshoot clone issue to NFS.

2015-05-21 Thread Duy Nguyen
On Thu, May 21, 2015 at 10:53 PM,   wrote:
> On Thu, May 21, 2015a at 9:31 PM, Duy Nguyen [mailto:pclo...@gmail.com], did 
> scribble:
>> > In case an object is not found pack directory is re-read again, which
>> > might cause some increased load on nfs.
>> > has_sha1_file() not finding the object should not happen often..
>>
>> That last statement is probably very wrong, but I have no time to test this
>> now. In index-pack, there is a has_sha1_file() for file collision test. That 
>> call
>> on a fresh clone would fail for _every_ object in the
>> (new) pack and the cost of reprepare pack files could be sky high...
>
> Confirmed with bisect that it is that commit:
>
> ~/git $ git bisect bad
> 45e8a7487339c0f0ea28244ef06851308d07387c is the first bad commit
> commit 45e8a7487339c0f0ea28244ef06851308d07387c
> Author: Jeff King 
> Date:   Fri Aug 30 15:14:13 2013 -0400
>
> I have an strace for that build but it is 153 megabytes so I probably 
> shouldn't attach, but the call summary is below:
>
> % time seconds  usecs/call callserrors syscall
> -- --- --- - - 
>  93.71   39.670084 103386835  2171 futex
>   3.161.338572   7190550   181 open
>   1.560.658786  18 37450 3 read
>   0.620.262740   2141390   pread
>   0.410.171526   5 37313 9 write
>   0.180.076166   0188859188835 access
>   0.110.047941   0374712   getdents
>   0.110.045174  11  4067  3910 lstat
>   0.060.023630   0190425   close
>   0.040.017995   6  2975 1 fstat
>   0.020.0076681917 4   wait4
>   0.010.004150   1  5065   madvise
>   0.010.003548   0 16090 8 recvfrom
>   0.000.001872   0  8048   select
>   0.000.001870  11   173 1 mkdir
>   0.000.000872   0  8055   poll
>   0.000.000262  221212 readlink
>   0.000.000185   0  1217  1146 stat
>   0.000.000158   0   457   mprotect
>   0.000.74   0   298   mmap
>   0.000.69   1   109 8 rt_sigreturn
>   0.000.47   0   159   brk
>   0.000.21   117   getcwd
>   0.000.00   042 3 lseek
>   0.000.00   092   munmap
>   0.000.00   035   rt_sigaction
>   0.000.00   0 9   rt_sigprocmask
>   0.000.00   0 8 3 ioctl
>   0.000.00   011   pipe
>   0.000.00   0 3   dup
>   0.000.00   0 8   dup2
>   0.000.00   0 6   setitimer
>   0.000.00   011 1 socket
>   0.000.00   0 8 7 connect
>   0.000.00   0 8   sendto
>   0.000.00   0 2   recvmsg
>   0.000.00   0 1   bind
>   0.000.00   0 1   getsockname
>   0.000.00   0 3 1 getpeername
>   0.000.00   0 2   setsockopt
>   0.000.00   0 2   getsockopt
>   0.000.00   0 8   clone
>   0.000.00   0 5   execve
>   0.000.00   0 3   uname
>   0.000.00   0   100   fcntl
>   0.000.00   0 2   fsync
>   0.000.00   013   chdir
>   0.000.00   014   rename
>   0.000.00   0 2   link
>   0.000.00   0 5   unlink
>   0.000.00   0 2   symlink
>   0.000.00   0 9   chmod
>   0.000.00   0 6   getrlimit
>   0.000.00   0 2   sysinfo
>   0.000.00   0 8   getuid
>   0.000.00   0 1   statfs
>   0.000.00   0 5   arch_prctl
>   0.000.00   0 1   gettid
>   0.000.00   0 5   set_tid_address
>   0.000.00   013   set_robust_list
> -- --- --- - - 
> 100.00   42.333410   1594736196300 total
>
> Is there anything else I can provide or test?

In builtin/index-pack.c, replace the lin

Re: Troubleshoot clone issue to NFS.

2015-05-22 Thread Jeff King
On Fri, May 22, 2015 at 07:16:54AM +0700, Duy Nguyen wrote:

> > Is there anything else I can provide or test?
> 
> In builtin/index-pack.c, replace the line "collision_test_needed =
> has_sha1_file(sha1);" with "collision_test_needed = 0;". Security is
> compromised but for this test it should be ok. Then clone again. I
> hope the new number gets down close to v1.8.4.1.

Yeah, I think that is a good starting point. I timed a local clone
before and after 45e8a74; there is a small difference on my system
(about 5%), but it goes away with your suggestion.

The problem with has_sha1_file() prior to v1.8.4.2 is that it is racy
with respect to simultaneous operations; we might claim we do not have
an object, when in fact we do. As you noted, usually has_sha1_file()
returns true (i.e., we look up objects that we expect to have), and the
performance impact is minimal.

But for code paths where _not_ having the object is normal, the impact
is much greater. So I think there are two possibilities for improving
this:

  1. Find places where we expect the object will not exist (like the
 collision_test check you pointed out) and use a
 "has_sha1_file_fast" that accepts that it may very occasionally
 erroneously return false. In this case it would mean potentially
 skipping a collision check, but I think that is OK. That could have
 security implications, but only if an attacker:

   a. has broken sha1 to generate a colliding object

   b. can manipulate the victim into repacking in a loop

   c. can manipulate the victim into fetching (or receiving a push)
  simultaneously with (b)

 at which point they can try to race the repack procedure to add
 their colliding object to the repository. It seems rather unlikely
 (especially part a).

  2. Make reprepare_packed_git() cheaper in the common case that nothing
 has changed. It would probably be enough to stat("objects/pack").
 We know that packfiles themselves do not change; we may only add or
 delete them. And since the hierarchy of objects/pack is flat, we
 know that the mtime on that directory will change if any packs are
 added or removed.

 Of course, we are still doing an extra stat() for each has_sha1_file
 call. Whether that helps for the NFS case depends on whether stat()
 is significantly cheaper than opendir/readdir/closedir. On my local
 disk, the hacky patch below did seem to give me back the 5% lost by
 45e8a74 (I did it directly on master, since that old commit does
 not have the stat_validity infrastructure).

---
diff --git a/cache.h b/cache.h
index c97d807..17c840c 100644
--- a/cache.h
+++ b/cache.h
@@ -1661,6 +1661,7 @@ int sane_execvp(const char *file, char *const argv[]);
  */
 struct stat_validity {
struct stat_data *sd;
+   unsigned mode;
 };
 
 void stat_validity_clear(struct stat_validity *sv);
diff --git a/read-cache.c b/read-cache.c
index 36ff89f..1d8702f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2281,6 +2281,7 @@ void stat_validity_clear(struct stat_validity *sv)
 {
free(sv->sd);
sv->sd = NULL;
+   sv->mode = 0;
 }
 
 int stat_validity_check(struct stat_validity *sv, const char *path)
@@ -2291,18 +2292,19 @@ int stat_validity_check(struct stat_validity *sv, const 
char *path)
return sv->sd == NULL;
if (!sv->sd)
return 0;
-   return S_ISREG(st.st_mode) && !match_stat_data(sv->sd, &st);
+   return sv->mode == st.st_mode && !match_stat_data(sv->sd, &st);
 }
 
 void stat_validity_update(struct stat_validity *sv, int fd)
 {
struct stat st;
 
-   if (fstat(fd, &st) < 0 || !S_ISREG(st.st_mode))
+   if (fstat(fd, &st) < 0)
stat_validity_clear(sv);
else {
if (!sv->sd)
sv->sd = xcalloc(1, sizeof(struct stat_data));
+   sv->mode = st.st_mode;
fill_stat_data(sv->sd, &st);
}
 }
diff --git a/sha1_file.c b/sha1_file.c
index ccc6dac..abe1be9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1225,7 +1225,8 @@ static void report_pack_garbage(struct string_list *list)
report_helper(list, seen_bits, first, list->nr);
 }
 
-static void prepare_packed_git_one(char *objdir, int local)
+static void prepare_packed_git_one(char *objdir, int local,
+  struct stat_validity *validity)
 {
struct strbuf path = STRBUF_INIT;
size_t dirnamelen;
@@ -1235,6 +1236,12 @@ static void prepare_packed_git_one(char *objdir, int 
local)
 
strbuf_addstr(&path, objdir);
strbuf_addstr(&path, "/pack");
+
+   if (stat_validity_check(validity, path.buf)) {
+   strbuf_release(&path);
+   return;
+   }
+
dir = opendir(path.buf);
if (!dir) {
if (errno != ENOENT)
@@ -1243,6 +1250,10 @@ static void prepare_packed_git_one(char *objdir, int 
local)
strbuf_rele

RE: Troubleshoot clone issue to NFS.

2015-05-22 Thread steve.norman
On Friday, May 22, 2015 @ 8:12 AM Jeff King did scribble:

> > In builtin/index-pack.c, replace the line "collision_test_needed =
> > has_sha1_file(sha1);" with "collision_test_needed = 0;". Security is
> > compromised but for this test it should be ok. Then clone again. I
> > hope the new number gets down close to v1.8.4.1.
> 
> Yeah, I think that is a good starting point. I timed a local clone
> before and after 45e8a74; there is a small difference on my system
> (about 5%), but it goes away with your suggestion.

Tested this change on a couple of versions, first of all on the revision
where things go wrong for me:

~ $ bin/git --version
git version 1.8.4.1.g45e8a74.dirty

~ $ time bin/git clone https://github.com/git/git test


real0m7.105s
user0m9.566s
sys 0m0.989s

~ $ time bin/git clone https://github.com/git/git /sami/test


real0m14.411s
user0m9.703s
sys 0m1.374s

This is more in line with what I see normally.   Also tested on master:

~ $ bin/git --version
git version 2.4.1.217.g6c1249c.dirty

~ $ time bin/git clone https://github.com/git/git test


real0m5.946s
user0m9.111s
sys 0m1.332s


~ $ time bin/git clone https://github.com/git/git /sami/test


real0m12.344s
user0m9.187s
sys 0m1.579s

So similar on the latest as well.

> The problem with has_sha1_file() prior to v1.8.4.2 is that it is racy
> with respect to simultaneous operations; we might claim we do not have
> an object, when in fact we do. As you noted, usually has_sha1_file()
> returns true (i.e., we look up objects that we expect to have), and the
> performance impact is minimal.
> 
> But for code paths where _not_ having the object is normal, the impact
> is much greater. So I think there are two possibilities for improving
> this:
> 
>   1. Find places where we expect the object will not exist (like the
>  collision_test check you pointed out) and use a
>  "has_sha1_file_fast" that accepts that it may very occasionally
>  erroneously return false. In this case it would mean potentially
>  skipping a collision check, but I think that is OK. That could have
>  security implications, but only if an attacker:
> 
>a. has broken sha1 to generate a colliding object
> 
>b. can manipulate the victim into repacking in a loop
> 
>c. can manipulate the victim into fetching (or receiving a push)
>   simultaneously with (b)
> 
>  at which point they can try to race the repack procedure to add
>  their colliding object to the repository. It seems rather unlikely
>  (especially part a).
> 
>   2. Make reprepare_packed_git() cheaper in the common case that nothing
>  has changed. It would probably be enough to stat("objects/pack").
>  We know that packfiles themselves do not change; we may only add or
>  delete them. And since the hierarchy of objects/pack is flat, we
>  know that the mtime on that directory will change if any packs are
>  added or removed.
> 
>  Of course, we are still doing an extra stat() for each has_sha1_file
>  call. Whether that helps for the NFS case depends on whether stat()
>  is significantly cheaper than opendir/readdir/closedir. On my local
>  disk, the hacky patch below did seem to give me back the 5% lost by
>  45e8a74 (I did it directly on master, since that old commit does
>  not have the stat_validity infrastructure).

Also tested master with the patch provided:

~ $ bin/git --version
git version 2.4.1.217.g6c1249c.dirty

~ $ time git clone https://github.com/git/git test

real0m8.263s
user0m10.550s
sys 0m3.763s

~ $ time git clone https://github.com/git/git /sami/test

real1m3.286s
user0m12.149s
sys 0m9.192s

So the patch isn't reducing the time taken when cloning to NAS.

Here are the top calls from strace

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 90.68   19.946171  46437398 45436 futex
  4.631.017637  22 46141 5 read
  2.370.521161   4141429   pread
  0.730.161442   3 47130 9 write
  0.420.093146   0188645188621 access
  0.380.083033  26  3209   181 open
  0.320.069587   0188613  1146 stat
  0.230.050855  12  4082  3925 lstat
  0.110.023317   8  2979 1 fstat
  0.040.009134   0 35696 3 recvfrom
  0.030.0076661917 4   wait4
  0.020.004478   1  3923   madvise
  0.010.002291   0 17858   poll
  0.010.002155   0 17851   select

Thanks for looking into this.

Steve
N�r��yb�X��ǧv�^�)޺{.n�+ا���ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf

Re: Troubleshoot clone issue to NFS.

2015-05-22 Thread Duy Nguyen
On Fri, May 22, 2015 at 3:35 PM,   wrote:
> Tested this change on a couple of versions, first of all on the revision
> where things go wrong for me:
>
> ...
>
> ~ $ time git clone https://github.com/git/git test
>
> real0m8.263s
> user0m10.550s
> sys 0m3.763s
>
> ~ $ time git clone https://github.com/git/git /sami/test
>
> real1m3.286s
> user0m12.149s
> sys 0m9.192s
>
> So the patch isn't reducing the time taken when cloning to NAS.

Strange. Maybe there is something else... Anyway some numbers from me.
This is nfs3 hosted by Raspberry Pi, accessed over wireless. I just
run index-pack on git.git pack instead of full clone.

 - v1.8.4.1 34s
 - v1.8.4.2 519s (oh.. wireless..)
 - v1.8.4.2 without has_sha1_file() in index-pack.c 33s
 - v1.8.4.2 + Jeff's mtime patch 36s
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Troubleshoot clone issue to NFS.

2015-05-22 Thread Junio C Hamano
Duy Nguyen  writes:

> Strange. Maybe there is something else... Anyway some numbers from me.
> This is nfs3 hosted by Raspberry Pi, accessed over wireless. I just
> run index-pack on git.git pack instead of full clone.

So the checkout codepath to touch working tree is one difference?
Are there other differences?

>
>  - v1.8.4.1 34s
>  - v1.8.4.2 519s (oh.. wireless..)
>  - v1.8.4.2 without has_sha1_file() in index-pack.c 33s
>  - v1.8.4.2 + Jeff's mtime patch 36s
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Troubleshoot clone issue to NFS.

2015-05-22 Thread steve.norman
On Friday, May 22, 2015 @ 11:06 AM Duy Nguyen did write:

> Strange. Maybe there is something else... Anyway some numbers from me.
> This is nfs3 hosted by Raspberry Pi, accessed over wireless. I just
> run index-pack on git.git pack instead of full clone.
> 
>  - v1.8.4.1 34s
>  - v1.8.4.2 519s (oh.. wireless..)
>  - v1.8.4.2 without has_sha1_file() in index-pack.c 33s
>  - v1.8.4.2 + Jeff's mtime patch 36s

Just ran the test again and it was 12 seconds.   Too many versions of git 
available on the 
machine and I think I might have run the wrong one:

~ $ time bin/git clone https://github.com/git/git test
Cloning into 'test'...
remote: Counting objects: 186015, done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 186015 (delta 8), reused 7 (delta 7), pack-reused 186000
Receiving objects: 100% (186015/186015), 61.33 MiB | 30.35 MiB/s, done.
Resolving deltas: 100% (135512/135512), done.
Checking connectivity... done.

real0m6.931s
user0m10.011s
sys 0m2.685s

~ $ time bin/git clone https://github.com/git/git /sami/test
Cloning into '/sami/test'...
remote: Counting objects: 186015, done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 186015 (delta 8), reused 7 (delta 7), pack-reused 186000
Receiving objects: 100% (186015/186015), 61.33 MiB | 29.91 MiB/s, done.
Resolving deltas: 100% (135512/135512), done.
Checking connectivity... done.
Checking out files: 100% (2795/2795), done.

real0m13.169s
user0m9.961s
sys 0m3.480s

So  I would say that was in line with the timings for the other versions I have 
tested.

And looking back at the email from the morning I did run the wrong bin/git.   
Sorry, totally my fault on that. 

If I can test anything else then let me know.

Thanks,

Steve
--


Re: Troubleshoot clone issue to NFS.

2015-05-24 Thread Duy Nguyen
On Fri, May 22, 2015 at 2:12 PM, Jeff King  wrote:
> So I think there are two possibilities for improving this:
>
>   1. Find places where we expect the object will not exist (like the
>  collision_test check you pointed out) and use a
>  "has_sha1_file_fast" that accepts that it may very occasionally
>  erroneously return false. In this case it would mean potentially
>  skipping a collision check, but I think that is OK. That could have
>  security implications, but only if an attacker:
>
>a. has broken sha1 to generate a colliding object
>
>b. can manipulate the victim into repacking in a loop
>
>c. can manipulate the victim into fetching (or receiving a push)
>   simultaneously with (b)
>
>  at which point they can try to race the repack procedure to add
>  their colliding object to the repository. It seems rather unlikely
>  (especially part a).

In case you want to back away from option 2 because it starts to leak
raciness, which your old commit tried to fix in the first place. I
think the only other place that tests for lots of non-existent loose
objects is write_sha1_file (e.g. "tar -xf bigtarball.tar.gz; cd
bigtarball; git init; git add ."). But the number of calls should be
much smaller compared to index-pack and it does not use has_sha1_file,
it uses check_and_freshen_file() instead.

There are other places where has_sha1_file() may return 0, but I think
the number of calls is even smaller to bother (shallow.c,
fetch-pack.c, apply.c, buik-checkin.c)
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Troubleshoot clone issue to NFS.

2015-06-05 Thread steve.norman
On Sunday, May 24, 2015 @ 10:01 AM Duy Nguyen [mailto:pclo...@gmail.com] did 
scribble:

> In case you want to back away from option 2 because it starts to leak
> raciness, which your old commit tried to fix in the first place. I
> think the only other place that tests for lots of non-existent loose
> objects is write_sha1_file (e.g. "tar -xf bigtarball.tar.gz; cd
> bigtarball; git init; git add ."). But the number of calls should be
> much smaller compared to index-pack and it does not use has_sha1_file,
> it uses check_and_freshen_file() instead.
> 
> There are other places where has_sha1_file() may return 0, but I think
> the number of calls is even smaller to bother (shallow.c,
> fetch-pack.c, apply.c, buik-checkin.c)

Any updates / further thoughts on this?

Thanks,

Steve
N�r��yb�X��ǧv�^�)޺{.n�+ا���ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf

Re: Troubleshoot clone issue to NFS.

2015-06-05 Thread Jeff King
On Fri, Jun 05, 2015 at 12:01:16PM +, steve.nor...@thomsonreuters.com wrote:

> On Sunday, May 24, 2015 @ 10:01 AM Duy Nguyen [mailto:pclo...@gmail.com] did 
> scribble:
> 
> > In case you want to back away from option 2 because it starts to leak
> > raciness, which your old commit tried to fix in the first place. I
> > think the only other place that tests for lots of non-existent loose
> > objects is write_sha1_file (e.g. "tar -xf bigtarball.tar.gz; cd
> > bigtarball; git init; git add ."). But the number of calls should be
> > much smaller compared to index-pack and it does not use has_sha1_file,
> > it uses check_and_freshen_file() instead.
> > 
> > There are other places where has_sha1_file() may return 0, but I think
> > the number of calls is even smaller to bother (shallow.c,
> > fetch-pack.c, apply.c, buik-checkin.c)
> 
> Any updates / further thoughts on this?

Sorry, I haven't had a chance to look at it further. It still on my todo
list. My plan is:

  1. Devise some torture to tests to see whether my patch series is in
 fact racy on Linux.

  2. Assuming it is, scrap it and make a has_sha1_file_quick() which
 might sometimes return a false negative (if somebody else is
 repacking). Use that in index-pack (and possibly other places, but
 we can start with index-pack).

If we skip step 1 out of pessimism (which I think is a reasonable thing
to do), then step 2 should not be all that much work. I'm going to be
offline for a few days, though, so I won't get to it until next week at
the earliest. If you (or someone else) wants to take a stab at it,
please feel free.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Troubleshoot clone issue to NFS.

2015-06-05 Thread steve.norman
On Friday, June 05, 2015 @ 1:18 PM Jeff King [mailto:p...@peff.net] did 
scribble:

> Sorry, I haven't had a chance to look at it further. It still on my todo
> list. My plan is:
> 
>   1. Devise some torture to tests to see whether my patch series is in
>  fact racy on Linux.
> 
>   2. Assuming it is, scrap it and make a has_sha1_file_quick() which
>  might sometimes return a false negative (if somebody else is
>  repacking). Use that in index-pack (and possibly other places, but
>  we can start with index-pack).
> 
> If we skip step 1 out of pessimism (which I think is a reasonable thing
> to do), then step 2 should not be all that much work. I'm going to be
> offline for a few days, though, so I won't get to it until next week at
> the earliest. If you (or someone else) wants to take a stab at it,
> please feel free.

I've been off myself and wanted to make sure I hadn't missed anything in 
the email threads while I was away as there we rather a lot of them.

Not feeling confident enough to make the changes myself at the moment. 
I think what you are saying is that has_sha1_file_quick() would be the version
from before your change in 45e8a74.  And then use that, but I could be
barking up the wrong tree completely.

Thanks,

Steve


Re: Troubleshoot clone issue to NFS.

2015-06-16 Thread Jeff King
On Fri, Jun 05, 2015 at 08:18:17AM -0400, Jeff King wrote:

> On Fri, Jun 05, 2015 at 12:01:16PM +, steve.nor...@thomsonreuters.com 
> wrote:
> 
> > On Sunday, May 24, 2015 @ 10:01 AM Duy Nguyen [mailto:pclo...@gmail.com] 
> > did scribble:
> > 
> > > In case you want to back away from option 2 because it starts to leak
> > > raciness, which your old commit tried to fix in the first place. I
> > > think the only other place that tests for lots of non-existent loose
> > > objects is write_sha1_file (e.g. "tar -xf bigtarball.tar.gz; cd
> > > bigtarball; git init; git add ."). But the number of calls should be
> > > much smaller compared to index-pack and it does not use has_sha1_file,
> > > it uses check_and_freshen_file() instead.
> > > 
> > > There are other places where has_sha1_file() may return 0, but I think
> > > the number of calls is even smaller to bother (shallow.c,
> > > fetch-pack.c, apply.c, buik-checkin.c)
> > 
> > Any updates / further thoughts on this?
> 
> Sorry, I haven't had a chance to look at it further. It still on my todo
> list. My plan is:
> 
>   1. Devise some torture to tests to see whether my patch series is in
>  fact racy on Linux.

I do not know that we needed further convincing that the patch series to
stat() the objects/pack directory was flaky, but just as an extra data
point in case somebody tries this approach later: it is indeed flaky.

I've been running with it for a week or so, and I noticed that a simple
clone of a small repository fails racily:

  $ git clone --bare --no-local . foo.git
  Cloning into bare repository 'child.git'...
  remote: Counting objects: 210, done.
  remote: Compressing objects: 100% (98/98), done.
  remote: Total 210 (delta 99), reused 210 (delta 99)
  Receiving objects: 100% (210/210), 68.00 KiB | 0 bytes/s, done.
  Resolving deltas: 100% (99/99), done.
  Checking connectivity... done.
  error: internal error: refs/heads/master is not a valid packed reference!

Sometimes it works, and sometimes not.  It looks like we're failing to
re-scan the pack directory when we should (presumably because the open()
and readdir() operations are not atomic).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html