Re: [PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-10 Thread Jeff King
On Tue, Jun 09, 2015 at 08:46:24PM -0700, Shawn Pearce wrote:

  This patch introduces a quick flag to has_sha1_file which
  callers can use when they would prefer high performance at
  the cost of false negatives during repacks. There may be
  other code paths that can use this, but the index-pack one
  is the most obviously critical, so we'll start with
  switching that one.
 
 Hilarious. We did this in JGit back in ... uhm before 2009. :)
 
 But its Java. So of course we had to do optimizations.

This is actually how Git did it up until v1.8.4.2, in 2013. I changed it
then because the old way was racy (and git could flakily report refs as
broken and skip them during repacks!).

If you are doing it the quick way everywhere in JGit, you may want to
reexamine the possibility for races. :)

  @@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
  return 1;
  if (has_loose_object(sha1))
  return 1;
  +   if (flags  HAS_SHA1_QUICK)
  +   return 0;
  reprepare_packed_git();
  return find_pack_entry(sha1, e);
 
 Something else we do is readdir() over the loose objects and store
 them in a map in memory. That way we avoid stat() calls during that
 has_loose_object() path. This is apparently a win enough of the time
 that we always do that when receiving a pack over the wire (client or
 server).

Yeah, I thought about that while writing this. It would be a win as long
as you have a small number of loose objects and were going to make a
large number of requests (otherwise you are traversing even though
nobody is going to look it up). According to perf, though, loose object
lookups are not a large expenditure[1].

I'm also hesitant to go that route because it's basically caching, which
introduces new opportunities for race conditions when the cache is stale
(we do the same thing with loose refs, and we have indeed run into races
there).

-Peff

[1] As measured mostly by __d_lookup_rcu calls. Of course, my patch
gives a 5% improvement over the original, and we were not spending
5% of the time there originally. I suspect part of the problem is
that we do the lookup under a lock, so the longer we spend there,
the more contention we have between threads, and the less
parallelism. Indeed, I just did a quick repeat of my tests with
pack.threads=1, and the size of the improvement shrinks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-10 Thread Duy Nguyen
On Wed, Jun 10, 2015 at 9:00 PM, Jeff King p...@peff.net wrote:
 On Tue, Jun 09, 2015 at 08:46:24PM -0700, Shawn Pearce wrote:

  This patch introduces a quick flag to has_sha1_file which
  callers can use when they would prefer high performance at
  the cost of false negatives during repacks. There may be
  other code paths that can use this, but the index-pack one
  is the most obviously critical, so we'll start with
  switching that one.

 Hilarious. We did this in JGit back in ... uhm before 2009. :)

 But its Java. So of course we had to do optimizations.

 This is actually how Git did it up until v1.8.4.2, in 2013. I changed it
 then because the old way was racy (and git could flakily report refs as
 broken and skip them during repacks!).

 If you are doing it the quick way everywhere in JGit, you may want to
 reexamine the possibility for races. :)

I was expecting this :D

  @@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
  return 1;
  if (has_loose_object(sha1))
  return 1;
  +   if (flags  HAS_SHA1_QUICK)
  +   return 0;
  reprepare_packed_git();
  return find_pack_entry(sha1, e);

 Something else we do is readdir() over the loose objects and store
 them in a map in memory. That way we avoid stat() calls during that
 has_loose_object() path. This is apparently a win enough of the time
 that we always do that when receiving a pack over the wire (client or
 server).

 Yeah, I thought about that while writing this. It would be a win as long
 as you have a small number of loose objects and were going to make a
 large number of requests (otherwise you are traversing even though
 nobody is going to look it up). According to perf, though, loose object
 lookups are not a large expenditure[1].

 I'm also hesitant to go that route because it's basically caching, which
 introduces new opportunities for race conditions when the cache is stale
 (we do the same thing with loose refs, and we have indeed run into races
 there).

Watchman may help avoid races _with_ caching. But we need to support
both ways in that case, falling back to normal file poking when
watchman gives up, or when we're on Windows. Extra work needs big
enough performance gain to justify. I haven't seen that gain yet.

 [1] As measured mostly by __d_lookup_rcu calls. Of course, my patch
 gives a 5% improvement over the original, and we were not spending
 5% of the time there originally. I suspect part of the problem is
 that we do the lookup under a lock, so the longer we spend there,
 the more contention we have between threads, and the less
 parallelism. Indeed, I just did a quick repeat of my tests with
 pack.threads=1, and the size of the improvement shrinks.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-10 Thread Shawn Pearce
On Wed, Jun 10, 2015 at 7:00 AM, Jeff King p...@peff.net wrote:
 On Tue, Jun 09, 2015 at 08:46:24PM -0700, Shawn Pearce wrote:

  This patch introduces a quick flag to has_sha1_file which
  callers can use when they would prefer high performance at
  the cost of false negatives during repacks. There may be
  other code paths that can use this, but the index-pack one
  is the most obviously critical, so we'll start with
  switching that one.

 Hilarious. We did this in JGit back in ... uhm before 2009. :)

 But its Java. So of course we had to do optimizations.

 This is actually how Git did it up until v1.8.4.2, in 2013. I changed it
 then because the old way was racy (and git could flakily report refs as
 broken and skip them during repacks!).

 If you are doing it the quick way everywhere in JGit, you may want to
 reexamine the possibility for races. :)

Correct, fortunately we are not this naive.

JGit always does the reprepare_packed_git() + retry search on a miss.

But we have a code path to bypass that inside critical loops like our
equivalent of index-pack pulling off the wire. We snapshot the object
tree at the start of the operation before we read in the pack header,
and then require that the incoming pack be completed with that
snapshot. Since the snapshot was taking after ref
negotiation/advertisement, we should be at least as current as the
refs that were exchanged on the wire.

  @@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
  return 1;
  if (has_loose_object(sha1))
  return 1;
  +   if (flags  HAS_SHA1_QUICK)
  +   return 0;
  reprepare_packed_git();
  return find_pack_entry(sha1, e);

 Something else we do is readdir() over the loose objects and store
 them in a map in memory. That way we avoid stat() calls during that
 has_loose_object() path. This is apparently a win enough of the time
 that we always do that when receiving a pack over the wire (client or
 server).

 Yeah, I thought about that while writing this. It would be a win as long
 as you have a small number of loose objects and were going to make a
 large number of requests (otherwise you are traversing even though
 nobody is going to look it up). According to perf, though, loose object
 lookups are not a large expenditure[1].

Interesting. We were getting hurt by this in JGit. For most
repositories it was cheaper to issue 256 readdir() and build a set of
SHA-1s we found. I think we even just hit every directory 00..ff, we
don't even bother with a readdir() on the $GIT_DIR/objects itself.

 I'm also hesitant to go that route because it's basically caching, which
 introduces new opportunities for race conditions when the cache is stale
 (we do the same thing with loose refs, and we have indeed run into races
 there).

Yes. But see above, we do this only after we snapshot the packs, and
only after the ref negotiation, and only for the duration of parsing
the pack off the wire. So we should never have a data race.

Since JGit is multi-threaded, this cache is also effectively a
thread-local. Its never shared across threads.

 [1] As measured mostly by __d_lookup_rcu calls. Of course, my patch
 gives a 5% improvement over the original, and we were not spending
 5% of the time there originally. I suspect part of the problem is
 that we do the lookup under a lock, so the longer we spend there,
 the more contention we have between threads, and the less
 parallelism. Indeed, I just did a quick repeat of my tests with
 pack.threads=1, and the size of the improvement shrinks.

Interesting. Yea, fine-grained locking can hurt parallel execution... :(
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-09 Thread Jeff King
On Tue, Jun 09, 2015 at 01:24:36PM -0400, Jeff King wrote:

 I tested this on my system, and confirmed that for a git clone
 --no-local --bare git.git:
 
   1. It cuts the number of openat/getdents/close syscalls by several
  orders of magnitude.
 
   2. The overall time drops from ~11.4s to ~10.5s. I suppose if I timed
  only the `index-pack` process, it would be even higher (as a
  percentage improvement).

Just for fun, I did a git pack-objects --all --stdout from linux.git,
and then timed git index-pack --stdin on it in an empty repo. With a
configured alternate pointing to another empty repo, just to make it
more unfair. And then I stored it all on a ramdisk to emphasize the cost
of the syscalls versus hitting the disk. The numbers I got were:

  [before]
  real2m13.093s
  user3m31.884s
  sys 0m55.208s

  [after]
  real1m40.389s
  user3m10.776s
  sys 0m26.012s

That's sort of a ridiculous test, but it does show that this was having
some impact even on normal systems without insane syscall latencies.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-09 Thread Jeff King
On Fri, Jun 05, 2015 at 08:29:21AM -0400, Jeff King wrote:

 On Fri, Jun 05, 2015 at 08:18:17AM -0400, Jeff King wrote:
 
1. Devise some torture to tests to see whether my patch series is in
   fact racy on Linux.
 
2. Assuming it is, scrap it and make a has_sha1_file_quick() which
   might sometimes return a false negative (if somebody else is
   repacking). Use that in index-pack (and possibly other places, but
   we can start with index-pack).
  
  If we skip step 1 out of pessimism (which I think is a reasonable thing
  to do), then step 2 should not be all that much work. I'm going to be
  offline for a few days, though, so I won't get to it until next week at
  the earliest. If you (or someone else) wants to take a stab at it,
  please feel free.
 
 Actually, it really is a tiny bit of work. I knocked this out in less
 than 10 minutes. It passes the test suite, but it's entirely possible
 that I did something boneheaded that does not fix your problem. ;)
 
 Please test and confirm that it makes things better on your system.

I tested this on my system, and confirmed that for a git clone
--no-local --bare git.git:

  1. It cuts the number of openat/getdents/close syscalls by several
 orders of magnitude.

  2. The overall time drops from ~11.4s to ~10.5s. I suppose if I timed
 only the `index-pack` process, it would be even higher (as a
 percentage improvement).

So I expect it should fix the problem on your NFS system.

Here's the patch again. Same as before, but I'm cc-ing Junio, as I think
it is good to apply.

-- 8 --
Subject: index-pack: avoid excessive re-reading of pack directory

Since 45e8a74 (has_sha1_file: re-check pack directory before
giving up, 2013-08-30), we spend extra effort for
has_sha1_file to give the right answer when somebody else is
repacking. Usually this effort does not matter, because
after finding that the object does not exist, the next step
is usually to die().

However, some code paths make a large number of
has_sha1_file checks which are _not_ expected to return 1.
The collision test in index-pack.c is such a case. On a
local system, this can cause a performance slowdown of
around 5%. But on a system with high-latency system calls
(like NFS), it can be much worse.

This patch introduces a quick flag to has_sha1_file which
callers can use when they would prefer high performance at
the cost of false negatives during repacks. There may be
other code paths that can use this, but the index-pack one
is the most obviously critical, so we'll start with
switching that one.

Signed-off-by: Jeff King p...@peff.net
---
 builtin/index-pack.c |  2 +-
 cache.h  | 11 ++-
 sha1_file.c  |  4 +++-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7ea2020..0761dfd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -784,7 +784,7 @@ static void sha1_object(const void *data, struct 
object_entry *obj_entry,
assert(data || obj_entry);
 
read_lock();
-   collision_test_needed = has_sha1_file(sha1);
+   collision_test_needed = has_sha1_file_with_flags(sha1, HAS_SHA1_QUICK);
read_unlock();
 
if (collision_test_needed  !data) {
diff --git a/cache.h b/cache.h
index 571c98f..98edc2c 100644
--- a/cache.h
+++ b/cache.h
@@ -943,8 +943,17 @@ extern int has_sha1_pack(const unsigned char *sha1);
  * Return true iff we have an object named sha1, whether local or in
  * an alternate object database, and whether packed or loose.  This
  * function does not respect replace references.
+ *
+ * If the QUICK flag is set, do not re-check the pack directory
+ * when we cannot find the object (this means we may give a false
+ * negative answer if another process is simultaneously repacking).
  */
-extern int has_sha1_file(const unsigned char *sha1);
+#define HAS_SHA1_QUICK 0x1
+extern int has_sha1_file_with_flags(const unsigned char *sha1, int flags);
+static inline int has_sha1_file(const unsigned char *sha1)
+{
+   return has_sha1_file_with_flags(sha1, 0);
+}
 
 /*
  * Return true iff an alternate object database has a loose object
diff --git a/sha1_file.c b/sha1_file.c
index 7e38148..dc8746b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3161,7 +3161,7 @@ int has_sha1_pack(const unsigned char *sha1)
return find_pack_entry(sha1, e);
 }
 
-int has_sha1_file(const unsigned char *sha1)
+int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
struct pack_entry e;
 
@@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
return 1;
if (has_loose_object(sha1))
return 1;
+   if (flags  HAS_SHA1_QUICK)
+   return 0;
reprepare_packed_git();
return find_pack_entry(sha1, e);
 }
-- 
2.4.2.752.geeb594a

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More 

Re: [PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-09 Thread Shawn Pearce
On Fri, Jun 5, 2015 at 5:29 AM, Jeff King p...@peff.net wrote:

 However, some code paths make a large number of
 has_sha1_file checks which are _not_ expected to return 1.
 The collision test in index-pack.c is such a case. On a
 local system, this can cause a performance slowdown of
 around 5%. But on a system with high-latency system calls
 (like NFS), it can be much worse.

 This patch introduces a quick flag to has_sha1_file which
 callers can use when they would prefer high performance at
 the cost of false negatives during repacks. There may be
 other code paths that can use this, but the index-pack one
 is the most obviously critical, so we'll start with
 switching that one.

Hilarious. We did this in JGit back in ... uhm before 2009. :)

But its Java. So of course we had to do optimizations.


 @@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
 return 1;
 if (has_loose_object(sha1))
 return 1;
 +   if (flags  HAS_SHA1_QUICK)
 +   return 0;
 reprepare_packed_git();
 return find_pack_entry(sha1, e);

Something else we do is readdir() over the loose objects and store
them in a map in memory. That way we avoid stat() calls during that
has_loose_object() path. This is apparently a win enough of the time
that we always do that when receiving a pack over the wire (client or
server).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] index-pack: avoid excessive re-reading of pack directory

2015-06-05 Thread Jeff King
On Fri, Jun 05, 2015 at 08:18:17AM -0400, Jeff King wrote:

   1. Devise some torture to tests to see whether my patch series is in
  fact racy on Linux.

   2. Assuming it is, scrap it and make a has_sha1_file_quick() which
  might sometimes return a false negative (if somebody else is
  repacking). Use that in index-pack (and possibly other places, but
  we can start with index-pack).
 
 If we skip step 1 out of pessimism (which I think is a reasonable thing
 to do), then step 2 should not be all that much work. I'm going to be
 offline for a few days, though, so I won't get to it until next week at
 the earliest. If you (or someone else) wants to take a stab at it,
 please feel free.

Actually, it really is a tiny bit of work. I knocked this out in less
than 10 minutes. It passes the test suite, but it's entirely possible
that I did something boneheaded that does not fix your problem. ;)

Please test and confirm that it makes things better on your system.

-- 8 --
Subject: [PATCH] index-pack: avoid excessive re-reading of pack directory

Since 45e8a74 (has_sha1_file: re-check pack directory before
giving up, 2013-08-30), we spend extra effort for
has_sha1_file to give the right answer when somebody else is
repacking. Usually this effort does not matter, because
after finding that the object does not exist, the next step
is usually to die().

However, some code paths make a large number of
has_sha1_file checks which are _not_ expected to return 1.
The collision test in index-pack.c is such a case. On a
local system, this can cause a performance slowdown of
around 5%. But on a system with high-latency system calls
(like NFS), it can be much worse.

This patch introduces a quick flag to has_sha1_file which
callers can use when they would prefer high performance at
the cost of false negatives during repacks. There may be
other code paths that can use this, but the index-pack one
is the most obviously critical, so we'll start with
switching that one.

Signed-off-by: Jeff King p...@peff.net
---
 builtin/index-pack.c |  2 +-
 cache.h  | 11 ++-
 sha1_file.c  |  4 +++-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7ea2020..0761dfd 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -784,7 +784,7 @@ static void sha1_object(const void *data, struct 
object_entry *obj_entry,
assert(data || obj_entry);
 
read_lock();
-   collision_test_needed = has_sha1_file(sha1);
+   collision_test_needed = has_sha1_file_with_flags(sha1, HAS_SHA1_QUICK);
read_unlock();
 
if (collision_test_needed  !data) {
diff --git a/cache.h b/cache.h
index 571c98f..98edc2c 100644
--- a/cache.h
+++ b/cache.h
@@ -943,8 +943,17 @@ extern int has_sha1_pack(const unsigned char *sha1);
  * Return true iff we have an object named sha1, whether local or in
  * an alternate object database, and whether packed or loose.  This
  * function does not respect replace references.
+ *
+ * If the QUICK flag is set, do not re-check the pack directory
+ * when we cannot find the object (this means we may give a false
+ * negative answer if another process is simultaneously repacking).
  */
-extern int has_sha1_file(const unsigned char *sha1);
+#define HAS_SHA1_QUICK 0x1
+extern int has_sha1_file_with_flags(const unsigned char *sha1, int flags);
+static inline int has_sha1_file(const unsigned char *sha1)
+{
+   return has_sha1_file_with_flags(sha1, 0);
+}
 
 /*
  * Return true iff an alternate object database has a loose object
diff --git a/sha1_file.c b/sha1_file.c
index 7e38148..dc8746b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3161,7 +3161,7 @@ int has_sha1_pack(const unsigned char *sha1)
return find_pack_entry(sha1, e);
 }
 
-int has_sha1_file(const unsigned char *sha1)
+int has_sha1_file_with_flags(const unsigned char *sha1, int flags)
 {
struct pack_entry e;
 
@@ -3169,6 +3169,8 @@ int has_sha1_file(const unsigned char *sha1)
return 1;
if (has_loose_object(sha1))
return 1;
+   if (flags  HAS_SHA1_QUICK)
+   return 0;
reprepare_packed_git();
return find_pack_entry(sha1, e);
 }
-- 
2.4.2.752.geeb594a

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html