Re: [PATCH 2/2] sha1_file: only freshen packs once per run

2015-04-20 Thread Stefan Saasen
I can confirm that this patch is equivalent to the previous one.

https://bitbucket.org/snippets/ssaasen/9AXg shows both the timing and
the NFS stats showing the effect of applying this patch.

Thanks for the fix Jeff!

Cheers,
Stefan

On 21 April 2015 at 05:55, Jeff King  wrote:
> Since 33d4221 (write_sha1_file: freshen existing objects,
> 2014-10-15), we update the mtime of existing objects that we
> would have written out (had they not existed). For the
> common case in which many objects are packed, we may update
> the mtime on a single packfile repeatedly. This can result
> in a noticeable performance problem if calling utime() is
> expensive (e.g., because your storage is on NFS).
>
> We can fix this by keeping a per-pack flag that lets us
> freshen only once per program invocation.
>
> An alternative would be to keep the packed_git.mtime flag up
> to date as we freshen, and freshen only once every N
> seconds. In practice, it's not worth the complexity. We are
> racing against prune expiration times here, which inherently
> must be set to accomodate reasonable program running times
> (because they really care about the time between an object
> being written and it becoming referenced, and the latter is
> typically the last step a program takes).
>
> Signed-off-by: Jeff King 
> ---
> Hopefully I didn't botch the flag logic again. :) I tested with "strace
> -c" myself this time, so I think it is good.
>
>  cache.h | 1 +
>  sha1_file.c | 9 -
>  2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/cache.h b/cache.h
> index 3d3244b..72c6888 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1174,6 +1174,7 @@ extern struct packed_git {
> int pack_fd;
> unsigned pack_local:1,
>  pack_keep:1,
> +freshened:1,
>  do_not_close:1;
> unsigned char sha1[20];
> /* something like ".git/objects/pack/x.pack" */
> diff --git a/sha1_file.c b/sha1_file.c
> index 822aaef..26b9b2b 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2999,7 +2999,14 @@ static int freshen_loose_object(const unsigned char 
> *sha1)
>  static int freshen_packed_object(const unsigned char *sha1)
>  {
> struct pack_entry e;
> -   return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
> +   if (!find_pack_entry(sha1, &e))
> +   return 0;
> +   if (e.p->freshened)
> +   return 1;
> +   if (!freshen_file(e.p->pack_name))
> +   return 0;
> +   e.p->freshened = 1;
> +   return 1;
>  }
>
>  int write_sha1_file(const void *buf, unsigned long len, const char *type, 
> unsigned char *returnsha1)
> --
> 2.4.0.rc2.384.g7297a4a
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] sha1_file: only freshen packs once per run

2015-04-20 Thread Jeff King
Since 33d4221 (write_sha1_file: freshen existing objects,
2014-10-15), we update the mtime of existing objects that we
would have written out (had they not existed). For the
common case in which many objects are packed, we may update
the mtime on a single packfile repeatedly. This can result
in a noticeable performance problem if calling utime() is
expensive (e.g., because your storage is on NFS).

We can fix this by keeping a per-pack flag that lets us
freshen only once per program invocation.

An alternative would be to keep the packed_git.mtime flag up
to date as we freshen, and freshen only once every N
seconds. In practice, it's not worth the complexity. We are
racing against prune expiration times here, which inherently
must be set to accomodate reasonable program running times
(because they really care about the time between an object
being written and it becoming referenced, and the latter is
typically the last step a program takes).

Signed-off-by: Jeff King 
---
Hopefully I didn't botch the flag logic again. :) I tested with "strace
-c" myself this time, so I think it is good.

 cache.h | 1 +
 sha1_file.c | 9 -
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 3d3244b..72c6888 100644
--- a/cache.h
+++ b/cache.h
@@ -1174,6 +1174,7 @@ extern struct packed_git {
int pack_fd;
unsigned pack_local:1,
 pack_keep:1,
+freshened:1,
 do_not_close:1;
unsigned char sha1[20];
/* something like ".git/objects/pack/x.pack" */
diff --git a/sha1_file.c b/sha1_file.c
index 822aaef..26b9b2b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2999,7 +2999,14 @@ static int freshen_loose_object(const unsigned char 
*sha1)
 static int freshen_packed_object(const unsigned char *sha1)
 {
struct pack_entry e;
-   return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
+   if (!find_pack_entry(sha1, &e))
+   return 0;
+   if (e.p->freshened)
+   return 1;
+   if (!freshen_file(e.p->pack_name))
+   return 0;
+   e.p->freshened = 1;
+   return 1;
 }
 
 int write_sha1_file(const void *buf, unsigned long len, const char *type, 
unsigned char *returnsha1)
-- 
2.4.0.rc2.384.g7297a4a
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html