Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-19 Thread Stephen Morton
On Mon, Mar 16, 2015 at 6:15 PM, Jeff King p...@peff.net wrote:
 On Mon, Mar 09, 2015 at 09:37:25PM -0400, Stephen Morton wrote:

 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time
 when straced, but I think it's much less, as little as 10s when not
 straced.
 It then reads a bunch of what look like objects from filehandle 0
 (presumably stdin, read from the remote end?)
 It then tries to lstat() these filenames in .git/sha1
 ./git/refs/sha1, .git/heads/sha, etc. It always fails ENOENT.
 It fails some 120,000 times. This could be a problem. Though I haven't
 checked to see if this happens on a fast push on another machine.

 Hmm. The push process must feed the set of object boundaries to
 pack-objects so it knows what to pack (i.e., what we want to send, and
 what the other side has).

 120,000 is an awfully large number of objects to be pass there, though.
 Does the repo you are pushing to by any chance have an extremely large
 number of refs (e.g., on the order of 120K tags)?


No. There are on the order of 9,500 refs (mostly tags) but nowhere near 120k.




 Can you try building git with this patch which would tell us more about
 where those objects are coming from?


patch snipped

 Those are all rather blunt debugging methods, but hopefully it can get
 us a sense of where the time is going.

 -Peff


Thanks Peff,


I haven't tried your patch, but I tried to backtrack a bit and
double-check that the problem always happened in a fresh clone with no
other modifications.

It did _not_ happen in a new clone --a push took just 5s -- and I
think the culprit could have been repack.writebitmaps=true. Although
I had thought writebitmaps was not originally enabled, I now suspect
that it was. Let me follow up on that first, before I recompile git
with your changes.


Thanks again, I really appreciate the help.

Steve
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-19 Thread Jeff King
On Thu, Mar 19, 2015 at 04:31:36PM -0400, Stephen Morton wrote:

  Hmm. The push process must feed the set of object boundaries to
  pack-objects so it knows what to pack (i.e., what we want to send, and
  what the other side has).
 
  120,000 is an awfully large number of objects to be pass there, though.
  Does the repo you are pushing to by any chance have an extremely large
  number of refs (e.g., on the order of 120K tags)?
 
 No. There are on the order of 9,500 refs (mostly tags) but nowhere near 120k.

I think you mentioned that it uses alternates to share objects with
other repos. Does the repository (or repositories) pointed to by the
alternates file have a large number of refs (especially distinct refs,
as I think modern git will squelch duplicate sha1s).

 It did _not_ happen in a new clone --a push took just 5s -- and I
 think the culprit could have been repack.writebitmaps=true. Although
 I had thought writebitmaps was not originally enabled, I now suspect
 that it was. Let me follow up on that first, before I recompile git
 with your changes.

It's certainly possible that bitmaps have an impact here. They should
not contribute to the 120K objects being passed to pack-objects, but
it's possible that size is a red herring (or possibly the number of
objects is choking something in the bitmap code path that does not have
problems otherwise).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-16 Thread Jeff King
On Mon, Mar 09, 2015 at 09:37:25PM -0400, Stephen Morton wrote:

 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time
 when straced, but I think it's much less, as little as 10s when not
 straced.
 It then reads a bunch of what look like objects from filehandle 0
 (presumably stdin, read from the remote end?)
 It then tries to lstat() these filenames in .git/sha1
 ./git/refs/sha1, .git/heads/sha, etc. It always fails ENOENT.
 It fails some 120,000 times. This could be a problem. Though I haven't
 checked to see if this happens on a fast push on another machine.

Hmm. The push process must feed the set of object boundaries to
pack-objects so it knows what to pack (i.e., what we want to send, and
what the other side has).

120,000 is an awfully large number of objects to be pass there, though.
Does the repo you are pushing to by any chance have an extremely large
number of refs (e.g., on the order of 120K tags)?

Can you try building git with this patch which would tell us more about
where those objects are coming from?

diff --git a/send-pack.c b/send-pack.c
index 9d2b0c5..17ace1f 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -74,10 +74,19 @@ static int pack_objects(int fd, struct ref *refs, struct 
sha1_array *extra, stru
 * We feed the pack-objects we just spawned with revision
 * parameters by writing to the pipe.
 */
+   warning(feeding %d .have objects, extra-nr);
for (i = 0; i  extra-nr; i++)
if (!feed_object(extra-sha1[i], po.in, 1))
break;
 
+   {
+   struct ref *p;
+   int count = 0;
+   for (p = refs; p; p = p-next)
+   count++;
+   warning(feeding %d refs, count);
+   }
+
while (refs) {
if (!is_null_sha1(refs-old_sha1) 
!feed_object(refs-old_sha1, po.in, 1))

 4. Then it just sits there for almost 1 minute. It uses up almost 100%
 of a single core during this period. It spends a lot of time running
 lots of brk() (memory allocation) commands and then getting (polled
 by?) a SIGALRM every 1s.
 About 5.5+ GB (about the repo size) of VIRT memory is allocated. Only
 about 400M is ever RES.

The SIGALRM is normal. That's the progress code checking in every 2
seconds to see if there's anything to print. My guess is that the
allocation is probably coming as part of the preferred base code. This
is a fancy term for stuff we are not going to send, but which we could
use as delta bases for objects we are sending.

Does the patch below speed things up (it is not a good thing to do in
general, but would let us know if that is the problematic area):

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d816587..c90a352 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2307,7 +2307,6 @@ static void show_object(struct object *obj,
 
 static void show_edge(struct commit *commit)
 {
-   add_preferred_base(commit-object.sha1);
 }
 
 struct in_pack_object {

If not, then the slowdown may come before we even get there, and
possibly this patch would help:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d816587..473c0a3 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2531,6 +2531,8 @@ static void get_object_list(int ac, const char **av)
}
die(not a rev '%s', line);
}
+   if (*line == '^')
+   continue;
if (handle_revision_arg(line, revs, flags, 
REVARG_CANNOT_BE_FILENAME))
die(bad revision '%s', line);
}

Those are all rather blunt debugging methods, but hopefully it can get
us a sense of where the time is going.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-09 Thread Stephen Morton
Thanks Peff,

I've done an strace and here's what I see. I'll try to put relevant
information in as legible a form as possible. The operation is
cpu-bound on a single core (note that yes, delta compression is using
8 threads. so that's obviously not the bottleneck) for the duration of
the pack-objects.

Timing is tough, because it's slower when straced.

I don't know if it matters, but this push is going to a fork on the
remote end most of whose objects are in the objects/info/alternatives
location. (Which is still on the local drive of the remote, not a
network drive or anything like that.)

1.
  GIT_TRACE=1 time time git push
12:21:28.980410 git.c:349   trace: built-in: git 'push'
12:21:29.099547 run-command.c:341   trace: run_command: 'ssh' '-p'
'7999' 'git@server.privacy' 'git-receive-pack
'\''~smorton/panos.git'\'''
12:21:42.863968 run-command.c:341   trace: run_command:
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
12:21:42.871170 exec_cmd.c:134  trace: exec: 'git'
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
12:21:42.907783 git.c:349   trace: built-in: git
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'

2. At this point, git pack-objects is forked

3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time
when straced, but I think it's much less, as little as 10s when not
straced.
It then reads a bunch of what look like objects from filehandle 0
(presumably stdin, read from the remote end?)
It then tries to lstat() these filenames in .git/sha1
./git/refs/sha1, .git/heads/sha, etc. It always fails ENOENT.
It fails some 120,000 times. This could be a problem. Though I haven't
checked to see if this happens on a fast push on another machine.

This takes a lot of time when straced, but I think less time when not straced.

4. Then it just sits there for almost 1 minute. It uses up almost 100%
of a single core during this period. It spends a lot of time running
lots of brk() (memory allocation) commands and then getting (polled
by?) a SIGALRM every 1s.
About 5.5+ GB (about the repo size) of VIRT memory is allocated. Only
about 400M is ever RES.


5. Then we get to the Counting objects phase.
Note the number of threads specified: so it

Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
...


6. Time. Note the 121,000 pagefaults

46.08user 0.67system 0:49.47elapsed 94%CPU (0avgtext+0avgdata
1720240maxresident)k
0inputs+16outputs (0major+121199minor)pagefaults 0swaps

Steve



On Mon, Mar 9, 2015 at 3:53 AM, Jeff King p...@peff.net wrote:
 On Thu, Mar 05, 2015 at 04:03:07PM -0500, Stephen Morton wrote:

 I'm experiencing very slow git pushes. On the order of 1 minute to push a
 trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
 taking a lot of time in the pack-objects phase.

 Can you tell what pack-objects is doing? Perhaps strace -f -tt might
 help. Or even just time git push, which will tell us whether we're
 spinning on CPU or something else.

 If you watch the progress meter, where does it spend its time? In
 counting objects, compressing objects, or writing objects? Or
 after that?

 You said the repo is big. Is it reasonably packed (try running `git
 gc`). Pack-objects may look at objects that you know the other side has
 in order to create deltas against them. If you have some crazy situation
 (e.g., you have several thousand packfiles) that can slow down object
 access quite a bit. A minute is more than I would expect, but if VFS
 operations in your VM are slow, that could be it.

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-09 Thread Jeff King
On Thu, Mar 05, 2015 at 04:03:07PM -0500, Stephen Morton wrote:

 I'm experiencing very slow git pushes. On the order of 1 minute to push a
 trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
 taking a lot of time in the pack-objects phase.

Can you tell what pack-objects is doing? Perhaps strace -f -tt might
help. Or even just time git push, which will tell us whether we're
spinning on CPU or something else.

If you watch the progress meter, where does it spend its time? In
counting objects, compressing objects, or writing objects? Or
after that?

You said the repo is big. Is it reasonably packed (try running `git
gc`). Pack-objects may look at objects that you know the other side has
in order to create deltas against them. If you have some crazy situation
(e.g., you have several thousand packfiles) that can slow down object
access quite a bit. A minute is more than I would expect, but if VFS
operations in your VM are slow, that could be it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Slow git pushes: sitting 1 minute in pack-objects

2015-03-05 Thread Stephen Morton
(Apologies, after a day I'm cross-posting from git.users. I think the question
is maybe too technical for that group.)

I'm experiencing very slow git pushes. On the order of 1 minute to push a
trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
taking a lot of time in the pack-objects phase.

Others are not seeing this with the same repo, but I'm the only one working
in a VM.

```
~/ws/git/repo.1/repo  date; git push mortons; date
Wed Mar  4 15:03:11 EST 2015
15:03:11.086758 git.c:349   trace: built-in: git 'push' 'mortons'
15:03:11.126665 run-command.c:341   trace: run_command: 'ssh' '-p'
'7999' 'git@privacy.privacy' 'git-receive-pack
'\''~mortons/repo.git'\'''
15:03:20.383341 run-command.c:341   trace: run_command:
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
15:03:20.383945 exec_cmd.c:134  trace: exec: 'git'
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
15:03:20.385168 git.c:349   trace: built-in: git
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 20.86 KiB | 0 bytes/s, done.
Total 4 (delta 0), reused 0 (delta 0)
To ssh://git@privacy.privacy:7999/~mortons/repo.git
   5fe662f..a137bda  my_branch - my_branch
Wed Mar  4 15:04:22 EST 2015

```

After it was slow at first, I tried setting these which did not help

repack.writebitmaps=true
pack.windowmemory=100m



Details:
git version 2.1.4
OS: CentOS 6.6 64-bit in a VM.
repo size: huge. 6 GB .git directory, around 800 MB working tree.
VM has 8 MB RAM and 8 cores.
CPU: i7, 8 core (4 cores hyperthreaded)

It is an ext4 filesystem on the guest linux drive.
On the host side, it is a .vmdk file and the virtualization software used is
virtualbox. While the drive is dynamically allocated, after I ran into
this issue,
I used fallocate to create a 50GB dummy file and then delete it to
ensure that
there was headroom in the drive and that dynamic allocation slowness was not the
issue, and subsequent pushes were still slow.
I have not experienced any filesystem slowness issues in the
months I've been using this vm.


Any ideas? I'm evaluating a move to git and this is the kind of thing
that could derail it.
Thanks,

Stephen
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-05 Thread Duy Nguyen
On Fri, Mar 6, 2015 at 4:03 AM, Stephen Morton
stephen.c.mor...@gmail.com wrote:
 I'm experiencing very slow git pushes. On the order of 1 minute to push a
 trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
 taking a lot of time in the pack-objects phase.

 Others are not seeing this with the same repo, but I'm the only one working
 in a VM.

 ...

 Details:
 git version 2.1.4
 OS: CentOS 6.6 64-bit in a VM.
 repo size: huge. 6 GB .git directory, around 800 MB working tree.
 VM has 8 MB RAM and 8 cores.

Is it 8 GB or MB RAM?

 CPU: i7, 8 core (4 cores hyperthreaded)
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow git pushes: sitting 1 minute in pack-objects

2015-03-05 Thread Stephen Morton
8GB of RAM.

Sorry, typo.

Steve


On Thu, Mar 5, 2015 at 7:25 PM, Duy Nguyen pclo...@gmail.com wrote:
 On Fri, Mar 6, 2015 at 4:03 AM, Stephen Morton
 stephen.c.mor...@gmail.com wrote:
 I'm experiencing very slow git pushes. On the order of 1 minute to push a
 trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
 taking a lot of time in the pack-objects phase.

 Others are not seeing this with the same repo, but I'm the only one working
 in a VM.

 ...

 Details:
 git version 2.1.4
 OS: CentOS 6.6 64-bit in a VM.
 repo size: huge. 6 GB .git directory, around 800 MB working tree.
 VM has 8 MB RAM and 8 cores.

 Is it 8 GB or MB RAM?

 CPU: i7, 8 core (4 cores hyperthreaded)
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html