Re: Slow git pushes: sitting 1 minute in pack-objects
On Mon, Mar 16, 2015 at 6:15 PM, Jeff King p...@peff.net wrote: On Mon, Mar 09, 2015 at 09:37:25PM -0400, Stephen Morton wrote: 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time when straced, but I think it's much less, as little as 10s when not straced. It then reads a bunch of what look like objects from filehandle 0 (presumably stdin, read from the remote end?) It then tries to lstat() these filenames in .git/sha1 ./git/refs/sha1, .git/heads/sha, etc. It always fails ENOENT. It fails some 120,000 times. This could be a problem. Though I haven't checked to see if this happens on a fast push on another machine. Hmm. The push process must feed the set of object boundaries to pack-objects so it knows what to pack (i.e., what we want to send, and what the other side has). 120,000 is an awfully large number of objects to be pass there, though. Does the repo you are pushing to by any chance have an extremely large number of refs (e.g., on the order of 120K tags)? No. There are on the order of 9,500 refs (mostly tags) but nowhere near 120k. Can you try building git with this patch which would tell us more about where those objects are coming from? patch snipped Those are all rather blunt debugging methods, but hopefully it can get us a sense of where the time is going. -Peff Thanks Peff, I haven't tried your patch, but I tried to backtrack a bit and double-check that the problem always happened in a fresh clone with no other modifications. It did _not_ happen in a new clone --a push took just 5s -- and I think the culprit could have been repack.writebitmaps=true. Although I had thought writebitmaps was not originally enabled, I now suspect that it was. Let me follow up on that first, before I recompile git with your changes. Thanks again, I really appreciate the help. Steve -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow git pushes: sitting 1 minute in pack-objects
On Thu, Mar 19, 2015 at 04:31:36PM -0400, Stephen Morton wrote: Hmm. The push process must feed the set of object boundaries to pack-objects so it knows what to pack (i.e., what we want to send, and what the other side has). 120,000 is an awfully large number of objects to be pass there, though. Does the repo you are pushing to by any chance have an extremely large number of refs (e.g., on the order of 120K tags)? No. There are on the order of 9,500 refs (mostly tags) but nowhere near 120k. I think you mentioned that it uses alternates to share objects with other repos. Does the repository (or repositories) pointed to by the alternates file have a large number of refs (especially distinct refs, as I think modern git will squelch duplicate sha1s). It did _not_ happen in a new clone --a push took just 5s -- and I think the culprit could have been repack.writebitmaps=true. Although I had thought writebitmaps was not originally enabled, I now suspect that it was. Let me follow up on that first, before I recompile git with your changes. It's certainly possible that bitmaps have an impact here. They should not contribute to the 120K objects being passed to pack-objects, but it's possible that size is a red herring (or possibly the number of objects is choking something in the bitmap code path that does not have problems otherwise). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow git pushes: sitting 1 minute in pack-objects
On Mon, Mar 09, 2015 at 09:37:25PM -0400, Stephen Morton wrote: 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time when straced, but I think it's much less, as little as 10s when not straced. It then reads a bunch of what look like objects from filehandle 0 (presumably stdin, read from the remote end?) It then tries to lstat() these filenames in .git/sha1 ./git/refs/sha1, .git/heads/sha, etc. It always fails ENOENT. It fails some 120,000 times. This could be a problem. Though I haven't checked to see if this happens on a fast push on another machine. Hmm. The push process must feed the set of object boundaries to pack-objects so it knows what to pack (i.e., what we want to send, and what the other side has). 120,000 is an awfully large number of objects to be pass there, though. Does the repo you are pushing to by any chance have an extremely large number of refs (e.g., on the order of 120K tags)? Can you try building git with this patch which would tell us more about where those objects are coming from? diff --git a/send-pack.c b/send-pack.c index 9d2b0c5..17ace1f 100644 --- a/send-pack.c +++ b/send-pack.c @@ -74,10 +74,19 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru * We feed the pack-objects we just spawned with revision * parameters by writing to the pipe. */ + warning(feeding %d .have objects, extra-nr); for (i = 0; i extra-nr; i++) if (!feed_object(extra-sha1[i], po.in, 1)) break; + { + struct ref *p; + int count = 0; + for (p = refs; p; p = p-next) + count++; + warning(feeding %d refs, count); + } + while (refs) { if (!is_null_sha1(refs-old_sha1) !feed_object(refs-old_sha1, po.in, 1)) 4. Then it just sits there for almost 1 minute. It uses up almost 100% of a single core during this period. It spends a lot of time running lots of brk() (memory allocation) commands and then getting (polled by?) a SIGALRM every 1s. About 5.5+ GB (about the repo size) of VIRT memory is allocated. Only about 400M is ever RES. The SIGALRM is normal. That's the progress code checking in every 2 seconds to see if there's anything to print. My guess is that the allocation is probably coming as part of the preferred base code. This is a fancy term for stuff we are not going to send, but which we could use as delta bases for objects we are sending. Does the patch below speed things up (it is not a good thing to do in general, but would let us know if that is the problematic area): diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d816587..c90a352 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2307,7 +2307,6 @@ static void show_object(struct object *obj, static void show_edge(struct commit *commit) { - add_preferred_base(commit-object.sha1); } struct in_pack_object { If not, then the slowdown may come before we even get there, and possibly this patch would help: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d816587..473c0a3 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2531,6 +2531,8 @@ static void get_object_list(int ac, const char **av) } die(not a rev '%s', line); } + if (*line == '^') + continue; if (handle_revision_arg(line, revs, flags, REVARG_CANNOT_BE_FILENAME)) die(bad revision '%s', line); } Those are all rather blunt debugging methods, but hopefully it can get us a sense of where the time is going. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow git pushes: sitting 1 minute in pack-objects
Thanks Peff, I've done an strace and here's what I see. I'll try to put relevant information in as legible a form as possible. The operation is cpu-bound on a single core (note that yes, delta compression is using 8 threads. so that's obviously not the bottleneck) for the duration of the pack-objects. Timing is tough, because it's slower when straced. I don't know if it matters, but this push is going to a fork on the remote end most of whose objects are in the objects/info/alternatives location. (Which is still on the local drive of the remote, not a network drive or anything like that.) 1. GIT_TRACE=1 time time git push 12:21:28.980410 git.c:349 trace: built-in: git 'push' 12:21:29.099547 run-command.c:341 trace: run_command: 'ssh' '-p' '7999' 'git@server.privacy' 'git-receive-pack '\''~smorton/panos.git'\''' 12:21:42.863968 run-command.c:341 trace: run_command: 'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin' '--delta-base-offset' '--progress' 12:21:42.871170 exec_cmd.c:134 trace: exec: 'git' 'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin' '--delta-base-offset' '--progress' 12:21:42.907783 git.c:349 trace: built-in: git 'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin' '--delta-base-offset' '--progress' 2. At this point, git pack-objects is forked 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time when straced, but I think it's much less, as little as 10s when not straced. It then reads a bunch of what look like objects from filehandle 0 (presumably stdin, read from the remote end?) It then tries to lstat() these filenames in .git/sha1 ./git/refs/sha1, .git/heads/sha, etc. It always fails ENOENT. It fails some 120,000 times. This could be a problem. Though I haven't checked to see if this happens on a fast push on another machine. This takes a lot of time when straced, but I think less time when not straced. 4. Then it just sits there for almost 1 minute. It uses up almost 100% of a single core during this period. It spends a lot of time running lots of brk() (memory allocation) commands and then getting (polled by?) a SIGALRM every 1s. About 5.5+ GB (about the repo size) of VIRT memory is allocated. Only about 400M is ever RES. 5. Then we get to the Counting objects phase. Note the number of threads specified: so it Counting objects: 4, done. Delta compression using up to 8 threads. Compressing objects: 100% (4/4), done. ... 6. Time. Note the 121,000 pagefaults 46.08user 0.67system 0:49.47elapsed 94%CPU (0avgtext+0avgdata 1720240maxresident)k 0inputs+16outputs (0major+121199minor)pagefaults 0swaps Steve On Mon, Mar 9, 2015 at 3:53 AM, Jeff King p...@peff.net wrote: On Thu, Mar 05, 2015 at 04:03:07PM -0500, Stephen Morton wrote: I'm experiencing very slow git pushes. On the order of 1 minute to push a trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be taking a lot of time in the pack-objects phase. Can you tell what pack-objects is doing? Perhaps strace -f -tt might help. Or even just time git push, which will tell us whether we're spinning on CPU or something else. If you watch the progress meter, where does it spend its time? In counting objects, compressing objects, or writing objects? Or after that? You said the repo is big. Is it reasonably packed (try running `git gc`). Pack-objects may look at objects that you know the other side has in order to create deltas against them. If you have some crazy situation (e.g., you have several thousand packfiles) that can slow down object access quite a bit. A minute is more than I would expect, but if VFS operations in your VM are slow, that could be it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow git pushes: sitting 1 minute in pack-objects
On Thu, Mar 05, 2015 at 04:03:07PM -0500, Stephen Morton wrote: I'm experiencing very slow git pushes. On the order of 1 minute to push a trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be taking a lot of time in the pack-objects phase. Can you tell what pack-objects is doing? Perhaps strace -f -tt might help. Or even just time git push, which will tell us whether we're spinning on CPU or something else. If you watch the progress meter, where does it spend its time? In counting objects, compressing objects, or writing objects? Or after that? You said the repo is big. Is it reasonably packed (try running `git gc`). Pack-objects may look at objects that you know the other side has in order to create deltas against them. If you have some crazy situation (e.g., you have several thousand packfiles) that can slow down object access quite a bit. A minute is more than I would expect, but if VFS operations in your VM are slow, that could be it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Slow git pushes: sitting 1 minute in pack-objects
(Apologies, after a day I'm cross-posting from git.users. I think the question is maybe too technical for that group.) I'm experiencing very slow git pushes. On the order of 1 minute to push a trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be taking a lot of time in the pack-objects phase. Others are not seeing this with the same repo, but I'm the only one working in a VM. ``` ~/ws/git/repo.1/repo date; git push mortons; date Wed Mar 4 15:03:11 EST 2015 15:03:11.086758 git.c:349 trace: built-in: git 'push' 'mortons' 15:03:11.126665 run-command.c:341 trace: run_command: 'ssh' '-p' '7999' 'git@privacy.privacy' 'git-receive-pack '\''~mortons/repo.git'\''' 15:03:20.383341 run-command.c:341 trace: run_command: 'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin' '--delta-base-offset' '--progress' 15:03:20.383945 exec_cmd.c:134 trace: exec: 'git' 'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin' '--delta-base-offset' '--progress' 15:03:20.385168 git.c:349 trace: built-in: git 'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin' '--delta-base-offset' '--progress' Counting objects: 4, done. Delta compression using up to 8 threads. Compressing objects: 100% (4/4), done. Writing objects: 100% (4/4), 20.86 KiB | 0 bytes/s, done. Total 4 (delta 0), reused 0 (delta 0) To ssh://git@privacy.privacy:7999/~mortons/repo.git 5fe662f..a137bda my_branch - my_branch Wed Mar 4 15:04:22 EST 2015 ``` After it was slow at first, I tried setting these which did not help repack.writebitmaps=true pack.windowmemory=100m Details: git version 2.1.4 OS: CentOS 6.6 64-bit in a VM. repo size: huge. 6 GB .git directory, around 800 MB working tree. VM has 8 MB RAM and 8 cores. CPU: i7, 8 core (4 cores hyperthreaded) It is an ext4 filesystem on the guest linux drive. On the host side, it is a .vmdk file and the virtualization software used is virtualbox. While the drive is dynamically allocated, after I ran into this issue, I used fallocate to create a 50GB dummy file and then delete it to ensure that there was headroom in the drive and that dynamic allocation slowness was not the issue, and subsequent pushes were still slow. I have not experienced any filesystem slowness issues in the months I've been using this vm. Any ideas? I'm evaluating a move to git and this is the kind of thing that could derail it. Thanks, Stephen -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow git pushes: sitting 1 minute in pack-objects
On Fri, Mar 6, 2015 at 4:03 AM, Stephen Morton stephen.c.mor...@gmail.com wrote: I'm experiencing very slow git pushes. On the order of 1 minute to push a trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be taking a lot of time in the pack-objects phase. Others are not seeing this with the same repo, but I'm the only one working in a VM. ... Details: git version 2.1.4 OS: CentOS 6.6 64-bit in a VM. repo size: huge. 6 GB .git directory, around 800 MB working tree. VM has 8 MB RAM and 8 cores. Is it 8 GB or MB RAM? CPU: i7, 8 core (4 cores hyperthreaded) -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Slow git pushes: sitting 1 minute in pack-objects
8GB of RAM. Sorry, typo. Steve On Thu, Mar 5, 2015 at 7:25 PM, Duy Nguyen pclo...@gmail.com wrote: On Fri, Mar 6, 2015 at 4:03 AM, Stephen Morton stephen.c.mor...@gmail.com wrote: I'm experiencing very slow git pushes. On the order of 1 minute to push a trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be taking a lot of time in the pack-objects phase. Others are not seeing this with the same repo, but I'm the only one working in a VM. ... Details: git version 2.1.4 OS: CentOS 6.6 64-bit in a VM. repo size: huge. 6 GB .git directory, around 800 MB working tree. VM has 8 MB RAM and 8 cores. Is it 8 GB or MB RAM? CPU: i7, 8 core (4 cores hyperthreaded) -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html