Re: Seems to be pushing more than necessary

2015-03-23 Thread Graham Hay
If I push straight to the other repo, it only pushes the 3 objects I'd
expect (instead of 10,000+). So it looks like that is the problem, but
I don't really understand why.

From my point of view, there should be no difference, but I clearly
don't understand how it actually works. How does git decide what refs
and/or objects are the same?

For a bit of background, the reason I have 2 remotes is to try and
avoid pushing to master. We work in a highly regulated industry, and
our code needs to be reviewed before hitting the mainline. So I push
to my fork and create a PR to the blessed repo, that way if I
accidentally commit to master (I have form!) then I have an extra
chance to catch it and don't have to back it out.

The two repos started out the same though, the only differences should
be the new work I have done. Is there any way I can continue to work
like this, or do I have to choose between slow pushes and safety?

On 23 March 2015 at 10:41, Duy Nguyen pclo...@gmail.com wrote:
 On Mon, Mar 23, 2015 at 5:35 PM, Graham Hay grahamr...@gmail.com wrote:
 Hmm. I'm using a private fork of a repo, I pull from one and push to
 the other, e.g.

 git fetch foo
 git rebase foo/master
 git push --set-upstream origin bar

 It's quite possible my workflow is causing the problem, but I'm not
 sure what I could do differently. What do you mean by a no-share
 remote?

 I mean the refs (and associated objects) that are available on foo
 may be not available on bar so when you push to origin you just
 need to send more. That rebase could generate lots of new objects to
 push out too, I think.
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seems to be pushing more than necessary

2015-03-23 Thread Graham Hay
Hmm. I'm using a private fork of a repo, I pull from one and push to
the other, e.g.

git fetch foo
git rebase foo/master
git push --set-upstream origin bar

It's quite possible my workflow is causing the problem, but I'm not
sure what I could do differently. What do you mean by a no-share
remote?

On 23 March 2015 at 10:05, Duy Nguyen pclo...@gmail.com wrote:
 On Thu, Mar 19, 2015 at 6:11 PM, Graham Hay grahamr...@gmail.com wrote:
 Try fast-export --anonymize as that would help us understand this.

 Attached.

 The bad news is it seems to be working for me (I recreated the remote
 repo from this dump). I notice that you have two remotes, one shares
 many refs (the remote ref39). The other, ref2, does not share any
 SHA-1 with refs in .git/refs/heads/. Any chance you push to a
 no-share remote, which results in a lot of objects to be sent?
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Seems to be pushing more than necessary

2015-03-20 Thread Graham Hay
That all seems quite reasonable, and is what I would expect to happen.

However at the moment, if I create a branch from master and edit one
line in one file,
with no other changes on the remote, it takes me over an hour to push
the new branch.

On 19 March 2015 at 18:36, Junio C Hamano gits...@pobox.com wrote:
 Graham Hay grahamr...@gmail.com writes:

 We have a fairly large repo (~2.4GB), mainly due to binary resources
 (for an ios app). I know this can generally be a problem, but I have a
 specific question.

 If I cut a branch, and edit a few (non-binary) files, and push, what
 should be uploaded? I assumed it was just the diff (I know whole
 compressed files are used, I mean the differences between my branch
 and where I cut it from). Is that correct?

 If you start from this state:

  (the 'origin')(you)
 ---Z---A clone ----Z---A

 and edit a few files, say, a/b, a/c and d/e/f, and committed to make
 the history look like this:

  (the 'origin')(you)
 ---Z---A ---Z---A---B

 i.e. git diff --name-only A B would show these three files, then
 the next push from you to the origin, i.e.

  (the 'origin')(you)
 ---Z---A---B- push  ---Z---A---B

 would involve transferring from you to the origin of the following:

  * The commit object that holds the message, authorship, etc. for B
  * The top-level tree object of commit B (as that is different from
that of A)
  * The tree object for 'a', 'd', 'd/e' and the blob object for
'a/b', 'a/c', and 'd/e/f'.

 However, that assumes that nothing is happening on the 'origin'
 side.

 If the 'origin', for example, rewound its head to Z before you
 attempt to push your B, then you may end up sending objects that do
 not exist in Z that are reachable from B.  Just like the above
 bullet points enumerated what is different between A and B, you
 can enumerate what is different between Z and A and add that to the
 above set.  That would be what will be sent.

 If the 'origin' updated its tip to a commit you do not even know
 about, normally you will be prevented from pushing B because we
 would not want you to lose somebody else's work.  If you forced such
 push, then you may end up sending a lot more.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seems to be pushing more than necessary

2015-03-18 Thread Graham Hay
 It would help if you pasted the push output. For example, does it stop
 at 20% at the compressing objects line or writing objects. How
 many total objects does it say?

It rattles through compressing objects, and the first 20% of
writing objects, then slows to a crawl.

Writing objects:  33% (3647/10804), 80.00 MiB | 112.00 KiB/s


 Another question is how big are these binary files on average? Git
 considers a file is big if its size is 512MB or more (see
 core.bigFileThreshold). If your binary files are are mostly under this
 limit, but still big enough, then git may still try to compare new
 objects with these to find the smallest diff to send. If it's the
 case, you could set core.bigFileThreshold to cover these binary files.

None of the files are very big (KB rather than MB), but there's a lot
of them. I'll try setting the threshold to something lower, thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: Seems to be pushing more than necessary

2015-03-18 Thread Graham Hay
We have a fairly large repo (~2.4GB), mainly due to binary resources
(for an ios app). I know this can generally be a problem, but I have a
specific question.

If I cut a branch, and edit a few (non-binary) files, and push, what
should be uploaded? I assumed it was just the diff (I know whole
compressed files are used, I mean the differences between my branch
and where I cut it from). Is that correct?

Because when I push, it grinds to a halt at the 20% mark, and feels
like it's trying to push the entire repo. If I run git diff --stat
--cached origin/foo I see the files I would expect (i.e. just those
that have changed). If I run git format-patch origin/foo..foo the
patch files total 1.7MB, which should upload in just a few seconds,
but I've had pushes take over an hour. I'm using git 2.2.2 on Mac OS X
(Mavericks), and ssh (g...@github.com).

Am I doing it wrong? Is this the expected behaviour? If not, is
there anything I can do to debug it?

Any help gratefully received.

Thanks,

Graham
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seems to be pushing more than necessary

2015-03-18 Thread Graham Hay
Are there any commands that I can use to show exactly what it is trying to push?

I'll see if I can create a (public) repo that has the same problem.
Thanks for your help.


 This 10804 looks wrong (i.e. sending that many compressed objects).
 Also 80 MiB sent at that point. If you modify just a couple files,
 something is really wrong because the number of new objects may be
 hundreds at most, not thousands.

 v2.2.2 supports git fast-export --anonymize [1] to create an
 anonymized clone of your repo that you can share, which might help
 us understand the problem.

 There's also the environment variable GIT_TRACE_PACKET that can help
 see what's going on at the protocol level, but I think you're on your
 own because without access to this repo, SHA-1s from that trace may
 not make much sense.

 [1] https://github.com/git/git/commit/a8722750985a53cc502a66ae3d68a9e42c7fdb98
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seems to be pushing more than necessary

2015-03-18 Thread Graham Hay
I created a repo with over 1GB of images, but it works as expected
(only pushed 3 objects).

Sorry, I must have done something wrong. I put that script in
~/Applications, and checked it worked. Then I ran this:

$ GIT_TRACE=2 PATH=~/Applications:$PATH git push --set-upstream origin git-wtf
12:48:28.839026 git.c:349   trace: built-in: git 'push'
'--set-upstream' 'origin' 'git-wtf'
12:48:28.907605 run-command.c:351   trace: run_command: 'ssh'
'g...@github.com' 'git-receive-pack
'\''grahamrhay/bornlucky-ios.git'\'''
12:48:30.137410 run-command.c:351   trace: run_command:
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
12:48:30.138246 exec_cmd.c:130  trace: exec: 'git'
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
12:48:30.144783 git.c:349   trace: built-in: git
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
Counting objects: 10837, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (9301/9301), done.
Writing objects:  21% (2276/10837)

but there was nothing in /tmp/stdin. Have I missed a step? I tried
changing the tee to point to ~ in case it was permissions related.

I fear this is some Mac nonsense. I added an echo in the script, but
it only gets called for the first git incantation.


On 18 March 2015 at 12:34, Duy Nguyen pclo...@gmail.com wrote:
 On Wed, Mar 18, 2015 at 7:26 PM, Duy Nguyen pclo...@gmail.com wrote:
 It's quite a lot of work :) I created this script named git and put
 it in $PATH to capture input for pack-objects. You'll need to update
 /path/to/real/git to point to the real binary then you'll get
 /tmp/stdin

 Forgot one important sentence: You need to push again using this fake
 git program to save data in /tmp/stdin. Also you can stop the push
 when it goes to compressing objects phase.
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seems to be pushing more than necessary

2015-03-18 Thread Graham Hay
Got there eventually!

$ git verify-pack --verbose bar.pack
e13e21a1f49704ed35ddc3b15b6111a5f9b34702 commit 220 152 12
03691863451ef9db6c69493da1fa556f9338a01d commit 334 227 164
... snip ...
chain length = 50: 2 objects
bar.pack: ok

Now what do I do with it :)

On 18 March 2015 at 13:33, Duy Nguyen pclo...@gmail.com wrote:
 On Wed, Mar 18, 2015 at 8:16 PM, Graham Hay grahamr...@gmail.com wrote:
 I created a repo with over 1GB of images, but it works as expected
 (only pushed 3 objects).

 Sorry, I must have done something wrong. I put that script in
 ~/Applications, and checked it worked. Then I ran this:

 $ GIT_TRACE=2 PATH=~/Applications:$PATH git push --set-upstream origin 
 git-wtf

 I think I encountered the same problem. Inserting
 --exec-path=$HOME/Applications between git and push was probably
 what made it work for me. Haven't investigated the reason yet. We
 really should have an easier way to get this info without jumping
 through hoops like this.
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html