Re: git-p4 out of memory for very large repository

2013-09-07 Thread Pete Wyckoff
cmt...@gmail.com wrote on Fri, 06 Sep 2013 15:03 -0400:
 Finally, I claim success!  Unfortunately I did not try either of the OOM
 score or strace suggestions - sorry!  After spending so much time on
 this, I've gotten to the point that I'm more interested in getting it to
 work than in figuring out why the direct approach isn't working; it
 sounds like you're both pretty confident that git is working as it
 should, and I don't maintain the system I'm doing this on so I don't
 doubt that there might be some artificial limit or other quirk here that
 we just aren't seeing.
 
 Anyway, what I found is that Pete's incremental method does work, I just
 have to know how to do it properly!  This is what I WAS doing to
 generate the error message I pasted several posts ago:
 
 git clone //path/to/branch@begin,stage1
 cd branch
 git sync //path/to/branch@stage2
 # ERROR!
 # (I also tried //path/to/branch@stage1+1,stage2, same error)
 
 Eventually what happened is that I downloaded the free 20-user p4d, set
 up a very small repository with only 4 changes, and started some old
 fashioned trial-and-error.  Here's what I should have been doing all
 along:
 
 git clone //path/to/branch@begin,stage1
 cd branch
 git sync //path/to/branch@begin,stage2
 git sync //path/to/branch@begin,stage3
 # and so on...
 
 And syncing a few thousand changes every day over the course of the past
 week, my git repo is finally up to the Perforce HEAD.  So I suppose
 ultimately this was my own misunderstanding, partly because when you
 begin your range at the original first change number the output looks
 suspiciously like it's importing changes again that it's already
 imported.  Maybe this is all documented somewhere, and if it is I just
 failed to find it.
 
 Thanks to both of you for all your help!

That you got it to work is the most important thing.  Amazing all
the effort you put into it; a lesser hacker would have walked
away much earlier.

The changes don't overlap.  If you give it a range that includes
changes already synced, git-p4 makes sure to start only at the
lowest change it has not yet seen.  I'll see if I can update the
docs somewhere.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-09-06 Thread Corey Thompson
On Mon, Sep 02, 2013 at 08:42:36PM +0100, Luke Diamand wrote:
 I guess you could try changing the OOM score for git-fast-import.
 
 change /proc/pid/oomadj.
 
 I think a value of -31 would make it very unlikely to be killed.
 
 On 29/08/13 23:46, Pete Wyckoff wrote:
 I usually just do git p4 sync @505859.  The error message below
 crops up when things get confused.  Usually after a previous
 error.  I tend to destroy the repo and try again.  Sorry I don't
 can't explain better what's happening here.  It's not a memory
 issue; it reports only 24 MB used.
 
 Bizarre.  There is no good explanation why memory usage would go
 up to 32 GB (?) within one top interval (3 sec ?).  My theory
 about one gigantic object is debunked:  you have only the 118 MB
 one.  Perhaps there's some container or process memory limit, as
 Luke guessed, but it's not obvious here.
 
 The other big hammer is strace.  If you're still interested in
 playing with this, you could do:
 
  strace -vf -tt -s 200 -o /tmp/strace.out git p4 clone 
 
 and hours later, see if something suggests itself toward the
 end of that output file.
 
  -- Pete
 

Finally, I claim success!  Unfortunately I did not try either of the OOM
score or strace suggestions - sorry!  After spending so much time on
this, I've gotten to the point that I'm more interested in getting it to
work than in figuring out why the direct approach isn't working; it
sounds like you're both pretty confident that git is working as it
should, and I don't maintain the system I'm doing this on so I don't
doubt that there might be some artificial limit or other quirk here that
we just aren't seeing.

Anyway, what I found is that Pete's incremental method does work, I just
have to know how to do it properly!  This is what I WAS doing to
generate the error message I pasted several posts ago:

git clone //path/to/branch@begin,stage1
cd branch
git sync //path/to/branch@stage2
# ERROR!
# (I also tried //path/to/branch@stage1+1,stage2, same error)

Eventually what happened is that I downloaded the free 20-user p4d, set
up a very small repository with only 4 changes, and started some old
fashioned trial-and-error.  Here's what I should have been doing all
along:

git clone //path/to/branch@begin,stage1
cd branch
git sync //path/to/branch@begin,stage2
git sync //path/to/branch@begin,stage3
# and so on...

And syncing a few thousand changes every day over the course of the past
week, my git repo is finally up to the Perforce HEAD.  So I suppose
ultimately this was my own misunderstanding, partly because when you
begin your range at the original first change number the output looks
suspiciously like it's importing changes again that it's already
imported.  Maybe this is all documented somewhere, and if it is I just
failed to find it.

Thanks to both of you for all your help!
Corey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-09-02 Thread Luke Diamand

I guess you could try changing the OOM score for git-fast-import.

change /proc/pid/oomadj.

I think a value of -31 would make it very unlikely to be killed.

On 29/08/13 23:46, Pete Wyckoff wrote:

cmt...@gmail.com wrote on Wed, 28 Aug 2013 11:41 -0400:

On Mon, Aug 26, 2013 at 09:47:56AM -0400, Corey Thompson wrote:

You are correct that git-fast-import is killed by the OOM killer, but I
was unclear about which process was malloc()ing so much memory that the
OOM killer got invoked (as other completely unrelated processes usually
also get killed when this happens).

Unless there's one gigantic file in one change that gets removed by
another change, I don't think that's the problem; as I mentioned in
another email, the machine has 32GB physical memory and the largest
single file in the current head is only 118MB.  Even if there is a very
large transient file somewhere in the history, I seriously doubt it's
tens of gigabytes in size.

I have tried watching it with top before, but it takes several hours
before it dies.  I haven't been able to see any explosion of memory
usage, even within the final hour, but I've never caught it just before
it dies, either.  I suspect that whatever the issue is here, it happens
very quickly.

If I'm unable to get through this today using the incremental p4 sync
method you described, I'll try running a full-blown clone overnight with
top in batch mode writing to a log file to see whether it catches
anything.

Thanks again,
Corey


Unforunately I have not made much progress.  The incremental sync method
fails with the output pasted below.  The change I specified is only one
change number above where that repo was cloned...


I usually just do git p4 sync @505859.  The error message below
crops up when things get confused.  Usually after a previous
error.  I tend to destroy the repo and try again.  Sorry I don't
can't explain better what's happening here.  It's not a memory
issue; it reports only 24 MB used.


So I tried a 'git p4 rebase' overnight with top running, and as I feared
I did not see anything out of the ordinary.  git, git-fast-import, and
git-p4 all hovered under 1.5% MEM the entire time, right up until
death.  The last entry in my log shows git-fast-import at 0.8%, with git
and git-p4 at 0.0% and 0.1%, respectively.  I could try again with a
more granular period, but I feel like this method is ultimately a goose
chase.


Bizarre.  There is no good explanation why memory usage would go
up to 32 GB (?) within one top interval (3 sec ?).  My theory
about one gigantic object is debunked:  you have only the 118 MB
one.  Perhaps there's some container or process memory limit, as
Luke guessed, but it's not obvious here.

The other big hammer is strace.  If you're still interested in
playing with this, you could do:

 strace -vf -tt -s 200 -o /tmp/strace.out git p4 clone 

and hours later, see if something suggests itself toward the
end of that output file.

-- Pete


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-29 Thread Pete Wyckoff
cmt...@gmail.com wrote on Wed, 28 Aug 2013 11:41 -0400:
 On Mon, Aug 26, 2013 at 09:47:56AM -0400, Corey Thompson wrote:
  You are correct that git-fast-import is killed by the OOM killer, but I
  was unclear about which process was malloc()ing so much memory that the
  OOM killer got invoked (as other completely unrelated processes usually
  also get killed when this happens).
  
  Unless there's one gigantic file in one change that gets removed by
  another change, I don't think that's the problem; as I mentioned in
  another email, the machine has 32GB physical memory and the largest
  single file in the current head is only 118MB.  Even if there is a very
  large transient file somewhere in the history, I seriously doubt it's
  tens of gigabytes in size.
  
  I have tried watching it with top before, but it takes several hours
  before it dies.  I haven't been able to see any explosion of memory
  usage, even within the final hour, but I've never caught it just before
  it dies, either.  I suspect that whatever the issue is here, it happens
  very quickly.
  
  If I'm unable to get through this today using the incremental p4 sync
  method you described, I'll try running a full-blown clone overnight with
  top in batch mode writing to a log file to see whether it catches
  anything.
  
  Thanks again,
  Corey
 
 Unforunately I have not made much progress.  The incremental sync method
 fails with the output pasted below.  The change I specified is only one
 change number above where that repo was cloned...

I usually just do git p4 sync @505859.  The error message below
crops up when things get confused.  Usually after a previous
error.  I tend to destroy the repo and try again.  Sorry I don't
can't explain better what's happening here.  It's not a memory
issue; it reports only 24 MB used.

 So I tried a 'git p4 rebase' overnight with top running, and as I feared
 I did not see anything out of the ordinary.  git, git-fast-import, and
 git-p4 all hovered under 1.5% MEM the entire time, right up until
 death.  The last entry in my log shows git-fast-import at 0.8%, with git
 and git-p4 at 0.0% and 0.1%, respectively.  I could try again with a
 more granular period, but I feel like this method is ultimately a goose
 chase.

Bizarre.  There is no good explanation why memory usage would go
up to 32 GB (?) within one top interval (3 sec ?).  My theory
about one gigantic object is debunked:  you have only the 118 MB
one.  Perhaps there's some container or process memory limit, as
Luke guessed, but it's not obvious here.

The other big hammer is strace.  If you're still interested in
playing with this, you could do:

strace -vf -tt -s 200 -o /tmp/strace.out git p4 clone 

and hours later, see if something suggests itself toward the
end of that output file.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-28 Thread Corey Thompson
On Mon, Aug 26, 2013 at 09:47:56AM -0400, Corey Thompson wrote:
 You are correct that git-fast-import is killed by the OOM killer, but I
 was unclear about which process was malloc()ing so much memory that the
 OOM killer got invoked (as other completely unrelated processes usually
 also get killed when this happens).
 
 Unless there's one gigantic file in one change that gets removed by
 another change, I don't think that's the problem; as I mentioned in
 another email, the machine has 32GB physical memory and the largest
 single file in the current head is only 118MB.  Even if there is a very
 large transient file somewhere in the history, I seriously doubt it's
 tens of gigabytes in size.
 
 I have tried watching it with top before, but it takes several hours
 before it dies.  I haven't been able to see any explosion of memory
 usage, even within the final hour, but I've never caught it just before
 it dies, either.  I suspect that whatever the issue is here, it happens
 very quickly.
 
 If I'm unable to get through this today using the incremental p4 sync
 method you described, I'll try running a full-blown clone overnight with
 top in batch mode writing to a log file to see whether it catches
 anything.
 
 Thanks again,
 Corey

Unforunately I have not made much progress.  The incremental sync method
fails with the output pasted below.  The change I specified is only one
change number above where that repo was cloned...

So I tried a 'git p4 rebase' overnight with top running, and as I feared
I did not see anything out of the ordinary.  git, git-fast-import, and
git-p4 all hovered under 1.5% MEM the entire time, right up until
death.  The last entry in my log shows git-fast-import at 0.8%, with git
and git-p4 at 0.0% and 0.1%, respectively.  I could try again with a
more granular period, but I feel like this method is ultimately a goose
chase.

Corey


$ git p4 sync //path/to/some/branch@505859
Doing initial import of //path/to/some/branch/ from revision @505859 into 
refs/remotes/p4/master
fast-import failed: warning: Not updating refs/remotes/p4/master (new tip 
29ef6ff25f1448fa2f907d22fd704594dc8769bd does not contain 
d477672be5ac6a00cc9175ba2713d5395660e840)
git-fast-import statistics:
-
Alloc'd objects: 165000
Total objects:   69 (232434 duplicates  )
  blobs  :   45 (209904 duplicates 40 deltas of 
42 attempts) 
  trees  :   23 ( 22530 duplicates  0 deltas of 
23 attempts) 
  commits:1 ( 0 duplicates  0 deltas of 
 0 attempts) 
  tags   :0 ( 0 duplicates  0 deltas of 
 0 attempts)
Total branches:   1 ( 1 loads )
  marks:   1024 ( 0 unique)
  atoms: 105170
Memory total: 24421 KiB
   pools: 17976 KiB
 objects:  6445 KiB
-
pack_report: getpagesize()=   4096
pack_report: core.packedGitWindowSize =   33554432
pack_report: core.packedGitLimit  =  268435456
pack_report: pack_used_ctr=   4371
pack_report: pack_mmap_calls  =124
pack_report: pack_open_windows=  8 /  9
pack_report: pack_mapped  =  268435456 /  268435456
-
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-26 Thread Corey Thompson
On Sun, Aug 25, 2013 at 11:50:01AM -0400, Pete Wyckoff wrote:
 Modern git, including your version, do streaming reads from p4,
 so the git-p4 python process never even holds a whole file's
 worth of data.  You're seeing git-fast-import die, it seems.  It
 will hold onto the entire file contents.  But just one, not the
 entire repo.  How big is the single largest file?
 
 You can import in pieces.  See the change numbers like this:
 
 p4 changes -m 1000 //depot/big/...
 p4 changes -m 1000 //depot/big/...@some-old-change
 
 Import something far enough back in history so that it seems
 to work:
 
 git p4 clone --destination=big //depot/big@60602
 cd big
 
 Sync up a bit at a time:
 
 git p4 sync @60700
 git p4 sync @60800
 ...
 
 I don't expect this to get around the problem you describe,
 however.  Sounds like there is one gigantic file that is causing
 git-fast-import to fill all of memory.  You will at least isolate
 the change.
 
 There are options to git-fast-import to limit max pack size
 and to cause it to skip importing files that are too big, if
 that would help.
 
 You can also use a client spec to hide the offending files
 from git.
 
 Can you watch with top?  Hit M to sort by memory usage, and
 see how big the processes get before falling over.
 
   -- Pete

You are correct that git-fast-import is killed by the OOM killer, but I
was unclear about which process was malloc()ing so much memory that the
OOM killer got invoked (as other completely unrelated processes usually
also get killed when this happens).

Unless there's one gigantic file in one change that gets removed by
another change, I don't think that's the problem; as I mentioned in
another email, the machine has 32GB physical memory and the largest
single file in the current head is only 118MB.  Even if there is a very
large transient file somewhere in the history, I seriously doubt it's
tens of gigabytes in size.

I have tried watching it with top before, but it takes several hours
before it dies.  I haven't been able to see any explosion of memory
usage, even within the final hour, but I've never caught it just before
it dies, either.  I suspect that whatever the issue is here, it happens
very quickly.

If I'm unable to get through this today using the incremental p4 sync
method you described, I'll try running a full-blown clone overnight with
top in batch mode writing to a log file to see whether it catches
anything.

Thanks again,
Corey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-25 Thread Pete Wyckoff
cmt...@gmail.com wrote on Fri, 23 Aug 2013 07:48 -0400:
 On Fri, Aug 23, 2013 at 08:16:58AM +0100, Luke Diamand wrote:
  On 23/08/13 02:12, Corey Thompson wrote:
  Hello,
  
  Has anyone actually gotten git-p4 to clone a large Perforce repository?
  
  Yes. I've cloned repos with a couple of Gig of files.
  
  I have one codebase in particular that gets to about 67%, then
  consistently gets get-fast-import (and often times a few other
  processes) killed by the OOM killer.
[..]
 Sorry, I guess I could have included more details in my original post.
 Since then, I have also made an attempt to clone another (slightly more
 recent) branch, and at last had success.  So I see this does indeed
 work, it just seems to be very unhappy with one particular branch.
 
 So, here are a few statistics I collected on the two branches.
 
 branch-that-fails:
 total workspace disk usage (current head): 12GB
 68 files over 20MB
 largest three being about 118MB
 
 branch-that-clones:
 total workspace disk usage (current head): 11GB
 22 files over 20MB
 largest three being about 80MB
 
 I suspect that part of the problem here might be that my company likes
 to submit very large binaries into our repo (.tar.gzs, pre-compiled
 third party binaries, etc.).
 
 Is there any way I can clone this in pieces?  The best I've come up with
 is to clone only up to a change number just before it tends to fail, and
 then rebase to the latest.  My clone succeeded, but the rebase still
 runs out of memory.  It would be great if I could specify a change
 number to rebase up to, so that I can just take this thing a few hundred
 changes at a time.

Modern git, including your version, do streaming reads from p4,
so the git-p4 python process never even holds a whole file's
worth of data.  You're seeing git-fast-import die, it seems.  It
will hold onto the entire file contents.  But just one, not the
entire repo.  How big is the single largest file?

You can import in pieces.  See the change numbers like this:

p4 changes -m 1000 //depot/big/...
p4 changes -m 1000 //depot/big/...@some-old-change

Import something far enough back in history so that it seems
to work:

git p4 clone --destination=big //depot/big@60602
cd big

Sync up a bit at a time:

git p4 sync @60700
git p4 sync @60800
...

I don't expect this to get around the problem you describe,
however.  Sounds like there is one gigantic file that is causing
git-fast-import to fill all of memory.  You will at least isolate
the change.

There are options to git-fast-import to limit max pack size
and to cause it to skip importing files that are too big, if
that would help.

You can also use a client spec to hide the offending files
from git.

Can you watch with top?  Hit M to sort by memory usage, and
see how big the processes get before falling over.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-23 Thread Luke Diamand

On 23/08/13 02:12, Corey Thompson wrote:

Hello,

Has anyone actually gotten git-p4 to clone a large Perforce repository?


Yes. I've cloned repos with a couple of Gig of files.


I have one codebase in particular that gets to about 67%, then
consistently gets get-fast-import (and often times a few other
processes) killed by the OOM killer.


What size is this codebase? Which version and platform of git are you using?

Maybe it's a regression, or perhaps you've hit some new, previously 
unknown size limit?


Thanks
Luke




I've found some patches out there that claim to resolve this, but
they're all for versions of git-p4.py from several years ago.  Not only
will they not apply cleanly, but as far as I can tell the issues that
these patches are meant to address aren't in the current version,
anyway.

Any suggestions would be greatly appreciated.

Thanks,
Corey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-23 Thread Corey Thompson
On Fri, Aug 23, 2013 at 08:16:58AM +0100, Luke Diamand wrote:
 On 23/08/13 02:12, Corey Thompson wrote:
 Hello,
 
 Has anyone actually gotten git-p4 to clone a large Perforce repository?
 
 Yes. I've cloned repos with a couple of Gig of files.
 
 I have one codebase in particular that gets to about 67%, then
 consistently gets get-fast-import (and often times a few other
 processes) killed by the OOM killer.
 
 What size is this codebase? Which version and platform of git are you using?
 
 Maybe it's a regression, or perhaps you've hit some new, previously
 unknown size limit?
 
 Thanks
 Luke
 
 
 
 I've found some patches out there that claim to resolve this, but
 they're all for versions of git-p4.py from several years ago.  Not only
 will they not apply cleanly, but as far as I can tell the issues that
 these patches are meant to address aren't in the current version,
 anyway.
 
 Any suggestions would be greatly appreciated.
 
 Thanks,
 Corey
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

Sorry, I guess I could have included more details in my original post.
Since then, I have also made an attempt to clone another (slightly more
recent) branch, and at last had success.  So I see this does indeed
work, it just seems to be very unhappy with one particular branch.

So, here are a few statistics I collected on the two branches.

branch-that-fails:
total workspace disk usage (current head): 12GB
68 files over 20MB
largest three being about 118MB

branch-that-clones:
total workspace disk usage (current head): 11GB
22 files over 20MB
largest three being about 80MB

I suspect that part of the problem here might be that my company likes
to submit very large binaries into our repo (.tar.gzs, pre-compiled
third party binaries, etc.).

Is there any way I can clone this in pieces?  The best I've come up with
is to clone only up to a change number just before it tends to fail, and
then rebase to the latest.  My clone succeeded, but the rebase still
runs out of memory.  It would be great if I could specify a change
number to rebase up to, so that I can just take this thing a few hundred
changes at a time.

Thanks,
Corey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-23 Thread Corey Thompson
On Fri, Aug 23, 2013 at 07:48:56AM -0400, Corey Thompson wrote:
 Sorry, I guess I could have included more details in my original post.
 Since then, I have also made an attempt to clone another (slightly more
 recent) branch, and at last had success.  So I see this does indeed
 work, it just seems to be very unhappy with one particular branch.
 
 So, here are a few statistics I collected on the two branches.
 
 branch-that-fails:
 total workspace disk usage (current head): 12GB
 68 files over 20MB
 largest three being about 118MB
 
 branch-that-clones:
 total workspace disk usage (current head): 11GB
 22 files over 20MB
 largest three being about 80MB
 
 I suspect that part of the problem here might be that my company likes
 to submit very large binaries into our repo (.tar.gzs, pre-compiled
 third party binaries, etc.).
 
 Is there any way I can clone this in pieces?  The best I've come up with
 is to clone only up to a change number just before it tends to fail, and
 then rebase to the latest.  My clone succeeded, but the rebase still
 runs out of memory.  It would be great if I could specify a change
 number to rebase up to, so that I can just take this thing a few hundred
 changes at a time.
 
 Thanks,
 Corey

And I still haven't told you anything about my platform or git
version...

This is on Fedora Core 11, with git 1.8.3.4 built from the github repo
(117eea7e).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-23 Thread Luke Diamand

I think I've cloned files as large as that or larger. If you just want to
clone this and move on, perhaps you just need a bit more memory? What's the
size of your physical memory and swap partition? Per process memory limit?


On 23 Aug 2013 12:59, Corey Thompson cmt...@gmail.com wrote:
On 23/08/13 12:59, Corey Thompson wrote:
 On Fri, Aug 23, 2013 at 07:48:56AM -0400, Corey Thompson wrote:
 Sorry, I guess I could have included more details in my original post.
 Since then, I have also made an attempt to clone another (slightly more
 recent) branch, and at last had success.  So I see this does indeed
 work, it just seems to be very unhappy with one particular branch.

 So, here are a few statistics I collected on the two branches.

 branch-that-fails:
 total workspace disk usage (current head): 12GB
 68 files over 20MB
 largest three being about 118MB

 branch-that-clones:
 total workspace disk usage (current head): 11GB
 22 files over 20MB
 largest three being about 80MB

 I suspect that part of the problem here might be that my company likes
 to submit very large binaries into our repo (.tar.gzs, pre-compiled
 third party binaries, etc.).

 Is there any way I can clone this in pieces?  The best I've come up with
 is to clone only up to a change number just before it tends to fail, and
 then rebase to the latest.  My clone succeeded, but the rebase still
 runs out of memory.  It would be great if I could specify a change
 number to rebase up to, so that I can just take this thing a few hundred
 changes at a time.

 Thanks,
 Corey
 
 And I still haven't told you anything about my platform or git
 version...
 
 This is on Fedora Core 11, with git 1.8.3.4 built from the github repo
 (117eea7e).

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-p4 out of memory for very large repository

2013-08-23 Thread Corey Thompson
On Fri, Aug 23, 2013 at 08:42:44PM +0100, Luke Diamand wrote:
 
 I think I've cloned files as large as that or larger. If you just want to
 clone this and move on, perhaps you just need a bit more memory? What's the
 size of your physical memory and swap partition? Per process memory limit?
 

The machine has 32GB of memory, so I hope that should be more than
sufficient!

$ ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 268288
max locked memory   (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files  (-n) 1024
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 1024
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

Admittedly I don't typically look at ulimit, so please excuse me if I
interpret this wrong, but I feel like this is indicating that the only
artificial limit in place is a maximum of 64kB mlock()'d memory.

Thanks,
Corey
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html