Re: Long clone time after done.

2012-11-26 Thread Uri Moszkowicz
Hi guys,
Any further interest on this scalability problem or should I move on?

Thanks,
Uri

On Thu, Nov 8, 2012 at 5:35 PM, Uri Moszkowicz u...@4refs.com wrote:
 I tried on the local disk as well and it didn't help. I managed to
 find a SUSE11 machine and tried it there but no luck so I think we can
 eliminate NFS and OS as significant factors now.

 I ran with perf and here's the report:

 ESC[31m69.07%ESC[m  git  /lib64/libc-2.11.1.so
  [.] memcpy
 ESC[31m12.33%ESC[m  git
 prefix/git-1.8.0.rc2.suse11/bin/git   [.]
 blk_SHA1_Block
 ESC[31m 5.11%ESC[m  git
 prefix/zlib/local/lib/libz.so.1.2.5   [.]
 inflate_fast
 ESC[32m 2.61%ESC[m  git
 prefix/zlib/local/lib/libz.so.1.2.5   [.]
 adler32
 ESC[32m 1.98%ESC[m  git  /lib64/libc-2.11.1.so
  [.] _int_malloc
 ESC[32m 0.86%ESC[m  git  [kernel]
  [k] clear_page_c

 Does this help? Machine has 396GB of RAM if it matters.

 On Thu, Nov 8, 2012 at 4:33 PM, Jeff King p...@peff.net wrote:
 On Thu, Nov 08, 2012 at 04:16:59PM -0600, Uri Moszkowicz wrote:

 I ran git cat-file commit some-tag for every tag. They seem to be
 roughly uniformly distributed between 0s and 2s and about 2/3 of the
 time seems to be system. My disk is mounted over NFS so I tried on the
 local disk and it didn't make a difference.

 I have only one 1.97GB pack. I ran git gc --aggressive before.

 Ah. NFS. That is almost certainly the source of the problem. Git will
 aggressively mmap. I would not be surprised to find that RHEL4's NFS
 implementation is not particularly fast at mmap-ing 2G files, and is
 spending a bunch of time in the kernel servicing the requests.

 Aside from upgrading your OS or getting off of NFS, I don't have a lot
 of advice.  The performance characteristics you are seeing are so
 grossly off of what is normal that using git is probably going to be
 painful. Your 2s cat-files should be more like .002s. I don't think
 there's anything for git to fix here.

 You could try building with NO_MMAP, which will emulate it with pread.
 That might fare better under your NFS implementation. Or it might be
 just as bad.

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Long clone time after done.

2012-11-08 Thread Uri Moszkowicz
I tried the patch but it doesn't appear to have helped :( Clone time
with it was ~32m.

Do you all by any chance have a tool to obfuscate a repository?
Probably I still wouldn't be permitted to distribute it but might make
the option slightly more palatable. Anything else that I can do to
help debug this problem?

On Thu, Nov 8, 2012 at 9:56 AM, Jeff King p...@peff.net wrote:
 On Wed, Nov 07, 2012 at 11:32:37AM -0600, Uri Moszkowicz wrote:

   #4  parse_object (sha1=0xb0ee98
 \017C\205Wj\001`\254\356\307Z\332\367\353\233.\375P}D) at
 object.c:212
   #5  0x004ae9ec in handle_one_ref (path=0xb0eec0
 refs/tags/removed, sha1=0xb0ee98
 \017C\205Wj\001`\254\356\307Z\332\367\353\233.\375P}D, flags=2,
 cb_data=optimized out) at pack-refs.

 [...]

 It looks like handle_one_ref() is called for each ref and most result
 in a call to read_sha1_file().

 Right. When generating the packed-refs file, we include the peeled
 reference for a tag (i.e., the commit that a tag object points to). So
 we have to actually read any tag objects to get the value.

 The upload-pack program generates a similar list, and I recently added
 some optimizations. This code path could benefit from some of them by
 using peel_ref instead of hand-rolling the tag dereferencing. The main
 optimization, though, is reusing peeled values that are already in
 packed-refs; we would probably need some additional magic to reuse the
 values from the source repository.

 However:

 It only takes a second or so for each call but when you have thousands
 of them (one for each ref) it adds up.

 I am more concerned that it takes a second to read each tag. Even in my
 pathological tests for optimizing upload-pack, peeling 50,000 refs took
 only half a second.

 Adding --single-branch --branch branch doesn't appear to help as
 it is implemented afterwards. I would like to debug this problem
 further but am not familiar enough with the implementation to know
 what the next step is. Can anyone offer some suggestions? I don't see
 why a clone should be dependent on an O(#refs) operations.

 Does this patch help? In a sample repo with 5000 annotated tags, it
 drops my local clone time from 0.20s to 0.11s. Which is a big percentage
 speedup, but this code isn't taking a long time in the first place for
 me.

 ---
 diff --git a/pack-refs.c b/pack-refs.c
 index f09a054..3344749 100644
 --- a/pack-refs.c
 +++ b/pack-refs.c
 @@ -40,13 +40,9 @@ static int handle_one_ref(const char *path, const unsigned 
 char *sha1,

 fprintf(cb-refs_file, %s %s\n, sha1_to_hex(sha1), path);
 if (is_tag_ref) {
 -   struct object *o = parse_object(sha1);
 -   if (o-type == OBJ_TAG) {
 -   o = deref_tag(o, path, 0);
 -   if (o)
 -   fprintf(cb-refs_file, ^%s\n,
 -   sha1_to_hex(o-sha1));
 -   }
 +   unsigned char peeled[20];
 +   if (!peel_ref(path, peeled))
 +   fprintf(cb-refs_file, ^%s\n, sha1_to_hex(peeled));
 }

 if ((cb-flags  PACK_REFS_PRUNE)  !do_not_prune(flags)) {
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Long clone time after done.

2012-11-08 Thread Uri Moszkowicz
I'm using RHEL4. Looks like perf is only available with RHEL6.

heads: 308
tags: 9614

Looking up the tags that way took a very long time by the way. git
tag | wc -l was much quicker. I've already pruned a lot of tags to
get to this number as well. The original repository had ~37k tags
since we used to tag every commit with CVS.

All my tags are packed so cat-file doesn't work:
fatal: git cat-file refs/tags/some-tag: bad file

On Thu, Nov 8, 2012 at 2:33 PM, Jeff King p...@peff.net wrote:
 On Thu, Nov 08, 2012 at 11:20:29AM -0600, Uri Moszkowicz wrote:

 I tried the patch but it doesn't appear to have helped :( Clone time
 with it was ~32m.

 That sounds ridiculously long.

 Do you all by any chance have a tool to obfuscate a repository?
 Probably I still wouldn't be permitted to distribute it but might make
 the option slightly more palatable. Anything else that I can do to
 help debug this problem?

 I don't have anything already written. What platform are you on? If it's
 Linux, can you try using perf to record where the time is going?

 How many refs do you have? What does:

   echo heads: $(git for-each-ref refs/heads | wc -l)
   echo  tags: $(git for-each-ref refs/tags | wc -l)

 report? How long does it take to look up a tag, like:

   time git cat-file tag refs/tags/some-tag

 ?

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Long clone time after done.

2012-11-08 Thread Uri Moszkowicz
I ran git cat-file commit some-tag for every tag. They seem to be
roughly uniformly distributed between 0s and 2s and about 2/3 of the
time seems to be system. My disk is mounted over NFS so I tried on the
local disk and it didn't make a difference.

I have only one 1.97GB pack. I ran git gc --aggressive before.

On Thu, Nov 8, 2012 at 4:11 PM, Jeff King p...@peff.net wrote:
 On Thu, Nov 08, 2012 at 03:49:32PM -0600, Uri Moszkowicz wrote:

 I'm using RHEL4. Looks like perf is only available with RHEL6.

 Yeah, RHEL4 is pretty ancient; I think it predates the invention of
 perf.

 heads: 308
 tags: 9614

 Looking up the tags that way took a very long time by the way. git
 tag | wc -l was much quicker. I've already pruned a lot of tags to
 get to this number as well. The original repository had ~37k tags
 since we used to tag every commit with CVS.

 Hmm. I think for-each-ref will actually open the tag objects, but git
 tag will not. That would imply that reading the refs is fast, but
 opening objects is slow. I wonder why.

 How many packs do you have in .git/objects/pack of the repository?

 All my tags are packed so cat-file doesn't work:
 fatal: git cat-file refs/tags/some-tag: bad file

 The packing shouldn't matter. The point of the command is to look up the
 refs/tags/some-tag ref (in packed-refs or in the filesystem), and then
 open and write the pointed-to object to stdout. If that is not working,
 then there is something seriously wrong going on.

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Long clone time after done.

2012-11-08 Thread Uri Moszkowicz
I tried on the local disk as well and it didn't help. I managed to
find a SUSE11 machine and tried it there but no luck so I think we can
eliminate NFS and OS as significant factors now.

I ran with perf and here's the report:

ESC[31m69.07%ESC[m  git  /lib64/libc-2.11.1.so
 [.] memcpy
ESC[31m12.33%ESC[m  git
prefix/git-1.8.0.rc2.suse11/bin/git   [.]
blk_SHA1_Block
ESC[31m 5.11%ESC[m  git
prefix/zlib/local/lib/libz.so.1.2.5   [.]
inflate_fast
ESC[32m 2.61%ESC[m  git
prefix/zlib/local/lib/libz.so.1.2.5   [.]
adler32
ESC[32m 1.98%ESC[m  git  /lib64/libc-2.11.1.so
 [.] _int_malloc
ESC[32m 0.86%ESC[m  git  [kernel]
 [k] clear_page_c

Does this help? Machine has 396GB of RAM if it matters.

On Thu, Nov 8, 2012 at 4:33 PM, Jeff King p...@peff.net wrote:
 On Thu, Nov 08, 2012 at 04:16:59PM -0600, Uri Moszkowicz wrote:

 I ran git cat-file commit some-tag for every tag. They seem to be
 roughly uniformly distributed between 0s and 2s and about 2/3 of the
 time seems to be system. My disk is mounted over NFS so I tried on the
 local disk and it didn't make a difference.

 I have only one 1.97GB pack. I ran git gc --aggressive before.

 Ah. NFS. That is almost certainly the source of the problem. Git will
 aggressively mmap. I would not be surprised to find that RHEL4's NFS
 implementation is not particularly fast at mmap-ing 2G files, and is
 spending a bunch of time in the kernel servicing the requests.

 Aside from upgrading your OS or getting off of NFS, I don't have a lot
 of advice.  The performance characteristics you are seeing are so
 grossly off of what is normal that using git is probably going to be
 painful. Your 2s cat-files should be more like .002s. I don't think
 there's anything for git to fix here.

 You could try building with NO_MMAP, which will emulate it with pread.
 That might fare better under your NFS implementation. Or it might be
 just as bad.

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Long clone time after done.

2012-10-24 Thread Uri Moszkowicz
It all goes to pack_refs() in write_remote_refs called from
update_remote_refs().

On Tue, Oct 23, 2012 at 11:29 PM, Nguyen Thai Ngoc Duy
pclo...@gmail.com wrote:
 On Wed, Oct 24, 2012 at 1:30 AM, Uri Moszkowicz u...@4refs.com wrote:
 I have a large repository which I ran git gc --aggressive on that
 I'm trying to clone on a local file system. I would expect it to
 complete very quickly with hard links but it's taking about 6min to
 complete with no checkout (git clone -n). I see the message Clining
 into 'repos'... done. appear after a few seconds but then Git just
 hangs there for another 6min. Any idea what it's doing at this point
 and how I can speed it up?

 done. is printed by clone_local(), which is called in cmd_clone().
 After that there are just a few more calls. Maybe you could add a few
 printf in between these calls, see which one takes most time?
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tag storage format

2012-10-23 Thread Uri Moszkowicz
That did the trick - thanks!

On Mon, Oct 22, 2012 at 5:46 PM, Andreas Schwab sch...@linux-m68k.org wrote:

 Uri Moszkowicz u...@4refs.com writes:

  Perhaps Git should switch to a single-file block text or binary format
  once a large number of tags becomes present in a repository.

 This is what git pack-refs (called by git gc) does (by putting the refs
 in .git/packed-refs).

 Andreas.

 --
 Andreas Schwab, sch...@linux-m68k.org
 GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
 And now for something completely different.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Long clone time after done.

2012-10-23 Thread Uri Moszkowicz
I have a large repository which I ran git gc --aggressive on that
I'm trying to clone on a local file system. I would expect it to
complete very quickly with hard links but it's taking about 6min to
complete with no checkout (git clone -n). I see the message Clining
into 'repos'... done. appear after a few seconds but then Git just
hangs there for another 6min. Any idea what it's doing at this point
and how I can speed it up?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Large number of object files

2012-10-23 Thread Uri Moszkowicz
Continuing to work on improving clone times, using git gc
--aggressive has resulted in a large number of tags combining into a
single file but now I have a large number of files in the objects
directory - 131k for a ~2.7GB repository. Any way to reduce the number
of these files to speed up clones?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


tag storage format

2012-10-22 Thread Uri Moszkowicz
I'm doing some testing on a large Git repository and am finding local
clones to take a very long time. After some investigation I've
determined that the problem is due to a very large number of tags
(~38k). Even with hard links, it just takes a really long time to
visit that many inodes. As it happens, I don't care for most of these
tags and will prune many of them anyway but I expect that over time it
will creep back up again. Have others reported this problem before and
is there a workaround? Perhaps Git should switch to a single-file
block text or binary format once a large number of tags becomes
present in a repository.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unexpected directories from read-tree

2012-10-19 Thread Uri Moszkowicz
I am using 1.8.0-rc2 but also tried 1.7.8.4. Thanks for the suggestion
to use ls-files -t - that's exactly what I was looking for. With
that I was easily able to tell what the problem is: missing / from
the sparse-checkout file.

On Thu, Oct 18, 2012 at 10:34 PM, Nguyen Thai Ngoc Duy
pclo...@gmail.com wrote:
 On Fri, Oct 19, 2012 at 6:10 AM, Uri Moszkowicz u...@4refs.com wrote:
 I'm testing out the sparse checkout feature of Git on my large (14GB)
 repository and am running into a problem. When I add dir1/ to
 sparse-checkout and then run git read-tree -mu HEAD I see dir1 as
 expected. But when I add dir2/ to sparse-checkout and read-tree
 again I see dir2 and dir3 appear and they're not nested. If I replace
 dir2/ with dir3/ in the sparse-checkout file, then I see dir1 and
 dir3 but not dir2 as expected again. How can I debug this problem?

 Posting here is step 1. What version are you using? You can look at
 unpack-trees.c The function that does the check is excluded_from_list.
 You should check ls-files -t, see if CE_SKIP_WORKTREE is set
 correctly for all dir1/*, dir2/* and dir3/*. Can you recreate a
 minimal test case for the problem?
 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Unexpected directories from read-tree

2012-10-18 Thread Uri Moszkowicz
I'm testing out the sparse checkout feature of Git on my large (14GB)
repository and am running into a problem. When I add dir1/ to
sparse-checkout and then run git read-tree -mu HEAD I see dir1 as
expected. But when I add dir2/ to sparse-checkout and read-tree
again I see dir2 and dir3 appear and they're not nested. If I replace
dir2/ with dir3/ in the sparse-checkout file, then I see dir1 and
dir3 but not dir2 as expected again. How can I debug this problem?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: git-fast-import died of signal 11

2012-10-17 Thread Uri Moszkowicz
Hi Michael,
Looks like the changes to limit solved the problem. I didn't verify if
it was the stacksize or descriptors but one of those. Final repository
size was 14GB from a 328GB dump file.

Thanks,
Uri

On Tue, Oct 16, 2012 at 2:18 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 10/15/2012 05:53 PM, Uri Moszkowicz wrote:
 I'm trying to convert a CVS repository to Git using cvs2git. I was able to
 generate the dump file without problem but am unable to get Git to
 fast-import it. The dump file is 328GB and I ran git fast-import on a
 machine with 512GB of RAM.

 fatal: Out of memory? mmap failed: Cannot allocate memory
 fast-import: dumping crash report to fast_import_crash_18192
 error: git-fast-import died of signal 11

 How can I import the repository?

 What versions of git and of cvs2git are you using?  If not the current
 versions, please try with the current versions.

 What is the nature of your repository (i.e., why is it so big)?  Does it
 consist of extremely large files?  A very deep history?  Extremely many
 branches/tags?  Extremely many files?

 Did you check whether the RAM usage of git-fast-import process was
 growing gradually to fill RAM while it was running vs. whether the usage
 seemed reasonable until it suddenly crashed?

 There are a few obvious possibilities:

 0. There is some reason that too little of your computer's RAM is
 available to git-fast-import (e.g., ulimit, other processes running at
 the same time, much RAM being used as a ramdisk, etc).

 1. Your import is simply too big for git-fast-import to hold in memory
 the accumulated things that it has to remember.  I'm not familiar with
 the internals of git-fast-import, but I believe that the main thing that
 it has to keep in RAM is the list of marks (references to git objects
 that can be referred to later in the import).  From your crash file, it
 looks like there were about 350k marks loaded at the time of the crash.
  Supposing each mark is about 100 bytes, this would only amount to 35
 Mb, which should not be a problem (*if* my assumptions are correct).

 2. Your import contains a gigantic object which individually is so big
 that it overflows some component of the import.  (I don't know whether
 large objects are handled streamily; they might be read into memory at
 some point.)  But since your computer had so much RAM this is hardly
 imaginable.

 3. git-fast-import has a memory leak and the accumulated memory leakage
 is exhausting your RAM.

 4. git-fast-import has some other kind of a bug.

 5. The contents of the dumpfile are corrupt in a way that is triggering
 the problem.  This could either be invalid input (e.g., an object that
 is reported to be quaggabytes large), or some invalid input that
 triggers a bug in git-fast-import.

 If (1), then you either need a bigger machine or git-fast-import needs
 architectural changes.

 If (2), then you either need a bigger machine or git-fast-import and/or
 git needs architectural changes.

 If (3), then it would be good to get more information about the problem
 so that the leak can be fixed.  If this is the case, it might be
 possible to work around the problem by splitting the dumpfile into
 several parts and loading them one after the other (outputting the marks
 from one run and loading them into the next).

 If (4) or (5), then it would be helpful to narrow down the problem.  It
 might be possible to do so by following the instructions in the cvs2svn
 FAQ [1] for systematically shrinking a test case to smaller size using
 destroy_repository.py and shrink_test_case.py.  If you can create a
 small repository that triggers the same problem, then there is a good
 chance that it is easy to fix.

 Michael
 (the cvs2git maintainer)

 [1] http://cvs2svn.tigris.org/faq.html#testcase

 --
 Michael Haggerty
 mhag...@alum.mit.edu
 http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: error: git-fast-import died of signal 11

2012-10-16 Thread Uri Moszkowicz
I'm using Git 1.8.0-rc2 and cvs2git version 2.5.0-dev (trunk). The
repository is almost 20 years old and should consist of mostly
smallish plain text files. We've been tagging every commit, in
addition to for releases and development branches, so there's a lot of
tags and branches. I didn't see the memory usage of the process before
exiting but after ~3.5 hours in a subsequent run it seems to be using
about 8.5GB of virtual memory with a resident size of only .5GB, which
should have easily fit on the 512GB machine that I was using. I'm
trying on a 1TB machine now but it doesn't look like it'll make a
difference. There is no ram disk and I have exclusive access to the
machine so only from the OS, which is trivial. The only significant
limit from my environment would be on the stack:

[umoszkow@mawhp5 ~] limit
cputime  unlimited
filesize unlimited
datasize unlimited
stacksize8000 kbytes
coredumpsize 0 kbytes
memoryuseunlimited
vmemoryuse   unlimited
descriptors  1024
memorylocked 32 kbytes
maxproc  8388608

Would that result in an mmap error though? I'll try with unlimited
stacksize and descriptors anyway.

I don't think modifying the original repository or a clone of it is
possible at this point but breaking up the import into a few steps may
be - will try that next if this fails.

On Tue, Oct 16, 2012 at 2:18 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 10/15/2012 05:53 PM, Uri Moszkowicz wrote:
 I'm trying to convert a CVS repository to Git using cvs2git. I was able to
 generate the dump file without problem but am unable to get Git to
 fast-import it. The dump file is 328GB and I ran git fast-import on a
 machine with 512GB of RAM.

 fatal: Out of memory? mmap failed: Cannot allocate memory
 fast-import: dumping crash report to fast_import_crash_18192
 error: git-fast-import died of signal 11

 How can I import the repository?

 What versions of git and of cvs2git are you using?  If not the current
 versions, please try with the current versions.

 What is the nature of your repository (i.e., why is it so big)?  Does it
 consist of extremely large files?  A very deep history?  Extremely many
 branches/tags?  Extremely many files?

 Did you check whether the RAM usage of git-fast-import process was
 growing gradually to fill RAM while it was running vs. whether the usage
 seemed reasonable until it suddenly crashed?

 There are a few obvious possibilities:

 0. There is some reason that too little of your computer's RAM is
 available to git-fast-import (e.g., ulimit, other processes running at
 the same time, much RAM being used as a ramdisk, etc).

 1. Your import is simply too big for git-fast-import to hold in memory
 the accumulated things that it has to remember.  I'm not familiar with
 the internals of git-fast-import, but I believe that the main thing that
 it has to keep in RAM is the list of marks (references to git objects
 that can be referred to later in the import).  From your crash file, it
 looks like there were about 350k marks loaded at the time of the crash.
  Supposing each mark is about 100 bytes, this would only amount to 35
 Mb, which should not be a problem (*if* my assumptions are correct).

 2. Your import contains a gigantic object which individually is so big
 that it overflows some component of the import.  (I don't know whether
 large objects are handled streamily; they might be read into memory at
 some point.)  But since your computer had so much RAM this is hardly
 imaginable.

 3. git-fast-import has a memory leak and the accumulated memory leakage
 is exhausting your RAM.

 4. git-fast-import has some other kind of a bug.

 5. The contents of the dumpfile are corrupt in a way that is triggering
 the problem.  This could either be invalid input (e.g., an object that
 is reported to be quaggabytes large), or some invalid input that
 triggers a bug in git-fast-import.

 If (1), then you either need a bigger machine or git-fast-import needs
 architectural changes.

 If (2), then you either need a bigger machine or git-fast-import and/or
 git needs architectural changes.

 If (3), then it would be good to get more information about the problem
 so that the leak can be fixed.  If this is the case, it might be
 possible to work around the problem by splitting the dumpfile into
 several parts and loading them one after the other (outputting the marks
 from one run and loading them into the next).

 If (4) or (5), then it would be helpful to narrow down the problem.  It
 might be possible to do so by following the instructions in the cvs2svn
 FAQ [1] for systematically shrinking a test case to smaller size using
 destroy_repository.py and shrink_test_case.py.  If you can create a
 small repository that triggers the same problem, then there is a good
 chance that it is easy to fix.

 Michael
 (the cvs2git maintainer)

 [1] http://cvs2svn.tigris.org/faq.html#testcase

 --
 Michael Haggerty
 mhag...@alum.mit.edu
 http://softwareswirl.blogspot.com

error: git-fast-import died of signal 11

2012-10-15 Thread Uri Moszkowicz
Hi,
I'm trying to convert a CVS repository to Git using cvs2git. I was able to
generate the dump file without problem but am unable to get Git to
fast-import it. The dump file is 328GB and I ran git fast-import on a
machine with 512GB of RAM.

fatal: Out of memory? mmap failed: Cannot allocate memory
fast-import: dumping crash report to fast_import_crash_18192
error: git-fast-import died of signal 11

How can I import the repository?

Thanks,
Uri
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html