Re: Experiences with git-clone-pack and rsync

2005-08-05 Thread Junio C Hamano
Johannes Schindelin [EMAIL PROTECTED] writes:

 But maybe I just cried wolf...

I do not think you are crying wolf.  I shared the same concern
from the beginning and that was partly why I was pushing for
the dumb server approach.


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Experiences with git-clone-pack and rsync

2005-08-04 Thread Johannes Schindelin

Hi,

I just tried to clone a relatively big repository from a slow machine to a 
slow machine. I'm talking about a 1.2 gigabyte repository, packed down to 
120 megabyte, containing more than 21000 commits. When git-clone-script 
did not show anything for over 15 minutes, I decided to find out what's 
happening. The server sat, partly swapping, partly git-rev-list'ing, 
git-pack-objects dozing.


Now, I know some internals of git, so I went in and did an rsync, which is 
perfectly reasonable, given that I do not need the server to unpack the 
objects, pack them again, and then - after a while - sending them as they 
were: packed. It took 60 seconds with -- evidently -- almost no load for 
the server.


BTW, I am not quite sure why the machine started swapping. Maybe it was 
some other process on that server, but it could have been git-rev-list 
also, keeping those 21000 commits in memory in order to sort them. Or, it 
could have been something worse: git holding lots and lots of objects in 
memory.


So, I don't know if git-daemon, which basically does the same thing as 
git-clone-pack on the server side, would not be a pretty good way to bring 
git.kernel.org (once it exists) to a halt.


Maybe there should be some kind of heuristics in git-daemon, i.e. 
git-count-objects in reverse, to decide if it is not better to (at least 
optionally) just send the pack as is, even if it contains more objects 
than the user actually asked for. Or, for big projects like the kernel, 
just send the pack if at least one needed object is contained in it. Hey, 
git-http-pull already does that :-)


But maybe I just cried wolf...

Ciao,
Dscho

P.S.: There is a serious flaw in git-fetch-pack, though, which probably 
persists when using git-daemon as server: If interrupted, it does not kill 
the remote git-rev-list and git-pack-objects. I can bring down my poor 
server pretty easily by issuing git pull, interrupting that, 
and repeating that several times. Not sure how to fix that.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html