On Fri, Jan 15, 2010 at 2:01 AM, Daniel Bradshaw <dan...@the-cell.co.uk> wrote:
> On 01/14/2010 12:49 PM, Nirbheek Chauhan wrote:
>>
>> In theory, yes. In practice, git is too slow to handle 30,000 files.
>> Even simple operations like git add become painful even if you put the
>> whole of portage on tmpfs since git does a stat() on every single file
>> in the repository with every operation.
>>
>
> My understanding is that git was developed as the SCM for the kernel
> project.
> A quick check in an arbitary untouched kernel in /usr/src/ suggests a file
> [1] count of 25300.
>
> Assuming that my figure isn't out by an order of magnitude, how does the
> kernel team get along with git and 25k files but it is deathly slow for our
> 30k?
> Or, to phrase the question better... what are they doing that allows them to
> manage?
>

My bad. I did the tests a while back, and the number "30,000" is
actually for the no. of ebuilds in portage. The no. of files is
actually ~113,000 (difference comes because every package has a
manifest+changelog+metadata.xml+patches). OTOH, the no. of directories
is "just" ~20,000, so  if git would only do a stat() on directories,
it would get into the "usable" circle.

Also, since git does a stat on directories as well as files, you can
say that every command has to do ~133,000 stats, which is damn slow
even when cached.

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team

Reply via email to