On Fri, Jan 15, 2010 at 2:01 AM, Daniel Bradshaw <dan...@the-cell.co.uk> wrote: > On 01/14/2010 12:49 PM, Nirbheek Chauhan wrote: >> >> In theory, yes. In practice, git is too slow to handle 30,000 files. >> Even simple operations like git add become painful even if you put the >> whole of portage on tmpfs since git does a stat() on every single file >> in the repository with every operation. >> > > My understanding is that git was developed as the SCM for the kernel > project. > A quick check in an arbitary untouched kernel in /usr/src/ suggests a file > [1] count of 25300. > > Assuming that my figure isn't out by an order of magnitude, how does the > kernel team get along with git and 25k files but it is deathly slow for our > 30k? > Or, to phrase the question better... what are they doing that allows them to > manage? >
My bad. I did the tests a while back, and the number "30,000" is actually for the no. of ebuilds in portage. The no. of files is actually ~113,000 (difference comes because every package has a manifest+changelog+metadata.xml+patches). OTOH, the no. of directories is "just" ~20,000, so if git would only do a stat() on directories, it would get into the "usable" circle. Also, since git does a stat on directories as well as files, you can say that every command has to do ~133,000 stats, which is damn slow even when cached. -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team