On Fri, 2006-03-24 at 14:40, Casper.Dik at Sun.COM wrote: > The package database flatfile is a serious obstacle to performance. > > The file is big and is written to disk one or more times for each > package installation. (And flushed to stable storage and then renamed)
Yes, absolutely. Another way to speed this up would be not to commit changes with every package, but just roll up all the changes and write it once at the end. This batching approach would help patches as well - which are just pkgadds of bits of packages, sometimes involving lots of packages. It's not just perfomance - minimising the time between installation or patch start and end reduces the risk of corruption due to system failure while in the middle of the process. If I (at the risk of sounding like a broken record) go back to clusters and metaclusters, then imagine clusteradd operating on clusters as first class objects - so it installs the whole lot in a single operation (rather like a single pkgadd of all the combined packages), and only updates the contents file once. > Other low hanging fruits are: > > - use of gzip vs bzip2 > > bzip2 is *so* slow that it actually is a performance bottleneck even on > fast machines; gzip compresses about 10% less but allows you to read data > at DVD speeds; bzip2 does not. Yes. I meant to mention that. Just to illustrate, my W2100z takes 60s to bzcat the SUNWsom archive; if I convert it to gzip then gzcat takes 6s - a factor 10 improvement. -- -Peter Tribble L.I.S., University of Hertfordshire - http://www.herts.ac.uk/ http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
