I've done some work to try and pin down exactly where the install is
spending it's time.

These numbers come from fully hands-off jumpstart installs. So there's
no waiting for user interaction, and they're off fileservers that are
sufficiently fast that they shouldn't be a problem.

Installs off DVD are considerably slower, either hunting around the
DVD or waiting for it to spin back up. So they would have additional
time wasted.

System 1: SunBlade 200, twin 1.015GHz processors, single 72G disk.

Hardware boot: 70s
Boot + config: 3m15s
newfs disk:    30s
package inst:  22m15s
postinstall:   15s
shutdown/reboot: 65s
Boot:          25s
Devices:       30s
smf import:    5m50s
boot to prompt: 30s
graphical login: 25s

Total 36 minutes.

Note that the package install is just under 2/3 of the total;
the SMF manifest import is an obvious target. There are also
pauses of a minute or two in the configuration phase that I
don't understand.

What is it spending its time on in those 22 minutes that are
installing packages?

Looking from the fileserver, the data transfer rate is reasonably
slow - we're talking 2, maybe 3, megabytes per second. And it's
reasonably steady throughout the installation. This is the first
indication that the contents file isn't the dominant factor: if it
was, I would expect the install to start off like an express train
and slow down markedly towards the end as the contents file grew
in size.

In terms of packages (there are 940 altogether in this test) it
takes 14 minutes to get halfway through the packages by number;
about 10 minutes to get halfway through by data size and number
of files.

(This is interesting. The packages that really bloat the contents
file tend to get installed early. It would be better to install
the heavy hitters - such as SUNWman and SUNWwebminu - right at the
end, so that the average size of the contents file is smaller.)

The trend here is that package installation gets quicker as the
install progresses, as the packages get smaller, despite the contents
file growing in size.

Given the observed rates for packages and data, and making some
assumptions, I estimate that the total cost of the contents file
rewrites is around 4 minutes. Maybe only 3. (You install the same
amount of data in the first 10 minutes as the last 12, so the
extra 2 minutes is the extra cost of rewriting the contents file.)

So, the contents file rewriting cost is ~20% or less of the total
package installation time. (On this machine, for this test...)

Does this fit in with anything else? You can analyse the contents
file handling more closely by observing zone creation. Create a
sparse root zone, and you don't install any data - but you do have
to rebuild the contents file. This takes just over 6 minutes (so
that gives an upper bound to the contents file cost straight away),
with the disk about 50% busy (and giving good response - these are
large streamed writes). Again, the 3-4 minutes or so - there is work
to be done by pkgadd in addition to the rewrite.

I can compare this with a similar system - a V240, that has cpus
clocked 50% quicker but a similar disk. That installs it's packages
in 17 minutes. Looks like a similar fixed cost for rewriting the
contents file, but the regular install is quicker as the machine
is quicker.

The limiting factor in current installs appears to be the cpu
required to bzcat the archives. It took the test machine 2 minutes
to unpack the staroffice archive - going at 2M/s which is the
network data rate I observed during install. That's about 1/6 of
the total data - so 12 minutes of the 22 is bzcat. That's the
lowest hanging fruit. Even when you've solved that there's still
10 minutes left - and only 4 of those appear to be down to the
contents file. There is of course the actual time it takes to write
the data to disk - and, being scattered around in small files,
those writes won't be as efficient as the contents file.

So, for this test, of the total 36 minutes the contents file rewrite
is about 20% of the package installation time and 10% of the total
elapsed time. The impact on a DVD install is even less - there it's
even more important to stream the data adequately fast.

Even so, the contents file is costing us 4 minutes, and once you
solve the uncompress problem and the SMF import it starts to look
like the next target. But even there it's only going to be 1/3 of
the time at most, and there are a couple of ways to speed things
up:
 - re-order package installation to install small packages first
 - install clusters rather than individual packages

-- 
-Peter Tribble
L.I.S., University of Hertfordshire - http://www.herts.ac.uk/
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/



Reply via email to