Hi! Package install times are quite nasty. For example, a full GUI install can take[1] 1.5 hours on spinning rust, and several minutes even on raid0 of Optane disks -- you can't get much faster without operating entirely in memory[2].
There are two massive recent improvements: * eatmydata helps a lot, and eating a power-lossed install is not a loss * mmdebstrap can speed up the debootstrap stage by a factor of 3-6 But d-i doesn't use either, debootstrap is small compared to installing actual tasks/b-deps afterwards, and there's at least a couple of orders of magnitude of speedup possible. But you'd say: I have a fast NVMe disk, slow network, and don't install more often than once in a few years. Then yeah, you don't need a d-i speedup. I care about two use cases: * boxes with HDDs or SD cards * datacenter VMs, buildds It seems that the hard minimum is around 1 second on modern hardware. You can't unpack the full set of .xz debs faster than 0.75s assuming full parallelism -- which is impossible as firefox.deb itself takes 3s (with other CPU cores loaded). We can cheat here by splitting or repacking such big .debs with zstd. But that's not all -- the .debs have to get fetched from the network (750MB in a second calls for 10Gb network at minimum). Then configured. Writeout is not a problem: the full test install is 2.5GB unpacked, and you don't need to persist that immediately. A desktop install can writeout while the user answers installer questions, a buildd doesn't care about durability, a VM can do usable stuff while the disk is writing. You don't need that big a CPU: while I benchmarked the preliminary code on a -j64 2990WX, the actual utilization was less than 1/4 despite the task being highly parallel. This CPU chokes on its limited memory bandwidth (64MB of L3 cache hardly counts for big xz decompression), you want something with fewer, faster cores. And it's slowest machines like 4-core boards that actually benefit the most in absolute terms. No, there's no such thing as a 1-way machine that can install a modern distro anymore[3]: oldest machine I own, a non-NX Pentium4, is already -j2; when 3 years ago I needed the cheapest possible box with • USB, • local storage, • ethernet; it had 4 cores and 512MB RAM. Non-SMP is dead and buried, forget about ever optimizing for that. Test set: buster's task-xfce-desktop. That's 750MB of .debs, 2.5GB result. So let's see if we can approach this theoretical limit. So far I came up with the following: * let's not care about power loss during install. So no fsyncs, and no writing a single byte that's going to be overwritten later. Do a global sync() only when entering grub-install. * almost all Pre-Depends and preinsts care only about upgrades; on a clean-slate install you can ignore them and at most fix-up later * dpkg-diverts can be a problem but going yolo seems to work for me so far (not sure if all cases can be fixed-up after the fact -- dash can) * being able to unpack in parallel also means you don't need to care about order: install can go before apt-download has finished. This is awesome when your mirror has a slower link than that 10Gb... We can install package X the moment apt has fetched it even though it's still downloading packages Y and Z. (Nb: what's a good way to know apt is done? I screen-scrape -oDebug::pkgAcquire looking for "Dequeuing" which is a nasty hack.) The above is all nice and dandy, but I don't know how to do configure right. It seems that at least some triggers can be parallelized. man-db is by a large margin the biggest offender -- seems it has no dependencies so it's a great low-hanging fruit. Somehow it worked for me even before ldconfig -- that's probably insane though, so ldconfig should go first. Both of ldconfig and man-db are ordered after all unpacks of unrelated packages have finished -- is it possible to do them piecewise? I hardly looked at other postinsts yet, I wonder how they can be elided or fast-tracked. My dependency graph so far: apt-update | +>------------------ | \ apt -s install apt-cache dumpavail | / +<------------------ | stat(if .debs are here) |\ | +----------+----... | | | | unpack 1 unpack 2 (.debs that were already on disk) | \----------+ | \---------\ apt download | Finished 3 -> unpack 3 -+ Finished 4 -> unpack 4 -+ Finished 5 -> unpack 5 -+ | (unpack complete) | ldconfig / \ man-db write dpkg's status | | | dpkg --configure -a (fully serial...) | / +-------/ | Done! One other issue is that the whole plan needs to be known before starting. So no running in-target tasksel, asking whether you want popcon, etc. But that'd actually fix another of my gripes about d-i. So... any comments so far? Any hints how to cheat the configure step? Meow! [1]. Both times on btrfs, which interacts especially badly with fsync spam dpkg does. [2]. There's a hidden meaning here. [3]. Counting only stuff you can buy new; heavily embedded doesn't run Debian but specially crafted distros. -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ Ivan was a worldly man: born in St. Petersburg, raised in ⢿⡄⠘⠷⠚⠋⠀ Petrograd, lived most of his life in Leningrad, then returned ⠈⠳⣄⠀⠀⠀⠀ to the city of his birth to die.