Re: build.sh sets with xz (was Re: vfs cache timing changes)
On Fri, Sep 13, 2019 at 06:59:42AM -0400, Greg Troxel wrote: > (I have not really been building current so am unclear on the xz > details.) > > I'd like us to keep somewhat separate the notions of: > > someone is doing build.sh release > > someone wants min-size sets at the expense of a lot of cpu time I agree, modulo s/release/sets/, especially when we're doing things like "reproducible builds" or "make images suitable for giving others". Very different use cases. When I'm developing (and/or trying to verify if the build works), I don't care that the tgz sets are 1.9x the size of the xz ones [mostly due to comp and base, if you're curious]; if I'm trying to send them across the net, I do. The big issue I run into is that `build.sh sets` - which you much need for most kinds of consistent installation - takes quite a bit longer and much more CPU with xz as compared to pigz. The results from doing `build.sh -x sets` in my VM, with the only change being USE_PIGZGZIP=1 in /etc/mk.conf, are USE_PIGZGZIP=1 216s user 38s system 80s wallclock (commented out) 536s user 77s system 279s wallclock and this is on a relatively modern machine's VM with a fast SSD; the difference gets greatly amplified on my anemic real-hardware server boxes.
Re: build.sh sets with xz (was Re: vfs cache timing changes)
Martin Husemann writes: > On Fri, Sep 13, 2019 at 06:59:42AM -0400, Greg Troxel wrote: >> I'd like us to keep somewhat separate the notions of: >> >> someone is doing build.sh release >> >> someone wants min-size sets at the expense of a lot of cpu time >> >> >> I regularly do build.sh release, and rsync the releasedir bits to other >> machines, and use them to install. Now perhaps I should be doing >> "distribution", but sometimes I want the ISOs. > > The default is MKDEBUG=no so you probably will not notice the compression > difference that much. I don't follow what DEBUG has to do with this, but that's not important. > If you set MKDEBUG=yes you can just as easily set USE_XZ_SETS=no > (or USE_PIGZGZIP=yes if you have pigz installed). Sure, I realize I could do this. The question is about defaults. > The other side of the coin is that we have reproducable builds, and we > should not make it harder than needed to reproduce our official builds. It should not difficult or hard to understand, which is perhaps different than defaults. > But ... it already needs some settings (which we still need to document > on a wiki page properly), so we could also default to something else > and force maximal compressions via the build.sh command line on the > build cluster. I could see MKREPRODUCILE=yes causing defaults of various things to be a particular way, and perhaps letting XZ default to no otherwise. I would hope that what MKREPRODUCILE=yes has to set is not very many things, but I haven't kept up.
Re: build.sh sets with xz (was Re: vfs cache timing changes)
On Fri, Sep 13, 2019 at 07:06:59AM +0200, Martin Husemann wrote: > I am not sure whether the xz compiled in tools supports the "-T threads" > option, but if it does, we can add "-T 0" to the default args and see how > much that improves things. Jörg, do you know this? It doesn't currently, since it is somewhat of a PITA to deal with pthread support portably. Joerg
Re: build.sh sets with xz (was Re: vfs cache timing changes)
On Fri, Sep 13, 2019 at 01:15:07PM +0200, Martin Husemann wrote: > The default is MKDEBUG=no so you probably will not notice the compression > difference that much. > > If you set MKDEBUG=yes you can just as easily set USE_XZ_SETS=no > (or USE_PIGZGZIP=yes if you have pigz installed). Or (evil hack:) we could use a different compression option for the *debug sets, as in the end those will not end up on the ISO anyway. (Sadly, I had hoped for them to fit and ship them by default on the ISOs as well). Martin
Re: build.sh sets with xz (was Re: vfs cache timing changes)
On Fri, Sep 13, 2019 at 06:59:42AM -0400, Greg Troxel wrote: > I'd like us to keep somewhat separate the notions of: > > someone is doing build.sh release > > someone wants min-size sets at the expense of a lot of cpu time > > > I regularly do build.sh release, and rsync the releasedir bits to other > machines, and use them to install. Now perhaps I should be doing > "distribution", but sometimes I want the ISOs. The default is MKDEBUG=no so you probably will not notice the compression difference that much. If you set MKDEBUG=yes you can just as easily set USE_XZ_SETS=no (or USE_PIGZGZIP=yes if you have pigz installed). The other side of the coin is that we have reproducable builds, and we should not make it harder than needed to reproduce our official builds. But ... it already needs some settings (which we still need to document on a wiki page properly), so we could also default to something else and force maximal compressions via the build.sh command line on the build cluster. Martin
Re: build.sh sets with xz (was Re: vfs cache timing changes)
"Tom Spindler (moof)" writes: >> PS: The xz compression for the debug set takes 36 minutes on my machine. >> We shoudl do something about it. Matt to use -T for more parallelism? > > On older machines, xz's default settings are pretty much unusable, > and USE_XZ_SETS=no (or USE_PIGZGZIP=yes) is almost a requirement. > On my not-exactly-slow i7 6700K, build.sh -j4 parallel is just fine > until it hits the xz stage; gzip is many orders of magnitude faster. > Maybe if xz were cranked down to -2 or -3 it'd be better at not > that much of a compression loss, or it defaulted to the higher > compression level only when doing a `build.sh release`. (I have not really been building current so am unclear on the xz details.) I'd like us to keep somewhat separate the notions of: someone is doing build.sh release someone wants min-size sets at the expense of a lot of cpu time I regularly do build.sh release, and rsync the releasedir bits to other machines, and use them to install. Now perhaps I should be doing "distribution", but sometimes I want the ISOs. Sometimes I do builds just to see if they work, e.g. if being diligent about testing changes. (Overall the notion of staying with gzip in most cases, with a tunable for extreme savins sounds sensible but I am too unclear to really weigh in on it.)
Re: build.sh sets with xz (was Re: vfs cache timing changes)
On Thu, Sep 12, 2019 at 09:49:53PM -0700, Tom Spindler (moof) wrote: > > PS: The xz compression for the debug set takes 36 minutes on my machine. > > We shoudl do something about it. Matt to use -T for more parallelism? > > On older machines, xz's default settings are pretty much unusable, > and USE_XZ_SETS=no (or USE_PIGZGZIP=yes) is almost a requirement. Side note: interestingly this does not appear to be a limiting factor on my older machines for x86 builds. The LLVM build/link time in the tools part for gallium explodes the overal build time by several hours and set compression is not as bad. Likely depends on available ram and speed of disk vs. CPU speed. Martin
Re: build.sh sets with xz (was Re: vfs cache timing changes)
On Thu, Sep 12, 2019 at 09:49:53PM -0700, Tom Spindler (moof) wrote: > Maybe if xz were cranked down to -2 or -3 it'd be better at not > that much of a compression loss, or it defaulted to the higher > compression level only when doing a `build.sh release`. If I remember the original size tests correctly, that would not give us any (or much) gain (sizewise) from gzip. There was nearly 30% size difference between the suggested default compression and -9 (what we use now). I see two easy options: - move back to gzip for all architectures that do not really need a CD-ROM sized ISO and gzip does not compress good enough (that would make sparc64 the only xz user for now) - provide an easy tunable for local builds where you care more about packing size than result size I am not sure whether the xz compiled in tools supports the "-T threads" option, but if it does, we can add "-T 0" to the default args and see how much that improves things. Jörg, do you know this? Martin