Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-13 Thread Tom Spindler (moof)
On Fri, Sep 13, 2019 at 06:59:42AM -0400, Greg Troxel wrote:
> (I have not really been building current so am unclear on the xz
> details.)
> 
> I'd like us to keep somewhat separate the notions of:
> 
>   someone is doing build.sh release
> 
>   someone wants min-size sets at the expense of a lot of cpu time

I agree, modulo s/release/sets/, especially when we're doing things
like "reproducible builds" or "make images suitable for giving
others".  Very different use cases. When I'm developing (and/or
trying to verify if the build works), I don't care that the tgz
sets are 1.9x the size of the xz ones [mostly due to comp and base,
if you're curious]; if I'm trying to send them across the net, I do.

The big issue I run into is that `build.sh sets` - which you much
need for most kinds of consistent installation - takes quite a bit
longer and much more CPU with xz as compared to pigz. The results
from doing `build.sh -x sets` in my VM, with the only change being
USE_PIGZGZIP=1 in /etc/mk.conf, are

USE_PIGZGZIP=1 216s user 38s system 80s wallclock
(commented out) 536s user 77s system 279s wallclock

and this is on a relatively modern machine's VM with a fast SSD;
the difference gets greatly amplified on my anemic real-hardware
server boxes.



Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-13 Thread Greg Troxel
Martin Husemann  writes:

> On Fri, Sep 13, 2019 at 06:59:42AM -0400, Greg Troxel wrote:
>> I'd like us to keep somewhat separate the notions of:
>> 
>>   someone is doing build.sh release
>> 
>>   someone wants min-size sets at the expense of a lot of cpu time
>> 
>> 
>> I regularly do build.sh release, and rsync the releasedir bits to other
>> machines, and use them to install.  Now perhaps I should be doing
>> "distribution", but sometimes I want the ISOs.
>
> The default is MKDEBUG=no so you probably will not notice the compression
> difference that much.

I don't follow what DEBUG has to do with this, but that's not important.

> If you set MKDEBUG=yes you can just as easily set USE_XZ_SETS=no
> (or USE_PIGZGZIP=yes if you have pigz installed).

Sure, I realize I could do this.  The question is about defaults.

> The other side of the coin is that we have reproducable builds, and we
> should not make it harder than needed to reproduce our official builds.

It should not difficult or hard to understand, which is perhaps
different than defaults.

> But ... it already needs some settings (which we still need to document
> on a wiki page properly), so we could also default to something else
> and force maximal compressions via the build.sh command line on the
> build cluster.

I could see

MKREPRODUCILE=yes

causing defaults of various things to be a particular way, and perhaps
letting XZ default to no otherwise.  I would hope that what
MKREPRODUCILE=yes has to set is not very many things, but I haven't kept
up.



Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-13 Thread Joerg Sonnenberger
On Fri, Sep 13, 2019 at 07:06:59AM +0200, Martin Husemann wrote:
> I am not sure whether the xz compiled in tools supports the "-T threads"
> option, but if it does, we can add "-T 0" to the default args and see how
> much that improves things. Jörg, do you know this?

It doesn't currently, since it is somewhat of a PITA to deal with
pthread support portably.

Joerg


Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-13 Thread Martin Husemann
On Fri, Sep 13, 2019 at 01:15:07PM +0200, Martin Husemann wrote:
> The default is MKDEBUG=no so you probably will not notice the compression
> difference that much.
> 
> If you set MKDEBUG=yes you can just as easily set USE_XZ_SETS=no
> (or USE_PIGZGZIP=yes if you have pigz installed).

Or (evil hack:) we could use a different compression option for the *debug
sets, as in the end those will not end up on the ISO anyway.
(Sadly, I had hoped for them to fit and ship them by default on the ISOs
as well).

Martin


Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-13 Thread Martin Husemann
On Fri, Sep 13, 2019 at 06:59:42AM -0400, Greg Troxel wrote:
> I'd like us to keep somewhat separate the notions of:
> 
>   someone is doing build.sh release
> 
>   someone wants min-size sets at the expense of a lot of cpu time
> 
> 
> I regularly do build.sh release, and rsync the releasedir bits to other
> machines, and use them to install.  Now perhaps I should be doing
> "distribution", but sometimes I want the ISOs.

The default is MKDEBUG=no so you probably will not notice the compression
difference that much.

If you set MKDEBUG=yes you can just as easily set USE_XZ_SETS=no
(or USE_PIGZGZIP=yes if you have pigz installed).

The other side of the coin is that we have reproducable builds, and we
should not make it harder than needed to reproduce our official builds.

But ... it already needs some settings (which we still need to document
on a wiki page properly), so we could also default to something else
and force maximal compressions via the build.sh command line on the
build cluster.

Martin


Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-13 Thread Greg Troxel
"Tom Spindler (moof)"  writes:

>> PS: The xz compression for the debug set takes 36 minutes on my machine.
>> We shoudl do something about it. Matt to use -T for more parallelism?
>
> On older machines, xz's default settings are pretty much unusable,
> and USE_XZ_SETS=no (or USE_PIGZGZIP=yes) is almost a requirement.
> On my not-exactly-slow i7 6700K, build.sh -j4 parallel is just fine
> until it hits the xz stage; gzip is many orders of magnitude faster.
> Maybe if xz were cranked down to -2 or -3 it'd be better at not
> that much of a compression loss, or it defaulted to the higher
> compression level only when doing a `build.sh release`.

(I have not really been building current so am unclear on the xz
details.)

I'd like us to keep somewhat separate the notions of:

  someone is doing build.sh release

  someone wants min-size sets at the expense of a lot of cpu time


I regularly do build.sh release, and rsync the releasedir bits to other
machines, and use them to install.  Now perhaps I should be doing
"distribution", but sometimes I want the ISOs.

Sometimes I do builds just to see if they work, e.g. if being diligent
about testing changes.

(Overall the notion of staying with gzip in most cases, with a tunable
for extreme savins sounds sensible but I am too unclear to really weigh
in on it.)


Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-12 Thread Martin Husemann
On Thu, Sep 12, 2019 at 09:49:53PM -0700, Tom Spindler (moof) wrote:
> > PS: The xz compression for the debug set takes 36 minutes on my machine.
> > We shoudl do something about it. Matt to use -T for more parallelism?
> 
> On older machines, xz's default settings are pretty much unusable,
> and USE_XZ_SETS=no (or USE_PIGZGZIP=yes) is almost a requirement.

Side note: interestingly this does not appear to be a limiting factor
on my older machines for x86 builds. The LLVM build/link time in the
tools part for gallium explodes the overal build time by several hours
and set compression is not as bad. Likely depends on available ram and
speed of disk vs. CPU speed.

Martin


Re: build.sh sets with xz (was Re: vfs cache timing changes)

2019-09-12 Thread Martin Husemann
On Thu, Sep 12, 2019 at 09:49:53PM -0700, Tom Spindler (moof) wrote:
> Maybe if xz were cranked down to -2 or -3 it'd be better at not
> that much of a compression loss, or it defaulted to the higher
> compression level only when doing a `build.sh release`.

If I remember the original size tests correctly, that would not give us
any (or much) gain (sizewise) from gzip.

There was nearly 30% size difference between the suggested default compression
and -9 (what we use now).

I see two easy options:

 - move back to gzip for all architectures that do not really need
   a CD-ROM sized ISO and gzip does not compress good enough (that
   would make sparc64 the only xz user for now)

 - provide an easy tunable for local builds where you care more about
   packing size than result size

I am not sure whether the xz compiled in tools supports the "-T threads"
option, but if it does, we can add "-T 0" to the default args and see how
much that improves things. Jörg, do you know this?

Martin