On 2021-01-20 Sebastian Andrzej Siewior wrote:
> On 2021-01-18 23:52:50 [+0200], Lasse Collin wrote:
> > I have understood that *in practice* the problem with the xz command
> > line tool is limited to "xz -T0" usage so fixing this use case is
> > enough for most people. Please correct me if I missed something.  
> 
> Correct.

There is some code for special behavior with -T0 now for both
compression and decompression. I haven't updated the man page yet but
the commit messages should be helpful. I hope it can be documented so
that it sounds simple enough. :-)

> In the parallel decompress I added code on Linux to query the
> available memory. I would prefer that as an upper limit on 64bit if no
> limit is given. The reason is that *this* amount of memory is safe to
> use without over-committing / involving swap.

This may be the way to go on Linux but I didn't add it yet. The
committed code uses total_ram / 4. Since MemAvail is Linux-specific
something more broadly available needs exist for better portability,
and total_ram / 4 could perhaps be it. It can be tweaked if needed,
it's just a starting point.

> For 32bit applications I would cap that limit to 2.5 GiB or so. The
> reason is that the *normal* case is to run 32bit application on a
> 32bit kernel and so likely only 3GiB can be addressed at most (minus
> a few details like linked in libs, NULL page, guard pages and so on).
> The 32bit application on 64bit kernel is probably a shortcut where
> something is done a 32bit chroot - like building a package.
> 
> I'm not sure what a sane upper limit is on other OSes. Limitting it on
> 32bit does probably more good than bad if there is no -M parameter.

I think a generic cap needs to be below 2 GiB. For example, if 32-bit
MIPS can do only 2 GiB. There could be OS+arch-specific exceptions
though.

The code currently in xz.git uses 1400 MiB. There needs to be some
extra room if repeated mallocs and frees fragment the address space a
little. Perhaps it's too conservative but it allows eight compression
threads at the default xz -6, and one thread at -9 in threaded mode (so
it can create a file that can be decompressed in threaded mode).

> > An alternative "fix" for the liblzma case could be adding a simple
> > API function that would scale down the number of threads in a
> > lzma_mt structure based on a memory usage limit and if the
> > application is 32 bits. Currently the thread count and LZMA2
> > settings adjusting code is in xz, not in liblzma.  
> 
> It might help. dpkg checks the memlimit with
> lzma_stream_encoder_mt_memusage() and decreases the memory limit until
> it fits. It looks simpler compared to rpm's attempt and various
> exceptions.

Now that lzma_mt structure contains memlimit_threading already, a flag
could be added to use it to reduce the number of threads at the encoder
initialization. I suppose reducing the thread count would go a long
way. It doesn't affect the compressed output so it can be done when
people wish reproducible output.

> > The idea for the current 4020 MiB special limit is based on a patch
> > that was in use in FreeBSD to solve the problem of 32-bit xz on
> > 64-bit kernel. So at least FreeBSD should be supported to not make
> > 32-bit xz worse under 64-bit FreeBSD kernel.  
> 
> Is this a common case?

I don't *know* but I guess some build 32-bit packages on a 64-bit
kernel so it may be common enough use case.

> While poking around, Linux has this personality() syscall/function.
> There is a flag called PER_LINUX32_3GB and PER_LINUX_32BIT which are
> set if the command is invoked with `linux32' say
>       linux32 xz
> 
> then it would set that flag set and could act. It is not set by
> starting a 32bit application on a 64bit kernel on its own or on a
> 32bit kernel. I don't know if this is common practise but I use this
> in my chroots. So commands like `uname -m' return `i686' instead of
> `x86_64'. If other chroot environments do it as well then it could be
> used as a hack to assume that it is run on 64bit kernel. That is if
> we want that ofcourse :)

I haven't look at this but it sounds that it could be useful. If xz
knows that it has 4 GiB of address space the default limit could be much
higher.

-- 
Lasse Collin

Reply via email to