Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-07-18 Thread Lasse Collin
On 2024-07-18 Brett Okken wrote:
> There were a number of "unsigned" related operations added in 1.8.
> I cannot find any of the official doc, but here is a decent
> summarization of what was added:
> https://www.baeldung.com/java-unsigned-arithmetic

Thanks! I suspect that RangeDecoder was the only performance-critical
place with unsigned integers.

-- 
Lasse Collin



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-07-18 Thread Brett Okken
Glad it worked.
There were a number of "unsigned" related operations added in 1.8.
I cannot find any of the official doc, but here is a decent
summarization of what was added:
https://www.baeldung.com/java-unsigned-arithmetic

On Thu, Jul 18, 2024 at 11:16 AM Lasse Collin  wrote:
>
> On 2024-07-18 Brett Okken wrote:
> > Did you try out the Integer.compareUnsigned method as an alternative?
>
> No, I was too dumb to even look for it. :-( This code was written when
> Java 6 was the latest stable release, and back then it seemed that Java
> was quite negative about supporting unsigned integers.
>
> Integer.compareUnsigned seems to give the same performance as the
> "long" method on x86-64. So compareUnsigned is the obvious choice. I
> have committed it to master.
>
> Thanks!
>
> --
> Lasse Collin



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-07-18 Thread Lasse Collin
On 2024-07-18 Brett Okken wrote:
> Did you try out the Integer.compareUnsigned method as an alternative?

No, I was too dumb to even look for it. :-( This code was written when
Java 6 was the latest stable release, and back then it seemed that Java
was quite negative about supporting unsigned integers.

Integer.compareUnsigned seems to give the same performance as the
"long" method on x86-64. So compareUnsigned is the obvious choice. I
have committed it to master.

Thanks!

-- 
Lasse Collin



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-07-18 Thread Brett Okken
Lasse,

On the rc_dec_speed PR, it appears to me the main change that might
speed things up is this unsigned comparison?
https://github.com/tukaani-project/xz-java/compare/master...rc_dec_speed#diff-8a1afbf1609c4b2d7813b299fce056f7ddd58a4a24ff02b01c2fdba38ff7fd0dL24-R23

Did you try out the Integer.compareUnsigned method as an alternative?
https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#compareUnsigned-int-int-

My observation historically has been that moving to 64bit operations is slower.

I will try to find some time to take a closer look at the PR you referenced.

Brett

On Thu, Jul 18, 2024 at 9:13 AM Lasse Collin  wrote:
>
> On 2024-03-07 Lasse Collin wrote:
> > On 2024-03-05 Dennis Ens wrote:
> > > > I hope 1.10 could be done in a month or two but I don't want to
> > > > make any promises or serious predictions. Historically those
> > > > haven't been accurate at all.
> > >
> > > I'll hope it's on the sooner side then.
>
> 1.10 is getting closer to being ready. I'd like to fix this-escape
> warnings still as those are a sign of design errors. More info here:
>
> https://github.com/tukaani-project/xz-java/pull/18
>
> I also created a branch rc_dec_speed but didn't create a PR for it. It
> speeds up decompression on x86-64 with OpenJDK 22 roughly 4 %. I'm not
> sure if it should be included in 1.10 in some form. If there is
> interest, let's create a separate thread for it, or I can create a PR if
> that is preferred.
>
> So now you and other people with Java knowledge can easily help because
> especially the PR #18 doesn't need much knowledge of the project
> internals. Thanks!
>
> I currently don't know what I should post to xz-devel and what to
> GitHub. Many public communication channels makes things hard to follow.
>
> --
> Lasse Collin
>



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-07-18 Thread Lasse Collin
On 2024-03-07 Lasse Collin wrote:
> On 2024-03-05 Dennis Ens wrote:
> > > I hope 1.10 could be done in a month or two but I don't want to
> > > make any promises or serious predictions. Historically those
> > > haven't been accurate at all.
> > 
> > I'll hope it's on the sooner side then.

1.10 is getting closer to being ready. I'd like to fix this-escape
warnings still as those are a sign of design errors. More info here:

https://github.com/tukaani-project/xz-java/pull/18

I also created a branch rc_dec_speed but didn't create a PR for it. It
speeds up decompression on x86-64 with OpenJDK 22 roughly 4 %. I'm not
sure if it should be included in 1.10 in some form. If there is
interest, let's create a separate thread for it, or I can create a PR if
that is preferred.

So now you and other people with Java knowledge can easily help because
especially the PR #18 doesn't need much knowledge of the project
internals. Thanks!

I currently don't know what I should post to xz-devel and what to
GitHub. Many public communication channels makes things hard to follow.

-- 
Lasse Collin



Re: [xz-devel] xz-java and newer java

2024-07-11 Thread Brett Okken
Looks great to me.
I apologize for not getting to the arm stuff myself.

Thanks,
Brett

On Thu, Jul 11, 2024 at 11:47 AM Lasse Collin 
wrote:

> On 2024-03-26 Lasse Collin wrote:
> > I suppose the ARM64 speed is still to be determined by you or someone
> > else.
>
> I was given results from one ARM64 system with Java 23:
>   - Java 8 code: 5.6 s
>   - Basic:   3.8 s
>   - UnalignedLongLE: 2.4 s
>
> The test command:
>
> time head -c1 /dev/zero \
> | java -jar build/jar/XZEncDemo.jar > /dev/null
>
> It's a similar enough result as on x86-64.
>
> Is the bytearrayview branch ready for merging?
>
> --
> Lasse Collin
>
>


Re: [xz-devel] Require CMake 3.20?

2024-07-11 Thread Lasse Collin
On 2024-07-11 Christoph Biedl wrote:
> Lasse Collin wrote...
> 
> > There weren't any comments in the CMake thread on GitHub[3] and no
> > one has objected on xz-devel so far either. Considering that
> > Autotools support isn't going away anytime soon, I think requiring
> > CMake 3.20 is fine.  
> 
> Just adding my 2¢, I consider your argument sane, and I appreciate a
> lot that you care about older release. Your notion became rare,
> please stay that way.

Thanks!

I have pushed the CMake >= 3.20 requirement and cleanup to master.

-- 
Lasse Collin



Re: [xz-devel] Require CMake 3.20?

2024-07-11 Thread Christoph Biedl
Lasse Collin wrote...

> There weren't any comments in the CMake thread on GitHub[3] and no one
> has objected on xz-devel so far either. Considering that Autotools
> support isn't going away anytime soon, I think requiring CMake 3.20
> is fine.

Just adding my 2¢, I consider your argument sane, and I appreciate a lot
that you care about older release. Your notion became rare, please stay
that way.

Christoph


signature.asc
Description: PGP signature


Re: [xz-devel] xz-java and newer java

2024-07-11 Thread Lasse Collin
On 2024-03-26 Lasse Collin wrote:
> I suppose the ARM64 speed is still to be determined by you or someone
> else.

I was given results from one ARM64 system with Java 23:
  - Java 8 code: 5.6 s
  - Basic:   3.8 s
  - UnalignedLongLE: 2.4 s

The test command:

time head -c1 /dev/zero \
| java -jar build/jar/XZEncDemo.jar > /dev/null

It's a similar enough result as on x86-64.

Is the bytearrayview branch ready for merging?

-- 
Lasse Collin



Re: [xz-devel] Require CMake 3.20?

2024-07-11 Thread Lasse Collin
On 2024-07-07 Sebastian Andrzej Siewior wrote:
> On 2024-07-06 15:42:41 [+0300], Lasse Collin wrote:
> > Does anyone think it's too early to require CMake >= 3.20?  
> 
> Debian Bookworm (stable) has cmake 3.25 and the release before it has
> it in backports. So if one rolls its own xz and is using one of those
> two distros and prefers to avoid autotools then it would work.

There was a bug report with Debian 11's CMake 3.18.4.[1] It was valuable
as it made me spot issues that were bugs with a new CMake too. Since
the build process didn't complain about CMake 3.18, the reporter didn't
have a reason to check Debian backports.

LTS distros with old CMake:
  - Ubuntu 20.04 has 3.16.3 (no backports)
  - CentOS 7 has 3.17.5 in EPEL

Kitware provides Ubuntu 20.04 repository with CMake for x86-64 and
32/64-bit ARM.[2] But people might not want to use third-party
repositories.

There weren't any comments in the CMake thread on GitHub[3] and no one
has objected on xz-devel so far either. Considering that Autotools
support isn't going away anytime soon, I think requiring CMake 3.20
is fine.

[1] https://github.com/tukaani-project/xz/issues/129
[2] https://apt.kitware.com/
[3] https://github.com/tukaani-project/xz/issues/68

-- 
Lasse Collin



Re: [xz-devel] Require CMake 3.20?

2024-07-07 Thread Sebastian Andrzej Siewior
On 2024-07-06 15:42:41 [+0300], Lasse Collin wrote:
> Does anyone think it's too early to require CMake >= 3.20?

Debian Bookworm (stable) has cmake 3.25 and the release before it has it
in backports. So if one rolls its own xz and is using one of those two
distros and prefers to avoid autotools then it would work.

Sebastian



[xz-devel] Require CMake 3.20?

2024-07-06 Thread Lasse Collin
While there has been initial work towards Meson support, I wanted to
finish the CMake support first. It should now be in a pretty good shape
in the master branch. There are quite a few cleanups, config variables
were renamed (which breaks backward compatibility), and a few issues
were fixed that were present in CMake support in XZ Utils 5.6.2.

Currently CMake >= 3.14 is required. Increasing the requirement to
CMake >= 3.20 would allow a few small cleanups. Translation support and
relocatable liblzma.pc only work with CMake >= 3.20, so builds with
older CMake are already not fully featured. However, there might be use
cases where the missing features don't matter.

Does anyone think it's too early to require CMake >= 3.20?

The CMake changes in the master branch are significant enough that
they might not be backported to 5.6.x, so this would be about 5.8.0
which should be released at some point this year.

The Autotools support won't go away in 5.8.0, thus the CMake version
requirement matters only if one cannot or doesn't want to use the
Autotools-based build.

-- 
Lasse Collin



[xz-devel] Build system fix for XZ Utils 5.2.13, 5.4.7, and 5.6.2

2024-06-19 Thread Lasse Collin
XZ Utils 5.2.13, 5.4.7, and 5.6.2 packages were created using a Git
snapshot of GNU Libtool. The benefit is that there tend to be fixes
that aren't in a stable release yet. At the same time there is a higher
risk of new bugs. This time there unfortunately was a bug that breaks
building of shared libraries on some systems like mips64.

I'm sorry for the hassle. In any case, don't blame Libtool upstream for
this.

The same patch applies to the three XZ Utils releases:


https://github.com/tukaani-project/xz/releases/download/v5.6.2/xz-5213-547-562-libtool.patch


https://github.com/tukaani-project/xz/releases/download/v5.6.2/xz-5213-547-562-libtool.patch.sig

Libtool upstream fix:


https://git.savannah.gnu.org/cgit/libtool.git/commit/?id=9a4a02615c9e7cbcfd690ed31874822a7d6aaea2

Other information:

https://lore.kernel.org/distributions/3299713.44csPzL39Z@pinacolada/

https://bugs.gentoo.org/934370

-- 
Lasse Collin



[xz-devel] XZ Utils 5.2.13, 5.4.7, and 5.6.2

2024-05-29 Thread Lasse Collin
XZ Utils 5.2.13, 5.4.7, and 5.6.2 are available at
. The releases are signed with my key. The same
key was used for 5.2.11 and 5.4.2 and older. Fingerprint:

3690 C240 CE51 B467 0D30  AD1C 38EE 757D 6918 4620

Sam James is now a supporting maintainer. The x...@tukaani.org email
address forwards to me and him.

Special thanks to Andres Freund for discovering the backdoor in 5.6.0
and 5.6.1.

Extracts from the NEWS file:

5.6.2 (2024-05-29)

* Remove the backdoor (CVE-2024-3094).

* Not changed: Memory sanitizer (MSAN) has a false positive
  in the CRC CLMUL code which also makes OSS Fuzz unhappy.
  Valgrind is smarter and doesn't complain.

  A revision to the CLMUL code is coming anyway and this issue
  will be cleaned up as part of it. It won't be backported to
  5.6.x or 5.4.x because the old code isn't wrong. There is
  no reason to risk introducing regressions in old branches
  just to silence a false positive.

* liblzma:

- lzma_index_decoder() and lzma_index_buffer_decode(): Fix
  a missing output pointer initialization (*i = NULL) if the
  functions are called with invalid arguments. The API docs
  say that such an initialization is always done. In practice
  this matters very little because the problem can only occur
  if the calling application has a bug and these functions
  return LZMA_PROG_ERROR.

- lzma_str_to_filters(): Fix a missing output pointer
  initialization (*error_pos = 0). This is very similar
  to the fix above.

- Fix C standard conformance with function pointer types.

- Remove GNU indirect function (IFUNC) support. This is *NOT*
  done for security reasons even though the backdoor relied on
  this code. The performance benefits of IFUNC are too tiny in
  this project to make the extra complexity worth it.

- FreeBSD on ARM64: Add error checking to CRC32 instruction
  support detection.

- Fix building with NVIDIA HPC SDK.

* xz:

- Fix a C standard conformance issue in --block-list parsing
  (arithmetic on a null pointer).

- Fix a warning from GNU groff when processing the man page:
  "warning: cannot select font 'CW'"

* xzdec: Add support for Linux Landlock ABI version 4. xz already
  had the v3-to-v4 change but it had been forgotten from xzdec.

* Autotools-based build system (configure):

- Symbol versioning variant can now be overridden with
  --enable-symbol-versions. Documentation in INSTALL was
  updated to match.

- Add new configure option --enable-doxygen to enable
  generation and installation of the liblzma API documentation
  using Doxygen. Documentation in INSTALL and PACKAGERS was
  updated to match.

CMake:

- Fix detection of Linux Landlock support. The detection code
  in CMakeLists.txt had been sabotaged.

- Disable symbol versioning on non-glibc Linux to match what
  the Autotools build does. For example, symbol versioning
  isn't enabled with musl.

- Symbol versioning variant can now be overridden by setting
  SYMBOL_VERSIONING to "OFF", "generic", or "linux".

- Add support for all tests in typical build configurations.
  Now the only difference to the tests coverage to Autotools
  is that CMake-based build will skip more tests if features
  are disabled. Such builds are only for special cases like
  embedded systems.

- Separate the CMake code for the tests into tests/tests.cmake.
  It is used conditionally, thus it is possible to

  rm -rf tests

  and the CMake-based build will still work normally except
  that no tests are then available.

- Add a option ENABLE_DOXYGEN to enable generation and
  installation of the liblzma API documentation using Doxygen.

* Documentation:

- Omit the Doxygen-generated liblzma API documentation from the
  package. Instead, the generation and installation of the API
  docs can be enabled with a configure or CMake option if
  Doxygen is available.

- Remove the XZ logo which was used in the API documentation.
  The logo has been retired and isn't used by the project
  anymore. However, it's OK to use it in contexts that refer
  to the backdoor incident.

- Remove the PDF versions of the man pages from the source
  package. These existed primarily for users of operating
  systems which don't come with tools to render man page
  source files. The plain text versions are still included
  in doc/man/txt. PDF files can still be generated to doc/man,
  if the required tools are available, using "make pdf" after
  running "configure".

Re: [xz-devel] xz-java and newer java

2024-03-26 Thread Lasse Collin
On 2024-03-20 Brett Okken wrote:
> The jdk8 changes show nice improvements over head. My assumption is
> that with less math going on in the offsets of the while loop allowed
> the jvm to better optimize.

Sounds good, thanks! :-)

> I am surprised with the binary math behind your handling of long
> comparisons here:

I had to refresh my memory as I hadn't commented it in memcmplen.h. Now
it is (based on Agner Fog's microarchitecture.pdf):

  - On some x86-64 processors (Intel Sandy Bridge to Tiger Lake),
sub+jz and sub+jnz can be fused but xor+jz or xor+jnz cannot.
Thus using subtraction has potential to be a tiny amount faster
since the code checks if the quotient is non-zero.

  - Some processors (Intel Pentium 4) used to have more ALU
resources for add/sub instructions than and/or/xor.

So in the C code it's not a huge thing and in Java it's probably
about nothing. But there is no real downside to using subtraction.

I understand how xor seems more obvious choice. However, when looking
for the lowest differing bit, subtraction will make that bit 1 and the
bits below it 0. Only the bits above the 1 will differ between
subtraction and xor but those bits are irrelevant here.

I created a new branch, bytearrayview, which combines the CRC64 edits
with the encoder speed changes as they share the ByteArrayView class
(formerly ArrayUtil).

> > I still need to check a few of your edits if some of them should be
> > included. :-)  
> 
> I think the changes to LZMAEncoderNormal as part of this PR to avoid
> the negative length comparison would be good to carry forward.

Done, I hope.

> 1. Use an interface with implementation chosen statically to separate
> out the implementation options.

I had an early version that used separate implementation classes but I
must have done something wrong as that version was *clearly* slower. So
I tried it again and it's as you say, no speed difference. :-)

> 2. Allow specifying the implementation to use with a system property.

Done. I hope it's done in a sensible enough way. The Java < 9 code is
completely separate so it cannot be chosen. The property needs to be
documented somewhere too.

I suppose the ARM64 speed is still to be determined by you or someone
else.

-- 
Lasse Collin



Re: [xz-devel] xz-java and newer java

2024-03-20 Thread Brett Okken
> I still need to check a few of your edits if some of them should be
> included. :-)

I think the changes to LZMAEncoderNormal as part of this PR to avoid the
negative length comparison would be good to carry forward. It basically
compares to the MATCH_LEN_MIN instead of 0. This can avoid some (short)
calls to getMatchLen whose results are going to be ignored anyway on the
very next line.
https://github.com/tukaani-project/xz-java/pull/13/commits/544e449446a3d652de2a5f170d197aef695f12ec#diff-e6858bec0a168955b2ad68bbe89af8ab9ca7b9b1ebf2d9b8bc362fb80dab2967

>I pushed basic docs for getMatchLen.

Thanks - that is very helpful.

> I can wait for the summary, thanks.

The jdk8 changes show nice improvements over head. My assumption is that
with less math going on in the offsets of the while loop allowed the jvm to
better optimize.
Benchmark(file)  (preset)  Mode
 Cnt Score  Error  Units
XZCompressionBenchmark.head ihe_ovly_pr.dcm 3  avgt
 3 0.617 ±0.159  ms/op
XZCompressionBenchmark.lasseihe_ovly_pr.dcm 3  avgt
 3 0.556 ±0.163  ms/op
XZCompressionBenchmark.head ihe_ovly_pr.dcm 6  avgt
 3 2.908 ±0.346  ms/op
XZCompressionBenchmark.lasseihe_ovly_pr.dcm 6  avgt
 3 2.437 ±0.160  ms/op
XZCompressionBenchmark.head  image1.dcm 3  avgt
 3  2106.954 ± 1295.185  ms/op
XZCompressionBenchmark.lasse image1.dcm 3  avgt
 3  1703.705 ±  482.628  ms/op
XZCompressionBenchmark.head  image1.dcm 6  avgt
 3  4304.648 ± 1650.114  ms/op
XZCompressionBenchmark.lasse image1.dcm 6  avgt
 3  3430.697 ±  129.481  ms/op
XZCompressionBenchmark.head   large.xml 3  avgt
 3   805.220 ± 1094.696  ms/op
XZCompressionBenchmark.lasse  large.xml 3  avgt
 3   658.586 ±   31.645  ms/op
XZCompressionBenchmark.head   large.xml 6  avgt
 3  6743.478 ± 1634.641  ms/op
XZCompressionBenchmark.lasse  large.xml 6  avgt
 3  5880.570 ±  587.226  ms/op


Defining an interface to defer the implementation to has no impact on
performance. Here are the results of LZUtil using an implementation which
matches what you have for jdk 8.

XZCompressionBenchmark.compress_legacy_loop  ihe_ovly_pr.dcm 3
 avgt3 0.548 ±0.056  ms/op
XZCompressionBenchmark.compress_legacy_loop  ihe_ovly_pr.dcm 6
 avgt3 2.493 ±0.097  ms/op
XZCompressionBenchmark.compress_legacy_loop   image1.dcm 3
 avgt3  1720.038 ±  237.015  ms/op
XZCompressionBenchmark.compress_legacy_loop   image1.dcm 6
 avgt3  3671.539 ± 2016.282  ms/op
XZCompressionBenchmark.compress_legacy_looplarge.xml 3
 avgt3   667.045 ±  108.601  ms/op
XZCompressionBenchmark.compress_legacy_looplarge.xml 6
 avgt3  5842.107 ±  552.634  ms/op

> Thanks. I was already tilted towards not using Unsafe and now I'm even
> more. The speed benefit of Unsafe over VarHandle should be tiny enough.
> It feels better that memory safety isn't ignored on any JDK version.

I have no problem with that. The performance differences between Unsafe and
VarHandle are very minimal (and sometimes reverse when bounds checks are
introduced to use of Unsafe).


I am surprised with the binary math behind your handling of long
comparisons here:
https://github.com/tukaani-project/xz-java/compare/master...array_compare#diff-1c6fd3fbd64728f8d99a692827015a1bd7341a0dc651cf6205cc72024e90b065R141-R147
Specifically you are using subtraction instead of XOR (like I did here)
https://github.com/tukaani-project/xz-java/pull/13/files#diff-2fc691ea3e96cf4821f4eceac43919cb659e7ae91b4e6919e35fb25f37439d3dR118-R127

Is there an advantage? By not supporting Unsafe, you do not have to deal
with BigEndian, so that makes this approach possible. I personally find XOR
to more clearly answer the question being asked (which is first byte with
difference). My first reaction was subtraction would not produce the
correct results.


Generally, I really like what you have. I would propose 2 changes:
1. Use an interface with implementation chosen statically to separate out
the implementation options. This makes it much easier to unit test all the
implementations. I also find that it makes the code easier to read/reason
about by being more modular.
2. Allow specifying the implementation to use with a system property. This
would be unlikely to be used outside of benchmarking, but would provide
options for users on unusual hardware.

Brett







On Tue, Mar 12, 2024 at 12:55 PM Lasse Collin 
wrote:

> On 2024-03-12 Brett Okken wrote:
> > I am still working on digesting your branch.
>
> I still need to check a few of your edits if some of them should be
> included. :-)
>
> > The difference in method signature is subtle, but I think a key part

Re: [xz-devel] xz-java and newer java

2024-03-12 Thread Lasse Collin
On 2024-03-12 Brett Okken wrote:
> I am still working on digesting your branch.

I still need to check a few of your edits if some of them should be
included. :-)

> The difference in method signature is subtle, but I think a key part
> of the improvements you are getting. Could you add javadoc to more
> clearly describe how the args are to be interpreted and what the
> return value means?

I pushed basic docs for getMatchLen.

Once crc64_varhandle2 is merged then array_compare should use ArrayUtil
too. It doesn't make a difference in speed.

> I am playing with manually unrolling the java 8 byte-by-byte impl
> along with tests comparing unsafe, var handle, and vector approaches.
> These tests take a long time to run, so it will be a couple days
> before I have complete results. Do you want data as I have it (and it
> is interesting), or wait for summary?

I can wait for the summary, thanks.

> I am not sure when I will get opportunity to test out arm64.

If someone has, for example, a Raspberry Pi, the compression of zeros
test is simple enough to do and at least on x86-64 has clear enough
difference. It's an over-simplified test but it's a data point still.

> I do have some things still on jdk 8, but only decompression. Surveys
> seem to indicate quite a bit of jdk 8 still in use, but I have no
> personal need.

Thanks. I was already tilted towards not using Unsafe and now I'm even
more. The speed benefit of Unsafe over VarHandle should be tiny enough.
It feels better that memory safety isn't ignored on any JDK version. If
a bug was found, it's nicer to not wonder if Unsafe had a role in it.
This is better for security too.

In my previous email I wondered if using Unsafe only with Java 8 would
make upgrading to newer JDK look bad if newer JDK used VarHandle
instead of Unsafe. Perhaps that worry was overblown. But the other
reasons and keeping the code simpler make me want to avoid Unsafe.

(C code via JNI wouldn't be memory safe but then the speed benefits
should be much more significant too.)

-- 
Lasse Collin



Re: [xz-devel] xz-java and newer java

2024-03-12 Thread Brett Okken
I am still working on digesting your branch. The difference in method
signature is subtle, but I think a key part of the improvements you are
getting. Could you add javadoc to more clearly describe how the args are to
be interpreted and what the return value means?

I am playing with manually unrolling the java 8 byte-by-byte impl along
with tests comparing unsafe, var handle, and vector approaches. These tests
take a long time to run, so it will be a couple days before I have complete
results. Do you want data as I have it (and it is interesting), or wait for
summary? I am not sure when I will get opportunity to test out arm64. That
could be awhile yet.

I do have some things still on jdk 8, but only decompression. Surveys seem
to indicate quite a bit of jdk 8 still in use, but I have no personal need.

Brett

On Sun, Mar 10, 2024 at 2:49 PM Lasse Collin 
wrote:

> On 2024-03-09 Brett Okken wrote:
> > When I tested graviton2 (arm64) previously, Arrays.mismatch was
> > better than comparing longs using a VarHandle.
>
> Sounds promising. :-) However, your array_comparison_performance
> handles the last 1-7 bytes byte-by-byte. My array_compare branch
> reserves extra 7 bytes at the end of the array so that one can safely
> read up to 7 bytes more than one actually needs. This way no bounds
> checks are needed (even with Unsafe). This might affect the comparision
> between Arrays.mismatch and VarHandle if the results were close before.
>
> > I do like Unsafe as an option for jdk 8 users on x86 or arm64.
>
> Unsafe seems very slightly faster than VarHandle. If Java 8 uses Unsafe,
> should newer versions do too? It could be counter-productive if Java 8
> was faster, even if the difference was tiny.
>
> Do you have use cases that are (for now) stuck on Java 8 or is your
> wish a more generic one?
>
> --
> Lasse Collin
>


Re: [xz-devel] xz-java and newer java

2024-03-10 Thread Lasse Collin
On 2024-03-09 Brett Okken wrote:
> When I tested graviton2 (arm64) previously, Arrays.mismatch was
> better than comparing longs using a VarHandle.

Sounds promising. :-) However, your array_comparison_performance
handles the last 1-7 bytes byte-by-byte. My array_compare branch
reserves extra 7 bytes at the end of the array so that one can safely
read up to 7 bytes more than one actually needs. This way no bounds
checks are needed (even with Unsafe). This might affect the comparision
between Arrays.mismatch and VarHandle if the results were close before.

> I do like Unsafe as an option for jdk 8 users on x86 or arm64.

Unsafe seems very slightly faster than VarHandle. If Java 8 uses Unsafe,
should newer versions do too? It could be counter-productive if Java 8
was faster, even if the difference was tiny.

Do you have use cases that are (for now) stuck on Java 8 or is your
wish a more generic one?

-- 
Lasse Collin



Re: [xz-devel] xz-java and newer java

2024-03-09 Thread Brett Okken
When I tested graviton2 (arm64) previously, Arrays.mismatch was better than
comparing longs using a VarHandle.

The benefits are definitely with content that compresses more - because
there are more long matches.

I do like Unsafe as an option for jdk 8 users on x86 or arm64.

On Sat, Mar 9, 2024 at 3:51 PM Lasse Collin 
wrote:

> I created a branch array_compare. It has a simple version for Java <= 8
> which seems very slightly faster than the current code in master, at
> least when tested with OpenJDK 21. For Java >= 9 there is
> Arrays.mismatch for portability and VarHandle for x86-64 and ARM64.
> These are clearly faster than the basic version.
>
> sun.misc.Unsafe would be a little faster than VarHandle but I feel it's
> not enough to be worth the downsides (non-standard and not memory safe).
>
> 32-bit archs I didn't include, for now at least, since if people want
> speed I hope they don't run 32-bit Java.
>
> Speed differences are very minor when testing with files that don't
> compress extremely well. That was the problem I had with my earlier
> test results. With files that have compression ratio like 0.05 the
> speed differences are clear.
>
> I cannot test on ARM64 so it would be great if someone can, comparing
> the three versions. The most extreme difference is when compressing
> just zeros:
>
> time head -c1 /dev/zero \
> | java -jar build/jar/XZEncDemo.jar > /dev/null
>
> Internal docs should be added to the branch and perhaps there are other
> related optimizations to do still. So it's not fully finished yet but
> now it's ready for testing and feedback. For example, some tweaks from
> your array_comp_incremental could be considered after testing.
>
> --
> Lasse Collin
>


Re: [xz-devel] xz-java and newer java

2024-03-09 Thread Lasse Collin
I created a branch array_compare. It has a simple version for Java <= 8
which seems very slightly faster than the current code in master, at
least when tested with OpenJDK 21. For Java >= 9 there is
Arrays.mismatch for portability and VarHandle for x86-64 and ARM64.
These are clearly faster than the basic version.

sun.misc.Unsafe would be a little faster than VarHandle but I feel it's
not enough to be worth the downsides (non-standard and not memory safe).

32-bit archs I didn't include, for now at least, since if people want
speed I hope they don't run 32-bit Java.

Speed differences are very minor when testing with files that don't
compress extremely well. That was the problem I had with my earlier
test results. With files that have compression ratio like 0.05 the
speed differences are clear.

I cannot test on ARM64 so it would be great if someone can, comparing
the three versions. The most extreme difference is when compressing
just zeros:

time head -c1 /dev/zero \
| java -jar build/jar/XZEncDemo.jar > /dev/null

Internal docs should be added to the branch and perhaps there are other
related optimizations to do still. So it's not fully finished yet but
now it's ready for testing and feedback. For example, some tweaks from
your array_comp_incremental could be considered after testing.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.6.1

2024-03-09 Thread Jia Tan
XZ Utils 5.6.1 is available at .

Here is an extract from the NEWS file:

5.6.1 (2024-03-09)

* liblzma: Fixed two bugs relating to GNU indirect function (IFUNC)
  with GCC. The more serious bug caused a program linked with
  liblzma to crash on start up if the flag -fprofile-generate was
  used to build liblzma. The second bug caused liblzma to falsely
  report an invalid write to Valgrind when loading liblzma.

* xz: Changed the messages for thread reduction due to memory
  constraints to only appear under the highest verbosity level.

* Build:

- Fixed a build issue when the header file 
  was present on the system but the Landlock system calls were
  not defined in .

- The CMake build now warns and disables NLS if both gettext
  tools and pre-created .gmo files are missing. Previously,
  this caused the CMake build to fail.

* Minor improvements to man pages.

* Minor improvements to tests.

---

Jia Tan



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-03-07 Thread Lasse Collin
On 2024-03-05 Dennis Ens wrote:
> > I hope 1.10 could be done in a month or two but I don't want to
> > make any promises or serious predictions. Historically those
> > haven't been accurate at all.  
> 
> I'll hope it's on the sooner side then. Is there a reason that
> xz-java is so far behind its counterpart?

These are unpaid hobby projects and the maintainers work on things they
happen to find interesting. The focus was on XZ Utils quite long, now
more attention is returning to XZ for Java.

> It seems those filters have been in that version for a while, and it
> seems strange they aren't compatible with each other. Maybe this
> should be made more clear in the README?

The README file in XZ for Java 1.9 specifies that the code implements
the .xz file format specification version 1.0.4. That doesn't include
the ARM64 or RISC-V filters.

ARM64 filter was in the master branch already. RISC-V filter is there
now too among a few other changes. README refers to spec version 1.2.0
now.

I understand it can be cryptic to refer to a spec version but obviously
one cannot list what future things are missing. One could list
supported filters but in theory something else could be extended too.

> I don't see anything about contributing on the xz-java github page.
> What are the best practices for contributing to this project?

I'm not sure if there is anything specific. Chatting on #tukaani can be
good to get ideas discussed quickly but it requires that people happen
to be online at the same time.

> > The encoder implementations have some minor differences which
> > affects both output and speed. Different releases can in theory
> > have different output. XZ Utils output might change in future
> > versions too.  
> 
> I see, that makes sense. I'm glad the difference is explainable and
> not a bug. Can you explain exactly what the differences are?

I don't remember much now. It's minor details but minor differences
affect output already.

> Does xz-java always do a better job compressing since it resulted in a
> smaller file?

They should be very close in practice. You need to compare to XZ Utils
in single-threaded mode: xz -T1

-- 
Lasse Collin



Re: [xz-devel] xz-java and newer java

2024-03-07 Thread Lasse Collin
On 2024-02-29 Brett Okken wrote:
> > Thanks! Ideally there would be one commit to add the minimal
> > portable version, then separate commits for each optimized variant.
> 
> Would you like me to remove the Unsafe based impl from
> https://github.com/tukaani-project/xz-java/pull/13?

There are new commits in master now and those might slightly conflict
with your PR (@Override additions). I'm playing around a bit and
learning about the faster methods still. So right now I don't have
wishes for changes; I don't want to request anything when there's a
possibility that some other way might end up looking more preferable.

In general, I would prefer splitting to more commits. Using your PR as
an example:

  1. Adding the changes to lz/*.java and the portable *Array*.java
 code required by those changes.

  2. Adding one advanced implementation that affects only the
 *Array*.java files.

  3. Repeat step 2. until all implementations are added.

When reasonably possible, the line length should be under 80 chars.

> > So far I have given it only a quick try. array_comp_incremental
> > seems faster than xz-java.git master. Compression time was reduced
> > by about 10 %. :-) This is with OpenJDK 21.0.2, only a quick test,
> > and my computer is old so I don't doubt your higher numbers.  
> 
> How are you testing? I am using jmh, so it has a warm up period before
> actually measuring, giving the jvm plenty of opportunity to perform
> optimizations. If you are doing single shot executions to compress a
> file, that could provide pretty different results.

I was simply timing a XZEncDemo at the default preset (6). I had hoped
that big files (binary and source packages) that take tens of seconds
to compress, repeating each test a few times, would work well enough.
But perhaps the difference is big enough only with certain types of
files.

On 2024-03-05 Brett Okken wrote:
> I have added a comment to the PR with updated benchmark results:
> https://github.com/tukaani-project/xz-java/pull/13#issuecomment-1977705691

Thanks! I'm not sure if I read the results well enough. The "Error"
column seems to have oddly high values on several lines. If the same
test set is run again, are the results in the "Score" column similar
enough between the two runs, retaining the speed order of the
implementations being tested?

If the first file is only ~66KB, I wonder if other factors like
initiazing large arrays in the classes take so much time that
differences in array comparison speeds becomes hard to measure.

When each test is repeated by the benchmarking framework, each run has
to allocate the classes again. Perhaps it might trigger garbage
collection. Did you have ArrayCache enabled?

ArrayCache.setDefaultCache(BasicArrayCache.getInstance());

I suppose optimizing only for new JDK version(s) would be fine if it
makes things easier. That is, it could be enough that performance
doesn't get worse on Java 8.

If the indirection adds overhead, would it make sense to have a
preprocessing step that creates .java file variants that directly use
the optimized methods? So LZMAEncoder.getInstance could choose at
runtime if it should use LZMAEncoderNormalPortable or
LZMAEncoderNormalUnsafe or some other implementation. That is, if this
cannot be done with multi-release JAR. It's not a pretty solution but if
it is faster then it could be one option, maybe.

Negative lenLimit currently occurs in two places (at least). Perhaps it
should be handled in those places instead of requiring the array
comparison to support it (the C code in liblzma does it like that).

-- 
Lasse Collin



Re: [xz-devel] xz-java and newer java

2024-03-05 Thread Brett Okken
I have added a comment to the PR with updated benchmark results:
https://github.com/tukaani-project/xz-java/pull/13#issuecomment-1977705691

On Fri, Mar 1, 2024 at 6:23 AM Brett Okken  wrote:
>
> I found and resolved the difference:
> https://github.com/tukaani-project/xz-java/pull/13/commits/1e4550e06d8cbec4079b2b2fba4a2245307cc4e6
>
> It was indeed in BT4 and had to do with searching for the
> niceLenLimit. I will update the benchmarks over the weekend, as they
> take some time to run.
>
> Brett
>
> On Thu, Feb 29, 2024 at 8:47 PM Brett Okken  wrote:
> >
> > > Thanks! Ideally there would be one commit to add the minimal portable
> > > version, then separate commits for each optimized variant.
> >
> > Would you like me to remove the Unsafe based impl from
> > https://github.com/tukaani-project/xz-java/pull/13?
> >
> > > So far I have given it only a quick try. array_comp_incremental seems
> > > faster than xz-java.git master. Compression time was reduced by about
> > > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > > computer is old so I don't doubt your higher numbers.
> >
> > How are you testing? I am using jmh, so it has a warm up period before
> > actually measuring, giving the jvm plenty of opportunity to perform
> > optimizations. If you are doing single shot executions to compress a
> > file, that could provide pretty different results.
> >
> > > With array_comparison_performance the improvement seems to be less,
> > > maybe 5 %. I didn't test much yet but it still seems clear that
> > > array_comp_incremental is faster on my computer.
> >
> > Going back to the previous question, this could be due to fact that I
> > collapsed some class hierarchy in the _incremental pr. This could take
> > the optimizer a bit longer to figure out.
> >
> > > However, your code produces different output compared to xz-java.git
> > > master so the speed comparison isn't entirely fair. I assume there was
> > > no intent to affect the encoder output with these changes so I wonder
> > > what is going on. Both of your branches produce the same output so it's
> > > something common between them that makes the difference.
> >
> > This was definitely not the intent, and I had not noticed this previously.
> >
> > With the 3 files I test with, none have any difference with preset of
> > 3. The smallest file (ihe_ovly_pr.cm) also has no difference at preset
> > 6.
> >
> > With the ~25MB image1.dcm (mostly a greyscale bmp), the PR versions
> > produce more compressed content at preset 6.
> > 1.9 = 4,041,476
> > PR = 4,004,156
> >
> > There is a smaller difference with the ~50MB xml file, but strangely,
> > the PR version is slightly bigger.
> > 1.9 = 1,589,512
> > PR = 1,589,564
> >
> > Given that I am only seeing differences with preset of 6, I am
> > guessing the difference must be in BT4.
> > The result still seems to be valid (at least the java XZInputStream
> > reads it back correctly).
> > There is clearly a subtle "defect" somewhere, but I cannot tell if it
> > is in the current trunk or the PR. My best guess is that there is an
> > off by 1 error in one or the other.
> >
> > Brett
> >
> > On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin  
> > wrote:
> > >
> > > On 2024-02-25 Brett Okken wrote:
> > > > I created https://github.com/tukaani-project/xz-java/pull/13 with the
> > > > bare bones changes to utilize a utility for array comparisons and an
> > > > Unsafe implementation.
> > > > When/if that is reviewed and approved, we can move on through the
> > > > other implementation options.
> > >
> > > Thanks! Ideally there would be one commit to add the minimal portable
> > > version, then separate commits for each optimized variant.
> > >
> > > So far I have given it only a quick try. array_comp_incremental seems
> > > faster than xz-java.git master. Compression time was reduced by about
> > > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > > computer is old so I don't doubt your higher numbers.
> > >
> > > With array_comparison_performance the improvement seems to be less,
> > > maybe 5 %. I didn't test much yet but it still seems clear that
> > > array_comp_incremental is faster on my computer.
> > >
> > > However, your code produces different output compared to xz-java.git
> > > master so the speed comparison isn't entirely fair. I assume there was
> > > no intent to affect the encoder output with these changes so I wonder
> > > what is going on. Both of your branches produce the same output so it's
> > > something common between them that makes the difference.
> > >
> > > I plan to get back to this next week.
> > >
> > > > > One thing I wonder is if JNI could help.
> > > >
> > > > It would most likely make things faster, but also more complicated. I
> > > > like the java version for the simplicity. I am not necessarily looking
> > > > to compete with native performance, but would like to get improvements
> > > > where they are reasonably available. Here there is some complexity 

Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-03-05 Thread Dennis Ens
> I hope 1.10 could be done in a month or two but I don't want to make any
> promises or serious predictions. Historically those haven't been
> accurate at all.

I'll hope it's on the sooner side then. Is there a reason that xz-java is so 
far behind its counterpart? It seems those filters have been in that version 
for a while, and it seems strange they aren't compatible with each other. Maybe 
this should be made more clear in the README? 

I'd be happy to help out to make them compatible. I don't see anything about 
contributing on the xz-java github page. What are the best practices for 
contributing to this project?

> However, Java in general is slower. Some compressors have a Java API but
> the performance-critical code is native code. For example, java.util.zip
> calls into native code from zlib. XZ for Java doesn't use any native
> code (for now at least).

That's good to know. It seems like if it's using native code it should be 
possible to get the speeds pretty similar. It sounds like an interesting 
problem to tackle.

> XZ for Java lacks threading still. Implementing it is among the most
> important tasks in XZ for Java. It helps with big files like your test
> file but makes compressed file a little bigger. From your numbers I'm
> not certain if you used xz in threaded mode or not. The time difference
> looks unusually high for single-threaded mode for both compression and
> decompression. The difference for a big input file in threaded mode
> looks small though (unless it had lots of trivially-compressible
> sections).

> XZ Utils 5.6.0 also enables threaded mode by default.

That would explain a lot of the speed difference I noticed then. I was using 
the latest code from master, so I should have been running in threaded mode. 
It's great how large an improvement that can make, although I know it's always 
complicated to implement threading without any bugs.

> The encoder implementations have some minor differences which affects
> both output and speed. Different releases can in theory have different
> output. XZ Utils output might change in future versions too.

I see, that makes sense. I'm glad the difference is explainable and not a bug. 
Can you explain exactly what the differences are? Does xz-java always do a 
better job compressing since it resulted in a smaller file?



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-03-05 Thread Lasse Collin
On 2024-03-05 Dennis Ens wrote:
> > The XZ for Java development is becoming active again but it may
> > still take a while until the next stable release is out. A few
> > other things are waiting in the queue from the past three years.  
> 
> Ah, I see. Thank you for the answer. Do you have a timeline of when
> the changes are expected?

I hope 1.10 could be done in a month or two but I don't want to make any
promises or serious predictions. Historically those haven't been
accurate at all.

> First, xz-java seems much slower. I tested compressing and
> decompressing a ~1.2 gigabyte file, and xz-java took 17m32.345s
> compared to xz's 7m7.615s to compress. Decompressing was 0m21.760s to
> 0m6.223s. Is there anything that can be done to improve the speed of
> the Java version, or is c just a much more efficient programming
> language?

Brett Okken's patches (originally from early 2021) should improve
compression speed. They are currently under review. Those are one of
the things to get into the next stable release.

However, Java in general is slower. Some compressors have a Java API but
the performance-critical code is native code. For example, java.util.zip
calls into native code from zlib. XZ for Java doesn't use any native
code (for now at least).

XZ for Java lacks threading still. Implementing it is among the most
important tasks in XZ for Java. It helps with big files like your test
file but makes compressed file a little bigger. From your numbers I'm
not certain if you used xz in threaded mode or not. The time difference
looks unusually high for single-threaded mode for both compression and
decompression. The difference for a big input file in threaded mode
looks small though (unless it had lots of trivially-compressible
sections).

In single-threaded mode, I would expect compressing with xz to take
around 30-40 % less time than XZ for Java but your numbers show 60 %
time reduction.

XZ Utils 5.6.0 added x86-64 assembly (GCC & Clang only) which reduces
per-thread decompression time by 20-40 % depending on the file and the
computer. So that increases the difference between XZ Utils and XZ for
Java too: decompression time can be roughly 50 % less with XZ Utils
5.6.0 in single-threaded mode on x86-64 compared to XZ for Java.

XZ Utils 5.6.0 also enables threaded mode by default.

> Also, I noticed that the results of compressing the files were
> different sizes. They both worked, so I don't know if it's an issue,
> but it does seem strange. The xz-java one was slightly smaller than
> the xz one.

The encoder implementations have some minor differences which affects
both output and speed. Different releases can in theory have different
output. XZ Utils output might change in future versions too.

-- 
Lasse Collin



Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-03-05 Thread Dennis Ens
> > The files specifically were good-1-arm64-lzma2-1.xz and
> > good-1-arm64-lzma2-2.xz and good-1-riscv-lzma2-1.xz and
> > good-1-riscv-lzma2-2.xz. These did seem to work fine when I tried
> > with xz, but not with xz-java. Do you think there might be a fix
> > available for this soon?
> 
> 
> XZ for Java 1.9 doesn't have ARM64 or RISC-V filter. The master branch
> has ARM64 filter. RISC-V filter will likely be there this week.
> 
> The XZ for Java development is becoming active again but it may still
> take a while until the next stable release is out. A few other things
> are waiting in the queue from the past three years.


Ah, I see. Thank you for the answer. Do you have a timeline of when the changes 
are expected?

I started to use xz, and was able to decompress the files without issue. 
Messing around with xz and xz-java, I noticed a few other things though. 

First, xz-java seems much slower. I tested compressing and decompressing a ~1.2 
gigabyte file, and xz-java took 17m32.345s compared to xz's 7m7.615s to 
compress. Decompressing was 0m21.760s to 0m6.223s. Is there anything that can 
be done to improve the speed of the Java version, or is c just a much more 
efficient programming language?

Also, I noticed that the results of compressing the files were different sizes. 
They both worked, so I don't know if it's an issue, but it does seem strange. 
The xz-java one was slightly smaller than the xz one.

Thanks again for the help.

--
Dennis Ens




Re: [xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-03-05 Thread Lasse Collin
On 2024-03-05 Dennis Ens wrote:
> The files specifically were good-1-arm64-lzma2-1.xz and
> good-1-arm64-lzma2-2.xz and good-1-riscv-lzma2-1.xz and
> good-1-riscv-lzma2-2.xz. These did seem to work fine when I tried
> with xz, but not with xz-java. Do you think there might be a fix
> available for this soon?

XZ for Java 1.9 doesn't have ARM64 or RISC-V filter. The master branch
has ARM64 filter. RISC-V filter will likely be there this week.

The XZ for Java development is becoming active again but it may still
take a while until the next stable release is out. A few other things
are waiting in the queue from the past three years.

-- 
Lasse Collin



[xz-devel] [BUG] Issue with xz-java: Unknown Filter ID

2024-03-04 Thread Dennis Ens


Hi all,

I think I may have found a bug with xz-java. With version 1.9 (the newest, I 
think). While trying to decompress a file I got an error message saying 
"Unknown Filter ID 11". To try to solve the issue, I tried to run several of 
the test cases with XZDecDemo.jar. Most of them worked fine, but I noticed 
issues with two arm files and two riscv files. The arm files failed with 
"Unknown Filter ID 10" and the riscv files failed with "Unknown Filter ID 11". 

The files specifically were good-1-arm64-lzma2-1.xz and good-1-arm64-lzma2-2.xz 
and good-1-riscv-lzma2-1.xz and good-1-riscv-lzma2-2.xz. These did seem to work 
fine when I tried with xz, but not with xz-java. Do you think there might be a 
fix available for this soon?

Thank you.

--
Dennis Ens





Re: [xz-devel] xz-java and newer java

2024-03-01 Thread Brett Okken
I found and resolved the difference:
https://github.com/tukaani-project/xz-java/pull/13/commits/1e4550e06d8cbec4079b2b2fba4a2245307cc4e6

It was indeed in BT4 and had to do with searching for the
niceLenLimit. I will update the benchmarks over the weekend, as they
take some time to run.

Brett

On Thu, Feb 29, 2024 at 8:47 PM Brett Okken  wrote:
>
> > Thanks! Ideally there would be one commit to add the minimal portable
> > version, then separate commits for each optimized variant.
>
> Would you like me to remove the Unsafe based impl from
> https://github.com/tukaani-project/xz-java/pull/13?
>
> > So far I have given it only a quick try. array_comp_incremental seems
> > faster than xz-java.git master. Compression time was reduced by about
> > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > computer is old so I don't doubt your higher numbers.
>
> How are you testing? I am using jmh, so it has a warm up period before
> actually measuring, giving the jvm plenty of opportunity to perform
> optimizations. If you are doing single shot executions to compress a
> file, that could provide pretty different results.
>
> > With array_comparison_performance the improvement seems to be less,
> > maybe 5 %. I didn't test much yet but it still seems clear that
> > array_comp_incremental is faster on my computer.
>
> Going back to the previous question, this could be due to fact that I
> collapsed some class hierarchy in the _incremental pr. This could take
> the optimizer a bit longer to figure out.
>
> > However, your code produces different output compared to xz-java.git
> > master so the speed comparison isn't entirely fair. I assume there was
> > no intent to affect the encoder output with these changes so I wonder
> > what is going on. Both of your branches produce the same output so it's
> > something common between them that makes the difference.
>
> This was definitely not the intent, and I had not noticed this previously.
>
> With the 3 files I test with, none have any difference with preset of
> 3. The smallest file (ihe_ovly_pr.cm) also has no difference at preset
> 6.
>
> With the ~25MB image1.dcm (mostly a greyscale bmp), the PR versions
> produce more compressed content at preset 6.
> 1.9 = 4,041,476
> PR = 4,004,156
>
> There is a smaller difference with the ~50MB xml file, but strangely,
> the PR version is slightly bigger.
> 1.9 = 1,589,512
> PR = 1,589,564
>
> Given that I am only seeing differences with preset of 6, I am
> guessing the difference must be in BT4.
> The result still seems to be valid (at least the java XZInputStream
> reads it back correctly).
> There is clearly a subtle "defect" somewhere, but I cannot tell if it
> is in the current trunk or the PR. My best guess is that there is an
> off by 1 error in one or the other.
>
> Brett
>
> On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin  
> wrote:
> >
> > On 2024-02-25 Brett Okken wrote:
> > > I created https://github.com/tukaani-project/xz-java/pull/13 with the
> > > bare bones changes to utilize a utility for array comparisons and an
> > > Unsafe implementation.
> > > When/if that is reviewed and approved, we can move on through the
> > > other implementation options.
> >
> > Thanks! Ideally there would be one commit to add the minimal portable
> > version, then separate commits for each optimized variant.
> >
> > So far I have given it only a quick try. array_comp_incremental seems
> > faster than xz-java.git master. Compression time was reduced by about
> > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> > computer is old so I don't doubt your higher numbers.
> >
> > With array_comparison_performance the improvement seems to be less,
> > maybe 5 %. I didn't test much yet but it still seems clear that
> > array_comp_incremental is faster on my computer.
> >
> > However, your code produces different output compared to xz-java.git
> > master so the speed comparison isn't entirely fair. I assume there was
> > no intent to affect the encoder output with these changes so I wonder
> > what is going on. Both of your branches produce the same output so it's
> > something common between them that makes the difference.
> >
> > I plan to get back to this next week.
> >
> > > > One thing I wonder is if JNI could help.
> > >
> > > It would most likely make things faster, but also more complicated. I
> > > like the java version for the simplicity. I am not necessarily looking
> > > to compete with native performance, but would like to get improvements
> > > where they are reasonably available. Here there is some complexity in
> > > supporting multiple implementations for different versions and/or
> > > architectures, but that complexity does not intrude into the core of
> > > the xz code.
> >
> > I think your thoughts are similar to mine here. Java version is clearly
> > slower but it's nicer code to read too. A separate class for buffer
> > comparisons indeed doesn't hurt the readability of the core code.
> >
> > 

Re: [xz-devel] xz-java and newer java

2024-02-29 Thread Brett Okken
> Thanks! Ideally there would be one commit to add the minimal portable
> version, then separate commits for each optimized variant.

Would you like me to remove the Unsafe based impl from
https://github.com/tukaani-project/xz-java/pull/13?

> So far I have given it only a quick try. array_comp_incremental seems
> faster than xz-java.git master. Compression time was reduced by about
> 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> computer is old so I don't doubt your higher numbers.

How are you testing? I am using jmh, so it has a warm up period before
actually measuring, giving the jvm plenty of opportunity to perform
optimizations. If you are doing single shot executions to compress a
file, that could provide pretty different results.

> With array_comparison_performance the improvement seems to be less,
> maybe 5 %. I didn't test much yet but it still seems clear that
> array_comp_incremental is faster on my computer.

Going back to the previous question, this could be due to fact that I
collapsed some class hierarchy in the _incremental pr. This could take
the optimizer a bit longer to figure out.

> However, your code produces different output compared to xz-java.git
> master so the speed comparison isn't entirely fair. I assume there was
> no intent to affect the encoder output with these changes so I wonder
> what is going on. Both of your branches produce the same output so it's
> something common between them that makes the difference.

This was definitely not the intent, and I had not noticed this previously.

With the 3 files I test with, none have any difference with preset of
3. The smallest file (ihe_ovly_pr.cm) also has no difference at preset
6.

With the ~25MB image1.dcm (mostly a greyscale bmp), the PR versions
produce more compressed content at preset 6.
1.9 = 4,041,476
PR = 4,004,156

There is a smaller difference with the ~50MB xml file, but strangely,
the PR version is slightly bigger.
1.9 = 1,589,512
PR = 1,589,564

Given that I am only seeing differences with preset of 6, I am
guessing the difference must be in BT4.
The result still seems to be valid (at least the java XZInputStream
reads it back correctly).
There is clearly a subtle "defect" somewhere, but I cannot tell if it
is in the current trunk or the PR. My best guess is that there is an
off by 1 error in one or the other.

Brett

On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin  wrote:
>
> On 2024-02-25 Brett Okken wrote:
> > I created https://github.com/tukaani-project/xz-java/pull/13 with the
> > bare bones changes to utilize a utility for array comparisons and an
> > Unsafe implementation.
> > When/if that is reviewed and approved, we can move on through the
> > other implementation options.
>
> Thanks! Ideally there would be one commit to add the minimal portable
> version, then separate commits for each optimized variant.
>
> So far I have given it only a quick try. array_comp_incremental seems
> faster than xz-java.git master. Compression time was reduced by about
> 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
> computer is old so I don't doubt your higher numbers.
>
> With array_comparison_performance the improvement seems to be less,
> maybe 5 %. I didn't test much yet but it still seems clear that
> array_comp_incremental is faster on my computer.
>
> However, your code produces different output compared to xz-java.git
> master so the speed comparison isn't entirely fair. I assume there was
> no intent to affect the encoder output with these changes so I wonder
> what is going on. Both of your branches produce the same output so it's
> something common between them that makes the difference.
>
> I plan to get back to this next week.
>
> > > One thing I wonder is if JNI could help.
> >
> > It would most likely make things faster, but also more complicated. I
> > like the java version for the simplicity. I am not necessarily looking
> > to compete with native performance, but would like to get improvements
> > where they are reasonably available. Here there is some complexity in
> > supporting multiple implementations for different versions and/or
> > architectures, but that complexity does not intrude into the core of
> > the xz code.
>
> I think your thoughts are similar to mine here. Java version is clearly
> slower but it's nicer code to read too. A separate class for buffer
> comparisons indeed doesn't hurt the readability of the core code.
>
> On the other hand, if Java version happened to be used a lot then JNI
> could save both time (up to 50 %) and even electricity. java.util.zip
> uses native zlib for the performance-critical code.
>
> In the long run both faster Java code and JNI might be worth doing.
> There's more than enough pure Java stuff to do for now so any JNI
> thoughts have to wait.
>
> --
> Lasse Collin
>



Re: [xz-devel] xz-java and newer java

2024-02-29 Thread Lasse Collin
On 2024-02-25 Brett Okken wrote:
> I created https://github.com/tukaani-project/xz-java/pull/13 with the
> bare bones changes to utilize a utility for array comparisons and an
> Unsafe implementation.
> When/if that is reviewed and approved, we can move on through the
> other implementation options.

Thanks! Ideally there would be one commit to add the minimal portable
version, then separate commits for each optimized variant.

So far I have given it only a quick try. array_comp_incremental seems
faster than xz-java.git master. Compression time was reduced by about
10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my
computer is old so I don't doubt your higher numbers.

With array_comparison_performance the improvement seems to be less,
maybe 5 %. I didn't test much yet but it still seems clear that
array_comp_incremental is faster on my computer.

However, your code produces different output compared to xz-java.git
master so the speed comparison isn't entirely fair. I assume there was
no intent to affect the encoder output with these changes so I wonder
what is going on. Both of your branches produce the same output so it's
something common between them that makes the difference.

I plan to get back to this next week.

> > One thing I wonder is if JNI could help.  
> 
> It would most likely make things faster, but also more complicated. I
> like the java version for the simplicity. I am not necessarily looking
> to compete with native performance, but would like to get improvements
> where they are reasonably available. Here there is some complexity in
> supporting multiple implementations for different versions and/or
> architectures, but that complexity does not intrude into the core of
> the xz code.

I think your thoughts are similar to mine here. Java version is clearly
slower but it's nicer code to read too. A separate class for buffer
comparisons indeed doesn't hurt the readability of the core code.

On the other hand, if Java version happened to be used a lot then JNI
could save both time (up to 50 %) and even electricity. java.util.zip
uses native zlib for the performance-critical code.

In the long run both faster Java code and JNI might be worth doing.
There's more than enough pure Java stuff to do for now so any JNI
thoughts have to wait.

-- 
Lasse Collin



Re: [xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-29 Thread Lasse Collin
On 2024-02-28 Sebastian Andrzej Siewior wrote:
> On 2024-02-28 18:45:03 [+0200], Lasse Collin wrote:
> > V_DEBUG was commited to the master and v5.6 branches a few moments
> > ago, so yes, your plan sounds good. :-) Feel free to do it as you
> > prefer, either just making the change or picking the other simple
> > fixes from v5.6 as well.  
> 
> Perfect. I just took the patch.

Thanks! :-)

> > Hopefully the already-added workarounds in other packages don't
> > cause any unwanted side effects in the future.  
> 
> The plan was to revert it. All good.

:-)

There is a branch "memavail" on GitHub with experimental support for
MemAvailable from Linux /proc/meminfo. It needs discussion and
feedback (likely in a new thread). There is no rush as it's not for
5.6.x anyway.

-- 
Lasse Collin



Re: [xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-28 Thread Lasse Collin
On 2024-02-28 Sebastian Andrzej Siewior wrote:
> I see. In that case let me throw this to V_DEBUG Debian wise and sync
> with xz upstream once a new release is up or so. I have two packages
> that fail because of this and dpkg added a workaround. So instead
> adding another workaround to another package I would fix this on the
> xz side. Sounds good?

V_DEBUG was commited to the master and v5.6 branches a few moments ago,
so yes, your plan sounds good. :-) Feel free to do it as you prefer,
either just making the change or picking the other simple fixes from
v5.6 as well.

Hopefully the already-added workarounds in other packages don't cause
any unwanted side effects in the future.

Thanks!

-- 
Lasse Collin



Re: [xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-28 Thread Sebastian Andrzej Siewior
On 2024-02-28 13:00:08 [+0200], Lasse Collin wrote:
> > > There are also messages that are shown when memory limit does affect
> > > compressed output (switching to single-threaded mode and LZMA2
> > > dictionary size adjustment). The verbosity requirement of these
> > > messages isn't being changed now.  
> > 
> > This sounds like you accept this change in principle but are thinking
> > if V_VERBOSE or V_DEBUG is the right thing.
> 
> Me and three other people on IRC think it should be changed but there
> is no consensus yet what exactly is the best (your patch, -v, or -vv).
> This is about the thread count messages only as (since 5.4.0) automatic
> thread count doesn't affect the compressed output.
> 
> There is some discussion also here:
> 
> https://github.com/tukaani-project/xz/issues/89

I see. In that case let me throw this to V_DEBUG Debian wise and sync
with xz upstream once a new release is up or so. I have two packages
that fail because of this and dpkg added a workaround. So instead adding
another workaround to another package I would fix this on the xz side.
Sounds good?

Sebastian



Re: [xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-28 Thread Lasse Collin
On 2024-02-27 Sebastian Andrzej Siewior wrote:
> On 2024-02-27 19:17:48 [+0200], Lasse Collin wrote:
> >   - The silencing could be done with -q as well though.  
> 
> Wouldn't -q also shut some legitime warnings?

Yes. When compressing from stdin to stdout, there aren't many possible
warnings but there are still a few rare ones. So -q isn't ideal to get
rid of thread count reduction messages.

> Isn't the automatic memory usage accurate?

It's simply 25 % of total RAM. The Linux-specific MemAvailable from
/proc/meminfo didn't get into 5.6.0. Perhaps it could be done in the
next development cycle, and maybe also look for similar features on a
few other OSes.

> Not sure if documenting it in the man-page would help here.

One issue is that currently the message tells about thread count
reduction and what the memlimit is but not how much memory is actually
required. One needs to use -vv to get the usage info.

Documenting on the man page could be good if it can be explained in an
understandable way and people can find it there. The man page is long
already.

The less average users *need* to understand the details the better.

> > There are also messages that are shown when memory limit does affect
> > compressed output (switching to single-threaded mode and LZMA2
> > dictionary size adjustment). The verbosity requirement of these
> > messages isn't being changed now.  
> 
> This sounds like you accept this change in principle but are thinking
> if V_VERBOSE or V_DEBUG is the right thing.

Me and three other people on IRC think it should be changed but there
is no consensus yet what exactly is the best (your patch, -v, or -vv).
This is about the thread count messages only as (since 5.4.0) automatic
thread count doesn't affect the compressed output.

There is some discussion also here:

https://github.com/tukaani-project/xz/issues/89

-- 
Lasse Collin



Re: [xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-27 Thread Sebastian Andrzej Siewior
On 2024-02-27 19:17:48 [+0200], Lasse Collin wrote:
> Thanks for the patch! We discussed a bit on IRC and everyone thinks
> it's on the right track but we are pondering the implementation details
> still.
> 
> The thread count messages are shown in situations which don't affect
> the compressed output, and thus the importance of these messages isn't
> so high. Originally they were there to reduce the chance of people
> asking why xz isn't using as many threads as requested.

Understood.

> We are considering to simply change those two message() calls to always
> use V_VERBOSE or V_DEBUG instead of the current V_WARNING. So automatic
> vs. manual number of threads wouldn't affect it like it does in your
> patch. Comparing your apporach and this simpler one:
> 
>   + There are scripts that take a user-specified number for
> parallelization and that number is passed to multiple tools, not
> just xz. Keeping xz -T16 silent about thread count reduction can
> make sense in this case.
> 
>   - The silencing could be done with -q as well though.

Wouldn't -q also shut some legitime warnings?

> There are pros and cons between V_VERBOSE and V_DEBUG.
> 
> For (de)compression, a single -v sets V_VERBOSE and actives the
> progress indicator. If the thread count messages are shown at -v, on
> some systems progress indicator usage would get the message about
> reduced thread count as well.
> 
>   + It works as a hint that increasing the memory usage limits manually
> might allow more threads to be used.
> 
>   - If one uses progress indicator frequently, the thread count
> reduction message might become slightly annoying as the information
> is already known by the user.
> 
>   - Progress indicator can be used in non-interactive cases (when
> stderr isn't a terminal). Then xz only prints a final summary per
> file. This likely is not a common use case but the thread count
> messages would be here as well.
> 
> V_DEBUG is set when -v is used twice (-vv).
> 
>   + Regular progress indicator uses wouldn't get extra messages.
> 
>   - A larger number of users might not become aware that they aren't
> getting as many threads as they could because the automatic memory
> usage limit is too low to allow more threads.

Isn't the automatic memory usage accurate? Because then there is hardly
something you could do about it. Maybe kill chrome…
On the other hand if they decompress and they have 32s CPU and xz
reduces it to 16 threads then there is no "loss" if the file has 16
blocks or less :)
Not sure if documenting it in the man-page would help here.

> There are also messages that are shown when memory limit does affect
> compressed output (switching to single-threaded mode and LZMA2
> dictionary size adjustment). The verbosity requirement of these messages
> isn't being changed now.

This sounds like you accept this change in principle but are thinking if
V_VERBOSE or V_DEBUG is the right thing.

Sebastian



Re: [xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-27 Thread Lasse Collin
On 2024-02-26 Sebastian Andrzej Siewior wrote:
> Print the warning about reduced threads only if number is selected
> - automatically and asked to be verbose (-v)
> - explicit by the user

Thanks for the patch! We discussed a bit on IRC and everyone thinks
it's on the right track but we are pondering the implementation details
still.

The thread count messages are shown in situations which don't affect
the compressed output, and thus the importance of these messages isn't
so high. Originally they were there to reduce the chance of people
asking why xz isn't using as many threads as requested.

We are considering to simply change those two message() calls to always
use V_VERBOSE or V_DEBUG instead of the current V_WARNING. So automatic
vs. manual number of threads wouldn't affect it like it does in your
patch. Comparing your apporach and this simpler one:

  + There are scripts that take a user-specified number for
parallelization and that number is passed to multiple tools, not
just xz. Keeping xz -T16 silent about thread count reduction can
make sense in this case.

  - The silencing could be done with -q as well though.

There are pros and cons between V_VERBOSE and V_DEBUG.

For (de)compression, a single -v sets V_VERBOSE and actives the
progress indicator. If the thread count messages are shown at -v, on
some systems progress indicator usage would get the message about
reduced thread count as well.

  + It works as a hint that increasing the memory usage limits manually
might allow more threads to be used.

  - If one uses progress indicator frequently, the thread count
reduction message might become slightly annoying as the information
is already known by the user.

  - Progress indicator can be used in non-interactive cases (when
stderr isn't a terminal). Then xz only prints a final summary per
file. This likely is not a common use case but the thread count
messages would be here as well.

V_DEBUG is set when -v is used twice (-vv).

  + Regular progress indicator uses wouldn't get extra messages.

  - A larger number of users might not become aware that they aren't
getting as many threads as they could because the automatic memory
usage limit is too low to allow more threads.

There are also messages that are shown when memory limit does affect
compressed output (switching to single-threaded mode and LZMA2
dictionary size adjustment). The verbosity requirement of these messages
isn't being changed now.

-- 
Lasse Collin



[xz-devel] [PATCH] xz: Avoid warnings due to memlimit if threads are in auto mode.

2024-02-26 Thread Sebastian Andrzej Siewior
From: Sebastian Andrzej Siewior 

If threads are automatically selected then it is possible that their
number needs to get reduced in order not to exceed the current memory
limit. This "reducing" forces a warning which is printed on stderr to
inform the user.
The information is probably not something the user would be interested
in since he did not explicitly ask for the additional threads and so any
number of threads would probably do it without raising an eyebrow.
The downside of this warning is that a few testsuites capture the output
of stderr and complain now that something went wrong.

Print the warning about reduced threads only if number is selected
- automatically and asked to be verbose (-v)
- explicit by the user

Signed-off-by: Sebastian Andrzej Siewior 
---
 src/xz/coder.c| 13 +++--
 src/xz/hardware.c |  7 +++
 src/xz/hardware.h |  2 ++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/xz/coder.c b/src/xz/coder.c
index 4efaa802b9bbc..e5e30558aedf6 100644
--- a/src/xz/coder.c
+++ b/src/xz/coder.c
@@ -580,8 +580,13 @@ coder_set_compression_settings(void)
message_bug();
 
if (memory_usage <= memory_limit) {
+   enum message_verbosity v = V_WARNING;
+
+   if (hardware_threads_are_automatic())
+   v = V_VERBOSE;
+
// The memory usage is now low enough.
-   message(V_WARNING, _("Reduced the number of "
+   message(v, _("Reduced the number of "
"threads from %s to %s to not exceed "
"the memory usage limit of %s MiB"),
uint64_to_str(
@@ -601,7 +606,11 @@ coder_set_compression_settings(void)
// time the soft limit will never make xz fail and never make
// xz change settings that would affect the compressed output.
if (hardware_memlimit_mtenc_is_default()) {
-   message(V_WARNING, _("Reduced the number of threads "
+   enum message_verbosity v = V_WARNING;
+
+   if (hardware_threads_are_automatic())
+   v = V_VERBOSE;
+   message(v, _("Reduced the number of threads "
"from %s to one. The automatic memory usage "
"limit of %s MiB is still being exceeded. "
"%s MiB of memory is required. "
diff --git a/src/xz/hardware.c b/src/xz/hardware.c
index 952652fecb8d9..c1d54a5910b7a 100644
--- a/src/xz/hardware.c
+++ b/src/xz/hardware.c
@@ -195,6 +195,13 @@ hardware_memlimit_mtenc_get(void)
 }
 
 
+extern bool
+hardware_threads_are_automatic(void)
+{
+   return threads_are_automatic;
+}
+
+
 extern bool
 hardware_memlimit_mtenc_is_default(void)
 {
diff --git a/src/xz/hardware.h b/src/xz/hardware.h
index 25b351e32b195..e4cfe299d2b2d 100644
--- a/src/xz/hardware.h
+++ b/src/xz/hardware.h
@@ -25,6 +25,8 @@ extern uint32_t hardware_threads_get(void);
 /// This can be true even if the number of threads is one.
 extern bool hardware_threads_is_mt(void);
 
+/// Returns true if the number of threads has set automaticaly.
+extern bool hardware_threads_are_automatic(void);
 
 /// Set the memory usage limit. There are separate limits for compression,
 /// decompression (also includes --list), and multithreaded decompression.
-- 
2.43.0




Re: [xz-devel] xz-java and newer java

2024-02-25 Thread Brett Okken
> Thanks! I could be good to split into smaller commits to make reviewing
> easier.

I created https://github.com/tukaani-project/xz-java/pull/13 with the
bare bones changes to utilize a utility for array comparisons and an
Unsafe implementation.
When/if that is reviewed and approved, we can move on through the
other implementation options.

> One thing I wonder is if JNI could help.

It would most likely make things faster, but also more complicated. I
like the java version for the simplicity. I am not necessarily looking
to compete with native performance, but would like to get improvements
where they are reasonably available. Here there is some complexity in
supporting multiple implementations for different versions and/or
architectures, but that complexity does not intrude into the core of
the xz code.

Thanks,
Brett

On Mon, Feb 19, 2024 at 1:32 PM Lasse Collin  wrote:
>
> On 2024-02-19 Brett Okken wrote:
> > I have created a pr to the GitHub project.
> >
> > https://github.com/tukaani-project/xz-java/pull/12
>
> Thanks! I could be good to split into smaller commits to make reviewing
> easier.
>
> > It is not clear to me if that is actually seeing active dev on the
> > Java project yet.
>
> I see now that there are quite a few things on GH. I had forgotten to
> turn email notifications on for the xz-java project; clearly those
> aren't on by default. :-( But likely not much would have been done even
> if I had noticed those issues and PRs earlier so the main problem is
> that the silence has been impolite. I'm sorry.
>
> XZ Utils 5.6.0 has to be released this month since there was a wish to
> get it into the next Ubuntu LTS. I'm hoping that next month something
> will finally get done around XZ for Java. We'll see.
>
> One thing I wonder is if JNI could help. Optimizing the Java code can
> help a bit but I suspect that it still won't be very fast. So far it has
> been nice that the Java code is quite readable and I would like keep it
> that way in the future too.
>
> --
> Lasse Collin



[xz-devel] XZ Utils 5.6.0

2024-02-24 Thread Jia Tan
XZ Utils 5.6.0 is available at .

There currently are no plans to maintain the 5.4.x branch, but releases
could be made if there is community interest.

Here is an extract from the NEWS file:

5.6.0 (2024-02-24)

This bumps the minor version of liblzma because new features were
added. The API and ABI are still backward compatible with liblzma
5.4.x and 5.2.x and 5.0.x.

NOTE: As described in the NEWS for 5.5.2beta, the core components
are now under the BSD Zero Clause License (0BSD).

Since 5.5.2beta:

* liblzma:

- Disabled the branchless C variant in the LZMA decoder based
  on the benchmark results from the community.

- Disabled x86-64 inline assembly on x32 to fix the build.

* Sandboxing support in xz:

- Landlock is now used even when xz needs to create files.
  In this case the sandbox has to be more permissive than
  when no files need to be created. A similar thing was
  already in use with pledge(2) since 5.3.4alpha.

- Landlock and pledge(2) are now stricter when reading from
  more than one input file and only writing to standard output.

- Added support for Landlock ABI version 4.

* CMake:

- Default to -O2 instead of -O3 with CMAKE_BUILD_TYPE=Release.
  -O3 is not useful for speed and makes the code larger.

- Now builds lzmainfo and lzmadec.

- xzdiff, xzgrep, xzless, xzmore, and their symlinks are now
  installed. The scripts are also tested during "make test".

- Added translation support for xz, lzmainfo, and the
  man pages.

- Applied the symbol versioning workaround for MicroBlaze that
  is used in the Autotools build.

- The general XZ Utils and liblzma API documentation is now
  installed.

- The CMake component names were changed a little and several
  were added. liblzma_Runtime and liblzma_Development are
  unchanged.

- Minimum required CMake version is now 3.14. However,
  translation support is disabled with CMake versions
  older than 3.20.

- The CMake-based build is now close to feature parity with the
  Autotools-based build. Most importantly a few tests aren't
  run yet. Testing the CMake-based build on different operating
  systems would be welcome now. See the comment at the top of
  CMakeLists.txt.

* Fixed a bug in the Autotools feature test for ARM64 CRC32
  instruction support for old versions of Clang. This did not
  affect the CMake build.

* Windows:

- The build instructions in INSTALL and windows/INSTALL*.txt
  were revised completely.

- windows/build-with-cmake.bat along with the instructions
  in windows/INSTALL-MinGW-w64_with_CMake.txt should make
  it very easy to build liblzma.dll and xz.exe on Windows
  using CMake and MinGW-w64 with either GCC or Clang/LLVM.

- windows/build.bash was updated. It now works on MSYS2 and
  on GNU/Linux (cross-compiling) to create a .zip and .7z
  package for 32-bit and 64-bit x86 using GCC + MinGW-w64.

* The TODO file is no longer installed as part of the
  documentation. The file is out of date and does not reflect
  the actual tasks that will be completed in the future.

* Translations:

- Translated lzmainfo man pages are now installed. These
  had been forgotten in earlier versions.

- Updated Croatian, Esperanto, German, Hungarian, Korean,
  Polish, Romanian, Spanish, Swedish, Vietnamese, and Ukrainian
  translations.

- Updated German, Korean, Romanian, and Ukrainian man page
  translations.

* Added a few tests.

---

Jia Tan



Re: [xz-devel] Testing LZMA_RANGE_DECODER_CONFIG

2024-02-20 Thread Lasse Collin
On 2024-02-19 Sebastian Andrzej Siewior wrote:
> Okay, so the input matters, too. I tried 1GiB urandom (so it does not
> compress so well) but that went quicker than expected…

urandom should be incompressible. When LZMA2 cannot compress a chunk it
stores it in uncompressed form. Decompression is like "cat with CRC".

> I found 3 idle x86 boxes and re-run a test with linux' perf on them
> and the arm64 box. I all flavours for the two archives. On RiscV I
> did the 'xz -t' thing because perf seems not to be supported well or
> I lack access.

Great work! Thanks!

On IRC one person ran a bunch of tests too. On ARM64 the results were
mixed. A variant that was better with GCC could be worse with Clang. So
those weren't as clear as your results but they too made me think that
using 0 for non-x86-64 is the way to go for 5.6.0.

Your x86-64 asm variant results were interesting too. Seems that the bit
0x100 isn't good with GCC although the difference is small. I confirmed
this on the tests I did on Celeron G1620 (Ivy Bridge). So I wonder if
0x0F0 should be the x86-64 variant to use in xz 5.6.0 with GCC.

On another machine with Clang 16, 0x100 is 8 % faster with Linux kernel
source. So the difference is somewhat big. It's still slightly slower
than the GCC version. This is on Phenom II X4 920.

Since 0x100 is only a little worse with GCC, using it for both GCC and
Clang could be OK. An #ifdef __clang__ could be used too but perhaps
it's not great in the long term. Something has to be chosen for 5.6.0;
further tweaks can be made later.

By the way, the "time" command gives more precise results than "xz -v".
I use

TIMEFORMAT=$'\nreal\t%3R\nuser\t%3U\nsys\t%3S\ncpu%%\t%P'

in bash to keep the output as seconds instead of minutes and seconds.

-- 
Lasse Collin



Re: [xz-devel] Testing LZMA_RANGE_DECODER_CONFIG

2024-02-20 Thread Sebastian Andrzej Siewior

On 2024-02-18 22:35:03 [+0200], Lasse Collin wrote:
> The balance between the hottest locations in the decompressor code
> varies depending on the input file. Linux kernel source compresses very
> well (ratio is about 0.10). This reduces the benefit of branchless
> code. On my main computer I still get about 2 % time reduction with =3.

Okay, so the input matters, too. I tried 1GiB urandom (so it does not
compress so well) but that went quicker than expected… Anyway.
I found 3 idle x86 boxes and re-run a test with linux' perf on them and
the arm64 box. I all flavours for the two archives. On RiscV I did the
'xz -t' thing because perf seems not to be supported well or I lack
access.

The task is pinned to a single CPU means the task can't be migrated to
another core and xz observes only one "core" (and does not spawn
threads). So it is single threaded.

Intel(R) Xeon(R) Platinum 8176M CPU:

|  Performance counter stats for './xz_0x000_gcc -t linux-6.7.5.tar.xz' (5 
runs):
|  
|  13.384,81 msec task-clock   #1,000 CPUs 
utilized   ( +-  0,05% )
| 21  context-switches #1,569 /sec  
  ( +-  2,61% )
|  0  cpu-migrations   #0,000 /sec  

|119  page-faults  #8,891 /sec  
  ( +-  0,34% )
| 28.041.975.275  cycles   #2,095 GHz   
  ( +-  0,05% )
| 32.576.330.155  instructions #1,16  insn per 
cycle  ( +-  0,00% )
|  4.304.914.251  branches #  321,627 M/sec 
  ( +-  0,00% )
|567.850.712  branch-misses#   13,19% of all 
branches ( +-  0,02% )
| 
|   13,38558 +- 0,00707 seconds time elapsed  ( +-  0,05% )
|
|  Performance counter stats for './xz_0x003_gcc -t linux-6.7.5.tar.xz' (5 
runs):
| 
|  12.853,67 msec task-clock   #1,000 CPUs 
utilized   ( +-  0,03% )
| 18  context-switches #1,400 /sec  
  ( +-  5,72% )
|  0  cpu-migrations   #0,000 /sec
|220  page-faults  #   17,116 /sec  
  ( +- 45,95% )
| 26.929.223.135  cycles   #2,095 GHz   
  ( +-  0,03% )
| 42.017.609.529  instructions #1,56  insn per 
cycle  ( +-  0,00% )
|  3.226.245.101  branches #  250,998 M/sec 
  ( +-  0,00% )
|299.814.626  branch-misses#9,29% of all 
branches ( +-  0,11% )
| 
|   12,85438 +- 0,00395 seconds time elapsed  ( +-  0,03% )

missed branches dropped, gained instructions but isn per cycle improved.
Less idle cycles. Worth, ~0.5 sec.

|  Performance counter stats for './xz_0x00f_gcc -t linux-6.7.5.tar.xz' (5 
runs):
| 
|  12.872,36 msec task-clock   #1,000 CPUs 
utilized   ( +-  0,01% )
| 17  context-switches #1,321 /sec  
  ( +-  6,55% )
|  0  cpu-migrations   #0,000 /sec
|220  page-faults  #   17,091 /sec  
  ( +- 45,98% )
| 26.968.386.196  cycles   #2,095 GHz   
  ( +-  0,01% )
| 44.566.213.262  instructions #1,65  insn per 
cycle  ( +-  0,00% )
|  2.957.642.049  branches #  229,767 M/sec 
  ( +-  0,00% )
|249.987.257  branch-misses#8,45% of all 
branches ( +-  0,05% )
| 
|   12,87303 +- 0,00115 seconds time elapsed  ( +-  0,01% )

Slightly worse vs previous.

|  Performance counter stats for './xz_0x1f0_gcc -t linux-6.7.5.tar.xz' (5 
runs):
| 
|   9.740,84 msec task-clock   #1,000 CPUs 
utilized   ( +-  0,02% )
| 21  context-switches #2,156 /sec  
  ( +-  6,14% )
|  0  cpu-migrations   #0,000 /sec
|216  page-faults  #   22,175 /sec  
  ( +- 46,95% )
| 20.407.560.821  cycles   #2,095 GHz   
  ( +-  0,02% )
| 34.751.763.859  instructions #1,70  insn per 
cycle  ( +-  0,00% )
|  3.182.093.181  branches #  326,676 M/sec 
  ( +-  0,00% )
|

Re: [xz-devel] xz-java and newer java

2024-02-19 Thread Lasse Collin
On 2024-02-19 Brett Okken wrote:
> I have created a pr to the GitHub project.
> 
> https://github.com/tukaani-project/xz-java/pull/12

Thanks! I could be good to split into smaller commits to make reviewing
easier.

> It is not clear to me if that is actually seeing active dev on the
> Java project yet.

I see now that there are quite a few things on GH. I had forgotten to
turn email notifications on for the xz-java project; clearly those
aren't on by default. :-( But likely not much would have been done even
if I had noticed those issues and PRs earlier so the main problem is
that the silence has been impolite. I'm sorry.

XZ Utils 5.6.0 has to be released this month since there was a wish to
get it into the next Ubuntu LTS. I'm hoping that next month something
will finally get done around XZ for Java. We'll see.

One thing I wonder is if JNI could help. Optimizing the Java code can
help a bit but I suspect that it still won't be very fast. So far it has
been nice that the Java code is quite readable and I would like keep it
that way in the future too.

-- 
Lasse Collin



Re: [xz-devel] Re: improve java delta performance

2024-02-19 Thread Brett Okken
I have created a pr to the GitHub project with these changes.

https://github.com/tukaani-project/xz-java/pull/11/files

Thanks,
Brett

On Thu, Mar 31, 2022 at 4:33 PM Lasse Collin 
wrote:

> > On Thu, May 6, 2021 at 4:18 PM Brett Okken 
> > wrote:
> >
> > > These changes reduce the time of DeltaEncoder by ~65% and
> > > DeltaDecoder by ~40%, assuming using arrays that are several KB in
> > > size.
>
> On 2022-02-12 Brett Okken wrote:
> > Can this be reviewed?
>
> It looks reasonable but I try to focus on XZ Utils at the moment.
>
> The Delta code in XZ Utils is also very simple and could be optimized
> the same way. But since Delta isn't used alone (it's used together with
> LZMA2) I suspect the overall improvement isn't big. It could still be
> done as it is simple but I won't look at it now.
>
> For the ArrayUtil patch, it's a complex one and I'm not able to look at
> it for now.
>
> --
> Lasse Collin
>


Re: [xz-devel] xz-java and newer java

2024-02-19 Thread Brett Okken
I have created a pr to the GitHub project.

https://github.com/tukaani-project/xz-java/pull/12

It is not clear to me if that is actually seeing active dev on the Java
project yet.

Thanks,
Brett

On Sat, Feb 12, 2022 at 11:45 AM Brett Okken 
wrote:

> Can this be taken up again?
>
> On Wed, Mar 24, 2021 at 6:20 AM Brett Okken 
> wrote:
>
>> I grabbed an older version in the last mail. This is the updated
>> version for aarch64.
>>
>


Re: [xz-devel] Testing LZMA_RANGE_DECODER_CONFIG

2024-02-18 Thread Lasse Collin
The balance between the hottest locations in the decompressor code
varies depending on the input file. Linux kernel source compresses very
well (ratio is about 0.10). This reduces the benefit of branchless
code. On my main computer I still get about 2 % time reduction with =3.

On another x86-64 computer I don't see any difference between =0 and =3
with the Linux kernel source. On the same machine, decompression time
of warzone2100-data[1] from Debian is reduced by 10.5 % with =3 compared
to =0. It's a package that doesn't compress so well (ratio is about
0.75). On my main computer the time reduction from =0 to =3 is 8.5 %.
All numbers are with GCC.

Of course, on x86-64 the =0 vs. =3 test isn't that interesting since the
asm is so much better. But this highlights how much the test file
choice can make a difference.

[1] https://packages.debian.org/bookworm/all/warzone2100-data/download

-- 
Lasse Collin



Re: [xz-devel] Testing LZMA_RANGE_DECODER_CONFIG

2024-02-18 Thread Lasse Collin
On 2024-02-17 Sebastian Andrzej Siewior wrote:
> I did some testing on !x86. I changed LZMA_RANGE_DECODER_CONFIG to
> different values run a test and looked at the MiB/s value. xz_0 means
> LZMA_RANGE_DECODER_CONFIG was 0, xz_1 means the define was set to 1. I
> touched src/liblzma/lzma/lzma_decoder.c and rebuilt xz. I pinned the
> shell to a single CPU and run test for archive (-tv) for one file
> three times.

Great to see testing! The testing method is fine. If pinning to a
single core, I assume --threads=1 was set as well because
multithreading is the default now.

Branchless code can help when branch prediction penalties are high. So
it will depend on the processor (not just the instruction set).

On x86-64, there was a clear improvement with the branchless C code. It
was a little more with Clang than GCC. So if easily possible, also
testing with Clang could be useful. Testing your script on x86-64 could
be worth it too so check that at least on x86-64 you get an improvement
with =1 and =3 compared to =0. (The bit 1 makes the main difference; 2
should have a small effect, and 4 and 8 are questionable and perhaps
not worth benchmarking until the usefulness of =1 or =3 is clear.)

If the branchless C code is not consistent outside x86-64, then 5.6.0
likely should stick to =0. From your results it seems that the other
tweaks to the code provided a minor improvement on non-x86-64 still.
(The tweaks that LZMA_RANGE_DECODER_CONFIG doesn't affect.)

Thanks!

-- 
Lasse Collin



[xz-devel] Testing LZMA_RANGE_DECODER_CONFIG

2024-02-17 Thread Sebastian Andrzej Siewior
Hi,

I did some testing on !x86. I changed LZMA_RANGE_DECODER_CONFIG to
different values run a test and looked at the MiB/s value. xz_0 means
LZMA_RANGE_DECODER_CONFIG was 0, xz_1 means the define was set to 1. I
touched src/liblzma/lzma/lzma_decoder.c and rebuilt xz. I pinned the
shell to a single CPU and run test for archive (-tv) for one file three
times. This are the results:

arm64 (Lenovo HR350A):
=== xz 5.4.1 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   110 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   110 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   110 MiB/s   0:12

=== ./xz_0 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   115 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   115 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   115 MiB/s   0:12

=== ./xz_1 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   108 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   108 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   108 MiB/s   0:12

=== ./xz_3 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   109 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   109 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   109 MiB/s   0:12

=== ./xz_7 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   109 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   109 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   109 MiB/s   0:12

=== ./xz_f ===
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   107 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   107 MiB/s   0:12
linux-6.7.5.tar.xz (1/1)
  100 % 134.9 MiB / 1,386.4 MiB = 0.097   107 MiB/s   0:12



RiscV (HiFive Unmatched)
=== xz 5.4.5 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:45
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:46
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:45

=== ./xz_0 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09732 MiB/s   0:43
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09732 MiB/s   0:43
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09732 MiB/s   0:43

=== ./xz_1 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:44
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:44
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:44

=== ./xz_3 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:45
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:45
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:45

=== ./xz_7 ===
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:45
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:44
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09731 MiB/s   0:44

=== ./xz_f ===
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:46
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:45
linux-6.7.5.tar.xz (1/1)
  100 % 134,9 MiB / 1.386,4 MiB = 0,09730 MiB/s   0:45


Based on this it looks like the `0' variant is the best one. Is my test
too simple and does not cover "everything / wide range of decodings"?

Sebastian



[xz-devel] XZ Utils 5.5.2beta

2024-02-14 Thread Jia Tan
This is the first release made under the 0BSD license. Please let us
know if there are any concerns about the license change. We are looking
forward to releasing 5.6.0 later this month!

XZ Utils 5.5.2beta is available at .

Here is an extract from the NEWS file:

5.5.2beta (2024-02-14)

* Licensing change: The core components are now under the
  BSD Zero Clause License (0BSD). In XZ Utils 5.4.6 and older
  and 5.5.1alpha these components are in the public domain and
  obviously remain so; the change affects the new releases only.

  0BSD is an extremely permissive license which doesn't require
  retaining or reproducing copyright or license notices when
  distributing the code, thus in practice there is extremely
  little difference to public domain.

* liblzma

- Significant speed optimizations to the LZMA decoder were
  made. There are now three variants that can be chosen at
  build time:

* Basic C version: This is a few percent faster than
  5.4.x due to some new optimizations.

* Branchless C: This is currently the default on platforms
  for which there is no assembly code. This should be a few
  percent faster than the basic C version.

* x86-64 inline assembly. This works with GCC and Clang.

  The default choice can currently be overriden by setting
  LZMA_RANGE_DECODER_CONFIG in CPPFLAGS: 0 means the basic
  version and 3 means that branchless C version.

- Optimized the CRC32 calculation on ARM64 platforms using the
  CRC32 instructions. The instructions are optional in ARMv8.0
  and are required in ARMv8.1 and later. Runtime detection for
  the instruction is used on GNU/Linux, FreeBSD, Windows, and
  macOS. If the compiler flags indicate unconditional CRC32
  instruction support (+crc) then the generic version is not
  built.

* Added lz4 support to xzdiff/xzcmp and xzgrep.

* Man pages of xzdiff/xzcmp, xzgrep, and xzmore were rewritten
  to simplify licensing of the man page translations.

* Translations:

- Updated Chinese (simplified), German, Korean, Polish,
  Romanian, Spanish, Swedish, and Ukrainian translations.

- Updated German, Korean, Romanian, and Ukrainian man page
  translations.

* Small improvements to the tests.

* Added doc/examples/11_file_info.c. It was added to the Git
  repository in 2017 but forgotten to be added into distribution
  tarballs.

* Removed doc/examples_old. These were from 2012.

* Removed the macos/build.sh script. It had not been updated
  since 2013.

---
Jia Tan



[xz-devel] XZ projects license change proposal

2024-02-08 Thread Lasse Collin
Hello!

I have made a post on GitHub about possibly moving from public domain
to BSD Zero Clause License:

https://github.com/tukaani-project/xz/issues/79

Feedback is welcome. Feel free to comment on GitHub, privately via email
to x...@tukaani.org, or on the xz-devel mailing list.

Thank you!

PS. XZ for Java has been idle longer than expected but it should
finally get at least some attention in the coming months.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.4.6, 5.5.1alpha, and website changes

2024-01-26 Thread Jia Tan
The XZ specific content has been moved from 
to . The old links will be kept working via
redirections.

Additionally, the official XZ Embedded Git repository is now on GitHub
. The repository on
git.tukaani.org will be maintained as a mirror with some delay.

XZ Utils 5.4.6 and 5.5.1alpha are available at
.

Here is an extract from the NEWS file:

5.4.6 (2024-01-26)

* Fixed a bug involving internal function pointers in liblzma not
  being initialized to NULL. The bug can only be triggered if
  lzma_filters_update() is called on a LZMA1 encoder, so it does
  not affect xz or any application known to us that uses liblzma.

* xz:

- Fixed a regression introduced in 5.4.2 that caused encoding
  in the raw format to unnecessarily fail if --suffix was not
  used. For instance, the following command no longer reports
  that --suffix must be used:

  echo foo | xz --format=raw --lzma2 | wc -c

- Fixed an issue on MinGW-w64 builds that prevented reading
  from or writing to non-terminal character devices like NUL.

* Added a new test.


5.5.1alpha (2024-01-26)

* Added a new filter for RISC-V binaries. The filter can be used
  for 32-bit and 64-bit binaries with either little or big
  endianness. In liblzma, the Filter ID is LZMA_FILTER_RISCV (0x0B)
  and the xz option is --riscv. liblzma filter string syntax
  recognizes this filter as "riscv".

* liblzma:

- Added lzma_mt_block_size() to recommend a Block size for
  multithreaded encoding

- Added CLMUL-based CRC32 on x86-64 and E2K with runtime
  processor detection. Similar to CRC64, on 32-bit x86 it
  isn't available unless --disable-assembler is used.

- Implemented GNU indirect function (IFUNC) as a runtime
  function dispatching method for CRC32 and CRC64 fast
  implementations on x86. Only GNU/Linux (glibc) and FreeBSD
  builds will use IFUNC, unless --enable-ifunc is specified to
  configure.

- Added definitions of mask values like
  LZMA_INDEX_CHECK_MASK_CRC32 to .

- The XZ logo is now included in the Doxygen generated
  documentation. It is licensed under Creative Commons
  Attribution-ShareAlike 4.0.

* xz:

- Multithreaded mode is now the default. This improves
  compression speed and creates .xz files that can be
  decompressed multithreaded at the cost of increased memory
  usage and slightly worse compression ratio.

- Added new command line option --filters to set the filter
  chain using liblzma filter string syntax.

- Added new command line options --filters1 ... --filters9 to
  set additional filter chains using liblzma filter string
  syntax. The --block-list option now allows specifying filter
  chains that were set using these new options.

- Added support for Linux Landlock as a sandboxing method.

- xzdec now supports pledge(2), Capsicum, and Linux Landlock as
  sandboxing methods.

- Progress indicator time stats remain accurate after pausing
  xz with SIGTSTP.

- Ported xz and xzdec to Windows MSVC. Visual Studio 2015 or
  later is required.

* CMake Build:

- Supports pledge(2), Capsicum, and Linux Landlock sandboxing
  methods.

- Replacement functions for getopt_long() are used on platforms
  that do not have it.

* Enabled unaligned access by default on PowerPC64LE and on RISC-V
  targets that define __riscv_misaligned_fast.

* Tests:

- Added two new fuzz targets to OSS-Fuzz.

- Implemented Continuous Integration (CI) testing using
  GitHub Actions.

* Changed quoting style from `...' to '...' in all messages,
  scripts, and documentation.

* Added basic Codespell support to help catch typo errors.

---
Jia Tan



[xz-devel] XZ Utils 5.4.5

2023-11-01 Thread Jia Tan
XZ Utils 5.4.5 is available at  and
.

Here is an extract from the NEWS file:

5.4.5 (2023-11-01)

* liblzma:

- Use __attribute__((__no_sanitize_address__)) to avoid address
  sanitization with CRC64 CLMUL. It uses 16-byte-aligned reads
  which can extend past the bounds of the input buffer and
  inherently trigger address sanitization errors. This isn't
  a bug.

- Fixed an assertion failure that could be triggered by a large
  unpadded_size argument. It was verified that there was no
  other bug than the assertion failure.

- Fixed a bug that prevented building with Windows Vista
  threading when __attribute__((__constructor__)) is not
  supported.

* xz now properly handles special files such as "con" or "nul" on
  Windows. Before this fix, the following wrote "foo" to the
  console and deleted the input file "con_xz":

  echo foo | xz > con_xz
  xz --suffix=_xz --decompress con_xz

* Build systems:

- Allow builds with Windows win95 threading and small mode when
  __attribute__((__constructor__)) is supported.

- Added a new line to liblzma.pc for MSYS2 (Windows):

  Cflags.private: -DLZMA_API_STATIC

  When compiling code that will link against static liblzma,
  the LZMA_API_STATIC macro needs to be defined on Windows.

- CMake specific changes:

* Fixed a bug that allowed CLOCK_MONOTONIC to be used even
  if the check for it failed.

* Fixed a bug where configuring CMake multiple times
  resulted in HAVE_CLOCK_GETTIME and HAVE_CLOCK_MONOTONIC
  not being set.

* Fixed the build with MinGW-w64-based Clang/LLVM 17.
  llvm-windres now has more accurate GNU windres emulation
  so the GNU windres workaround from 5.4.1 is needed with
  llvm-windres version 17 too.

* The import library on Windows is now properly named
  "liblzma.dll.a" instead of "libliblzma.dll.a"

* Fixed a bug causing the Ninja Generator to fail on
  UNIX-like systems. This bug was introduced in 5.4.0.

* Added a new option to disable CLMUL CRC64.

* A module-definition (.def) file is now created when
  building liblzma.dll with MinGW-w64.

* The pkg-config liblzma.pc file is now installed on all
  builds except when using MSVC on Windows.

* Added large file support by default for platforms that
  need it to handle files larger than 2 GiB. This includes
  MinGW-w64, even 64-bit builds.

* Small fixes and improvements to the tests.

* Updated translations: Chinese (simplified) and Esperanto.



[xz-devel] Re: [xz-devel] [PATCH] [xz-embedded] Fix condition that automatically define XZ_DEC_BCJ

2023-09-08 Thread Jules Maselbas



On September 8, 2023 3:05:59 PM GMT+02:00, Lasse Collin 
 wrote:
> On 2023-09-07 Jules Maselbas wrote:
> > The XZ_DEC_BCJ macro was not defined when only selecting the ARM64 BCJ
> > decoder, leading to no BCJ decoder being compiled.
> > 
> > The macro that select XZ_DEC_BCJ if any of the BCJ decoder is
> > selected was missing a case for the recently added ARM64 BCJ decoder.
> > 
> > Also the macro `defined(XZ_DEC_ARM)` was used twice in the condition
> > for selecting XZ_DEC_BCJ, so this patch replaces one with
> > XZ_DEC_ARM64.
> 
> Thanks! I kept the ordering of the filter names the same as elsewhere
> in the file and in xz_dec_bcj.c.
OK great

> The ARM64 filter still hasn't been submitted to Linux but it's on the
> to-do list.
cool, i've just pushed the ARM64 filter to the barebox bootloader, see:
https://lore.barebox.org/barebox/b4b0c086-ee4a-4cbb-85fb-3499f8de4...@zdiv.net/T/#t




Re: [xz-devel] [PATCH] [xz-embedded] Fix condition that automatically define XZ_DEC_BCJ

2023-09-08 Thread Lasse Collin
On 2023-09-07 Jules Maselbas wrote:
> The XZ_DEC_BCJ macro was not defined when only selecting the ARM64 BCJ
> decoder, leading to no BCJ decoder being compiled.
> 
> The macro that select XZ_DEC_BCJ if any of the BCJ decoder is
> selected was missing a case for the recently added ARM64 BCJ decoder.
> 
> Also the macro `defined(XZ_DEC_ARM)` was used twice in the condition
> for selecting XZ_DEC_BCJ, so this patch replaces one with
> XZ_DEC_ARM64.

Thanks! I kept the ordering of the filter names the same as elsewhere
in the file and in xz_dec_bcj.c.

The ARM64 filter still hasn't been submitted to Linux but it's on the
to-do list.

-- 
Lasse Collin



[xz-devel] [PATCH] [xz-embedded] Fix condition that automatically define XZ_DEC_BCJ

2023-09-07 Thread Jules Maselbas
The XZ_DEC_BCJ macro was not defined when only selecting the ARM64 BCJ
decoder, leading to no BCJ decoder being compiled.

The macro that select XZ_DEC_BCJ if any of the BCJ decoder is selected was
missing a case for the recently added ARM64 BCJ decoder.

Also the macro `defined(XZ_DEC_ARM)` was used twice in the condition for
selecting XZ_DEC_BCJ, so this patch replaces one with XZ_DEC_ARM64.

Signed-off-by: Jules Maselbas 
---
 linux/lib/xz/xz_private.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux/lib/xz/xz_private.h b/linux/lib/xz/xz_private.h
index e3bba7b..94b0350 100644
--- a/linux/lib/xz/xz_private.h
+++ b/linux/lib/xz/xz_private.h
@@ -102,7 +102,7 @@
 #ifndef XZ_DEC_BCJ
 #  if defined(XZ_DEC_X86) || defined(XZ_DEC_POWERPC) \
|| defined(XZ_DEC_IA64) || defined(XZ_DEC_ARM) \
-   || defined(XZ_DEC_ARM) || defined(XZ_DEC_ARMTHUMB) \
+   || defined(XZ_DEC_ARM64) || defined(XZ_DEC_ARMTHUMB) \
|| defined(XZ_DEC_SPARC)
 #  define XZ_DEC_BCJ
 #  endif
-- 
2.42.0




[xz-devel] XZ Utils 5.4.4

2023-08-02 Thread Jia Tan
XZ Utils 5.4.4 is available at  and
.

Here is an extract from the NEWS file:

5.4.4 (2023-08-02)

* liblzma and xzdec can now build against WASI SDK when threading
  support is disabled. xz and tests don't build yet.

* CMake:

- Fixed a bug preventing other projects from including liblzma
  multiple times using find_package().

- Don't create broken symlinks in Cygwin and MSYS2 unless
  supported by the environment. This prevented building for the
  default MSYS2 environment. The problem was introduced in
  xz 5.4.0.

* Documentation:

- Small improvements to man pages.

- Small improvements and typo fixes for liblzma API
  documentation.

* Tests:

- Added a new section to INSTALL to describe basic test usage
  and address recent questions about building the tests when
  cross compiling.

- Small fixes and improvements to the tests.

* Translations:

- Fixed a mistake that caused one of the error messages to not
  be translated. This only affected versions 5.4.2 and 5.4.3.

- Updated the Chinese (simplified), Croatian, Esperanto, German,
  Korean, Polish, Romanian, Spanish, Swedish, Ukrainian, and
  Vietnamese translations.

- Updated the German, Korean, Romanian, and Ukrainian man page
  translations.



[xz-devel] XZ Utils 5.2.12 and 5.4.3

2023-05-04 Thread Jia Tan
XZ Utils 5.2.12 and 5.4.3 are available at .

The Doxygen-generated liblzma API documentation is available at
.

This is the last planned release for the 5.2.12 branch. Patches or new
releases may still be made if severe bugs are found.

The OpenPGP signing key used for this and future releases has changed
to my key (). The fingerprint is:

22D4 65F2 B4C1 7380 3B20  C6DE 59FC F207 FEA7 F445

The new key is available at https://tukaani.org/misc/jia_tan_pubkey.txt

Here is an extract from the NEWS file:

5.2.12 (2023-05-04)

* Fixed a build system bug that prevented building liblzma as a
  shared library when configured with --disable-threads. This bug
  affected releases 5.2.6 to 5.2.11 and 5.4.0 to 5.4.2.

* Include  for Windows intrinsic functions where they are
  needed. This fixed a bug that prevented building liblzma using
  clang-cl on Windows.

* Minor update to the Croatian translation. The small change
  applies to a string in both 5.2 and 5.4 branches.

5.4.3 (2023-05-04)

* All fixes from 5.2.12

* Features in the CMake build can now be disabled as CMake cache
  variables, similar to the Autotools build.

* Minor update to the Croatian translation.



[xz-devel] XZ Utils 5.2.11 and 5.4.2

2023-03-18 Thread Lasse Collin
XZ Utils 5.2.11 and 5.4.2 are available at .

The Doxygen-generated liblzma API documentation is now available online
at .

Please let us know if there is interest in more releases for the
5.2 branch. Jia Tan and I will plan further bug-fix releases for this
branch only if people will use it.

Future release tarballs might be signed by Jia Tan. Recently he has done
most of the work in XZ Utils. :-)

Here is an extract from the NEWS file:

5.2.11 (2023-03-18)

* Removed all possible cases of null pointer + 0. It is undefined
  behavior in C99 and C17. This was detected by a sanitizer and had
  not caused any known issues.

* Build systems:

- Added a workaround for building with GCC on MicroBlaze Linux.
  GCC 12 on MicroBlaze doesn't support the __symver__ attribute
  even though __has_attribute(__symver__) returns true. The
  build is now done without the extra RHEL/CentOS 7 symbols
  that were added in XZ Utils 5.2.7. The workaround only
  applies to the Autotools build (not CMake).

- CMake: Ensure that the C compiler language is set to C99 or
  a newer standard.

- CMake changes from XZ Utils 5.4.1:

* Added a workaround for a build failure with
  windres from GNU binutils.

* Included the Windows resource files in the xz
  and xzdec build rules.

5.4.2 (2023-03-18)

* All fixes from 5.2.11 that were not included in 5.4.1.

* If xz is built with support for the Capsicum sandbox but running
  in an environment that doesn't support Capsicum, xz now runs
  normally without sandboxing instead of exiting with an error.

* liblzma:

- Documentation was updated to improve the style, consistency,
  and completeness of the liblzma API headers.

- The Doxygen-generated HTML documentation for the liblzma API
  header files is now included in the source release and is
  installed as part of "make install". All JavaScript is
  removed to simplify license compliance and to reduce the
  install size.

- Fixed a minor bug in lzma_str_from_filters() that produced
  too many filters in the output string instead of reporting
  an error if the input array had more than four filters. This
  bug did not affect xz.

* Build systems:

- autogen.sh now invokes the doxygen tool via the new wrapper
  script doxygen/update-doxygen, unless the command line option
  --no-doxygen is used.

- Added microlzma_encoder.c and microlzma_decoder.c to the
  VS project files for Windows and to the CMake build. These
  should have been included in 5.3.2alpha.

* Tests:

- Added a test to the CMake build that was forgotten in the
  previous release.

- Added and refactored a few tests.

* Translations:

- Updated the Brazilian Portuguese translation.

- Added Brazilian Portuguese man page translation.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.4.1, xz.git on GitHub

2023-01-11 Thread Jia Tan
The primary Git repository of XZ Utils is now on
. It is also an alternative
method for sending bug reports, patches, and general discussion. The
old repository at  will be maintained
as a mirror and will be updated with some delay.

The forum on Sourceforge.net is no longer used.

The IRC channel #tukaani on Libera Chat has been a little more active
recently. :-)

XZ Utils 5.4.1 is available at . Here is an
extract from the NEWS file:

5.4.1 (2023-01-11)

* liblzma:

- Fixed the return value of lzma_microlzma_encoder() if the
  LZMA options lc/lp/pb are invalid. Invalid lc/lp/pb options
  made the function return LZMA_STREAM_END without encoding
  anything instead of returning LZMA_OPTIONS_ERROR.

- Windows / Visual Studio: Workaround a possible compiler bug
  when targeting 32-bit x86 and compiling the CLMUL version of
  the CRC64 code. The CLMUL code isn't enabled by the Windows
  project files but it is in the CMake-based builds.

* Build systems:

- Windows-specific CMake changes:

* Don't try to enable CLMUL CRC64 code if _mm_set_epi64x()
  isn't available. This fixes CMake-based build with Visual
  Studio 2013.

* Created a workaround for a build failure with windres
  from GNU binutils. It is used only when the C compiler
  is GCC (not Clang). The workaround is incompatible
  with llvm-windres, resulting in "XZx20Utils" instead
  of "XZ Utils" in the resource file, but without the
  workaround llvm-windres works correctly. See the
  comment in CMakeLists.txt for details.

* Included the resource files in the xz and xzdec build
  rules. Building the command line tools is still
  experimental but possible with MinGW-w64.

- Visual Studio: Added stream_decoder_mt.c to the project
  files. Now the threaded decompressor lzma_stream_decoder_mt()
  gets built. CMake-based build wasn't affected.

- Updated windows/INSTALL-MSVC.txt to mention that CMake-based
  build is now the preferred method with Visual Studio. The
  project files will probably be removed after 5.4.x releases.

- Changes to #defines in config.h:

* HAVE_DECL_CLOCK_MONOTONIC was replaced by
  HAVE_CLOCK_MONOTONIC. The old macro was always defined
  in configure-generated config.h to either 0 or 1. The
  new macro is defined (to 1) only if the declaration of
  CLOCK_MONOTONIC is available. This matches the way most
  other config.h macros work and makes things simpler with
  other build systems.

* HAVE_DECL_PROGRAM_INVOCATION_NAME was replaced by
  HAVE_PROGRAM_INVOCATION_NAME for the same reason.

* Tests:

- Fixed test script compatibility with ancient /bin/sh
  versions. Now the five test_compress_* tests should
  no longer fail on Solaris 10.

- Added and refactored a few tests.

* Translations:

- Updated the Catalan and Esperanto translations.

- Added Korean and Ukrainian man page translations.



[xz-devel] XZ Utils 5.2.10 and 5.4.0

2022-12-13 Thread Lasse Collin
XZ Utils 5.2.10 and 5.4.0 are available at .

The old stable branch 5.2.x will be maintained for a while so that
those who don't want to move to a new stable release immediately can
still get bug fixes.

Here is an extract from the NEWS file:

5.2.10 (2022-12-13)

* xz: Don't modify argv[] when parsing the --memlimit* and
  --block-list command line options. This fixes confusing
  arguments in process listing (like "ps auxf").

* GNU/Linux only: Use __has_attribute(__symver__) to detect if
  that attribute is supported. This fixes build on Mandriva where
  Clang is patched to define __GNUC__ to 11 by default (instead
  of 4 as used by Clang upstream).


5.4.0 (2022-12-13)

This bumps the minor version of liblzma because new features were
added. The API and ABI are still backward compatible with liblzma
5.2.x and 5.0.x.

Since 5.3.5beta:

* All fixes from 5.2.10.

* The ARM64 filter is now stable. The xz option is now --arm64.
  Decompression requires XZ Utils 5.4.0. In the future the ARM64
  filter will be supported by XZ for Java, XZ Embedded (including
  the version in Linux), LZMA SDK, and 7-Zip.

* Translations:

- Updated Catalan, Croatian, German, Romanian, and Turkish
  translations.

- Updated German man page translations.

- Added Romanian man page translations.

Summary of new features added in the 5.3.x development releases:

* liblzma:

- Added threaded .xz decompressor lzma_stream_decoder_mt().
  It can use multiple threads with .xz files that have multiple
  Blocks with size information in Block Headers. The threaded
  encoder in xz has always created such files.

  Single-threaded encoder cannot store the size information in
  Block Headers even if one used LZMA_FULL_FLUSH to create
  multiple Blocks, so this threaded decoder cannot use multiple
  threads with such files.

  If there are multiple Streams (concatenated .xz files), one
  Stream will be decompressed completely before starting the
  next Stream.

- A new decoder flag LZMA_FAIL_FAST was added. It makes the
  threaded decompressor report errors soon instead of first
  flushing all pending data before the error location.

- New Filter IDs:
* LZMA_FILTER_ARM64 is for ARM64 binaries.
* LZMA_FILTER_LZMA1EXT is for raw LZMA1 streams that don't
  necessarily use the end marker.

- Added lzma_str_to_filters(), lzma_str_from_filters(), and
  lzma_str_list_filters() to convert a preset or a filter chain
  string to a lzma_filter[] and vice versa. These should make
  it easier to write applications that allow users to specify
  custom compression options.

- Added lzma_filters_free() which can be convenient for freeing
  the filter options in a filter chain (an array of lzma_filter
  structures).

- lzma_file_info_decoder() to makes it a little easier to get
  the Index field from .xz files. This helps in getting the
  uncompressed file size but an easy-to-use random access
  API is still missing which has existed in XZ for Java for
  a long time.

- Added lzma_microlzma_encoder() and lzma_microlzma_decoder().
  It is used by erofs-utils and may be used by others too.

  The MicroLZMA format is a raw LZMA stream (without end marker)
  whose first byte (always 0x00) has been replaced with
  bitwise-negation of the LZMA properties (lc/lp/pb). It was
  created for use in EROFS but may be used in other contexts
  as well where it is important to avoid wasting bytes for
  stream headers or footers. The format is also supported by
  XZ Embedded (the XZ Embedded version in Linux got MicroLZMA
  support in Linux 5.16).

  The MicroLZMA encoder API in liblzma can compress into a
  fixed-sized output buffer so that as much data is compressed
  as can be fit into the buffer while still creating a valid
  MicroLZMA stream. This is needed for EROFS.

- Added lzma_lzip_decoder() to decompress the .lz (lzip) file
  format version 0 and the original unextended version 1 files.
  Also lzma_auto_decoder() supports .lz files.

- lzma_filters_update() can now be used with the multi-threaded
  encoder (lzma_stream_encoder_mt()) to change the filter chain
  after LZMA_FULL_BARRIER or LZMA_FULL_FLUSH.

- In lzma_options_lzma, allow nice_len = 2 and 3 with the match
  finders that require at least 3 or 4. Now it is internally
  rounded up if needed.

- CLMUL-based CRC64 on x86-64 and E2K with runtime processor
  detection. On 32-bit x86 it currently isn't available unless
  

[xz-devel] XZ Utils 5.3.5beta

2022-12-01 Thread Lasse Collin
There were technical issues on the tukaani.org website in the past 24 hours. 
These should have now been fixed. Sorry for the inconvenience.

XZ Utils 5.3.5beta is available at . Here is an
extract from the NEWS file:

5.3.5beta (2022-12-01)

* All fixes from 5.2.9.

* liblzma:

- Added new LZMA_FILTER_LZMA1EXT for raw encoder and decoder to
  handle raw LZMA1 streams that don't have end of payload marker
  (EOPM) alias end of stream (EOS) marker. It can be used in
  filter chains, for example, with the x86 BCJ filter.

- Added lzma_str_to_filters(), lzma_str_from_filters(), and
  lzma_str_list_filters() to make it easier for applications
  to get custom compression options from a user and convert
  it to an array of lzma_filter structures.

- Added lzma_filters_free().

- lzma_filters_update() can now be used with the multi-threaded
  encoder (lzma_stream_encoder_mt()) to change the filter chain
  after LZMA_FULL_BARRIER or LZMA_FULL_FLUSH.

- In lzma_options_lzma, allow nice_len = 2 and 3 with the match
  finders that require at least 3 or 4. Now it is internally
  rounded up if needed.

- ARM64 filter was modified. It is still experimental.

- Fixed LTO build with Clang if -fgnuc-version=10 or similar
  was used to make Clang look like GCC >= 10. Now it uses
  __has_attribute(__symver__) which should be reliable.

* xz:

- --threads=+1 or -T+1 is now a way to put xz into multi-threaded
  mode while using only one worker thread.

- In --lzma2=nice=NUMBER allow 2 and 3 with all match finders
  now that liblzma handles it.

* Updated translations: Chinese (simplified), Korean, and Turkish.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.2.9

2022-11-30 Thread Lasse Collin
XZ Utils 5.2.9 is available at . Here is an
extract from the NEWS file:

5.2.9 (2022-11-30)

* liblzma:

- Fixed an infinite loop in LZMA encoder initialization
  if dict_size >= 2 GiB. (The encoder only supports up
  to 1536 MiB.)

- Fixed two cases of invalid free() that can happen if
  a tiny allocation fails in encoder re-initialization
  or in lzma_filters_update(). These bugs had some
  similarities with the bug fixed in 5.2.7.

- Fixed lzma_block_encoder() not allowing the use of
  LZMA_SYNC_FLUSH with lzma_code() even though it was
  documented to be supported. The sync-flush code in
  the Block encoder was already used internally via
  lzma_stream_encoder(), so this was just a missing flag
  in the lzma_block_encoder() API function.

- GNU/Linux only: Don't put symbol versions into static
  liblzma as it breaks things in some cases (and even if
  it didn't break anything, symbol versions in static
  libraries are useless anyway). The downside of the fix
  is that if the configure options --with-pic or --without-pic
  are used then it's not possible to build both shared and
  static liblzma at the same time on GNU/Linux anymore;
  with those options --disable-static or --disable-shared
  must be used too.

* New email address for bug reports is  which
  forwards messages to Lasse Collin and Jia Tan.

-- 
Lasse Collin



Re: [xz-devel] [PATCH 1/2] Add support openssl's SHA256 implementation

2022-11-30 Thread Lasse Collin
On 2022-11-30 Lasse Collin wrote:
> Are there other good library options?

If the goal is to use SHA instructions on x86 then intrinsics in the C
code with runtime CPU detection are an option too. It's done in
crc64_fast.c in 5.3.4alpha already.

-- 
Lasse Collin



Re: [xz-devel] [PATCH 1/2] Add support openssl's SHA256 implementation

2022-11-30 Thread Lasse Collin
Hello!

This could be good as an optional feature, disabled by default so that
extra dependency doesn't get added accidentally. It's too late for
5.4.0 but perhaps in 5.4.1 or .2.

The biggest problem with the patch is that it lacks error checking:

  - EVP_MD_CTX_new() can return NULL if memory allocation fails. Man
page doesn't document this but source code makes it clear.

  - EVP_get_digestbyname() can return NULL on failure. Perhaps this
could be replaced with EVP_sha256()? It seems to return a pointer
to a statically-allocated structure and man page implies that it
cannot fail.

  - EVP_DigestInit_ex(), EVP_DigestUpdate(), and EVP_DigestFinal_ex()
can in theory fail, perhaps not in practice, I don't know.

Currently it is assumed in liblzma that initiazation cannot fail so
that would need to be changed. It could be good to check the return
values from EVP_DigestUpdate() and EVP_DigestFinal_ex() too. Since it
is unlikely that EVP_DigestUpdate() fails it could perhaps be OK to
store the failure code and only return it for lzma_check_finish() but
I'm not sure if that is acceptable.

The configure options perhaps should be --with instead of --enable since
it adds a dependency on another package, if one wants to stick to
Autoconf's guidlines. (It's less clear if --enable-external-sha256
should be --with since it only affects what to use from the OS base
libraries. In any case it won't be changed as it would affect
compatibility with build scripts.)

Are there other good library options? For example, Nettle's SHA-256
functions don't need any error checking but I haven't checked the
performance.

Is it a mess for distributions if a dependency of liblzma gets its
soname bumped and then liblzma needs to be rebuilt without changing its
soname? I suppose such things happen all the time but when a library is
needed by a package manager it might perhaps have extra worries.

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-24 Thread Lasse Collin
On 2022-11-23 Sebastian Andrzej Siewior wrote:
> 3x to be exact:
> - 1x shared with threads
> - 1x static with threads
> - 1x non-shared, no threads, no encoders, just xzdec.
> 
> There are three build folder in the end. The full gets a make install,
> the other get xzdec/liblzma.a extracted.

Thanks! I remember the details now, it's excellent.

I figured out a way to make everything just work in the common case. If
--with-pic or --without-pic is used then building both shared and
static liblzma at the same time isn't possible (configure will fail).
That is, --with-pic or --without-pic requires that also
--disable-shared or --disable-static is used on GNU/Linux.

It's in xz.git now and will be in the next releases (5.2.9 is needed to
fix other bugs) so I hope any workarounds can be removed from distros
after that.

Thanks to Adrian for reporting the bug!

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

Hi Sebastian!

On 11/23/22 21:40, John Paul Adrian Glaubitz wrote:

Adrian, could you please remove the -dev package from the buildd? Then
it should work (eitherway I'm going to disable the versions for static
builds).


The chroots are regularly regenerated by a cron job using debootstrap.

We don't have any particular setting to explicitly pull in liblzma-dev
on ia64, see [1]. So, I have honestly no clue why it was installed.


It's pulled in by libunwind-dev which is AFAIK always installed when building
C/C++ code on ia64. The external libunwind library is a hard requirement on
ia64.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Sebastian Andrzej Siewior
On 2022-11-23 21:12:53 [+0100], To Lasse Collin wrote:
> > It is fine to build *static* liblzma with --disable-symbol-versions on
> > all archs. Debian-specific workaround is fine in the short term but
> > this should be fixed upstream. One method would be to disable the extra
> > symbols on ia64 but that is not a real fix. Perhaps it's not really
> > possible as long as the main build system is Autotools, I don't
> > currently know.
> 
> I'm not sure what other do but it might be reasonable to disable symbol
> versions for static linking/ compile since there should be no need for
> them.
> I kicked a mariadb build on amd64 with liblzma-dev as an addititional
> dependency just to see if it fails.

Just for the protocol: The mariadb build on amd64 with liblzma-dev
installed passed. So this was not it…

Sebastian



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Sebastian Andrzej Siewior
On 2022-11-23 22:26:39 [+0200], Lasse Collin wrote:
> On 2022-11-23 John Paul Adrian Glaubitz wrote:
> > Well, Debian builds both the static and dynamic libraries in separate
> > steps, so I'm not sure whether the autotools build system would be
> > able to detect that.
> 
> I would assume the separate steps means running configure twice, once
> to disable static build and once to disable shared build.

3x to be exact:
- 1x shared with threads
- 1x static with threads
- 1x non-shared, no threads, no encoders, just xzdec.

There are three build folder in the end. The full gets a make install,
the other get xzdec/liblzma.a extracted.

Sebastian



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

Hi!

On 11/23/22 21:12, Sebastian Andrzej Siewior wrote:

Can it build against liblzma.so if liblzma.a isn't available?


mariadb does not depend on liblzma-dev for building. The build log says:
| -- The following features have been disabled:
| * INNODB_LZMA, LZMA compression in the InnoDB storage engine

The amd64 buildd has liblzma5 installed - not liblzma-dev. So it can't
compile against it nor link statically. The ia64 buildd however has
liblzma-dev installed so the options are there. I *think* only the
testsuite (or whatever these few binaries were) link statically against
it and not the software package as a whole.

Adrian, could you please remove the -dev package from the buildd? Then
it should work (eitherway I'm going to disable the versions for static
builds).


The chroots are regularly regenerated by a cron job using debootstrap.

We don't have any particular setting to explicitly pull in liblzma-dev
on ia64, see [1]. So, I have honestly no clue why it was installed.

One could run

# debootstrap --foreign --no-check-gpg --arch=ia64 --variant=buildd

and see if that creates a chroot with liblzma-dev pre-installed.

Let me try a native run on yttrium.

Adrian


[1] https://salsa.debian.org/debian-ports-team/dsa-puppet


--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Lasse Collin
On 2022-11-23 John Paul Adrian Glaubitz wrote:
> Well, Debian builds both the static and dynamic libraries in separate
> steps, so I'm not sure whether the autotools build system would be
> able to detect that.

I would assume the separate steps means running configure twice, once
to disable static build and once to disable shared build.

> I would make --enable-static and --enable-symbol-versions mutually
> exclusive so that the configure fails if both are enabled.

I was thinking of a slightly friendlier approach so that the combination
--disable-shared --enable-static would imply --disable-symbol-versions
on GNU/Linux (it doesn't matter elsewhere for now). It's good if people
never need to use the option *-symbol-versions. The defaults need to be
as good as easily possible. Using  --disable-symbol-versions as a
temporary workaround is fine but if it is needed in the long term then
something is broken.

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

Hi Lasse!

On 11/23/22 20:52, Lasse Collin wrote:

On 2022-11-23 John Paul Adrian Glaubitz wrote:

So, for now, we should build the static library with
"--disable-symbol-versions".


An ugly workaround in upstream could be to make configure fail on
GNU/Linux if both shared and static libs are about to be built. That
is, show an error message describing that one thing has to be built at
a time. It's not pretty but with Autotools I don't see any other way
except dropping the RHEL/CentOS 7 compat symbols completely. Static
libs shouldn't have symbol versions (no matter which arch), somehow it
just doesn't always create problems.


Well, Debian builds both the static and dynamic libraries in separate steps,
so I'm not sure whether the autotools build system would be able to detect
that.


That is, it would be mandatory to use either --disable-static or
--disable-shared to make configure pass. Or would it be less bad to
default to shared-only build and require the use of both
--disable-shared --enable-static to get static build? I don't like any
of these but I don't have better ideas.


I would make --enable-static and --enable-symbol-versions mutually exclusive
so that the configure fails if both are enabled.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Lasse Collin
On 2022-11-23 John Paul Adrian Glaubitz wrote:
> So, for now, we should build the static library with
> "--disable-symbol-versions".

An ugly workaround in upstream could be to make configure fail on
GNU/Linux if both shared and static libs are about to be built. That
is, show an error message describing that one thing has to be built at
a time. It's not pretty but with Autotools I don't see any other way
except dropping the RHEL/CentOS 7 compat symbols completely. Static
libs shouldn't have symbol versions (no matter which arch), somehow it
just doesn't always create problems.

That is, it would be mandatory to use either --disable-static or
--disable-shared to make configure pass. Or would it be less bad to
default to shared-only build and require the use of both
--disable-shared --enable-static to get static build? I don't like any
of these but I don't have better ideas.

Thoughts? 

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

Hi Sebastian!

On 11/23/22 20:29, Sebastian Andrzej Siewior wrote:

On 2022-11-23 14:33:39 [+0100], John Paul Adrian Glaubitz wrote:

@Sebastian: Can you do that? Does anything speak against that?


No, let me do that.


Great, thank you!

Don't forget to reference Debian bug #1024516 in the debian/changelog ;-).

Thanks,
Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Sebastian Andrzej Siewior
On 2022-11-23 14:33:39 [+0100], John Paul Adrian Glaubitz wrote:
> @Sebastian: Can you do that? Does anything speak against that?

No, let me do that.

> Adrian

Sebastian



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Lasse Collin
On 2022-11-23 John Paul Adrian Glaubitz wrote:
> On 11/23/22 12:31, Lasse Collin wrote:
> > (1) Does this make the problem go away?  
> 
> Yes, that fixes the linker problem for me. At least in the case of
> mariadb-10.6.

Why does it want static liblzma.a in the first place? It sounds weird
to require rebuilding of mariadb-10.6 every time liblzma is updated.

Can it build against liblzma.so if liblzma.a isn't available?

It is fine to build *static* liblzma with --disable-symbol-versions on
all archs. Debian-specific workaround is fine in the short term but
this should be fixed upstream. One method would be to disable the extra
symbols on ia64 but that is not a real fix. Perhaps it's not really
possible as long as the main build system is Autotools, I don't
currently know.

I'm still curious why exactly one symbol (lzma_get_progress) looks
special in the readelf output. For some reason no other symbols with
the symver declarations are there. Does it happen because of something
in XZ Utils or is it weird behavior in the toolchain that creates the
static lib.

One can wonder if it was a mistake to try to clean up the issues that
started from the RHEL/CentOS 7 patch since now it has created a new
problem. On the other hand, the same could have happened if this kind of
symbol versioning had been done to avoid bumping the soname (which
hopefully will never happen though).

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

Hi Lasse!

On 11/23/22 12:31, Lasse Collin wrote:

On 2022-11-23 John Paul Adrian Glaubitz wrote:

I guess the additional unwind section breaks your workaround, so the
best might be to just disable this workaround on ia64 using the
configure flag, no?


There currently is no configure option to only disable the CentOS 7
workaround symbols. They are enabled if $host_os matches linux* and
--disable-symbol-versions wasn't used. Disabling symbol versions from
liblzma.so.5 will cause problems as they have been used since 5.2.0 and
many programs and libraries will expect to find XZ_5.0 and XZ_5.2.

Having the symbol versions in a static library doesn't make much sense
though. Perhaps this is a bug in XZ Utils. As a test, the static
liblzma.a could be built without symbol versions with --disable-shared
--disable-symbol-versions:

(1) Does this make the problem go away?


Yes, that fixes the linker problem for me. At least in the case of mariadb-10.6.

So, for now, we should build the static library with 
"--disable-symbol-versions".

@Sebastian: Can you do that? Does anything speak against that?

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread Lasse Collin
On 2022-11-23 John Paul Adrian Glaubitz wrote:
> I guess the additional unwind section breaks your workaround, so the
> best might be to just disable this workaround on ia64 using the
> configure flag, no?

There currently is no configure option to only disable the CentOS 7
workaround symbols. They are enabled if $host_os matches linux* and
--disable-symbol-versions wasn't used. Disabling symbol versions from
liblzma.so.5 will cause problems as they have been used since 5.2.0 and
many programs and libraries will expect to find XZ_5.0 and XZ_5.2.

Having the symbol versions in a static library doesn't make much sense
though. Perhaps this is a bug in XZ Utils. As a test, the static
liblzma.a could be built without symbol versions with --disable-shared
--disable-symbol-versions:

(1) Does this make the problem go away?

(2) Do the failing builds even require that liblzma.a is present
on the system?

I don't now how to avoid symvers in a static library as, to my
understanding, GNU Libtool doesn't add any -DBUILDING_SHARED_LIBRARY
kind of flag which would allow using a #ifdef to know when to use the
symbol versions. Libtool does add -DDLL_EXPORT when building a shared
library on Windows but that's not useful here.

(Switching to another build system would avoid some other Libtool
problems too like wrong shared library versioning on some OSes. However,
Autotools-based build system is able to produce usable xz on quite a
few less-common systems that some other build systems don't support.)

A workaround to this workaround could be to disable the CentOS 7
symbols on ia64 by default. Adding an explicit configure option is
possible too, if needed. But the first step should be to understand
what is going on since the same problem could appear in the future if
symbol versions are used for providing compatibility with an actual ABI
change (hopefully not needed but still).

> Older versions are available through Debian Snapshots:
> 
> > http://snapshot.debian.org/package/xz-utils/  

liblzma.a in liblzma-dev_5.2.5-2.1_ia64.deb doesn't have any "@XZ" in
it which is expected. This looks normal:

: [0x18c0-0x1990], info at +0x100

> > Many other functions are listed in those .IA_64.unwind
> > sections too but lzma_get_progress is the only one that has "@XZ"
> > as part of the function name.  
> 
> Hmm, that definitely seems the problem. Could it be that the symbols
> that are exported on ia64 need some additional naming?

It seems weird why only one symbol is affected. Perhaps it's a bug in
the toolchain creating liblzma.a. However, perhaps the main bug is that
XZ Utils build puts symbol versions into a static liblzma. :-(

> I think we can waive for CentOS 7 compatibility on Debian unstable
> ia64 .

There is no official CentOS 7 for ia64 but that isn't the whole story
as the broken patch has been used elsewhere too. Not having those extra
symbols would still be fine in practice. :-)

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

(resent because I got Sergei's email address wrong)

Hello Lasse!

On 11/23/22 00:11, Lasse Collin wrote:

On 2022-11-22 Sebastian Andrzej Siewior wrote:

This looks like it is staticaly linked against liblzma.


The shared libs in Debian seem to be correct as you managed to answer
right before my email. Thanks!  But the above comment made me look at
Debian's liblzma.a. The output of

 readelf -aW usr/lib/ia64-linux-gnu/liblzma.a

includes the following two lines in both 5.2.7 and 5.3.4alpha:

 Unwind section '.IA_64.unwind' at offset 0x2000 contains 15 entries:
 [...]
 : [0x1980-0x1a50], info at +0x108


I guess the additional unwind section breaks your workaround, so the best might
be to just disable this workaround on ia64 using the configure flag, no?


There are no older versions on the mirror so I didn't check what
pre-5.2.7 would have. But .IA_64.unwind is a ia64-specific thing.


Older versions are available through Debian Snapshots:


http://snapshot.debian.org/package/xz-utils/



Many other functions are listed in those .IA_64.unwind
sections too but lzma_get_progress is the only one that has "@XZ"
as part of the function name.


Hmm, that definitely seems the problem. Could it be that the symbols
that are exported on ia64 need some additional naming?


I don't understand these details but I wanted let you know anyway in
case it isn't a coincidence why lzma_get_progress appears in a special
form in both liblzma.a and in the linker error messages. The error has
@@XZ_5.2 (which even 5.2.0 has in shared liblzma.so.5) but here the
static lib has @XZ_5.2.2 which exists solely for CentOS 7 compatibility.


I think we can waive for CentOS 7 compatibility on Debian unstable ia64 .

Let me CC Sergei Trofimovich from Gentoo who has a more in-depth knowledge
on the ia64 architecture.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-23 Thread John Paul Adrian Glaubitz

Hello Lasse!

On 11/23/22 00:11, Lasse Collin wrote:

On 2022-11-22 Sebastian Andrzej Siewior wrote:

This looks like it is staticaly linked against liblzma.


The shared libs in Debian seem to be correct as you managed to answer
right before my email. Thanks! :-) But the above comment made me look at
Debian's liblzma.a. The output of

 readelf -aW usr/lib/ia64-linux-gnu/liblzma.a

includes the following two lines in both 5.2.7 and 5.3.4alpha:

 Unwind section '.IA_64.unwind' at offset 0x2000 contains 15 entries:
 [...]
 : [0x1980-0x1a50], info at +0x108


I guess the additional unwind section breaks your workaround, so the best might
be to just disable this workaround on ia64 using the configure flag, no?


There are no older versions on the mirror so I didn't check what
pre-5.2.7 would have. But .IA_64.unwind is a ia64-specific thing.


Older versions are available through Debian Snapshots:


http://snapshot.debian.org/package/xz-utils/



Many other functions are listed in those .IA_64.unwind
sections too but lzma_get_progress is the only one that has "@XZ"
as part of the function name.


Hmm, that definitely seems the problem. Could it be that the symbols
that are exported on ia64 need some additional naming?


I don't understand these details but I wanted let you know anyway in
case it isn't a coincidence why lzma_get_progress appears in a special
form in both liblzma.a and in the linker error messages. The error has
@@XZ_5.2 (which even 5.2.0 has in shared liblzma.so.5) but here the
static lib has @XZ_5.2.2 which exists solely for CentOS 7 compatibility.


I think we can waive for CentOS 7 compatibility on Debian unstable ia64 ;-).

Let me CC Sergei Trofimovich from Gentoo who has a more in-depth knowledge
on the ia64 architecture.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-22 Thread Lasse Collin
On 2022-11-22 Sebastian Andrzej Siewior wrote:
> This looks like it is staticaly linked against liblzma.

The shared libs in Debian seem to be correct as you managed to answer
right before my email. Thanks! :-) But the above comment made me look at
Debian's liblzma.a. The output of

readelf -aW usr/lib/ia64-linux-gnu/liblzma.a

includes the following two lines in both 5.2.7 and 5.3.4alpha:

Unwind section '.IA_64.unwind' at offset 0x2000 contains 15 entries:
[...]
: [0x1980-0x1a50], info at +0x108

There are no older versions on the mirror so I didn't check what
pre-5.2.7 would have. But .IA_64.unwind is a ia64-specific thing.
Many other functions are listed in those .IA_64.unwind
sections too but lzma_get_progress is the only one that has "@XZ"
as part of the function name.

I don't understand these details but I wanted let you know anyway in
case it isn't a coincidence why lzma_get_progress appears in a special
form in both liblzma.a and in the linker error messages. The error has
@@XZ_5.2 (which even 5.2.0 has in shared liblzma.so.5) but here the
static lib has @XZ_5.2.2 which exists solely for CentOS 7 compatibility.

lzma_cputhreads doesn't show the same special behavior in ia64 liblzma.a
even though lzma_cputhreads is handled exactly like lzma_get_progress in
the liblzma C code and linker script.

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-22 Thread Lasse Collin
On 2022-11-22 John Paul Adrian Glaubitz wrote:
> Does anyone have a clue why this particular change may have broken
> the linking on ia64?

Thanks for your report. This is important to fix.

What do these commands print? Fix the path to liblzma.so.5 if needed.

readelf --dyn-syms -W /lib/liblzma.so.5 \
| grep lzma_get_progress

readelf --dyn-syms -W /lib/liblzma.so.5 \
| grep lzma_stream_encoder_mt_memusage

The first should print 2 lines and the second 3 lines. The rightmost
columns should be like these:

FUNCGLOBAL DEFAULT   11 lzma_get_progress@@XZ_5.2
FUNCGLOBAL DEFAULT   11 lzma_get_progress@XZ_5.2.2
FUNCGLOBAL DEFAULT   11 lzma_stream_encoder_mt_memusage@@XZ_5.2
FUNCGLOBAL DEFAULT   11 lzma_stream_encoder_mt_memusage@XZ_5.1.2alpha
FUNCGLOBAL DEFAULT   11 lzma_stream_encoder_mt_memusage@XZ_5.2.2

Pay close attention to @ vs. @@. The XZ_5.2 must be the ones with @@.
If you see the same as above then I don't have a clue.

By any chance, was XZ Utils built with GCC older than 10 using
link-time optimization (LTO, -flto)? As my commit message describes
and NEWS warns, GCC < 10 and LTO will not produce correct results
due to the symbol versions. It should work fine with GCC >= 10 or Clang.

For what it is worth, when I wrote the patch I tested it on on
Slackware 10.1 (32-bit x86) that has GCC 3.3.4 and it worked perfectly
there. This symbol version stuff isn't a new thing so it really should
work.

-- 
Lasse Collin



Re: [xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-22 Thread Sebastian Andrzej Siewior
On 2022-11-22 18:51:49 [+0100], John Paul Adrian Glaubitz wrote:
> Hello!
Hi,

> [ 36%] Linking CXX shared module ha_archive.so
> cd /<>/builddir/storage/archive && /usr/bin/cmake -E 
> cmake_link_script CMakeFiles/archive.dir/link.txt --verbose=1
> /usr/bin/c++ -fPIC -g -O2 -ffile-prefix-map=/<>=. 
> -specs=/usr/share/dpkg/pie-compile.specs -Wformat -Werror=format-security 
> -Wdate-time -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -pie -fPIC 
> -fstack-protector --param=ssp-buffer-size=4 -O2 -g -static-libgcc 
> -fno-omit-frame-pointer -fno-strict-aliasing -Wno-uninitialized 
> -fno-omit-frame-pointer -D_FORTIFY_SOURCE=2 -DDBUG_OFF -Wall -Wenum-compare 
> -Wenum-conversion -Wextra -Wformat-security -Wmissing-braces 
> -Wno-format-truncation -Wno-init-self -Wno-nonnull-compare 
> -Wno-unused-parameter -Woverloaded-virtual -Wnon-virtual-dtor -Wvla 
> -Wwrite-strings -specs=/usr/share/dpkg/pie-link.specs -Wl,-z,relro,-z,now 
> -shared  -o ha_archive.so CMakeFiles/archive.dir/azio.c.o 
> CMakeFiles/archive.dir/ha_archive.cc.o  ../../libservices/libmysqlservices.a 
> -lz
> /usr/bin/ld: warning: -z relro ignored
> /usr/bin/ld: ha_archive.so: version node not found for symbol 
> lzma_get_progress@@XZ_5.2
> /usr/bin/ld: failed to set dynamic section sizes: bad value
> collect2: error: ld returned 1 exit status
> make[4]: *** [storage/archive/CMakeFiles/archive.dir/build.make:118: 
> storage/archive/ha_archive.so] Error 1
> make[4]: Leaving directory '/<>/builddir'
> make[3]: *** [CMakeFiles/Makefile2:4913: 
> storage/archive/CMakeFiles/archive.dir/all] Error 2

I'm not sure if this an ia64 issue or something else is missing. Looking
at the symbols:

| bigeasy@yttrium:~$ readelf -W --dyn-syms 
/lib/ia64-linux-gnu/liblzma.so.5|grep lzma_get_progress
|160: 7480   208 FUNCGLOBAL DEFAULT   12 
lzma_get_progress@@XZ_5.2
|161: 7480   208 FUNCGLOBAL DEFAULT   12 
lzma_get_progress@XZ_5.2.2
| bigeasy@yttrium:~$ readelf -W --dyn-syms /usr/bin/xz|grep progress
| 45:  0 FUNCGLOBAL DEFAULT  UND 
lzma_get_progress@XZ_5.2 (8)

The @@ thingy is used in the library to mark the default symbol. So
liblzma provides two lzma_get_progress and default is XZ_5.2. The XZ
binary picked it up properly. Looking around in your build:

| bigeasy@yttrium:~$ readelf -W --dyn-syms 
../glaubitz/mariadb-10.6/mariadb-10.6-10.6.11/builddir/client/mariadb-conv 
|grep lzma_get_progress
|812: 0011c140   208 FUNCGLOBAL DEFAULT   14 
lzma_get_progress@@XZ_5.2
|813: 0011c140   208 FUNCGLOBAL DEFAULT   14 
lzma_get_progress@XZ_5.2.2

This looks like it is staticaly linked against liblzma. I didn't find
lzma_get_progress anywhere else. So it looks like this function isn't
used by mariadb itself but appears due to static linking somewhere and
asks for trouble. I didn't find any reference to lzma_get_progress in
/lib/ia64-linux-gnu/libgcc_s.so.1, /lib/ia64-linux-gnu/libz.so.1.2.13,
ha_archive.cc.o or libmysqlservices.a. This seems to be all that is
passed to the compiler for linking.

Sebastian



[xz-devel] RHEL7 ABI patch (913ddc5) breaks linking on ia64

2022-11-22 Thread John Paul Adrian Glaubitz

Hello!

Since recently, several packages started to fail to build on Debian unstable 
ia64
when linking against liblzma. There error was always the same and indicates a 
problem
with the symbols exported by liblzma:

[ 36%] Linking CXX shared module ha_archive.so
cd /<>/builddir/storage/archive && /usr/bin/cmake -E 
cmake_link_script CMakeFiles/archive.dir/link.txt --verbose=1
/usr/bin/c++ -fPIC -g -O2 -ffile-prefix-map=/<>=. 
-specs=/usr/share/dpkg/pie-compile.specs -Wformat -Werror=format-security -Wdate-time 
-D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -pie -fPIC -fstack-protector 
--param=ssp-buffer-size=4 -O2 -g -static-libgcc -fno-omit-frame-pointer 
-fno-strict-aliasing -Wno-uninitialized -fno-omit-frame-pointer -D_FORTIFY_SOURCE=2 
-DDBUG_OFF -Wall -Wenum-compare -Wenum-conversion -Wextra -Wformat-security 
-Wmissing-braces -Wno-format-truncation -Wno-init-self -Wno-nonnull-compare 
-Wno-unused-parameter -Woverloaded-virtual -Wnon-virtual-dtor -Wvla -Wwrite-strings 
-specs=/usr/share/dpkg/pie-link.specs -Wl,-z,relro,-z,now -shared  -o ha_archive.so 
CMakeFiles/archive.dir/azio.c.o CMakeFiles/archive.dir/ha_archive.cc.o  
../../libservices/libmysqlservices.a -lz
/usr/bin/ld: warning: -z relro ignored
/usr/bin/ld: ha_archive.so: version node not found for symbol 
lzma_get_progress@@XZ_5.2
/usr/bin/ld: failed to set dynamic section sizes: bad value
collect2: error: ld returned 1 exit status
make[4]: *** [storage/archive/CMakeFiles/archive.dir/build.make:118: 
storage/archive/ha_archive.so] Error 1
make[4]: Leaving directory '/<>/builddir'
make[3]: *** [CMakeFiles/Makefile2:4913: 
storage/archive/CMakeFiles/archive.dir/all] Error 2

Upon closer inspection, I noticed that the change 913ddc5 looked very 
suspicious and indeed
reverting the following change fixes the issue so that linking against liblzma 
works again
on Debian unstable ia64:

commit 913ddc5572b9455fa0cf299be2e35c708840e922
Author: Lasse Collin 
Date:   Sun Sep 4 23:23:00 2022 +0300

liblzma: Vaccinate against an ill patch from RHEL/CentOS 7.

The relevant bug report in Debian is #1024516 [2].

Does anyone have a clue why this particular change may have broken the linking 
on ia64?

Thanks,
Adrian


[1] 
https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.6=ia64=1%3A10.6.11-1=1669022458=0
[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024516


--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [xz-devel] [PATCH] add xz arm64 bcj filter support

2022-11-17 Thread Lasse Collin
Hello!

On 2021-09-02 Liao Hua wrote:
> +#define LZMA_FILTER_ARM64 LZMA_VLI_C(0x0a)

Is this ID 0x0A in actual use somewhere? Can it be used in the official
.xz format for something else than the filter you submitted?

On 2021-09-08 Lasse Collin wrote:
> On 2021-09-02 Liao Hua wrote:
> > We have some questions about xz bcj filters.
> > 1. Why ARM and ARM-Thumb bcj filters are little endian only?  
> 
> Perhaps it's an error. Long ago when I wrote the docs, I knew that the
> ARM filters worked on little endian code but didn't know how big
> endian ARM was done.

I read about this and if I have understood correctly, in the past big
endian ARM could use big endian instruction encoding too but nowadays
instructions are always in little endian order, even if data access is
big endian. The endianness in the docs is about instruction encoding.
The filters don't care about data access.

The mention of endianness has been removed in 5.3.4alpha (and thus
5.4.0) since it is more confusing than useful.

The PowerPC filter is indeed big endian only. Little endian PowerPC
would need a new filter. Filtering little endian PowerPC code would
have comparable improvement in compression as the current big endian
filter does.

> > 2. Why there is no arm64 bcj filter? Are there any technical risks?
> > Or other considerations?  
> 
> It just hasn't been done, no other reason.

There will probably be a new ARM64 filter in 5.4.0. The exact design is
still not frozen. Different parameters work a little better or worse in
different situations. It doesn't seem practical to make a tunable
filter since few people would try different settings and it would make
the code slower and a little bigger (which matters in XZ Embedded).

With ARM64 it is good to use --lzma2=lc=2,lp=2 instead of the default
lc=3,lp=0. This alone can give a little over 1 % smaller file.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.3.4alpha

2022-11-15 Thread Lasse Collin
XZ Utils 5.3.4alpha is available at . Here is an
extract from the NEWS file:

5.3.4alpha (2022-11-15)

* All fixes from 5.2.7 and 5.2.8.

* liblzma:

- Minor improvements to the threaded decoder.

- Added CRC64 implementation that uses SSSE3, SSE4.1, and CLMUL
  instructions on 32/64-bit x86 and E2K. On 32-bit x86 it's
  not enabled unless --disable-assembler is used but then
  the non-CLMUL code might be slower. Processor support is
  detected at runtime so this is built by default on x86-64
  and E2K. On these platforms, if compiler flags indicate
  unconditional CLMUL support (-msse4.1 -mpclmul) then the
  generic version is not built, making liblzma 8-9 KiB smaller
  compared to having both versions included.

  With extremely compressible files this can make decompression
  up to twice as fast but with typical files 5 % improvement
  is a more realistic expectation.

  The CLMUL version is slower than the generic version with
  tiny inputs (especially at 1-8 bytes per call, but up to
  16 bytes). In normal use in xz this doesn't matter at all.

- Added an experimental ARM64 filter. This is *not* the final
  version! Files created with this experimental version won't
  be supported in the future versions! The filter design is
  a compromise where improving one use case makes some other
  cases worse.

- Added decompression support for the .lz (lzip) file format
  version 0 and the original unextended version 1. See the
  API docs of lzma_lzip_decoder() for details. Also
  lzma_auto_decoder() supports .lz files.

- Building with --disable-threads --enable-small
  is now thread-safe if the compiler supports
  __attribute__((__constructor__))

* xz:

- Added support for OpenBSD's pledge(2) as a sandboxing method.

- Don't mention endianness for ARM and ARM-Thumb filters in
  --long-help. The filters only work for little endian
  instruction encoding but modern ARM processors using
  big endian data access still use little endian
  instruction encoding. So the help text was misleading.
  In contrast, the PowerPC filter is only for big endian
  32/64-bit PowerPC code. Little endian PowerPC would need
  a separate filter.

- Added --experimental-arm64. This will be renamed once the
  filter is finished. Files created with this experimental
  filter will not be supported in the future!

- Added new fields to the output of xz --robot --info-memory.

- Added decompression support for the .lz (lzip) file format
  version 0 and the original unextended version 1. It is
  autodetected by default. See also the option --format on
  the xz man page.

* Scripts now support the .lz format using xz.

* Build systems:

- New #defines in config.h: HAVE_ENCODER_ARM64,
  HAVE_DECODER_ARM64, HAVE_LZIP_DECODER, HAVE_CPUID_H,
  HAVE_FUNC_ATTRIBUTE_CONSTRUCTOR, HAVE_USABLE_CLMUL

- New configure options: --disable-clmul-crc,
  --disable-microlzma, --disable-lzip-decoder, and
  'pledge' is now an option in --enable-sandbox (but
  it's autodetected by default anyway).

- INSTALL was updated to document the new configure options.

- PACKAGERS now lists also --disable-microlzma and
  --disable-lzip-decoder as configure options that must
  not be used in builds for non-embedded use.

* Tests:

- Fix some of the tests so that they skip instead of fail if
  certain features have been disabled with configure options.
  It's still not perfect.

- Other improvements to tests.

* Updated translations: Croatian, Finnish, Hungarian, Polish,
  Romanian, Spanish, Swedish, and Ukrainian.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.2.8

2022-11-13 Thread Lasse Collin
XZ Utils 5.2.8 is available at . Here is an
extract from the NEWS file:

5.2.8 (2022-11-13)

* xz:

- If xz cannot remove an input file when it should, this
  is now treated as a warning (exit status 2) instead of
  an error (exit status 1). This matches GNU gzip and it
  is more logical as at that point the output file has
  already been successfully closed.

- Fix handling of .xz files with an unsupported check type.
  Previously such printed a warning message but then xz
  behaved as if an error had occurred (didn't decompress,
  exit status 1). Now a warning is printed, decompression
  is done anyway, and exit status is 2. This used to work
  slightly before 5.0.0. In practice this bug matters only
  if xz has been built with some check types disabled. As
  instructed in PACKAGERS, such builds should be done in
  special situations only.

- Fix "xz -dc --single-stream tests/files/good-0-empty.xz"
  which failed with "Internal error (bug)". That is,
  --single-stream was broken if the first .xz stream in
  the input file didn't contain any uncompressed data.

- Fix displaying file sizes in the progress indicator when
  working in passthru mode and there are multiple input files.
  Just like "gzip -cdf", "xz -cdf" works like "cat" when the
  input file isn't a supported compressed file format. In
  this case the file size counters weren't reset between
  files so with multiple input files the progress indicator
  displayed an incorrect (too large) value.

* liblzma:

- API docs in lzma/container.h:
* Update the list of decoder flags in the decoder
  function docs.
* Explain LZMA_CONCATENATED behavior with .lzma files
  in lzma_auto_decoder() docs.

- OpenBSD: Use HW_NCPUONLINE to detect the number of
  available hardware threads in lzma_physmem().

- Fix use of wrong macro to detect x86 SSE2 support.
  __SSE2_MATH__ was used with GCC/Clang but the correct
  one is __SSE2__. The first one means that SSE2 is used
  for floating point math which is irrelevant here.
  The affected SSE2 code isn't used on x86-64 so this affects
  only 32-bit x86 builds that use -msse2 without -mfpmath=sse
  (there is no runtime detection for SSE2). It improves LZMA
  compression speed (not decompression).

- Fix the build with Intel C compiler 2021 (ICC, not ICX)
  on Linux. It defines __GNUC__ to 10 but doesn't support
  the __symver__ attribute introduced in GCC 10.

* Scripts: Ignore warnings from xz by using --quiet --no-warn.
  This is needed if the input .xz files use an unsupported
  check type.

* Translations:

- Updated Croatian and Turkish translations.

- One new translations wasn't included because it needed
  technical fixes. It will be in upcoming 5.4.0. No new
  translations will be added to the 5.2.x branch anymore.

- Renamed the French man page translation file from
  fr_FR.po to fr.po and thus also its install directory
  (like /usr/share/man/fr_FR -> .../fr).

- Man page translations for upcoming 5.4.0 are now handled
  in the Translation Project.

* Update doc/faq.txt a little so it's less out-of-date.

-- 
Lasse Collin



Re: [xz-devel] [PATCH 0/2] tests: Disable bits that require the [encoder|threads]

2022-10-23 Thread Jia Tan
> Sure!
>
> Let me know if you need any more data.

Thanks! That should be all that I need. It looks like from your
current build script the multilib_src_configure does not build xz and
xzdec, so that explains why the script tests are skipped. I see that
our skip messages from the scripts could be a little more helpful in
the future...

Jia Tan



Re: [xz-devel] [PATCH 0/2] tests: Disable bits that require the [encoder|threads]

2022-10-23 Thread Sam James


> On 23 Oct 2022, at 14:34, Jia Tan  wrote:
> 
> Hi!
> 
>> This definitely improves the situation. However, in Gentoo, we
>> allow optionally disabling 'extra-filters', described to users as:
>> ```
>>Build additional filters that are not
>>used in any of the default xz presets. This includes delta
>>and BCJ coders, additional match finders and SHA256 
>> checks.
>> ```
>> 
>> When this flag is disabled (i.e. no extra-filters), we pass the following
>> options to configure:
>> ```
>> /var/tmp/portage/app-arch/xz-utils-/work/xz-utils-/configure 
>> --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
>> --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share 
>> --sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking 
>> --disable-silent-rules --docdir=/usr/share/doc/xz-utils- 
>> --htmldir=/usr/share/doc/xz-utils-/html --with-sysroot=/ 
>> --libdir=/usr/lib64 --enable-threads --enable-nls --disable-static 
>> --enable-encoders=lzma1,lzma2 --enable-decoders=lzma1,lzma2 
>> --enable-match-finders=hc3,hc4,bt4 --enable-checks=crc32,crc64
> 
> Thanks for reporting this. Can you attach your entire test-suite.log
> file? I am wondering why test_files.sh and test_compress_* skips
> instead of fails on your configuration. Fixing the test_bcj_exact_size
> issue is simple and I already submitted a patch for it to Lasse.
> test_files.sh and test_compress_* deserve a proper rewrite, but that
> probably will not happen before 5.4.0 so the short term solution may
> be to have them skip if the configurations differ to far from the
> default.

Sure!

Let me know if you need any more data.



test-suite.log
Description: Binary data

> 
> Jia Tan



signature.asc
Description: Message signed with OpenPGP


Re: [xz-devel] [PATCH 0/2] tests: Disable bits that require the [encoder|threads]

2022-10-23 Thread Jia Tan
Hi!

> This definitely improves the situation. However, in Gentoo, we
> allow optionally disabling 'extra-filters', described to users as:
> ```
> Build additional filters that are not
> used in any of the default xz presets. This includes delta
> and BCJ coders, additional match finders and SHA256 
> checks.
> ```
>
> When this flag is disabled (i.e. no extra-filters), we pass the following
> options to configure:
> ```
> /var/tmp/portage/app-arch/xz-utils-/work/xz-utils-/configure 
> --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
> --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share 
> --sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking 
> --disable-silent-rules --docdir=/usr/share/doc/xz-utils- 
> --htmldir=/usr/share/doc/xz-utils-/html --with-sysroot=/ 
> --libdir=/usr/lib64 --enable-threads --enable-nls --disable-static 
> --enable-encoders=lzma1,lzma2 --enable-decoders=lzma1,lzma2 
> --enable-match-finders=hc3,hc4,bt4 --enable-checks=crc32,crc64

Thanks for reporting this. Can you attach your entire test-suite.log
file? I am wondering why test_files.sh and test_compress_* skips
instead of fails on your configuration. Fixing the test_bcj_exact_size
issue is simple and I already submitted a patch for it to Lasse.
test_files.sh and test_compress_* deserve a proper rewrite, but that
probably will not happen before 5.4.0 so the short term solution may
be to have them skip if the configurations differ to far from the
default.

Jia Tan



Re: [xz-devel] [PATCH 0/2] tests: Disable bits that require the [encoder|threads]

2022-10-20 Thread Sam James


> On 20 Oct 2022, at 14:26, Jia Tan  wrote:
> 
> Hi!
> 
>> the Debian xz-utils package builds the xzdec binary package which is
>> configured as "--disable-encoders --disable-threads". With these options
>> the test suite can't link due missing encoder or thread relevant
>> function.
>> The two patches is what I needed to get it built with these two options.
>> This is for the 5.3.3alpha version.
> 
> Thank you for reporting this and for your patches. We made a few minor
> changes to extend your patch to also compile and skip tests if
> encoders, threads, or decoders were disabled, including the script
> tests. These changes have been committed to master, so they will be
> included in the upcoming 5.4.0 release. If we have another alpha or
> beta release prior to 5.4.0, the commits will be included in those
> releases too.


This definitely improves the situation. However, in Gentoo, we
allow optionally disabling 'extra-filters', described to users as:
```
Build additional filters that are not
used in any of the default xz presets. This includes delta
and BCJ coders, additional match finders and SHA256 
checks.
```

When this flag is disabled (i.e. no extra-filters), we pass the following
options to configure:
```
/var/tmp/portage/app-arch/xz-utils-/work/xz-utils-/configure 
--prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
--mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share 
--sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking 
--disable-silent-rules --docdir=/usr/share/doc/xz-utils- 
--htmldir=/usr/share/doc/xz-utils-/html --with-sysroot=/ 
--libdir=/usr/lib64 --enable-threads --enable-nls --disable-static 
--enable-encoders=lzma1,lzma2 --enable-decoders=lzma1,lzma2 
--enable-match-finders=hc3,hc4,bt4 --enable-checks=crc32,crc64
```

This results in the following test failures on master as of today:
```
make[2]: Entering directory 
'/var/tmp/portage/app-arch/xz-utils-/work/xz-utils--abi_x86_32.x86/tests'
make[3]: Entering directory 
'/var/tmp/portage/app-arch/xz-utils-/work/xz-utils--abi_x86_32.x86/tests'
SKIP: test_files.sh
SKIP: test_compress_prepared_bcj_x86
SKIP: test_compress_prepared_bcj_sparc
SKIP: test_compress_generated_random
SKIP: test_compress_generated_text
SKIP: test_compress_generated_abc
PASS: test_hardware
PASS: test_check
PASS: test_filter_flags
PASS: test_stream_flags
SKIP: test_block_header
PASS: test_memlimit
PASS: test_vli
FAIL: test_bcj_exact_size
PASS: test_index

Testsuite summary for XZ Utils 5.3.3alpha

# TOTAL: 15
# PASS:  7
# SKIP:  7
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

See tests/test-suite.log
Please report to lasse.col...@tukaani.org

```

From tests/test-suite.log:
```
FAIL: test_bcj_exact_size
=

=== test_bcj_exact_size.c ===
SKIP: test_exact_size [test_bcj_exact_size.c:27] PowerPC BCJ encoder and/or 
decoder is disabled
FAIL: test_empty_block [test_bcj_exact_size.c:103] assert_enum_eq: 
'lzma_stream_buffer_decode(, 0, ((void *)0), empty_bcj_lzma2, _pos, 
in_size, out, _pos, 0) == LZMA_OPTIONS_ERROR' but expected '... = LZMA_OK'
---
# TOTAL: 2
# PASS:  0
# SKIP:  1
# FAIL:  1
# ERROR: 0
=== END ===
FAIL test_bcj_exact_size (exit status: 1)
```

The tests pass if I turn 'extra-filters' back on.

This is the current build script used:
https://gitweb.gentoo.org/repo/gentoo.git/tree/app-arch/xz-utils/xz-utils-.ebuild?id=5a8ce9b83b02f2b5a2e276e3d02f5436d3dce4ac.

Best,
sam


signature.asc
Description: Message signed with OpenPGP


Re: [xz-devel] [PATCH 0/2] tests: Disable bits that require the [encoder|threads]

2022-10-20 Thread Jia Tan
Hi!

> the Debian xz-utils package builds the xzdec binary package which is
> configured as "--disable-encoders --disable-threads". With these options
> the test suite can't link due missing encoder or thread relevant
> function.
> The two patches is what I needed to get it built with these two options.
> This is for the 5.3.3alpha version.

Thank you for reporting this and for your patches. We made a few minor
changes to extend your patch to also compile and skip tests if
encoders, threads, or decoders were disabled, including the script
tests. These changes have been committed to master, so they will be
included in the upcoming 5.4.0 release. If we have another alpha or
beta release prior to 5.4.0, the commits will be included in those
releases too.
Thanks again for your help!

Jia Tan



Re: [xz-devel] XZ Utils 5.3.3alpha

2022-10-03 Thread Sebastian Andrzej Siewior
On 2022-09-30 20:23:06 [+0300], Lasse Collin wrote:
> On 2022-09-29 Guillem Jover wrote:
> > On Wed, 2022-09-28 at 21:41:59 +0800, Jia Tan wrote:
> > > […] The
> > > interface for liblzma and xz for the multi threaded decoder does not
> > > have any planned changes, so things could probably be developed and
> > > tested using 5.3.3.  
> > 
> > Ah, thanks, that's reassuring then. It's one of the things I was
> > worried about when having to decide whether to merge the patch I've
> > got implementing this support into dpkg. So, once the alpha version
> > has been packaged for Debian experimental, I'll test the patch and
> > commit it.
> 
> There are no planned changes but that isn't a *promise* that there won't
> be any changes before 5.4.0.
>
> I don't track API or ABI compatibility within development releases and
> thus binaries linked against shared liblzma from one alpha/beta release
> won't run with liblzma from the next alpha/beta *if* they depend on
> unstable symbols (symbol versioning stops it). This includes the xz
> binary itself and would include dpkg too if it uses the threaded
> decoder.

That should be no problem. The last alpha version has been uploaded to
debian experimental and it is exactly the place for such things. So dpkg
could be linked against that version in experimental and will never
enter an official release.

Sebastian



Re: [xz-devel] XZ Utils 5.3.3alpha

2022-09-30 Thread Lasse Collin
On 2022-09-29 Guillem Jover wrote:
> On Wed, 2022-09-28 at 21:41:59 +0800, Jia Tan wrote:
> > […] The
> > interface for liblzma and xz for the multi threaded decoder does not
> > have any planned changes, so things could probably be developed and
> > tested using 5.3.3.  
> 
> Ah, thanks, that's reassuring then. It's one of the things I was
> worried about when having to decide whether to merge the patch I've
> got implementing this support into dpkg. So, once the alpha version
> has been packaged for Debian experimental, I'll test the patch and
> commit it.

There are no planned changes but that isn't a *promise* that there won't
be any changes before 5.4.0.

I don't track API or ABI compatibility within development releases and
thus binaries linked against shared liblzma from one alpha/beta release
won't run with liblzma from the next alpha/beta *if* they depend on
unstable symbols (symbol versioning stops it). This includes the xz
binary itself and would include dpkg too if it uses the threaded
decoder.

Sometimes it can be worked around with distro-specific patches but
that's extra hassle and can go wrong too. Please don't end up with a
similar result that happened with RHEL/CentOS 7 which ended up
affecting users of other distributions too (this is included in 5.2.7):


https://git.tukaani.org/?p=xz.git;a=commitdiff;h=913ddc5572b9455fa0cf299be2e35c708840e922

So while I encourage testing, one needs to be careful when it can
affect critical tools in the operating system. :-)

-- 
Lasse Collin



Re: [xz-devel] XZ Utils 5.3.3alpha

2022-09-30 Thread Lasse Collin
On 2022-09-28 Jia Tan wrote:
> On 2022-09-27 Sebastian Andrzej Siewior wrote:
> > Okay, so that is what you are tracking. I remember that there was a
> > stall in the decoding but I don't remember how it played out.
> >
> > I do remember that I had something for memory allocation/ limit but
> > I don't remember if we settled on something or if discussion is
> > needed. Also how many decoding threads make sense, etc.  
> 
> We ended up changing xz to use (total_ram / 4) as the default "soft
> limit". If the soft limit is reached, xz will decode single threaded.
> The "hard limit" shares the same environment variable and xz option
> (--memlimit-decompress).

There is also the 1400 MiB cap for 32-bit executables.

The memory limiting in threaded decompression (two separate limits in
parallel) is one thing where feedback would be important as after the
liblzma API, ABI and xz tool syntax are in a stable release, backward
compatibility has to be maintained.

Another thing needing feedback is the new behavior of -T0 when no
memlimit has been specified. Now it has a default soft limit. I hope it
is an improvement but quite possibly it could be improved. Your
suggestion to use MemAvailable on Linux is one thing that could be
included if people think it is a good way to go as a Linux-specific
behavior (having more benefits than downsides).

These are documented on the xz man page. I hope it is clear enough. It
feels a bit complicated, which is a bad sign but on the other hand I
feel the underlying problem isn't as trivial as it seems on the surface.

So far Jia Tan and I have received no feedback about these things at
all. I would prefer to hear the complaints before 5.4.0 is out. :-)

> > This reminds me that I once posted a patch to use openssl for the
> > sha256.
> > https://www.mail-archive.com/xz-devel@tukaani.org/msg00429.html
> >
> > Some distro is using sha256 instead crc64 by default, I don't
> > remember which one… Not that I care personally ;)  
> 
> I am unsure if we will have time to include your sha256 patch, but if
> we finish all the tasks with extra time it may be considered.

There's more to this than available time. 5.1.2alpha added support for
using SHA-256 from the OS base libraries (not OpenSSL) but starting with
5.2.3 it is disabled by default. Some OS libs use (or used to use) the
same symbol names for SHA-256 functions as OpenSSL while having
incompatible ABI. This lead to weird problems when an application
needed both liblzma and OpenSSL as liblzma ended up calling OpenSSL
functions. Plus, some of the OS-specific implementations were slower
than the C code in liblzma (OpenSSL would be faster).

OpenSSL's license has compatibility questions with GNU GPL. If I
remember correctly, some distributions consider OpenSSL to be part of
the core operating system and thus avoid the compatibility problem with
the GPL. I'm not up to date how distros handle it in 2022 but perhaps
it should be taken into account so that apps depending on liblzma won't
get legally unacceptable OpenSSL linkage. So if OpenSSL support is
added it likely should be disabled by default in configure.ac.

> > > This is everything currently planned.

Translations need to be updated too once the strings and man pages are
close to final. A development release needs to be sent to the
Translation Project at some point. If people want to translate the man
pages too, they will need quite a bit of time.

-- 
Lasse Collin



[xz-devel] XZ Utils 5.2.7

2022-09-30 Thread Lasse Collin
XZ Utils 5.2.7 is available at . Here is an
extract from the NEWS file:

5.2.7 (2022-09-30)

* liblzma:

- Made lzma_filters_copy() to never modify the destination
  array if an error occurs. lzma_stream_encoder() and
  lzma_stream_encoder_mt() already assumed this. Before this
  change, if a tiny memory allocation in lzma_filters_copy()
  failed it would lead to a crash (invalid free() or invalid
  memory reads) in the cleanup paths of these two encoder
  initialization functions.

- Added missing integer overflow check to lzma_index_append().
  This affects xz --list and other applications that decode
  the Index field from .xz files using lzma_index_decoder().
  Normal decompression of .xz files doesn't call this code
  and thus most applications using liblzma aren't affected
  by this bug.

- Single-threaded .xz decoder (lzma_stream_decoder()): If
  lzma_code() returns LZMA_MEMLIMIT_ERROR it is now possible
  to use lzma_memlimit_set() to increase the limit and continue
  decoding. This was supposed to work from the beginning
  but there was a bug. With other decoders (.lzma or
  threaded .xz decoder) this already worked correctly.

- Fixed accumulation of integrity check type statistics in
  lzma_index_cat(). This bug made lzma_index_checks() return
  only the type of the integrity check of the last Stream
  when multiple lzma_indexes were concatenated. Most
  applications don't use these APIs but in xz it made
  xz --list not list all check types from concatenated .xz
  files. In xz --list --verbose only the per-file "Check:"
  lines were affected and in xz --robot --list only the "file"
  line was affected.

- Added ABI compatibility with executables that were linked
  against liblzma in RHEL/CentOS 7 or other liblzma builds
  that had copied the problematic patch from RHEL/CentOS 7
  (xz-5.2.2-compat-libs.patch). For the details, see the
  comment at the top of src/liblzma/validate_map.sh.

  WARNING: This uses __symver__ attribute with GCC >= 10.
  In other cases the traditional __asm__(".symver ...")
  is used. Using link-time optimization (LTO, -flto) with
  GCC versions older than 10 can silently result in
  broken liblzma.so.5 (incorrect symbol versions)! If you
  want to use -flto with GCC, you must use GCC >= 10.
  LTO with Clang seems to work even with the traditional
  __asm__(".symver ...") method.

* xzgrep: Fixed compatibility with old shells that break if
  comments inside command substitutions have apostrophes (').
  This problem was introduced in 5.2.6.

* Build systems:

- New #define in config.h: HAVE_SYMBOL_VERSIONS_LINUX

- Windows: Fixed liblzma.dll build with Visual Studio project
  files. It broke in 5.2.6 due to a change that was made to
  improve CMake support.

- Windows: Building liblzma with UNICODE defined should now
  work.

- CMake files are now actually included in the release tarball.
  They should have been in 5.2.5 already.

- Minor CMake fixes and improvements.

* Added a new translation: Turkish

-- 
Lasse Collin



Re: [xz-devel] XZ Utils 5.3.3alpha

2022-09-29 Thread Guillem Jover
On Wed, 2022-09-28 at 21:41:59 +0800, Jia Tan wrote:
> […] The
> interface for liblzma and xz for the multi threaded decoder does not
> have any planned changes, so things could probably be developed and
> tested using 5.3.3.

Ah, thanks, that's reassuring then. It's one of the things I was
worried about when having to decide whether to merge the patch I've got
implementing this support into dpkg. So, once the alpha version has been
packaged for Debian experimental, I'll test the patch and commit it.

> This would actually help us because having people
> test and give us feedback on both performance and the interface would
> help before committing to things in the stable release.

Given that this will be disabled at configure time (until the support
is in Debian unstable), I'm not sure we'll have many people testing
this, but I guess it will make it possible for people wanting to test
it to do it more easily. And there's always the option to do that over
the entire Debian archive or similar.

Thanks,
Guillem



Re: [xz-devel] XZ Utils 5.3.3alpha

2022-09-28 Thread Jia Tan
> Okay, so that is what you are tracking. I remember that there was a
> stall in the decoding but I don't remember how it played out.
>
> I do remember that I had something for memory allocation/ limit but I
> don't remember if we settled on something or if discussion is needed.
> Also how many decoding threads make sense, etc.

We ended up changing xz to use (total_ram / 4) as the default "soft
limit". If the soft limit is reached, xz will decode single threaded.
The "hard limit" shares the same environment variable and xz option
(--memlimit-decompress).

> > - New ARM64 filter needs to be properly coordinated to other xz
> > implementations and documented.
> > - Converting tests to the new tuktest framework. Most of the tests
> > have been written, but they still need to be reviewed.
> > - liblzma and xz functionality to convert a string into a filter
> > chain. A draft of this is on the mailing list already, but the syntax
> > needs finalizing and the code was not polished.
> > - A patch for .lz support needs review.
> > - A patch for crc64 optimizations needs review.
>
> This reminds me that I once posted a patch to use openssl for the
> sha256.
> https://www.mail-archive.com/xz-devel@tukaani.org/msg00429.html
>
> Some distro is using sha256 instead crc64 by default, I don't remember
> which one… Not that I care personally ;)

I am unsure if we will have time to include your sha256 patch, but if
we finish all the tasks with extra time it may be considered.

> > - Misc. minor bug fixes.
> >
> > This is everything currently planned. Most things are done and just
> > needs review and minor improvements. Don't worry, multi threaded
> > decompression will be coming to xz in a stable release very soon!
>
> Okay. That is good to hear. I would like to get it in Debian and have
> dpkg support for the upcomming stable release. The earlier the better
> since this affects quite a large part of the system. The toolchain
> freeze is in January and I think that dpkg is part of it (or people will
> probably get very nervous if such a change gets integrated later in the
> cycle).

Thank you for notifying us about the January freeze. I hope this is
the extra bit of motivation needed for us to release 5.4.0 as soon as
it is ready.

Lasse is confident that we will have the release by December. I will
see what we can do to make it as early December as possible since I
understand not wanting to make large changes just before a freeze. The
interface for liblzma and xz for the multi threaded decoder does not
have any planned changes, so things could probably be developed and
tested using 5.3.3. This would actually help us because having people
test and give us feedback on both performance and the interface would
help before committing to things in the stable release.

Thanks again Sebastian for your contributions to both xz and Debian's use of xz!

Jia Tan



Re: [xz-devel] XZ Utils 5.3.3alpha

2022-09-27 Thread Sebastian Andrzej Siewior
On 2022-09-27 21:29:07 [+0800], Jia Tan wrote:
> > Are there any open issues? If not, what needs to be done before the
> > final release can happen?
> 
> The 5.4.0 release that will contain the multi threaded decoder is
> planned for December. The list of open issues related to 5..4.0 in
> general that I am tracking are:
> 
> - Final tweaks to multi threaded decoder (error handling may need
> improvements since the worker threads stay running in some cases when
> they should not).

Okay, so that is what you are tracking. I remember that there was a
stall in the decoding but I don't remember how it played out.

I do remember that I had something for memory allocation/ limit but I
don't remember if we settled on something or if discussion is needed.
Also how many decoding threads make sense, etc.

> - New ARM64 filter needs to be properly coordinated to other xz
> implementations and documented.
> - Converting tests to the new tuktest framework. Most of the tests
> have been written, but they still need to be reviewed.
> - liblzma and xz functionality to convert a string into a filter
> chain. A draft of this is on the mailing list already, but the syntax
> needs finalizing and the code was not polished.
> - A patch for .lz support needs review.
> - A patch for crc64 optimizations needs review.

This reminds me that I once posted a patch to use openssl for the
sha256. 
https://www.mail-archive.com/xz-devel@tukaani.org/msg00429.html

Some distro is using sha256 instead crc64 by default, I don't remember
which one… Not that I care personally ;)

> - Misc. minor bug fixes.
>
> This is everything currently planned. Most things are done and just
> needs review and minor improvements. Don't worry, multi threaded
> decompression will be coming to xz in a stable release very soon!

Okay. That is good to hear. I would like to get it in Debian and have
dpkg support for the upcomming stable release. The earlier the better
since this affects quite a large part of the system. The toolchain
freeze is in January and I think that dpkg is part of it (or people will
probably get very nervous if such a change gets integrated later in the
cycle).

> Jia Tan

Sebastian



  1   2   3   4   5   >