On 2024-03-11 21:23:03 [+0000], Amin Bandali wrote:
> Hi,
Hi,

> On Mon, Mar 11, 2024 at 05:55:31PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2024-03-11 00:05:54 [+0000], Amin Bandali wrote:
> > > Hi Sebastian, all,
> > Hi,
> > 
> > > Will this fix be enough for addressing all cases, though?
> > 
> > I think so. Do you have a test case for me to check?
> 
> Not about pristine-xz specifically; I meant more in the context of
> other devel tools like gbp et al.

ah okay. pristine-tar was the only tool that had CI failures during the
upload of new xz-utils to exp. I wouldn't know other tools that require
to recreate the same binary file.

> > Who is handling the compression in the first place here?
> 
> In the case of "gbp import-orig --uscan", gbp invokes uscan, part of
> the devscripts package which has several perl modules including
> Devscripts::Compression which is a sort of a wrapper around dpkg's
> Dpkg::Compression, which will ultimately run the 'xz' executable.
> 
> In some other cases like "gbp import-orig --filter" mentioned by
> Andrey, gbp does the compression itself.  Which is why I suggested
> that 'Opts' in gbp.pkg.compressor may need to be updated to add '-T1'
> for calls to 'xz'.

okay. I wouldn't recomment doing -T1. This forces xz doing a single
block and using a signle thread. The default (without passing the -T
argument) will allow xz to use multiple threads and compress into
multiple blocks which in turn can be decompressed using multiple
threads.
Forcing -T1 will force single threaded compression and decompression.
pristine-tar can handle both cases.

> > The idea is to pass -T1 to xz if nothing was recorded in pristine-tar's
> > delta information. If the -T argument then everything keeps working
> > as-is. If you use gbp to repack the tar archive then I would recommend
> > to no pass -T1 and to use multi-threaded compression. pristine-tar
> > will recongnise this and record this information.
> 
> Sorry I don't think I fully understood this bit.  Could you please
> explain again, perhaps a bit more verbosely?

If you do "pristine-tar gendelta" then pristine tar creates a .delta
file which is tar.gz file containing a few files including the actual
delta from `xdelta' and a file called `wrapper'. The `wrapper' file is
also a tar.gz file including files regarding the invocation of the
compressing tool which includes the arguments required to produce the
exact output of the resulting .xz (from the tar input). Prior 1.50+nmu1
pristine-tar didn't record here the -T argument unless multi-threaded
compression was used and pristine-tar used -Tcpus and recorded this.
Since 1.50+nmu1 I made pristine-tar to always record the -T argument in
the wrapper file, either -Tcpus in the multi threaded case as it did or
by using -T1 in the single threaded one block case.
That means the reproduce case has always the fitting -T argument. If you
get an older archive which lacks the -T argument, pristine-tar will
assume -T1 which was the old default.

> To clarify, the use-cases described earlier involving gbp and
> devscripts aren't necessarily related to pristine-xz, used for
> regenerating pristine xz files; rather, about the generation or
> repacking of xz files *before* they are handed to pristine-xz for
> processing and storage in the repo.  I was trying to imply that
> similarly to how you sent patches for pristine-tar to adapt it for
> changes in xz-utils, that similar patches are probably also needed
> for gbp and devscripts.  Does that make sense?

So gbp and descripts should be able to deal with xz as-is since they
don't have any expectation in the resulting binary file. They are happy
once the input compressed/ decompressed. pristine-tar is the only tool,
to my best knowledge, that requires binary identical output. Therefore I
would keep gbp and devscripts as-is and prefer the multi-threaded
compression & decompression.
dpkg uses multi-threaded compression since a while and decompression
since Bookworm.

> Thanks,
> -amin

Sebastian

Reply via email to