Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Fay Stegerman
* Fay Stegerman  [2024-04-11 04:28]:
> * Holger Levsen  [2024-04-11 02:14]:
> > > unzip does seem to extract all the files, though it errors out.  Not sure 
> > > what
> > > diffoscope should do here.  This is definitely a broken ZIP file.  That 
> > > bug
> > > should probably be reported against libscout or whatever tooling it used 
> > > to
> > > create that JAR.
> > 
> > I agree it's more complicated, but fundamentally, diffoscope should *not* 
> > crash
> > here! (but rather report the broken zip file.)
> 
> I think we all agree it shouldn't crash :)
> 
> What I meant is that I'm not sure it should simply catch the error, report the
> file as broken, and not attempt extraction, or if it makes sense to attempt to
> work around this issue, at least in cases like this specific one where the
> entries are exact duplicates and the files can presumably be safely extracted.
> I think my workaround (which could be implemented slightly differently as 
> well,
> without modifying the ZipFile, but processing it differently in diffoscope)
> would accomplish that for this JAR at least.  I could make an MR for that.
> Though as I said I will also report this upstream to cpython, probably 
> tomorrow.
> 
> - Fay

The attached patch avoids the crash in this case, FWIW.  I would still recommend
catching the error for other cases.

- Fay
diff --git a/diffoscope/comparators/zip.py b/diffoscope/comparators/zip.py
index 2a27042a..4bfb1527 100644
--- a/diffoscope/comparators/zip.py
+++ b/diffoscope/comparators/zip.py
@@ -182,7 +182,12 @@ class ZipDirectory(Directory, ArchiveMember):
 
 class ZipContainer(Archive):
 def open_archive(self):
-return zipfile.ZipFile(self.source.path, "r")
+zf = zipfile.ZipFile(self.source.path, "r")
+self.name_to_info = {}
+for info in zf.infolist():
+if info.filename not in self.name_to_info:
+self.name_to_info[info.filename] = info
+return zf
 
 def close_archive(self):
 self.archive.close()
@@ -199,7 +204,8 @@ class ZipContainer(Archive):
 ).encode(sys.getfilesystemencoding(), errors="replace")
 
 try:
-with self.archive.open(member_name) as source, open(
+info = self.name_to_info[member_name]
+with self.archive.open(info) as source, open(
 targetpath, "wb"
 ) as target:
 shutil.copyfileobj(source, target)
___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Fay Stegerman
* Holger Levsen  [2024-04-11 02:14]:
> > unzip does seem to extract all the files, though it errors out.  Not sure 
> > what
> > diffoscope should do here.  This is definitely a broken ZIP file.  That bug
> > should probably be reported against libscout or whatever tooling it used to
> > create that JAR.
> 
> I agree it's more complicated, but fundamentally, diffoscope should *not* 
> crash
> here! (but rather report the broken zip file.)

I think we all agree it shouldn't crash :)

What I meant is that I'm not sure it should simply catch the error, report the
file as broken, and not attempt extraction, or if it makes sense to attempt to
work around this issue, at least in cases like this specific one where the
entries are exact duplicates and the files can presumably be safely extracted.
I think my workaround (which could be implemented slightly differently as well,
without modifying the ZipFile, but processing it differently in diffoscope)
would accomplish that for this JAR at least.  I could make an MR for that.
Though as I said I will also report this upstream to cpython, probably tomorrow.

- Fay

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Chris Lamb
Fay Stegerman wrote:

> Salsa is probably better for figuring out what to do next, but I get
> these mails too :)

Oh, hey! o/

> unzip does seem to extract all the files, though it errors out.  Not sure what
> diffoscope should do here.  This is definitely a broken ZIP file.

First; great debugging there, thank you. :)

Okay, separate from your suggestion that a bug should be filed against
libscout with its broken zip file, I think that diffoscope should not
traceback and crash on this particular input. We do this elsewhere with
(most) invalid inputs and it makes a lot of sense here as well.

I'll modify diffoscope tomorrow morning to catch the specific
exception being thrown by Python's builtin zipfile module and add a
suitable message as a user-visible 'comment' — again, something we have
plenty of prior art for elsewhere in the codebase. Thanks again.


Best wishes,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 
⬊   ⬋
  o

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Holger Levsen
On Thu, Apr 11, 2024 at 01:48:18AM +0200, Fay Stegerman wrote:
> Salsa is probably better for figuring out what to do next, but I get these 
> mails
> too :)

:)
 
> The libscout.jar has duplicate ZIP entries in the central directory, pointing 
> to
> the same actual entry in the ZIP.  So the "overlapped entries" error is 
> entirely
> correct, even if it's not a zip bomb.

ah!

> unzip does seem to extract all the files, though it errors out.  Not sure what
> diffoscope should do here.  This is definitely a broken ZIP file.  That bug
> should probably be reported against libscout or whatever tooling it used to
> create that JAR.

I agree it's more complicated, but fundamentally, diffoscope should *not* crash
here! (but rather report the broken zip file.)

thanks!


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

I’ve said it once, and I’ll say it a thousand times: If the penalty for
breaking a law is a fine, then that law only exists for the poor.


signature.asc
Description: PGP signature
___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Fay Stegerman
* Fay Stegerman  [2024-04-11 01:48]:
> * Holger Levsen  [2024-04-10 19:43]:
> > On Wed, Apr 10, 2024 at 06:12:21PM +0100, Chris Lamb wrote:
> > > Holger Levsen wrote:
> > > 
> > > > when building libscout 2.3.2-3 on current unstable, the result is also 
> > > > unreproducible, but diffoscope crashes when analysing the diff.
> > > I think this is somewhat related to:
> > >   https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/362
> > > … which was said to be fixed by Fay in 
> > > cc3b077f6ef97b4e20036e9823926fe633c7d4d0
> > > that released as diffoscope version 263 on 2024-04-05.
> > > However, I can see that the current output of libscout/amd64 on
> > > tests.reproducible-builds.org is failing with this very version:
> > 
> > yes, indeed.
> > 
> > also, this happened before too, I'm sure about at least with diffoscope 260 
> > already.
> >  
> > > Will loop Fay in via Salsa presently.
> > 
> > thank you!
> 
> Salsa is probably better for figuring out what to do next, but I get these 
> mails
> too :)
> 
> The libscout.jar has duplicate ZIP entries in the central directory, pointing 
> to
> the same actual entry in the ZIP.  So the "overlapped entries" error is 
> entirely
> correct, even if it's not a zip bomb.
> 
>   >>> import zipfile
>   >>> zf = zipfile.ZipFile("libscout.jar")
>   >>> fh = zf.open("javax/annotation/CheckForNull.class")
>   zipfile.BadZipFile: Overlapped entries: 
> 'javax/annotation/CheckForNull.class' (possible zip bomb)
[...]

I do have a workaround of sorts for this specific case of duplicate entries.
I'll open a cpython issue to report it to upstream.  Though they may not
consider this a bug, possibly even the correct behaviour.  Not sure myself tbh 
:)

  >>> for info in reversed(zf.infolist()):
  ...   zf.NameToInfo[info.filename] = info
  >>> fh = zf.open("javax/annotation/CheckForNull.class") # works now

- Fay

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Fay Stegerman
* Holger Levsen  [2024-04-10 19:43]:
> On Wed, Apr 10, 2024 at 06:12:21PM +0100, Chris Lamb wrote:
> > Holger Levsen wrote:
> > 
> > > when building libscout 2.3.2-3 on current unstable, the result is also 
> > > unreproducible, but diffoscope crashes when analysing the diff.
> > I think this is somewhat related to:
> >   https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/362
> > … which was said to be fixed by Fay in 
> > cc3b077f6ef97b4e20036e9823926fe633c7d4d0
> > that released as diffoscope version 263 on 2024-04-05.
> > However, I can see that the current output of libscout/amd64 on
> > tests.reproducible-builds.org is failing with this very version:
> 
> yes, indeed.
> 
> also, this happened before too, I'm sure about at least with diffoscope 260 
> already.
>  
> > Will loop Fay in via Salsa presently.
> 
> thank you!

Salsa is probably better for figuring out what to do next, but I get these mails
too :)

The libscout.jar has duplicate ZIP entries in the central directory, pointing to
the same actual entry in the ZIP.  So the "overlapped entries" error is entirely
correct, even if it's not a zip bomb.

  >>> import zipfile
  >>> zf = zipfile.ZipFile("libscout.jar")
  >>> fh = zf.open("javax/annotation/CheckForNull.class")
  zipfile.BadZipFile: Overlapped entries: 'javax/annotation/CheckForNull.class' 
(possible zip bomb)
  >>> len([i for i in zf.infolist() if i.filename == 
"javax/annotation/CheckForNull.class"])
  2
  >>> len(zf.namelist()) - len(set(zf.namelist()))
  35
  >>> x, y = [i for i in zf.infolist() if i.filename == 
"javax/annotation/CheckForNull.class"]
  >>> x.header_offset
  23065534
  >>> y.header_offset
  23065534
  >>> x._end_offset
  23065890
  >>> y._end_offset
  23065534
  >>> zf.open(x)
  
  >>> zf.open(y)
  Traceback (most recent call last):
  zipfile.BadZipFile: Overlapped entries: 'javax/annotation/CheckForNull.class' 
(possible zip bomb)

$ unzip -q -d foo libscout.jar
error: invalid zip file with overlapped components (possible zip bomb)

unzip does seem to extract all the files, though it errors out.  Not sure what
diffoscope should do here.  This is definitely a broken ZIP file.  That bug
should probably be reported against libscout or whatever tooling it used to
create that JAR.

FWIW, it seems the libscout.jar files in both .deb files are identical apart
from timestamps and the ordering of entries in the ZIP.

- Fay

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Re: Bug#882511: dpkg-buildpackage: should allow caller to force inclusion of source in buildinfo

2024-04-10 Thread Vagrant Cascadian
On 2024-04-09, Guillem Jover wrote:
> I've now finished the change I had in that branch, which implements
> support so that dpkg-buildpackage can be passed a .dsc or a source-dir,
> and in the former will first extract it, and for both then it will
> change directory to the source tree. If it got passed a .dsc then it
> will instruct dpkg-genbuildinfo to include a ref to it.
>
> Which I think accomplishes the requested behavior in a safe way? I've
> attached what I've got, which I'm planning on merging for 1.22.7. I'll
> probably split that into two commits though before merging.

Had a chance to take this for a test run, and it appears to work, though
with a few surprises...

  dpkg-buildpackage -- hello_2.10-3.dsc

Ends up regenerating the .dsc, as --build=any,all,source by default
... which may end up with a different .dsc checksum in the .buildinfo
than .dsc that was passed on the commandline. Which makes some sense,
but maybe would be better to error out? I would not expect to regenerate
the .dsc if you're passing dpkg-buildpackage a .dsc!


  dpkg-buildpackage --build=any,all -- /path/to/hello_2.10-3.dsc

Fails to find the .dsc file, as it appears to extract the sources to
hello-2.10 and then expects to find ../hello_2.10-3.dsc


All that said ... this seemed to work for me:

  dpkg-buildpackage --build=any,all -- hello_2.10-3.dsc

So yay, progress! Thanks!


All of the above cases do not clean up the hello-2.10 extracted from the
.dsc file, so re-running any of the above need to manually clean that or
run from a clean directory or experience various failure modes with the
existing hellp-2.10 directory.


So a few little glitches, but overall this seems close to something we
have really wanted for reproducible builds! And just for good measure,
thanks!


live well,
  vagrant


signature.asc
Description: PGP signature
___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Holger Levsen
On Wed, Apr 10, 2024 at 06:12:21PM +0100, Chris Lamb wrote:
> Holger Levsen wrote:
> 
> > when building libscout 2.3.2-3 on current unstable, the result is also 
> > unreproducible, but diffoscope crashes when analysing the diff.
> I think this is somewhat related to:
>   https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/362
> … which was said to be fixed by Fay in 
> cc3b077f6ef97b4e20036e9823926fe633c7d4d0
> that released as diffoscope version 263 on 2024-04-05.
> However, I can see that the current output of libscout/amd64 on
> tests.reproducible-builds.org is failing with this very version:

yes, indeed.

also, this happened before too, I'm sure about at least with diffoscope 260 
already.
 
> Will loop Fay in via Salsa presently.

thank you!


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Fischers Fritz fischt Plastik.


signature.asc
Description: PGP signature
___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Bug#1068705: diffoscope crashes on libscout 2.3.2-3 build on unstable but not bullseye

2024-04-10 Thread Chris Lamb
Holger Levsen wrote:

> when building libscout 2.3.2-3 on current unstable, the result is also 
> unreproducible, but diffoscope crashes when analysing the diff.

I think this is somewhat related to:

  https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/362

… which was said to be fixed by Fay in cc3b077f6ef97b4e20036e9823926fe633c7d4d0
that released as diffoscope version 263 on 2024-04-05.

However, I can see that the current output of libscout/amd64 on
tests.reproducible-builds.org is failing with this very version:

  Tue Apr  9 12:14:14 UTC 2024  I: diffoscope 263 will be used to compare the 
two builds:

  From https://gist.github.com/lamby/e5db96d4d61612485a469b826590192e/raw
  (saved output for posterity)

Will loop Fay in via Salsa presently.


Regards,

-- 
  ,''`.
 : :'  : Chris Lamb
 `. `'`  la...@debian.org  chris-lamb.co.uk
   `-

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds


Please review the draft for March's report

2024-04-10 Thread Chris Lamb
Hi all,

Sorry for the delay in getting this out — it was, quite genuinely, a
bumper amount of things that needed condensing, rewriting and
generally getting into readable shape. Anyway, if folks would be so
kind as to review the draft for last months report here:

  https://reproducible-builds.org/reports/2024-03/?draft

… or, via the Git repository itself:

  
https://salsa.debian.org/reproducible-builds/reproducible-website/blob/master/_reports/2024-03.md

I intend to publish it no earlier than:

  $ date -d 'Thu, 11 Apr 2024 17:30:00 +0100'

  https://time.is/compare/1730_11_Apr_2024_in_BST

§

As ever, please feel free and commit/push to drafts directly without the 
overhead of
sending patches or merge requests. You should make your changes to the
"_reports/2024-03.md" file in the "reproducible-website" repository:

  $ git clone https://salsa.debian.org/reproducible-builds/reproducible-website
  $ cd reproducible-website
  $ sensible-editor _reports/2024-03.md

I am happy to reword and/or rework additions prior to publishing. If you
currently do not have access to the above repository, you can request access
by following the instructions at:

  https://reproducible-builds.org/contribute/salsa/


Regards,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 
⬊   ⬋
  o

___
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds