Re: Does diffoscope compares disk partitions

2023-03-02 Thread intrigeri
Hi,

Mattia Rizzolo (2023-03-01):
> On Wed, Mar 01, 2023 at 10:55:41AM +, Chris Lamb wrote:
>> > Does it support disk partitions or do I missing something?
>> 
>> The short answer is that that diffoscope *should* support comparing your
>> partition images properly, instead of falling back to a raw
>> xxd(1) comparison. The reason diffoscope doesn't do that right now is
>> either due to a bug, or we just need to extend support for this
>> particular type of partition.
>
> Mhh, I don't think it should.
> I don't reckon ever seeing anything related to GPT or MBR, so I don't
> think diffoscope supports whole disk images with a partition tables.

IIRC at Tails, we happily used diffoscope to compare our "USB images",
which are full disk images with a GPT and an ESP (+ the hybrid GPT
machinery to make them bootable as legacy BIOS too).

Take care,
-- 
intrigeri


Re: Does diffoscope compares disk partitions

2023-03-02 Thread Arnout Engelen
On Thu, Mar 2, 2023, at 02:09, John Gilmore wrote:
> I have been surprised at how much effort has gone into "diffoscope" as a
> total fraction of the Reproducible Builds effort.

How do you know?

> Perhaps it is a case
> akin to the drunk looking for his keys under the streetlight where he
> can see, rather than in the dark where he dropped them.  (It's easier to
> hack diffoscope than to hack thousands of irreproducible packages.)

I suspect it might be a visibility effect, but the other way around: diffoscope 
issues may be more likely to be discussed on this list than issues with 
individual packages. If you look at the monthly reports there's a healthy batch 
of distro work happening each month, and even there it might be 
skewed: it's much easier to enumerate the changes to diffoscope, than it is to 
gather reproducibility work from all over the internet.

I, for one, spend *way* more time fixing irreproducible packages (where 
diffoscope is an amazing tool) than working on diffoscope itself (I only did 
some issue reporting and testing). I rarely post about it here, and I've been 
really bad at making it visible in the monthly reports as well - I should get 
back into that habit ;).


Kind regards,

Arnout


Re: Does diffoscope compares disk partitions

2023-03-02 Thread Mattia Rizzolo
On Wed, Mar 01, 2023 at 10:07:33PM -0800, Vagrant Cascadian wrote:
> I daresay it is more akin to someone looking for lost keys by inventing
> a flashlight to look near where they dropped them, and they happen to
> have dropped the keys into a bin with miscelaneous arbitrary pieces and
> bits of things, many off which are shaped and/or sized roughly like your
> keys... and wow before nobody thought to make a flashlight shine at so
> many different wavelengths, or detect the density of the materials, or
> produce a harmonic resonance with certain materials, or a high intensity
> laser to burn off all the organic detritus that has accumulated,
> and... only to find out that people have been loosing their keys in this
> bin for decades, and we found someone else's keys too! Oh, look, with
> this small tweak, we could also detect antique coins...


Wow…

We should quote this in our blog post :>

-- 
regards,
Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540  .''`.
More about me:  https://mapreri.org : :'  :
Launchpad user: https://launchpad.net/~mapreri  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-


signature.asc
Description: PGP signature


Re: Does diffoscope compares disk partitions -> The German word is 'jein' (yes and no)

2023-03-01 Thread Roland Clobus

Hello,

On 01/03/2023 22:59, Thomas Schmitt wrote:

Mattia Rizzolo wrote:

...

... The file tree and the data files' content is only
a part of a bootable ISO 9660 image. There's executable code in the
blind spots of such a view: MBR legacy BIOS boot code, EFI programs in
the EFI partition, "hidden" El Torito boot images, ...


I know that thing has been used to work on reproducible bootable image,


Possibly because typical pitfalls aren't tested. Like FAT timestamps in
the EFI partition.


In live-build, the FAT timestamps in the EFI partition are carefully 
ironed out [1][2]. Before the fixes, they did show up in the diffoscope, 
but only in the hex dump.
After I realised that the differences in the hex dump were actually 
timestamps (or timestamp related), I could remove the differences.


From the original mail:
** 2023-02-27 14:09:12 D: diffoscope.comparators.utils.command: 
Executing xxd {}

** Killed

I've seen this too. You'll need loads of memory and temporary disk 
space, especially in the initial stages of finding the root cause of the 
differences (comparing 2 xxd outputs of 6GB files can produce lots of 
output)
The original mail also hint at a possible root cause: the timestamp 
inside each partition have not been clamped to SOURCE_DATE_EPOCH. See 
e.g. [3] how it is done in live-build.


> It is possible to make reproducible ISOs by xorriso and 
SOURCE_DATE_EPOCH.

> But the input files for the ISO production have to be identical, too.

Indeed, before the meta-content of ISO files can be considered, its 
payload should have been harmonized. Live-build does it in [3].


> Both, the use of SOURCE_DATE_EPOCH and care for identical input are 
not tradition with bootable ISO production.

>
> So another reason for no protests might be that bootable ISOs aren't 
often challenged by reproducibility tests which use diffoscope.


I agree that having a specialised comparator would be nice, but even the 
hex dump has fulfilled its purpose. The live ISO images are now 100% 
reproducible (the header is containing a lot of the magic in the 
previous mails). The ISO images are tested with diffoscope on a daily 
basis in Jenkins [4].


With kind regards,
Roland

[1] 
https://sources.debian.org/src/live-build/1:20230131/scripts/build/binary_grub-efi/?hl=262#L262
[2] 
https://sources.debian.org/src/live-build/1:20230131/scripts/build/efi-image/?hl=79#L79
[3] 
https://sources.debian.org/src/live-build/1:20230131/scripts/build/binary_iso/?hl=180#L180

[4] https://jenkins.debian.net/view/live/


OpenPGP_signature
Description: OpenPGP digital signature


Re: Does diffoscope compares disk partitions

2023-03-01 Thread Vagrant Cascadian
On 2023-03-01, John Gilmore wrote:
>>> So, overall, I actually don't think that diffoscope has the requested
>>> support, and it's not "just" a bug of failed identification.
>
> I have been surprised at how much effort has gone into "diffoscope" as a
> total fraction of the Reproducible Builds effort.  Perhaps it is a case
> akin to the drunk looking for his keys under the streetlight where he
> can see, rather than in the dark where he dropped them.

I daresay it is more akin to someone looking for lost keys by inventing
a flashlight to look near where they dropped them, and they happen to
have dropped the keys into a bin with miscelaneous arbitrary pieces and
bits of things, many off which are shaped and/or sized roughly like your
keys... and wow before nobody thought to make a flashlight shine at so
many different wavelengths, or detect the density of the materials, or
produce a harmonic resonance with certain materials, or a high intensity
laser to burn off all the organic detritus that has accumulated,
and... only to find out that people have been loosing their keys in this
bin for decades, and we found someone else's keys too! Oh, look, with
this small tweak, we could also detect antique coins...


> (It's easier to hack diffoscope than to hack thousands of
> irreproducible packages.)

Fixing reproducibility issues blindfolded does not seem like an
efficient way to fix issues either. We have already fixed tens of
thousands of issues, and have thousands more that we are working on.
Diffoscope is a highly useful tool towards that end.


live well,
  vagrant


signature.asc
Description: PGP signature


Re: Does diffoscope compares disk partitions

2023-03-01 Thread John Gilmore
>> So, overall, I actually don't think that diffoscope has the requested
>> support, and it's not "just" a bug of failed identification.

I have been surprised at how much effort has gone into "diffoscope" as a
total fraction of the Reproducible Builds effort.  Perhaps it is a case
akin to the drunk looking for his keys under the streetlight where he
can see, rather than in the dark where he dropped them.  (It's easier to
hack diffoscope than to hack thousands of irreproducible packages.)  I
for one am happy that diffoscope DOESN'T have support for umpteen disk
partitioning schemes and file system formats.

John

PS:  Has anyone on the list considered writing an article for the
Journal of Irreproducible Results about our effort?


Re: Does diffoscope compares disk partitions

2023-03-01 Thread Thomas Schmitt
Hi,

Mattia Rizzolo wrote:
> I already knew that hybrid ISO 9660 are pretty much devil's offspring,
> and I guess I could have done without your introduction to them /o\

Since i am helping to create them i feel obligated to warn the public.
{:)


> FWIW, currently diffoscope _does_ have a iso9660 comparator, but that
> just runs libarchive on it, so I _believe_ it ignores the existence of
> an eventual partition table and just goes straight to the really
> compliant "data part" of the image.

That's a bit naive. The file tree and the data files' content is only
a part of a bootable ISO 9660 image. There's executable code in the
blind spots of such a view: MBR legacy BIOS boot code, EFI programs in
the EFI partition, "hidden" El Torito boot images, ...


> I know that thing has been used to work on reproducible bootable image,

Possibly because typical pitfalls aren't tested. Like FAT timestamps in
the EFI partition.


> so I guess it just always happened that the "hybrid part" of them were
> already reproducible :3

It is possible to make reproducible ISOs by xorriso and SOURCE_DATE_EPOCH.
But the input files for the ISO production have to be identical, too.
Both, the use of SOURCE_DATE_EPOCH and care for identical input are not
tradition with bootable ISO production.

So another reason for no protests might be that bootable ISOs aren't often
challenged by reproducibility tests which use diffoscope.


Have a nice day :)

Thomas



Re: Does diffoscope compares disk partitions

2023-03-01 Thread Mattia Rizzolo
Hello Thomas,

On Wed, Mar 01, 2023 at 09:44:56PM +0100, Thomas Schmitt wrote:
> There are various layouts. Some are quite awkward and invite for getting
> lost in the woods. (Actually they lure firmwares out of the woods.)

Duh.
I already knew that hybrid ISO 9660 are pretty much devil's offspring,
and I guess I could have done without your introduction to them /o\

> All these images contain block ranges which are neither covered by the
> ISO 9660 file tree nor by the partitions.
> One will always first have to compare such ISOs byte-by-byte and only in
> case of mismatch look into the ISO filesystem and partitions for a more
> qualified presentation of the differences.

FWIW, currently diffoscope _does_ have a iso9660 comparator, but that
just runs libarchive on it, so I _believe_ it ignores the existence of
an eventual partition table and just goes straight to the really
compliant "data part" of the image.

I know that thing has been used to work on reproducible bootable image,
so I guess it just always happened that the "hybrid part" of them were
already reproducible :3
Else I reckon somebody would have provided some improvements in this
area already.

-- 
regards,
Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540  .''`.
More about me:  https://mapreri.org : :'  :
Launchpad user: https://launchpad.net/~mapreri  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-


signature.asc
Description: PGP signature


Re: Does diffoscope compares disk partitions

2023-03-01 Thread Thomas Schmitt
Hi,

Mattia Rizzolo wrote:
> So, overall, I actually don't think that diffoscope has the requested
> support, and it's not "just" a bug of failed identification.

If it gets support for detecting partitions in image files then it will
have to handle hybrid bootable ISO 9660 images which bear partition
tables for the case that they get copied onto a USB stick.
There are various layouts. Some are quite awkward and invite for getting
lost in the woods. (Actually they lure firmwares out of the woods.)

E.g. debian-11.5.0-amd64-netinst.iso has two partitions in the MBR table
plus a GPT header which most readers won't take into respect but will
appease older versions of OVMF virtual EFI. MBR partition 1 marks the
whole ISO 9660 filesystem. MBR partition 2 marks a data file range within
that ISO filesystem (i.e. abominable partition nesting). The data file is
actually a FAT filesystem image, which could possibly begin by an MBR
with partition table. Further there is an Apple Partition Map with ai
single partition marking the FAT filesystem image for EFI.
This partition layout is an invention of Matthew J. Garrett for Fedora.

ubuntu-22.04-desktop-amd64.iso has a nearly properly announced GPT with
the tiny flaw of an MBR partition 2 of type 0x00 which carries the
"bootable" flag. That's a lure for some legacy BIOS laptops which don't
consider a disk for booting if it does not have a partition with the flag.
The 3 not nested GPT partitions mark the ISO filesystem, the EFI partion,
and the traditional 300 KiB of padding. The EFI partition is not a data
file in the ISO filesystem.
The confusion is completed by the fact that the overall image is a
mountable ISO filesystem which claims the whole size of the image file
including the EFI partition and the padding. Nevertheless GPT partition 1
does not start at the start of the image file but is an ISO filesystem too,
which bears the same files as the overall ISO fileystem.

grub-mkrescue produces for x86_64 a boringly specs compliant ISO with GPT
at the cost that there is no mountable ISO partition in the GPT. But the
image also contains an Apple Partition Map with a partition pointing to
a HFS+ filesystem which shows the same files as the ISO 9660 filesystem.
(It is of course possible to mount the overall image file as ISO 9660.)

Quite like a normal disk image is Fedora-Everything-netinst-x86_64-37-1.7.iso
with its 3 GPT partitions. The only unusual aspect is the mountability of
the overall image file with an ISO filesystem which claims the block range
of all three partitions.


All these images contain block ranges which are neither covered by the
ISO 9660 file tree nor by the partitions.
One will always first have to compare such ISOs byte-by-byte and only in
case of mismatch look into the ISO filesystem and partitions for a more
qualified presentation of the differences.


Have a nice day :)

Thomas



RE: Does diffoscope compares disk partitions

2023-03-01 Thread Venkata.Pyla
Hi Chris, 

Thanks for your replay, please see my inline comments.

>-Original Message-
>From: Chris Lamb 
>Sent: 01 March 2023 16:26
>To: pyla venkata(TSIP TMIEC ODG Porting) tsip.com>; rb-general 
>Subject: Re: Does diffoscope compares disk partitions
>
>Hey Venkata,
>
>> Does it support disk partitions or do I missing something?
>
>The short answer is that that diffoscope *should* support comparing your
>partition images properly, instead of falling back to a raw
>xxd(1) comparison. The reason diffoscope doesn't do that right now is either
>due to a bug, or we just need to extend support for this particular type of
>partition.
>
>Correctly detecting DOS/MBR files is somewhat more fiddly than one might
>think, but the pertinent part of the debug log is this:
>
>> image1.wic not identified by any comparator. Magic says: DOS/MBR boot
>> sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS
>> (0x3ff,255,63), startsector 1, 12546899 sectors, extended partition
>> table (last)
>
>Would it be possible for you to share the two .wic images somewhere?
>In fact, if you could re-file this issue in our bug tracker, that would be 
>great:
>
>  https://salsa.debian.org/reproducible-builds/diffoscope/-/issues
>

Thanks, I created an issue and shared the details in the ticket
https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/333

>(And just for clarity, the ".wic" files are files containing raw partitions, 
>but the
>".disk" files contain entire disk images including a partition table?)
>

Yes, "*.wic" images are disk images with partition table and "*.disk" images 
are images with-out partition table.

>
>Best wishes,
>
>--
>  o
>⬋   ⬊  Chris Lamb
>   o o reproducible-builds.org 
>⬊   ⬋
>  o


Re: Does diffoscope compares disk partitions

2023-03-01 Thread Mattia Rizzolo
On Wed, Mar 01, 2023 at 10:55:41AM +, Chris Lamb wrote:
> > Does it support disk partitions or do I missing something?
> 
> The short answer is that that diffoscope *should* support comparing your
> partition images properly, instead of falling back to a raw
> xxd(1) comparison. The reason diffoscope doesn't do that right now is
> either due to a bug, or we just need to extend support for this
> particular type of partition.

Mhh, I don't think it should.
I don't reckon ever seeing anything related to GPT or MBR, so I don't
think diffoscope supports whole disk images with a partition tables.

> Correctly detecting DOS/MBR files is somewhat more fiddly than one
> might think, but the pertinent part of the debug log is this:

Indeed…

For a time we used the "DOS/MBR" as flag for fat16/32 images, however we
moved over to use the headers instead.

I don't think we have anything actually handling an actual MBR, however.
And this is a gpt partition table, which is yet different, and I think
also not handled.

> > image1.wic not identified by any comparator. Magic says: DOS/MBR boot 
> >  sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS 
> >  (0x3ff,255,63), startsector 1, 12546899 sectors, extended partition 
> >  table (last)
> 
> Would it be possible for you to share the two .wic images somewhere?
> In fact, if you could re-file this issue in our bug tracker, that
> would be great:
> 
>   https://salsa.debian.org/reproducible-builds/diffoscope/-/issues
> 
> (And just for clarity, the ".wic" files are files containing raw
> partitions, but the ".disk" files contain entire disk images including
> a partition table?)

.disk is a new one for me, but looking at the diffoscope output from
Venkata I reckon this is actually a single file system (ext4) image?

Also, I never saw a .wic file (never heard of, even), but since Venkata
shared a `fdisk -l` of such file, I reckon that one is a full disk disk
dump including a partition table.



So, overall, I actually don't think that diffoscope has the requested
support, and it's not "just" a bug of failed identification.


-- 
regards,
Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540  .''`.
More about me:  https://mapreri.org : :'  :
Launchpad user: https://launchpad.net/~mapreri  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-


signature.asc
Description: PGP signature


Re: Does diffoscope compares disk partitions

2023-03-01 Thread Chris Lamb
Hey Venkata,

> Does it support disk partitions or do I missing something?

The short answer is that that diffoscope *should* support comparing your
partition images properly, instead of falling back to a raw
xxd(1) comparison. The reason diffoscope doesn't do that right now is
either due to a bug, or we just need to extend support for this
particular type of partition.

Correctly detecting DOS/MBR files is somewhat more fiddly than one
might think, but the pertinent part of the debug log is this:

> image1.wic not identified by any comparator. Magic says: DOS/MBR boot 
>  sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS 
>  (0x3ff,255,63), startsector 1, 12546899 sectors, extended partition 
>  table (last)

Would it be possible for you to share the two .wic images somewhere?
In fact, if you could re-file this issue in our bug tracker, that
would be great:

  https://salsa.debian.org/reproducible-builds/diffoscope/-/issues

(And just for clarity, the ".wic" files are files containing raw
partitions, but the ".disk" files contain entire disk images including
a partition table?)


Best wishes,

-- 
  o
⬋   ⬊  Chris Lamb
   o o reproducible-builds.org 
⬊   ⬋
  o