Re: Does diffoscope compares disk partitions
Hi, Mattia Rizzolo (2023-03-01): > On Wed, Mar 01, 2023 at 10:55:41AM +, Chris Lamb wrote: >> > Does it support disk partitions or do I missing something? >> >> The short answer is that that diffoscope *should* support comparing your >> partition images properly, instead of falling back to a raw >> xxd(1) comparison. The reason diffoscope doesn't do that right now is >> either due to a bug, or we just need to extend support for this >> particular type of partition. > > Mhh, I don't think it should. > I don't reckon ever seeing anything related to GPT or MBR, so I don't > think diffoscope supports whole disk images with a partition tables. IIRC at Tails, we happily used diffoscope to compare our "USB images", which are full disk images with a GPT and an ESP (+ the hybrid GPT machinery to make them bootable as legacy BIOS too). Take care, -- intrigeri
Re: Does diffoscope compares disk partitions
On Thu, Mar 2, 2023, at 02:09, John Gilmore wrote: > I have been surprised at how much effort has gone into "diffoscope" as a > total fraction of the Reproducible Builds effort. How do you know? > Perhaps it is a case > akin to the drunk looking for his keys under the streetlight where he > can see, rather than in the dark where he dropped them. (It's easier to > hack diffoscope than to hack thousands of irreproducible packages.) I suspect it might be a visibility effect, but the other way around: diffoscope issues may be more likely to be discussed on this list than issues with individual packages. If you look at the monthly reports there's a healthy batch of distro work happening each month, and even there it might be skewed: it's much easier to enumerate the changes to diffoscope, than it is to gather reproducibility work from all over the internet. I, for one, spend *way* more time fixing irreproducible packages (where diffoscope is an amazing tool) than working on diffoscope itself (I only did some issue reporting and testing). I rarely post about it here, and I've been really bad at making it visible in the monthly reports as well - I should get back into that habit ;). Kind regards, Arnout
Re: Does diffoscope compares disk partitions
On Wed, Mar 01, 2023 at 10:07:33PM -0800, Vagrant Cascadian wrote: > I daresay it is more akin to someone looking for lost keys by inventing > a flashlight to look near where they dropped them, and they happen to > have dropped the keys into a bin with miscelaneous arbitrary pieces and > bits of things, many off which are shaped and/or sized roughly like your > keys... and wow before nobody thought to make a flashlight shine at so > many different wavelengths, or detect the density of the materials, or > produce a harmonic resonance with certain materials, or a high intensity > laser to burn off all the organic detritus that has accumulated, > and... only to find out that people have been loosing their keys in this > bin for decades, and we found someone else's keys too! Oh, look, with > this small tweak, we could also detect antique coins... Wow… We should quote this in our blog post :> -- regards, Mattia Rizzolo GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`. More about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'` Debian QA page: https://qa.debian.org/developer.php?login=mattia `- signature.asc Description: PGP signature
Re: Does diffoscope compares disk partitions -> The German word is 'jein' (yes and no)
Hello, On 01/03/2023 22:59, Thomas Schmitt wrote: Mattia Rizzolo wrote: ... ... The file tree and the data files' content is only a part of a bootable ISO 9660 image. There's executable code in the blind spots of such a view: MBR legacy BIOS boot code, EFI programs in the EFI partition, "hidden" El Torito boot images, ... I know that thing has been used to work on reproducible bootable image, Possibly because typical pitfalls aren't tested. Like FAT timestamps in the EFI partition. In live-build, the FAT timestamps in the EFI partition are carefully ironed out [1][2]. Before the fixes, they did show up in the diffoscope, but only in the hex dump. After I realised that the differences in the hex dump were actually timestamps (or timestamp related), I could remove the differences. From the original mail: ** 2023-02-27 14:09:12 D: diffoscope.comparators.utils.command: Executing xxd {} ** Killed I've seen this too. You'll need loads of memory and temporary disk space, especially in the initial stages of finding the root cause of the differences (comparing 2 xxd outputs of 6GB files can produce lots of output) The original mail also hint at a possible root cause: the timestamp inside each partition have not been clamped to SOURCE_DATE_EPOCH. See e.g. [3] how it is done in live-build. > It is possible to make reproducible ISOs by xorriso and SOURCE_DATE_EPOCH. > But the input files for the ISO production have to be identical, too. Indeed, before the meta-content of ISO files can be considered, its payload should have been harmonized. Live-build does it in [3]. > Both, the use of SOURCE_DATE_EPOCH and care for identical input are not tradition with bootable ISO production. > > So another reason for no protests might be that bootable ISOs aren't often challenged by reproducibility tests which use diffoscope. I agree that having a specialised comparator would be nice, but even the hex dump has fulfilled its purpose. The live ISO images are now 100% reproducible (the header is containing a lot of the magic in the previous mails). The ISO images are tested with diffoscope on a daily basis in Jenkins [4]. With kind regards, Roland [1] https://sources.debian.org/src/live-build/1:20230131/scripts/build/binary_grub-efi/?hl=262#L262 [2] https://sources.debian.org/src/live-build/1:20230131/scripts/build/efi-image/?hl=79#L79 [3] https://sources.debian.org/src/live-build/1:20230131/scripts/build/binary_iso/?hl=180#L180 [4] https://jenkins.debian.net/view/live/ OpenPGP_signature Description: OpenPGP digital signature
Re: Does diffoscope compares disk partitions
On 2023-03-01, John Gilmore wrote: >>> So, overall, I actually don't think that diffoscope has the requested >>> support, and it's not "just" a bug of failed identification. > > I have been surprised at how much effort has gone into "diffoscope" as a > total fraction of the Reproducible Builds effort. Perhaps it is a case > akin to the drunk looking for his keys under the streetlight where he > can see, rather than in the dark where he dropped them. I daresay it is more akin to someone looking for lost keys by inventing a flashlight to look near where they dropped them, and they happen to have dropped the keys into a bin with miscelaneous arbitrary pieces and bits of things, many off which are shaped and/or sized roughly like your keys... and wow before nobody thought to make a flashlight shine at so many different wavelengths, or detect the density of the materials, or produce a harmonic resonance with certain materials, or a high intensity laser to burn off all the organic detritus that has accumulated, and... only to find out that people have been loosing their keys in this bin for decades, and we found someone else's keys too! Oh, look, with this small tweak, we could also detect antique coins... > (It's easier to hack diffoscope than to hack thousands of > irreproducible packages.) Fixing reproducibility issues blindfolded does not seem like an efficient way to fix issues either. We have already fixed tens of thousands of issues, and have thousands more that we are working on. Diffoscope is a highly useful tool towards that end. live well, vagrant signature.asc Description: PGP signature
Re: Does diffoscope compares disk partitions
>> So, overall, I actually don't think that diffoscope has the requested >> support, and it's not "just" a bug of failed identification. I have been surprised at how much effort has gone into "diffoscope" as a total fraction of the Reproducible Builds effort. Perhaps it is a case akin to the drunk looking for his keys under the streetlight where he can see, rather than in the dark where he dropped them. (It's easier to hack diffoscope than to hack thousands of irreproducible packages.) I for one am happy that diffoscope DOESN'T have support for umpteen disk partitioning schemes and file system formats. John PS: Has anyone on the list considered writing an article for the Journal of Irreproducible Results about our effort?
Re: Does diffoscope compares disk partitions
Hi, Mattia Rizzolo wrote: > I already knew that hybrid ISO 9660 are pretty much devil's offspring, > and I guess I could have done without your introduction to them /o\ Since i am helping to create them i feel obligated to warn the public. {:) > FWIW, currently diffoscope _does_ have a iso9660 comparator, but that > just runs libarchive on it, so I _believe_ it ignores the existence of > an eventual partition table and just goes straight to the really > compliant "data part" of the image. That's a bit naive. The file tree and the data files' content is only a part of a bootable ISO 9660 image. There's executable code in the blind spots of such a view: MBR legacy BIOS boot code, EFI programs in the EFI partition, "hidden" El Torito boot images, ... > I know that thing has been used to work on reproducible bootable image, Possibly because typical pitfalls aren't tested. Like FAT timestamps in the EFI partition. > so I guess it just always happened that the "hybrid part" of them were > already reproducible :3 It is possible to make reproducible ISOs by xorriso and SOURCE_DATE_EPOCH. But the input files for the ISO production have to be identical, too. Both, the use of SOURCE_DATE_EPOCH and care for identical input are not tradition with bootable ISO production. So another reason for no protests might be that bootable ISOs aren't often challenged by reproducibility tests which use diffoscope. Have a nice day :) Thomas
Re: Does diffoscope compares disk partitions
Hello Thomas, On Wed, Mar 01, 2023 at 09:44:56PM +0100, Thomas Schmitt wrote: > There are various layouts. Some are quite awkward and invite for getting > lost in the woods. (Actually they lure firmwares out of the woods.) Duh. I already knew that hybrid ISO 9660 are pretty much devil's offspring, and I guess I could have done without your introduction to them /o\ > All these images contain block ranges which are neither covered by the > ISO 9660 file tree nor by the partitions. > One will always first have to compare such ISOs byte-by-byte and only in > case of mismatch look into the ISO filesystem and partitions for a more > qualified presentation of the differences. FWIW, currently diffoscope _does_ have a iso9660 comparator, but that just runs libarchive on it, so I _believe_ it ignores the existence of an eventual partition table and just goes straight to the really compliant "data part" of the image. I know that thing has been used to work on reproducible bootable image, so I guess it just always happened that the "hybrid part" of them were already reproducible :3 Else I reckon somebody would have provided some improvements in this area already. -- regards, Mattia Rizzolo GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`. More about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'` Debian QA page: https://qa.debian.org/developer.php?login=mattia `- signature.asc Description: PGP signature
Re: Does diffoscope compares disk partitions
Hi, Mattia Rizzolo wrote: > So, overall, I actually don't think that diffoscope has the requested > support, and it's not "just" a bug of failed identification. If it gets support for detecting partitions in image files then it will have to handle hybrid bootable ISO 9660 images which bear partition tables for the case that they get copied onto a USB stick. There are various layouts. Some are quite awkward and invite for getting lost in the woods. (Actually they lure firmwares out of the woods.) E.g. debian-11.5.0-amd64-netinst.iso has two partitions in the MBR table plus a GPT header which most readers won't take into respect but will appease older versions of OVMF virtual EFI. MBR partition 1 marks the whole ISO 9660 filesystem. MBR partition 2 marks a data file range within that ISO filesystem (i.e. abominable partition nesting). The data file is actually a FAT filesystem image, which could possibly begin by an MBR with partition table. Further there is an Apple Partition Map with ai single partition marking the FAT filesystem image for EFI. This partition layout is an invention of Matthew J. Garrett for Fedora. ubuntu-22.04-desktop-amd64.iso has a nearly properly announced GPT with the tiny flaw of an MBR partition 2 of type 0x00 which carries the "bootable" flag. That's a lure for some legacy BIOS laptops which don't consider a disk for booting if it does not have a partition with the flag. The 3 not nested GPT partitions mark the ISO filesystem, the EFI partion, and the traditional 300 KiB of padding. The EFI partition is not a data file in the ISO filesystem. The confusion is completed by the fact that the overall image is a mountable ISO filesystem which claims the whole size of the image file including the EFI partition and the padding. Nevertheless GPT partition 1 does not start at the start of the image file but is an ISO filesystem too, which bears the same files as the overall ISO fileystem. grub-mkrescue produces for x86_64 a boringly specs compliant ISO with GPT at the cost that there is no mountable ISO partition in the GPT. But the image also contains an Apple Partition Map with a partition pointing to a HFS+ filesystem which shows the same files as the ISO 9660 filesystem. (It is of course possible to mount the overall image file as ISO 9660.) Quite like a normal disk image is Fedora-Everything-netinst-x86_64-37-1.7.iso with its 3 GPT partitions. The only unusual aspect is the mountability of the overall image file with an ISO filesystem which claims the block range of all three partitions. All these images contain block ranges which are neither covered by the ISO 9660 file tree nor by the partitions. One will always first have to compare such ISOs byte-by-byte and only in case of mismatch look into the ISO filesystem and partitions for a more qualified presentation of the differences. Have a nice day :) Thomas
RE: Does diffoscope compares disk partitions
Hi Chris, Thanks for your replay, please see my inline comments. >-Original Message- >From: Chris Lamb >Sent: 01 March 2023 16:26 >To: pyla venkata(TSIP TMIEC ODG Porting) tsip.com>; rb-general >Subject: Re: Does diffoscope compares disk partitions > >Hey Venkata, > >> Does it support disk partitions or do I missing something? > >The short answer is that that diffoscope *should* support comparing your >partition images properly, instead of falling back to a raw >xxd(1) comparison. The reason diffoscope doesn't do that right now is either >due to a bug, or we just need to extend support for this particular type of >partition. > >Correctly detecting DOS/MBR files is somewhat more fiddly than one might >think, but the pertinent part of the debug log is this: > >> image1.wic not identified by any comparator. Magic says: DOS/MBR boot >> sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS >> (0x3ff,255,63), startsector 1, 12546899 sectors, extended partition >> table (last) > >Would it be possible for you to share the two .wic images somewhere? >In fact, if you could re-file this issue in our bug tracker, that would be >great: > > https://salsa.debian.org/reproducible-builds/diffoscope/-/issues > Thanks, I created an issue and shared the details in the ticket https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/333 >(And just for clarity, the ".wic" files are files containing raw partitions, >but the >".disk" files contain entire disk images including a partition table?) > Yes, "*.wic" images are disk images with partition table and "*.disk" images are images with-out partition table. > >Best wishes, > >-- > o >⬋ ⬊ Chris Lamb > o o reproducible-builds.org >⬊ ⬋ > o
Re: Does diffoscope compares disk partitions
On Wed, Mar 01, 2023 at 10:55:41AM +, Chris Lamb wrote: > > Does it support disk partitions or do I missing something? > > The short answer is that that diffoscope *should* support comparing your > partition images properly, instead of falling back to a raw > xxd(1) comparison. The reason diffoscope doesn't do that right now is > either due to a bug, or we just need to extend support for this > particular type of partition. Mhh, I don't think it should. I don't reckon ever seeing anything related to GPT or MBR, so I don't think diffoscope supports whole disk images with a partition tables. > Correctly detecting DOS/MBR files is somewhat more fiddly than one > might think, but the pertinent part of the debug log is this: Indeed… For a time we used the "DOS/MBR" as flag for fat16/32 images, however we moved over to use the headers instead. I don't think we have anything actually handling an actual MBR, however. And this is a gpt partition table, which is yet different, and I think also not handled. > > image1.wic not identified by any comparator. Magic says: DOS/MBR boot > > sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS > > (0x3ff,255,63), startsector 1, 12546899 sectors, extended partition > > table (last) > > Would it be possible for you to share the two .wic images somewhere? > In fact, if you could re-file this issue in our bug tracker, that > would be great: > > https://salsa.debian.org/reproducible-builds/diffoscope/-/issues > > (And just for clarity, the ".wic" files are files containing raw > partitions, but the ".disk" files contain entire disk images including > a partition table?) .disk is a new one for me, but looking at the diffoscope output from Venkata I reckon this is actually a single file system (ext4) image? Also, I never saw a .wic file (never heard of, even), but since Venkata shared a `fdisk -l` of such file, I reckon that one is a full disk disk dump including a partition table. So, overall, I actually don't think that diffoscope has the requested support, and it's not "just" a bug of failed identification. -- regards, Mattia Rizzolo GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`. More about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'` Debian QA page: https://qa.debian.org/developer.php?login=mattia `- signature.asc Description: PGP signature
Re: Does diffoscope compares disk partitions
Hey Venkata, > Does it support disk partitions or do I missing something? The short answer is that that diffoscope *should* support comparing your partition images properly, instead of falling back to a raw xxd(1) comparison. The reason diffoscope doesn't do that right now is either due to a bug, or we just need to extend support for this particular type of partition. Correctly detecting DOS/MBR files is somewhat more fiddly than one might think, but the pertinent part of the debug log is this: > image1.wic not identified by any comparator. Magic says: DOS/MBR boot > sector; partition 1 : ID=0xee, start-CHS (0x0,0,2), end-CHS > (0x3ff,255,63), startsector 1, 12546899 sectors, extended partition > table (last) Would it be possible for you to share the two .wic images somewhere? In fact, if you could re-file this issue in our bug tracker, that would be great: https://salsa.debian.org/reproducible-builds/diffoscope/-/issues (And just for clarity, the ".wic" files are files containing raw partitions, but the ".disk" files contain entire disk images including a partition table?) Best wishes, -- o ⬋ ⬊ Chris Lamb o o reproducible-builds.org ⬊ ⬋ o