Bug#788364: libmagic1: misdetect some Coreboot images as text
Jérémy Bobbio wrote... > diffoscope is the tool that we have created as part of the “reproducible > builds” effort to understand differences between two builds. (...) Eh, no worries :) I saw your presentation at DebConf, and of course I'm interested in supporting diffoscope. > diffoscope uses libmagic (through its Python bindings) to identify the > format of the files its trying to compare. Some coreboot images are > misdetected as text files which results in garbled diffoscope output. At a quick glance (I might be wrong) I failed to find the magic 0x4F524243 in the images attached to the initial report. OTOH, these images start with a huge amount (7.3 Mbyte) of \xff octets. file/libmagic don't look that far into files anyway to it might be impossible to detect coreboot image files properly. Rainer also provided some information that point into the same direction. Still, such a sequence must not be detected as text. I'll prepare a patch for upstream. Christoph signature.asc Description: Digital signature
Bug#788364: libmagic1: misdetect some Coreboot images as text
retitle 788364 diffoscope: garbled output when comparing some Coreboot images clone 788364 -1 reassign -1 libmagic1 severity -1 libmagic1 normal retitle -1 libmagic1: misdetect Coreboot images as text files thanks Hi Christoph, diffoscope is the tool that we have created as part of the “reproducible builds” effort to understand differences between two builds. We now also use it to compare builds of Coreboot images. diffoscope uses libmagic (through its Python bindings) to identify the format of the files its trying to compare. Some coreboot images are misdetected as text files which results in garbled diffoscope output. Proper way to detect Coreboot images is probably to look for a CBFS header. cbfs_find_header() is how upstream does it: http://review.coreboot.org/gitweb?p=coreboot.git;a=blob;f=util/cbfstool/cbfs_image.c;h=c40bd6641 I could tell diffoscope to detect Coreboot images with a similar mechanism but it would probably be better to teach libmagic to do it. Is that easily doable? Reiner Herrmann: > file detects them as plain-text: > > > /tmp/b1_coreboot.rom: ISO-8859 text, with very long lines, with no line > > terminators > > /tmp/b2_coreboot.rom: ISO-8859 text, with very long lines, with no line > > terminators > > That's why diffoscope also treats them as text. > I'm not sure this can/should be fixed inside diffoscope, as we rely on > libmagic detecting them correctly. Reiner, I remember you had a look into this during DebConf. Have you made any progress? -- Lunar.''`. lu...@debian.org: :Ⓐ : # apt-get install anarchism `. `'` `- signature.asc Description: Digital signature
Bug#788364: libmagic1: misdetect some Coreboot images as text
Hi Lunar, On Thu, Sep 03, 2015 at 04:26:57PM +0200, Jérémy Bobbio wrote: > diffoscope uses libmagic (through its Python bindings) to identify the > format of the files its trying to compare. Some coreboot images are > misdetected as text files which results in garbled diffoscope output. > > Proper way to detect Coreboot images is probably to look for a CBFS > header. cbfs_find_header() is how upstream does it: > http://review.coreboot.org/gitweb?p=coreboot.git;a=blob;f=util/cbfstool/cbfs_image.c;h=c40bd6641 > > I could tell diffoscope to detect Coreboot images with a similar > mechanism but it would probably be better to teach libmagic to do it. > Is that easily doable? > > Reiner Herrmann: > > file detects them as plain-text: > > > > > /tmp/b1_coreboot.rom: ISO-8859 text, with very long lines, with no line > > > terminators > > > /tmp/b2_coreboot.rom: ISO-8859 text, with very long lines, with no line > > > terminators > > > > That's why diffoscope also treats them as text. > > I'm not sure this can/should be fixed inside diffoscope, as we rely on > > libmagic detecting them correctly. > > Reiner, I remember you had a look into this during DebConf. Have you > made any progress? Unfortunately I haven't found any easy solution to it. It looked like magic(5) files require constant offsets for checking magic numbers. And I also didn't see a way to look at an offset backwards from the end of a file (where CBFS images have an offset to the header). I just had another look and saw that the "type" field can also be a search. So it could be possible to detect them via pattern files. signature.asc Description: Digital signature