Matthew Booth: > On Fri, 2013-11-22 at 20:14 +0000, Richard W.M. Jones wrote: >> On Fri, Nov 22, 2013 at 05:56:00PM +0000, adrelanos wrote: >>> Thank you all for your suggestions! >>> >>> Richard W.M. Jones: >>>> I keep meaning to write a comprehensive "virt-diff" tool. I needed it >>>> myself just yesterday. >>> >>> Most interesting. I guess there are two reasons for creating such a >>> tool: just compare the images (show the diff) and/or check for malicious >>> additions in the other image. >>> >>> Did you consider implementing the former or both? >> >> For all the reasons that Alex goes into, it would just be for checking >> NON-malicious differences. The use case is to reverse engineer what >> files change in a guest when you perform an action (eg. install a >> Windows driver or run some Linux administrative command). >> >> [...] >>> At the moment I am not trying to write a virt-diff like tool, but >>> something simpler. A tool to create a report of all of a vm image's >>> contents. (Checksums for all files, filesystem, for MBR and Volume Boot >>> Record.) When publishing VM images, it might be useful to publish such a >>> report together with the image, so others who re-build from source can >>> be certain, they ended up with a very similar image. When having created >>> two such reports, one could easily get a virt-diff like tool. >> >> I think Matt Booth was doing something like this for Windows systems, >> with the aim of being able to recreate a Windows VM from a (smaller) >> description. Don't know what state that was/is in. > > I wrote a POC tool to store an MD5 of every file on a Windows > filesystem. It looked like a good idea for what it was, but not very > applicable here. > >> [...] >>> What other data can there be outside the filesystem? >>> >>> I can think of: >>> >>> - MBR >>> - Volume Boot Record >>> >>> Anything else? >> >> Potentially all unused space inside and between partitions / >> filesystems / logical volumes. The boot loader is sometimes stored in >> the space between the MBR and the first partition. Other peculiar >> things lie in other spaces. > > Any mechanism for doing volume management. e.g. MBR, GPT, LVM (Linux), > LDM (Windows). Sometimes these overlap and interact in complex ways, > e.g. LDM has an MBR and a GPT, both of which it ignores in favour of its > own metadata. > >> However if you don't care about guests that are malicious / hiding >> data, then you can ignore everything except for the MBR and any >> non-zero data between the MBR and the first partition. Note for GPT >> you have to take into account two partition tables as well. >> >>> If these have been compared, the compared image should be as safe to use >>> as the original one? >>> >>> (I could imagine that there can be extra data outside filesystem, maybe >>> in regions outside the partition table, but those data shouldn't get >>> executed after starting the image in a VM.) > > I'm coming in to this discussion late, so I don't know what you're doing > or how paranoid you need to be.
A few years ago, I could say very paranoid. Otherwise, I wouldn't do it in the first place. :) Nowadays after the news coverage, I'd say no paranoia at all, just reasonable precuations. ;) > However, cranking up the paranoia a > little, imagine the following scenario: > > There's a bug in a critical boot element which means the boot relies on > uninitialised disk space. As it happens, in a normal installation this > uninitialised disk space is always safe and it's located somewhere which > will rarely, if ever, be touched, so nobody has every noticed it. > (Paranoia level: state actor. Somebody put the bug there deliberately.) > Malicious person modifies the uninitialised disk space. Your tool will > never notice. The boot process is now compromised. > > You could probably come up with more with a few minutes of thought. I'm > pretty sure a dedicated team given a few months to work on this project > could come up with some inventive ideas :) I hope you are wrong. :) I am going to ask for more feedback on another mailing list after the initial implementation of the script is done. (At the moment I am making good progress, the initial report creation script is almost finished, currently ironing out a few non-deterministic /var/cache... files and folders and recreating them during the first boot.) _______________________________________________ Libguestfs mailing list [email protected] https://www.redhat.com/mailman/listinfo/libguestfs
