This is sort of off topic and not directly related to HLFS, but considering the subject I thought I'd shoot this out at you guys.
Transparently Transmitting Checksums With Archives I have recently had my mind on MD5 sums for file downloads and such. I put very little thought towards md5sums beyond normal use, but I started to think that this could be made easier. It seems to me that md5sums would be easier to transmit with the actual file and not separately. The idea is to make md5 checksums implicit. For a given compressions format, say gzip, first compress the particular archive and then make an md5 checksum for the particular archive. Once both of those are setup,wrap the original archive and its checksum in a second (tar) archive. Any application might be able to untar the original file and then run a checksum automatically and only continue if the checksum passes. This method would require no installation of anything new. Something could be installed to handle the extraction and auto-check the checksum on extract. Here is a bash script I use to create a TMG file (TML = Tar MD5 Gzip): BEGIN_SCRIPT #!/bin/bash for i in $* ; do I=$(echo $i | sed -e "s|/$||") echo "Attempting to archive, compress, and checksum: $I" tar --numeric-owner -pc $I | gzip --best > $I.tgz && echo "Archived & Compressed: $I" && md5sum $I.tgz > $I.md5 && echo "Created MD5 Checksum: $I" && tar --numeric-owner -pc $I.tgz $I.md5 > $I.tmg && rm -f $I.tgz $I.md5 && echo "Created TMG File: $I" || echo "An error occured" done END_SCRIPT System Integrity Scans With checksum, one could also perform regular checkups on the state of system libraries and programs (binaries in general). Given that binaries do not normally change, except when they are updated, one could create a root-only directory (say /checksum with drwx------) and have the init program run regular checkups on the state of the binaries. For security, this integrity check would allow the system to identify a potentially infected or damaged binary. In the rare case of the linux-virus, the md5sum would detect a potentially infected or insecure binary due to a checksum failure. The system could then move the infected file somewhere safe, make the infected file un-executable, and attempt to replace the infected (or damaged) file. On a package managed system, the application performing the checksum could attempt to download the correct binary and replace the old one. On a source system with a set of how-to-compile rules, a recompilation of the infected or damaged file could be auto-performed. I imagine this would also be useful to set off alarms on the system to tell the user to check the hardware and/or filesystem(s) for problems. For an embedded system, there should be little overhead of performing these checksums. A less aggressive approach would be to check timestamps and only perform checks against those files that changed, but this will not help as well against hardware or filesystem failures. In a quick test against my /sbin/ which contains 28M of programs: Checksum Space Usage: 1.1M Time it took to perform checksum against /sbin/: real 0m1.401s user 0m0.210s sys 0m0.187s For my unusual /bin/ directory of 168M: Checksum Space Usage: 4.5M Time it took to perform checksum against /bin/: real 0m4.943s user 0m1.133s sys 0m1.010s So on a standard desktop computer, these checksums will take a negligible amount of time and resources. Performing the checks once a day at midnight prove useful. -- Kevin Day -- http://linuxfromscratch.org/mailman/listinfo/hlfs-dev FAQ: http://www.linuxfromscratch.org/faq/ Unsubscribe: See the above information page
