Hi, Ouch. In jigdo-lite it is not easy to have the downloaded files verified with the checksums of the expected FileParts.
Steve, i could need a decision in which direction i should go: - Check .jigdo MD5s by jigdo-lite. - Check by jigdo-file, with a new option --warn-unused-file to enable my "POSSIBLE FILE CORRUPTION" test when jigdo-lite is cycling between downloading and jigdo-file "make-image" scanning. (I expect this test to produce lots of false positives if jigdo-file would use it when exploiting a large local pool tree.) - Declare "Won't fix" and have other fun. --------------------------------------------------------------------- Things which are so far ok for a MD5 check in jigdo-lite: The list of files to download is obtained by a run of jigdo-file print-missing-all ... This is not too bad, because it not only delivers a list of possible URLs per file (usually one per file) but also a MD5 in jigdo-file's modified base64 encoding jigdo-file command MD5SUM is supposed to produce a disk file's MD5 in the same format. So comparison would be possible If i add "http://archive.debian.org/..." to the [Servers] list in .jigdo, i get per missing file: two URLs, one encoded MD5, and an empty line. http://archive.debian.org/.../openssh-client-udeb_5.5p1-6+squeeze3_amd64.udeb http://us.cdimage.debian.org/.../openssh/openssh-client-udeb_5.5p1-6+squeeze3_amd64.udeb MD5Sum:BjBWgpWgZYkV0gdXgcpm5A http://archive.debian.org/.../reiserfsprogs-udeb_3.6.21-1_amd64.udeb http://us.cdimage.debian.org/.../reiserfsprogs-udeb_3.6.21-1_amd64.udeb MD5Sum:HEsrTtJufOa50DKzAIQ3EA jigdo-lite seems to expect up to 8 such URLs per file. See it counting by fingers in line 591: for pass in x xx xxx xxxx xxxxx xxxxxx xxxxxxx xxxxxxxx; do ... while $readLine url <&3; do count="x$count" ... if test "$count" != "$pass"; then continue; fi Up to 10 collected URLs are then handed as arguments to function fetchAndMerge which not only downloads them but also runs jigdo-file to put them into the emerging ISO. So this is where verifying would have to happen. I made a plan how to give the MD5s of the URLs as further arguments to fetchAndMerge. Since the encoded MD5s are single words one could send them down as first argument, shift 1, and then give the other arguments to function "fetch" for download. --------------------------------------------------------------------- But then i becomes ugly: Now fetchAndMerge has URLs for wget and corresponding MD5s for files. It would need to deduce the file paths from the URLs in order to run jigdo-file MD5SUM on it. jigdo-file MAKE-IMAGE gets the root of the file pool. I do not dare to guess whether only the freshly downloaded files are in there. If others are present, the relation between downloaded files and MD5s would derail. --------------------------------------------------------------------- Possible workaround: I am now exploring the effort to introduce a new option for jigdo-file: --warn-unused-file [make-image] Complain if a submitted file matches none of the wanted checksums which shall control whether the message POSSIBLE FILE CORRUPTION: Offered file did not fit into the template. POSSIBLY CORRUPTED: `...path...' shall be emitted if a file does not match any wanted template checksum. The option will be disabled by default. jigdo-lite function fetchAndMerge could set it. I am still unsure whether jigdo-lite should use it by default. In my tests with "netinst" and "businesscard" images it produced no false positives. If it encounters surplus files which were not freshly downloaded, then it would report them but would not confuse them with others. I can invest a few dozen GB of netload into larger tests, if this is desired. --------------------------------------------------------------------- Well, given the fact that this is only for the unusual case of damaged files on the fallback server, one could easily argue that the risk of a regression is not outwighted by the potential benefit. Have a nice day :) Thomas