L'octidi 8 fructidor, an CCXXIV, Thomas Schmitt a écrit :
>   dd if=/dev/cdrom bs=1M count=$blocks of=/media/richard/myisos/dvd_1.iso

Useless use of dd. head -c will perform as well, without the need for
arithmetic. And, for a DVD (but not a CD), I think just cat without isosize
would work as well.

> A discussion on reproducible-builds a year ago yielded that the file
> content sorting order by libisofs did not match the sorting order of
> directory records in the tree. This was fixed by release 1.4.2.

While reading the beginning of your mail, I was about to point that files in
ISO 9660 filesystems were written in sequence and should not cause much
seeking. Obviously, you already knew it and thought of it.

Any idea how they did manage that? Naively, I imagine that creating the
directory index and then creating the file entries is done from the same
in-memory data structure after a single sort.

> Nevertheless it turns out that the layers of Debian GNU/Linux 8 still
> do a poor job. I repacked the ISO by xorriso-1.4.5 and verified that
> the data extents are sorted according to the sorting of the ECMA-119
> and Rock Ridge tree. Simple tree traversal or alphabetically sorted
> tree traversal would yield smooth reading, but cp -r has different ideas
> about sequence.

You can use -v to easily know the order cp choses. AFAIK, cp -r does no
sorting and uses the order from the kernel, and the kernel, for ISO 9660
uses the order in the directory data, so that should work ok.

Actually, since rsync does its own sorting, so it could lead to worse
results in this case.

Still, we are speaking of a Debian install CD: the bulk of the data should
be made of a pool directory with only subdirectories on one level containing
plain files. All with file names from the almost-portable character set (+
and ~ are used), in lowercase. There is not much room for sorting
discrepancies.

But IIRC, ISO 9660 stores all the directory structure first and only then
the files' payload. That could be an explanation, since cp -r reads the
directories as they come (and even, apparently, subdirectories after it has
copied the plain files). That could explain seeking:

readdir pool
readdir pool/a [no seeking]
copy pool/a/a-1.deb [seeking over the rest of the directory structure]
copy pool/a/a.orig.tar.gz [no seeking]
readdir pool/b [seeking back]
...

In that case, running "find /media/cdrom > /dev/null" repeatedly to keep the
whole structure of the hierarchy in the inodes and dentries cache could
speed things up.

And also, rsync would speed things up since it establishes a list of all the
files to copy before starting, and its sort order should yield the same
result with these particular file names. I do not have an optical drive at
hand to check.

Attachment: signature.asc
Description: Digital signature

Reply via email to