On Fri, Jul 06, 2012 at 12:44:24PM +0200, Joerg Schilling wrote: >> Tim Kientzle <[email protected]> wrote: >> >> > To be honest, I've considered altering bsdtar's directory-traversal >> > code so that it always sorts the first 100 names in a directory >> > and then leaves the rest unsorted. That would give fully-sorted >> > output for almost all cases and avoid the memory consumption >> > (and slow performance) on very large directories. >> >> 100 directory entries looks a bit small. >> >> average values seem to be 5..50 entries per directory, 100 is a number that >> looks too close to values from every day. >> >> But in general: as directories usually don't have too many entries, it is >> not a >> problem to read a directory at once (and this is why star uses this method >> since 10 years).
To be myself honest also, i have sometimes to deliver directories that contain about 50000 files, that i would like sorted. In any case, i prefer to rely on a single additional option on my tar command line (in TAR_OPTIONS in this case) than on a complex and error-prone find incantation. If not provided directly (as an option) or indirectly (by just replacing 100 with 50000 in the code) by tar, i feel myself capable to insert some dirty malloc/qsort/free somewhere in tar/src/create.c, not far where --no-recursion is actually acting. Denis Excoffier.
