Samuel Thibault, 2012-01-17 12:03:41 +0100 : [...]
> I'm not sure to understand what you mean exactly. If you have even > just a hundred files of the same size, you will need ten thousand file > comparisons! I'm sure that can be optimised. Read all 100 files in parallel, comparing blocks of similar offset. You need to perform 99 comparisons on each block for as long as blocks are identical; when one of the 99 doesn't match, you can split your set of files according to this offset into at least 2 equivalence classes, which you consider subsets from now on. A subset with only one file can be eliminated from the rest of the scan, and even if there are only multiple-file subsets, the number of comparisons to be performed at further steps is reduced by at least one. Roland. -- Roland Mas You can tune a filesystem, but you can't tuna fish. -- in the tunefs(8) manual page. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87k44qxzej....@mirexpress.internal.placard.fr.eu.org