On 2026-03-22 16:52, Terence Kelly wrote:

1. Can you confirm that shuf uses the Fisher-Yates/Durstenfeld unbiased shuffle algorithm?

Never heard of that name for that algorithm, which I consider obvious. As I recall, I independently invented a superset of it in 2006 and shuf uses this superset algorithm, which is equivalent to Fisher-Yates/Durstenfeld in the special case you're surely thinking of. See coreutils/gl/lib/randperm.c's randperm_new.


2. Can you confirm that shuf avoids modulo bias --- the infamous and widespread bug, "random_number%N" --- when it makes equiprobable selections in its implementation of the shuffle algorithm?

Yes. See coreutils/gl/lib/randint.c's randint_genmax.


3. Can you confirm that when shuf's "--random-source=FILE" option is used, the specified file is the sole source of random bits for all of shuf's behavior, and that if the same random source file is used again, holding all other shuf options & inputs constant, then shuf will emit the same output?

Yes. See coreutils/gl/lib/randread.c's randread_new and its callers.


Before I can recommend shuf, however, I must confirm that it is free of several 
defects that are often found in other random permutation software.

I suggest recommending coreutils 9.6 (2025-01-17) or later, due to the bug fixed here (a bug that's not on your list...):

https://cgit.git.savannah.gnu.org/cgit/coreutils.git/commit/?id=bfbb3ec7f798b179d7fa7b42673e068b18048899

The bug doesn't matter if you use --random-source=FILE, as that option bypasses the bug.



Reply via email to