This is in response to a mail I found in the archive from "Wed, 25 Aug 2004 16:14:15 +0200" that adds an -R option to sort randomly. I've been seeking to add such functionality to sort for a while.
1. It looks like this isn't in 5.2.1 yet, has something been added to cvs? Is something planned? Is somebody working on a patch? 2. Was this patch actually tested? It doesn't work for me, e.g.: $ sort -R < /usr/share/dict/words | tail -n 15 more palming palindromes muting mutilation palindrome's mopes palest mister palette pall moribund mooted morbid mordant I tried modifying sort in this way a long time ago, by having the key comparison functions return random bits, and found that it gave similar results. The reason is that gnu 'sort' uses mergesort - but problems may arise with other algorithms as well. When you're merging two lists, if you pick elements randomly from one list or the other with equal probability, you'll find that after 'n' picks you'll be on average sqrt(n) farther into one list or the other, so when you finish one list, you still have about sqrt(n) leftover elements in the other list. These end up consecutive in your result. After several merges, the cumulative effect is that the tail of your list is not very well sorted. A better algorithm would be as follows: combine each key you want to randomize with some salt (your "random seed"), and compute its hash. (By the way, it would be best to be able to randomize on a per-key basis) Then compare as you would other keys. It is hard to see how this could give incorrect results. Furthermore, it has other desired properties: reproducibility across systems, no dependence on libc's rand(). Would the maintainers be receptive to a patch which accomplished this if it followed the style guidelines etc.? I think a feature like this would be quite useful to all manner of users, from those involved in statistical investigations to those who want to randomize their mp3's without installing a separate piece of software. 3. Is there a way to subscribe to this mailing list? Thanks, Frederik Eaton _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils