Frederik Eaton <[EMAIL PROTECTED]> wrote: > So, what is the current state of things? Who is in charge of accepting > patches? Are we decided that a 'shuffle' command but no 'sort -R' > facility would be best, or that it would be good to have both, or is > it still in question whether either would be accepted?
I am the official `maintainer', but Paul Eggert has been making most of the changes recently. It looks like there are some desirable features that can be provided only by a shuffle-enabled program that is key-aware. Key specification and the comparison code are already part of sort. Obviously, duplicating all of that in a separate program is not an option. I don't relish the idea of factoring out sort's line- and key-handling code either, but it might be feasible. However, I do like the idea of a new program that simply outputs a random permutation of its input records, and that does it well, and repeatably. The Unix tool philosophy certainly does encourage the `perform one task and do it well' approach. Since doing it well includes handling input larger than available virtual memory, this is not trivial -- and it is well suited to the coreutils, i.e., it's not easily implementable as a script. Initially, I was inclined to say that adding both the new program (no key support) and related functionality to sort was desirable. Thinking of the limits of robustness of such a new program, I realized that if the input is sufficiently large and not seekable (e.g., from a pipe), then the program will have to resort to writing temporary files, much as sort already does. More duplicated effort, determining how much memory to use (like sort's --buffer-size=SIZE option), managing the temporary files, ensuring that they're removed upon interrupt, etc. But maybe not prohibitive. The new program would also have to have an option like sort's -z, --zero-terminated option, and --temporary-directory=DIR, and --output=FILE. In effect, it would need all of sort's options that don't relate to sorting. So implementing a robust shuffle program, even one without key handling capabilities, would require much of the infrastructure already present in sort.c. It sure sounds like shuffle and sort should share a lot of code, one way or another, so why not have them share the line- and key- handling code, too? I won't rule out adding a new program, like shuffle, but I confess I'm less inclined now than when I started typing this message. _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils