On Thursday 02 June 2005 10:31, Jim Meyering wrote: > It sure sounds like shuffle and sort should share a lot of code, > one way or another, so why not have them share the line- and key- > handling code, too? I won't rule out adding a new program, like > shuffle, but I confess I'm less inclined now than when I started > typing this message.
Now that I've heard of some samples, I think a sort --random is an excellent idea, and I hope its inclusion to coreutils occurs at some point. There is scope in sort that is far beyond shuffle. Assuming sort --random's eventual entry to coreutils is a given, I think it sounds like we have two questions to decide on-- a) Implementation details aside, does a shuffle command merit entry? b) Examining implementation details, what is the best way to go? My take is that shuffle is good for the lazy man--shuffle as is currently written, is destructive, replacing a file A with random(A). I intend (after I add -z and -o, of course :] ) to add a --head (-h) option. so that if all we want is the first line, that's all we have to process. My thought is that these properties make shuffle ideal for simple, quick hitters via "system" type calls in the various scripting languages that are commonly used. To help illustrate, here are the common use cases I envsion: USE CASE 1: Randomizing file contents shuffle * ls * | xargs -l -ii sort --random "i" -o "i" USE CASE 2: Grabbing a file at random ls * | shuffle -h 1 ls * | sort --random | head -l 1 USE CASE 3: Generating a list of random files find . -name \*.mp3 | shuffle -o "playlist.m3u" find . -name \*.mp3 | sort --random -o "playlist.m3u" Effiency wise, I think shuffle will run quicker, but that may not be an issue given the size of average cases (small). For question a) above, I'm thinking there is room. In the same way we have grep -r and find . | xargs grep, I'm assuming we can have both shuffle and sort (from a users perspective). If we assume that the potential exists for both sort and shuffle, the devil then becomes the details. How much of sort would exist in shuffle--or vice versa? Should there be a gnu coreutils include that deals specifically with temp files, for use by any utility? Ahhh, questions questions...I'm not sure how we should approach it. Is the answer to b) unknown at present? Trying to get our arms around the issue could lead to a great deal of analysis paralysis, though I'm always willing to try. If we agree that a) is a given, maybe we should just try and add the N-scale code to shuffle, with a parallel --random effort in sort? Then we can operate in hindsight, refactoring and adjusting as neccessary. One potential plan at any rate. Of course, if we agree that shuffle should not be included, then no harm no foul either. But, as it sounds like sort --random is far from trivial, sometimes a bird in the hand??? Thoughts? Thanks, Davis _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils