[EMAIL PROTECTED] (Paul Jarc) writes: >> Sort of, but not quite. > > I couldn't find the "not quite" part of your explanation.
Well, I tried. :-) >> "sort -rR" should output in the reverse order of "sort -R". > > Nit: they shouldn't expect that unless they also specify a seed. Yes, of course. > But sort -R can still provide this just by permuting the original > input order, rather than the correct sort order. I don't understand this claim. If "sort -R" operates by permuting the original input order, and then sorts the result, then it will generate the same output as if it hadn't permuted anything (assuming there are no ties). > we do: > $ sort -R A > B > $ sort -R --seed=deadbeef A > A1 > $ sort -R --seed=deadbeef A > A2 > $ sort -R --seed=deadbeef B > B1 > $ sort -R --seed=deadbeef B > B2 > > Then we should expect that A1 and A2 have the same contents, and that > B1 and B2 have the same contents. But the TODO requirement would also > ensure that A1/A2 have the same contents as B1/B2. Yes, assuming no ties. > Is that really needed? If it's not needed, then why is this relevant to "sort"? You are asking for a program that randomly permutes its input. Then let's design another program to do that, and not get bogged down with how its features work together with "sort"'s existing zoo of options. > I'm also not sure that clustering lines with equivalent sort keys is > desirable. Again, it depends on whether you want something relevant to the collating order (i.e., a sort), or you want something that's completely irrelevant (i.e., a permutation). If the latter, then I suspect we should be talking about a different tool. >>>>> This means that two different files, that happen to sort to the >>>>> same output, should give the same output when randomized with >>>>> the same SEED. Is that right? [*] >>>> if you sort a permutation of the same input file >>>> with the same --random-seed=SEED option twice, you'll get the same >>>> output. [**] >> If two files sort to the same output, then they're permutations of >> each other. So [**] implies [*]. (The converse does not hold. See >> what I mean about the logic being tricky here?...) > > No, I think [*] implies [**] only. [*] is the more general case > placing a requirement on all permutations of the same input; [**] is > the special case where the two files are the same permutation of the > same input. Ah, OK, I think see the problem. By [**] I meant that if you sort two permutations of the same input file, and use the same random seed for both sorts, you'll get the same output. This is roughly the same as [*], then. I say "roughly" because it's not clear from either statement what should be done with ties. _______________________________________________ Bug-coreutils mailing list [EMAIL PROTECTED] http://lists.gnu.org/mailman/listinfo/bug-coreutils