Re: Fisher-Yates shuffle

abigail Tue, 09 Apr 2002 09:26:07 -0700

On Tue, Apr 09, 2002 at 01:49:01PM +0300, Ilmari Karonen wrote:
> 
> On Fri, 5 Apr 2002 [EMAIL PROTECTED] wrote:
> > 
> >     sub shuffle {
> >         for (my  $i = @_;  $i;) {
> >              my  $j = rand $i --;
> >              @_ [$i => $j] = @_ [$j => $i]
> >         }
> >         @_;
> >     }
> 
> I could live with that.  To really go into the FAQ, though, it'd need an
> explanatory paragraph or two, about how it works and why.  And I'd also
> insist on using a while loop instead of for, even if it costs a line.


I think a while loop isn't right. You have an iterator with an obvious
starting value, which gets decremented with the same amount in each
iteration. If that isn't screaming 'for', you might as well start
deprecating 'for'.

> > Perhaps this is the best of both worlds:
> >     - It takes a list as argument.
> >     - It's in-situ.
> >     - In list context, it returns the shuffled list.
> >     - In scalar (and hence void) context, it doesn't require linear
> >       additional memory.
> 
> It doesn't, unfortunately, allow for the idiom one expects from a
> copy-and-shuffle function, namely
> 
>   my @shuffled = shuffle @original;
> 
> In fact, it's not really copy-and-shuffle at all -- it's an in-place
> shuffle with an optional *trailing* copy operation.  Not very useful,
> really.

It allows for:

    my @shuffled = shuffle 1 .. 10;

which in, my opinion, is very intuitive. If the function doesn't 
return a list in list context, you would have to write that as

    shuffle my @shuffled = 1 .. 10;

which not only is harder to understand, but makes people wonder
why one calls an array '@shuffled' when you assign an ordered
list to it. ;-)

> It does have an advantage over the FAQ solution in that it takes a list
> instead of an arrayref.  I might, in fact, prefer to omit the last
> statement entirely, so that the function does an in-place shuffle of its
> arguments and returns nothing in any context.

It will always return *something* in scalar or list context. You might
as well have it "do the right thing" in list context. Isn't that what's
Perl all about?  ;-)

> Putting these ideas together, I'd suggest something like this:
> 
> =pod
> 
> Alternatively, you may use this function, which shuffles its arguments
> in place.  The algorithm is known as a Fisher-Yates shuffle, and can be
> proven to produce a uniform distribution of permutations, provided that
> the random number generator is sufficiently random.
> 
>     sub shuffle {
>         my $i = @_;                     # length of @_ array
>         while ($i) {
>              my $j = rand $i--;
>              @_[$i, $j] = @_[$j, $i];   # 0 <= int($j) <= $i
>         }
>     }
> 
>     # usage examples:
>     shuffle @array;
>     shuffle $a, $b, $c;
>     shuffle my @shuffled = @original;
> 
> It may be educational to work out the proof of correctness by yourself
> -- to get you started, consider the part of the array below C<$i> as a
> pool from which elements get picked one at a time at random.  Note that
> it's surprisingly easy to introduce subtle bugs into the algorithm, for
> example by replacing C<$i--> with C<--$i>.  Can you see why?


I do think it needs a reference to Knuth [1]. Or to the original publication
of Fisher and Yates [2].

[1] D. E. Knuth: I<The Art of Computer Programming>, Volume 2,
    Third edition. Section 3.4.2, Algorithm P, pp 145. Reading:
    Addison-Wesley, 1997. ISBN: 0-201-89684-2.

[2] R. A. Fisher and F. Yates: I<Statistical Tables>. London, 1938.
    Example 12.



Abigail

Re: Fisher-Yates shuffle

Reply via email to