rpjday wrote:
>
> actually, if you examine the algorithm, the probability that line n
> will be chosen as the "new" random line as you read through the file
> is precisely 1/n, so all lines are equally likely.
I don't think that is what is happening here. I don't think that all
lines are equally likely. To me and maybe I am being obtuse here (it has
happened before) only the probability that the last one will be
overwritten is 1/n and n changes for each line read. In fact, it may
approach equality ( don't really know the math). Now as I look at it it
would seen that it would be weighted the other direction than I
originally thought. It would be very hard to get the first line since
each line after it has a 1/n chance of overwriting it and it can never
be gotton back. That means that even after the second iteration, there
is a .5 probabliity that it would still be there and after the 3rd a 0.5
* 0.333 = 0.1665 probability that it would still be there ... I will
grant that the last one has a 1/n chance though.
It has bee a really long time since my one stats class but something
just does not look quite right to me
01 #!/usr/bin/perl
02
03 srand;
04 while (<>)
05 {
06 if (rand($.) < 1)
07 {
08 $line = $_;
09 }
10 }
11 print $line;
Thanks for your patience.
Bret
_______________________________________________
Redhat-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/redhat-list