In article <[EMAIL PROTECTED]>,
Paul Crowley  <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] (Ian Goldberg) writes:
> > The expected number of collisions you get if you sample S items out of
> > a universe of size U (=2^N in the above case) is about (S^2)/U.
> 
> I know this is a month old but I'm only now catching up on the
> newsgroup.  I'd be surprised if the expected number isn't about
> (S^2)/2U where S is large but much smaller than U.  Where S is small
> you have to use S(S-1)/2U.  The source of error in this reckoning is
> that you treat every pair of members of S as independent, but it's
> very small when U is much bigger than S.
> 
> If the expected number of collisions is x, the probability of *no*
> collisions is I think roughly e^-x.

Looks right to me.
(I suspect Ian meant to absorb this small error in the word `about'.)

> If I keep generating new members of S from U until I get a collision,
> does anyone know the expected size of S when I succeed?

Yes.  \sqrt{\pi U / 2}.  In general, finding k collisions is expected
to take \sqrt{k \pi U / 2} trials.  The bible on this topic is
P. Flajolet and A.M. Odlyzko, ``Random mapping statistics'', EUROCRYPT '89.

Reply via email to