> One more note on the issue of hashing functions and collisions. > > The number of "collisions" needs to be split from the number > of "probes". Once you have a collision, you may make multiple probes. > > The collision occurs when the initial spot is busy. > The probes are the lookups at subsequent locations. > > Linear probing suffers from two problems: > Clustering acts to increase collisions > Clustering forces a longer string of probes > > These are two different issues, but both afffect the running time > of your algorithm.
I had overlooked this distinction. When I correct for this, use a table of size 2003, and use linear probing with increment 1, I get about 280 collisions for the quiche.txt
