On Wed, Aug 17, 2011 at 12:40 PM, Andrew Wiley <wiley.andre...@gmail.com> wrote: > > > On Tue, Aug 16, 2011 at 7:29 PM, bearophile <bearophileh...@lycos.com> > wrote: >> >> Walter Bright: >> >> > > I think there are search trees like the Red-Black ones that guarantee >> > > a O(n ln n) worst case. I am wrong? >> > >> > Just feed it more data. >> >> If you feed it more data, even if all items pruce collision because they >> all hash to the same bucket, if you use Red-Black trees to handle the items >> in the same bucket you keep having a O(n ln n) behaviour, that's usually >> fast enough. With Python and the new D AAs you instead get a quadratic one. >> This quadratic behaviour gives troubles way before the physical RAM is >> exhausted. > > The problem here is that algorithmic complexity doesn't really tell the > whole story. We can talk about what "should" be faster all day, but unless > we can prove that there's really a problem here, this doesn't seem like it's > worth worrying about. > And if it was changed in the past because the linked lists turned out to be > faster, I'd guess that actual benchmarking was probably used back then to > determine which was better. >
There's a lot more to making a hash-map robust than changing the lookup scheme. There's also a large variety of collision resolution methods that all have their pros and cons depending on the data-set and lookup patterns you're dealing with. For a built in AA system I feel simplicity is a much more achievable goal than trying to solve everybody's specific problems. Hence I feel that the current solution is probably a good choice. Just my two cents anyway. Josh.