Thank you very much for your explanation! I made a mistake that I said the hash value should be recalculated each time the dict resize, what I wanted to say was that the position of each item should be recalculated.
Maybe I should take a look at the source code of python dict some day. I'll some here for help then. Thanks again! Tim Peters wrote: > [EMAIL PROTECTED] > > Python dict is a hash table, isn't it? > > Yup. > > > I know that hashtable has the concept of "bucket size" and "min bucket > > count" stuff, > > Some implementations of hash tables do. Python's does not. Python's > uses what's called "open addressing" instead. > > > and they should be configurable so I can set them to the proper value > > when I know how many items I'm going to handle. > > > > If these two values can't be set, the hashtable will give them default > > values. When there are more and more items being added to the > > hashtable, it increase its buckets and copy the old items into the new > > one and re-calculate the hash value of each item. > > That's several distinct claims, each of which is true of some hash > table implementations but not of others. Python's implementation has > no "buckets", so all that's simply irrelevant. Python's > implementation (not all do) saves the hash value of each item, so when > it does need to grow (or shrink) the space allocated to the table, it > does not need to recalculate the hash value of any item. > > You should also note that copying a dict key or value (no matter of > what type) consists in its entirety of copying one machine address (a > 4- or 8-byte pointer, depending on platform). > > > I think this will waste some time doing the increasing-copying thing. > > It does take time to resize. "Waste" is a value judgment ;-) > > > If I know I'm going to handle about 20000 items, I can set the size of > > hashtable to 30000. > > > > So, can I do this in python? > > You can't. Unless you build up to 20000 items over and over (many > thousands of times), it's unlikely you'd be able to measure the time > difference even if you could. -- http://mail.python.org/mailman/listinfo/python-list