"spir" <denis.s...@gmail.com> wrote in message news:mailman.2909.1301443345.4748.digitalmars-d-le...@puremagic.com... > On 03/30/2011 01:24 AM, Nick Sabalausky wrote: >> My understanding of hash tables is that they allocate a fixed size array >> and >> map keys to indicies within the range 0..predefined_length_of_the_AA. >> >> So I've been wondering, how many elements do D's built-in AAs have? And >> what's the content of each one, just a single pointer? > > Each element is a data structure, often called bucket (typically a link > list), storing (key:value) pairs for which the key, once hashed and > modulo-ed, maps to the given index. That's why the famous O(1) lookup time > for hash tables is very theoretic: the constant part holds average time > for hashing which is very variable, plus another average time for linearly > traversing the bucket. The latter part depends on table load factor (= > number of elements / number of buckets) and proper statistical repartition > of keys into buckets. > > The key (!) points are finding a good hash func to "linearize" said > repartition (which depends on actual key-data domain, not only on their > type...), but better ones rapidly eat much time (ones used in practice are > rather simple & fast); and finding optimal load factor, and growth scheme. > In practice, all of this tends to make hash tables an implementation > nightmare (for me). I'd love to find practicle alternatives, but efficient > binary trees also are complex and even more depend on kind of keys, I > guess. >
Right, I know that, but that's not what I was asking. Take this hypothetical implementation: struct Bucket(TKey, TVal) { ... } enum hashTableSize = ...; struct Hash(TKey, TVal) { Bucket!(TKey, TVal)[hashTableSize] data; TVal get(TKey key) { ... } void set(TKey key, TVal value) { ... } } I assume that D's AA are something at least vaguely like that. My questions are: 1. What does D use for "hashTableSize"? Or does "hashTableSize" vary? If it varies, what's a typical rough ballpark size? (And just out of curiosity, if it varies, is it decided at compile-time, or does it change even at runtime?) 2. What is "sizeof(Bucket(TKey, TVal))"? And I mean the shallow size, not deep size. Is it dependent on TKey or TVal? Or is it just simply a pointer to the start of a linked list (and therefore "sizeof(size_t)")?