Re: OrderedTable is not an ordered table

lightness1024 Wed, 21 Mar 2018 12:30:05 +0100

@cblake - ok perfect. also naming things search-aware is important so I concur.

The python's OrderedDict: I didn't know that, it's a good thing nim is not
alone then. And in fact the presence of this is enough to not go and rename it.
(less disturbance)

In C++ the set/map is usually implemented with a red black tree. It fits a
category which the language lawyers call a "node base container". Sorts are
log(N) and happens at each operation that must trigger a rebalancing of the
tree. (insert/remove)

The undered_set/unordered_map stuff are hash tables, implemented with an
index(hashes)-array of pointers to linked lists. This allows for stable memory
positioning of the values during the needs for rehashings. (decoupling index
from values) Which allows operations to be "non iterator invalidating".

The hashes are not bound to the table's size, which means that they pass
through a modulo operator before being used as indices. The statistics say that
modulo is unfair in most cases, but the most fair for prime numbers. Which
means that the "buckets" (hash table reservation size) are maintained and
resized by jumps to the next reasonable prime number.

A problem of this scheme is that memory gets fragmented horribly, requiring
allocators for serious work. I have written an open address hash map (here
[https://sourceforge.net/projects/cgenericopenaddresshashmap/](https://sourceforge.net/projects/cgenericopenaddresshashmap/))
that allows to keep everything in a flat buffer. This has always better
performance from small types.

The nim implementation appears to require powers of two for the table size,
which means that the modulo must be done by discarding the most significant
bits of the hashes, which loses entropy compared to the prime numbers solution.

Re: OrderedTable is not an ordered table

Reply via email to