On 03/23/2010 08:11 AM, Michael Rynn wrote:
On Mon, 22 Mar 2010 16:59:36 -0500, Andrei Alexandrescu wrote:

On 03/22/2010 04:03 PM, Andrei Alexandrescu wrote:
On 03/22/2010 03:36 PM, Walter Bright wrote:
Andrei Alexandrescu wrote:
Better suggestions are always welcome. For integrals I'm unclear on
what we could use to make things better. (Clearly we could and should
get rid of the extraneous field.)

Unfortunately, it won't be much of a win. Memory allocation is done in
buckets of size 16, 32, 64, etc. Reducing the node size for a
uint[uint] from 16 to 12 saves no memory at all.

As we discussed, if nodes are allocated in bunches, you could store 5
nodes in a 64-byte block instead of 4. That's more than a 25% increase
in effectiveness because the per-block bookkeeping is also slashed by
5.

Andrei

One more idea before I forget: the bunches approach requires using a
freelist for nodes that are available but not used. (Freelists are a
good idea whether or not bunches are used.)

One good possibility is to store the head of the freelist in a static
variable, such that e.g. all int[string] instances use the same
freelist. And there's no thread contention issue because each thread has
its own freelist.


Andrei

I just committed a aaA.d version that uses some heap node memory
management, although its on a per class instance basis. Also it dispences
with a separate hash storage for keys<= size_t.  Sharing would mean some
working out of which classes share the same sized node blocks.
Much easier to implement class sharing using templates.

See the code here : http://www.dsource.org/projects/aa/browser/trunk/
druntime/aaA.d

See the benchmarks and comments here : http://www.dsource.org/projects/aa/
wiki/DrunTimeAA.

The result squeezes some extra performance for integer or less sized keys
(about 20% faster for lookups).


For the druntime, the generic C interface constrains the kinds of
specialized AA versions that can be instantiated using run-time TypeInfo
and var-arged calls. Maybe a class / interface direct calls would work
better. Just imagine at runtime looking at the TypeInfo_AssociatedArray
and trying to work out exactly which template is going to be instantiated.

And having a low code size, one or few versions fit  all approach, that
performance sacrifice is unavoidable in a basic runtime libary.

But in template land , options and combinations and gains abound.

Absolutely!

This is solid work. It would be interesting to assess how your implementation and the old built-in hashes compare with Walter's new implementation, which uses a singly-linked list instead of a tree. Do you have what it takes to check out and build druntime and phobos?


Andrei

Reply via email to