On 03/22/2010 02:50 PM, Daniel Keep wrote:

How about just *fixing* the hashtable so it doesn't generate false
pointers in the first place?  Maybe I'm just strange, but that seems
like a more direct, efficacious solution...

This was of course the first thing Walter and I discussed. It turns out that that would necessitate precise GC, which we don't have yet.

As for the redundancy, I was under the impression that the hash was
cached so make resizing more efficient: if the number of buckets
changes, you have to re-insert all the nodes.  If you don't store the
hash, you have to recompute it (which is expensive for anything other
than ints (wherein D laughably 'hashes' by returning the ints value)).

What would be a better hashing scheme for ints?

Given that I already think this is a silly way to go about solving the
false pointer issue, I don't see any benefit to doing this other than to
give the CPU something to do while it waits for memory reads.

Sadly, I lack the background to make any sort of objective judgement of
the hash function *itself*, so I'll just reiterate what I've heard
repeated to me over and over again by numerous people: D's builtin hash
function sucks like a black holes.

Then you might be glad to hear that Walter has recently improved the default hashtable implementation significantly (at least for heavy-duty applications).

We're using Paul Hsieh's hash function for strings and general memory, which is no slouch.

http://www.dsource.org/projects/druntime/browser/trunk/src/rt/util/hash.d

Better suggestions are always welcome. For integrals I'm unclear on what we could use to make things better. (Clearly we could and should get rid of the extraneous field.)


Andrei

Reply via email to