Finn Bock wrote:
I would guess that doing ~6 string compares to navigate the binary tree (with 148 color keywords) is slower than one string hash, ~1.2 int compares and one string compare. But I haven't measured it, so you might be well be right. Many keyword sets for other properties are much smaller and could perhaps benefit from a more suitable collection type.
[J.Pietschmann]
I meant setup effort, although a binary tree will most likely do additional memory management. You are right about the lookup. Just for curiosity, where do you get the 1.2 int comparisions? A perfect hash should not have collisions.
I was comparing a standard HashMap with your binary tree. A perfect hash would likely have a more complicated hash function and of course zero int compares.
It might also be interesting how a trie or ternary tree (as used for hyphenation patterns) would compare to hash maps for keywords (in terms of setup costs, lookup costs and memory). I have doing a study of various Java implementations on my todo list but didn't quite get around to do this.
Very interesting indeed.
regards, finn