davidl:
> Something like following can substitute dynamic array:
> struct da(T){
>      T* ptr;
>      int len, capacity;
>      T opIndex(int i){}
> ...

On 32 bit systems that struct is 3*4 bytes long, but the current GC allocates 
small sizes using blocks of memory long as powers of 2, so I think that struct 
uses 4*4 bytes anyway. So the 4th word can be used for something else. A good 
way to use it may be to store there a hash value (by default is 0, that means 
"hash not computed yet").
Hash values are really useful, especially for immutable arrays/strings, because 
for example to tell two strings are equal you can first compare their pointer 
and length, and then their hash values. If the hash values are present, then 
most of the times you can tell if they are different.
Having hash values is very useful for AA keys/set items too, because for 
example to compute the intersection among things that have a hash value, you 
can use the hash value to see if they are different in a very quickly way. So 
your set operations become very fast (this is how Python set operations work).
Hash values for strings can be useful for the GC too. Some people have shown 
that in long-running Java programs many strings are duplicated. So if you have 
immutable strings you can store only one copy of a string, saving lot of 
memory. The GC can perform such compaction using the hash values to speed up 
the tests necessary to see if two strings are different.

Bye,
bearophile

Reply via email to