> On May 10, 2014, at 11:38 PM, Robbert van Dalen <[email protected]> 
> wrote:
> 
> 
>> On 11 May 2014, at 02:00, Mark van Gulik <[email protected]> wrote:
>> 
>> For simplicity, consider tuples and variables.  A variable containing a 
>> tuple is still the same variable if you change what tuple it contains.  So 
>> equality for variables is based on the variables' identity.
> 
> how is the hash of a variable determined?
> does it only depend on the variables’ identity (aka, its memory location), or 
> is it also dependent on its content?

A variable's hash value is permanent, and can be considered to have been 
generated randomly when the variable was created.  However, most variables 
never get put into a hashed structure, so we actually postpone generating the 
hash until someone requests it.  We also generate (and save into a dedicated 
slot of the variable) the hash for the relatively infrequent case that a 
variable becomes shared.  If we didn't do this we might have two fibers (bound 
to two distinct threads) attempting to fetch the hash value simultaneously.  
Each would see that the hash slot was not yet occupied and would generate, 
cache in the slot, and return a different hash value.  It would require using a 
lock in the hash operation to avoid this race, so we avoid the cost by simply 
generating a hash, if one hasn't already been computed, at the last moment 
before the variable becomes shared.

So to directly answer your question, the hash of the variable is random but 
permanent, and is not affected by the variable's current content.


>> So combining these principles, a tuple of variables has equality semantics 
>> that depend on the identities of the variables.  Two tuples of variables are 
>> the same tuple of variables if they refer to the same (identical) variables 
>> in the same positions.  So in one sense the tuple has "identity", in that 
>> its equality depends on identity of its components, but in a stronger sense 
>> a tuple is really only relying on equality of its components (which in this 
>> case was equivalent to identity).
> 
> what happens with the hash of set - that contains a variable that has been 
> mutated?

Viewing the system as variables and immutable objects, and ignoring the mutable 
optimization, the hash of a value *never* changes.  Similarly, the equality or 
inequality of two objects never changes.

However, if we take into account our mutable object trick, a hash value can 
change -- precisely when that can't be detected because an object's storage has 
been recycled because the last reference to it was dropped.  You can think of 
the mutable object mechanism as being a memory management trick.  In fact, any 
garbage collected system already has a level of description at which storage 
for an object is recycled, such that new objects can occupy the storage that 
old ones used to occupy, and apparently breaking basic principles of equality, 
identity, and hashing.  That's not the right level of description, just like 
saying that the electrons flowing through the transistors that make up a 
flip-flop (or alternatively the electron holes migrating backwards) don't 
preserve the identity of the ones and zeros.  The simplifying assumption of a 
macro-state (only two of which exist for the flop-flop) is the same kind of 
abstraction that we use when talking about the semantics actually exposed to 
the Avail programmer.  In essence, the Avail VM *implements* a higher level set 
of semantics, that in which only variables and completely immutable objects can 
be discussed meaningfully.

So if an immutable set contains some variables, it will always contain those 
variables.  And since the variables are the same variables even after having 
different values assigned to them, the set remains unchanged.

An alternative implementation for variables might clarify things further.  Say 
that instead of a slot within the variable for its current value, we instead 
used a giant, global, mutable Java Map<> from each variable to its value.  
Using that mechanism we wouldn't expect that binding a new value to a variable 
via this Map<> would cause a variable to now be a different variable.  And yet 
this implementation should have the same semantics as our actual implementation 
in which a slot is reserved within each variable to hold the value that it's 
bound to.  Maybe the appropriate mental model is that the variable's current 
value slot isn't considered part of the variable, but is just an implementation 
trick that allows us to look up the value that Avail currently has bound to 
that variable, more efficiently than with a global Map<>.

Reply via email to