Re: Hashes, Stringification, Hashing and Strings

Mike Lambert Tue, 16 Apr 2002 10:40:29 -0700

Speaking of which, how do we ensure the immutability of keys being put
into the hash? I think Perl copied the string, so that:

$b = "aa";
$a{$b} = 1;
chop $b;
print $a{"aa"};

still works.

If we start storing full thingies into the keys of a hash, we either need
to make deep copies of these, or copy enough to ensure the hashing
function has all that it needs.

Say we do:
$b = new Cat();
$a{$b} = 1;
$b->somefunctionthatchangesthehashvalue();
$a{$b} doesn't find anything, since $b was hashed under it's old identity.

Also, Perl allowed this before:
$b = "aa";
%a{$b} = 1;
print $a{"aa"};

Will it allow this?
$b = new Cat();
%a{$b} = 1;
print $a{new Cat()};

If it does, how does it determine equality of two objects? What if Cat
includes a counter that determines which number it is in some total
ordering of cats. Will the above still work?

Java had all sorts of problems where they force immutability of an object
onto the user of their Collection API. It's a pain in the ass, and the
source of stupid bugs, where your object changes it's hash value without
being rehashed. How will perl handle this?

If it automatically rehashed objects in whatever tables they are stored in
(forget the speed hit for now), then we still have the problem of:

$a = "a";
%c{$a} = 1;
$b = "aa";
%c{$b} = 2;
chop $b;
print %c{"a"}; #what gets printed?

According to perl 5 semantics, this would print 1. But according to the
rehashing rule above, this is ambiguous. So that's not an option.

I personally liked the stringification of keys. It made things a LOT
simpler. :)

Finally, one last option is to hash based on some memory address, so that
when we store objects in there, we can be assured of no two of them
pointing to the same place, or worrying about the hash function changing.
Assuming we work around our copying GC some way to get a unique object
identity value, we still have the problem of two equivalent strings
hashing to different places, or two equivalent arrays hashing to different
values. I can make a case for the latter being a good thing, but not the
former, as it breaks perl5 semantics.

Two options I see are:
a) stringify everything, and make life simpler. I never really had a
problem with stringified keys...
b) strings get hashed like perl5. everything else gets hashed based
on some unique object identifier (memory address) that's constant
throughout the life of the object. This means that in order for an object
to be able to have equality work in hashtables, from two objects
constructed at different points in time, they'd need to overload the
operator:hash() function to return a string.

Or has this matter been thought through already?

Thanks,
Mike Lambert

Aaron Sherman wrote:

> Date: 16 Apr 2002 12:13:55 -0400
> From: Aaron Sherman <[EMAIL PROTECTED]>
> To: Perl6 Language List <[EMAIL PROTECTED]>
> Subject: Hashes, Stringification, Hashing and Strings
>
> In this example:
>
>       %hash = ($a=>$b);
>
> $a can be anything. In fact, since Perl6 promises to retain the original
> value of $a, we're rather encouraged to store complex data there. But,
> this poses a problem. The key to use for hashing might not ideally be
> the string representation.
>
> For example, if I'm hashing a collection of special file objects, it
> might be useful in one context to do:
>
>       method operator:qq () {
>               return read_whole_file();
>       }
>
> But, if I have 100 of these objects, and I try to put them in a hash,
> life could get ugly fast!
>
> More helpful would be to have this builtin:
>
>       method operator:hash () {
>               return operator:qq();
>       }
>
> and then allow me to override it in my class like so:
>
>       method operator:hash () {
>               return filename();
>       }
>
> This allows me to specify separate hashing and stringification methods,
> but retains Perl's original default of combining the two.
>
>
>

Re: Hashes, Stringification, Hashing and Strings

Reply via email to