Stupid Hash Tricks

Nutter, Mark Fri, 20 Apr 2001 06:10:07 -0700
Here's some real basic info about hashes...may be useful to newbies...

@ary = ('a', 'b', 'c', 'b', 'a');
%hsh = (); # empty hash

foreach $item (@ary)
{
        $hsh{$item} = 1;
}

@ary = keys %hsh;

What does @ary contain now?

You can think of a hash as being like an array that is indexed by strings
instead of by numbers.  A useful side effect of this approach is that the
keys of the hash function like a set -- each key appears once and only once.
So, in the code above, we have an array which may contain duplicate entries
(in this case, 'a' and 'b' are duplicated).  We also have an empty array.
In the foreach loop, we add each array item as a new item in the hash, with
a value of 1.  Following each item individually:

$item  %hsh              #comment
a     (a=>1)
b     (a=>1, b=>1)
c     (a=>1, b=>1, c=>1)
b     (a=>1, b=>1, c=>1) # sets the 'b' item to one again
a     (a=>1, b=>1, c=>1) # sets the 'a' item to one again

Every time you use the same key, you are accessing the same hash item--no
matter how many times 'a' appears in the original array, the hash will have
only one entry for 'a'.  Thus, by using the code above to put array items
into a hash, we are eliminating all duplicates from the original array.

   bash$ perl -e '
   > @ary = ('a', 'b', 'c', 'b', 'a');
   > %hsh = (); # empty hash
   >
   > foreach $item (@ary)
   > {
   > $hsh{$item} = 1;
   > }
   >
   > @ary = keys %hsh;
   >
   > print (join "\n", @ary);'
   a
   b
   c

Now, here's an exercise for the beginner.  What does the "mystery_sub"
routine do in the following code?

# Given:  @a is an array which contains no duplicate entries
# Given:  @b is an array which contains no duplicate entries

sub mystery_sub
{
        # note: in a real program we'd pass @a and @b in as arguments
        # to the function, but we're keeping this simple so we'll
        # just use @a and @b as global variables

        foreach $i (@a, @b) # did you know you can combine arrays like this?
:)
        {
          $h{$i}++;
        }
        @result = ();
        foreach $k (keys %h)
        {
          $h{$k}--;
          push @result, $k if $h{$k};
        }
        return @result;
}
Stupid Hash Tricks

Reply via email to