RE: Stupid Hash Tricks

Nutter, Mark Fri, 20 Apr 2001 10:22:41 -0700
> Ok I know what it does ('cause I ran it, see below) but I 
> still don't fully
> understand how.  

Well, it's a trick, based on the "givens" that neither array contained any
duplicates.  If each item appears at most once per array, then all we need
to do is count the number of times each item appears:

  $h{$i}++;  # keep a running count of how many times
             # we've seen this item

Any item that appears in only one array will have $h{$i} equal to 1, and any
item that appears in both will have $h{$i} equal to 2.  The next trick
relies on the fact that in Perl "0" is the same is "false" and "1" (or any
other non-zero value) equates to "true."  If each item in %h has a value of
either 1 or 2, and we subtract one from each value, we get results of 0
(false) or 1 (true).  So in the second foreach loop, we do the subtraction,
then use Perl's logical shorthand to test whether or not to keep a
particular item:

foreach $k (keys %h)
{
        $h{$k}--; # turn 1 -> 0 (i.e. "false"), and
                # turn 2 -> 1 (i.e. "true")
        push @result, $k      # keep current key IF
           if $h{$k};
}

The push adds $k to the @result array, but the "if" qualifier makes sure
this only happens if $h{$k} is non-zero, which will only happen for items
that were in both arrays.  It could be written equally well as

        if($h{$k})
        {
          push @result, $k;
      }

but the shorthand method is a common (and handy) Perl idiom.

Of course, if you change just one word, you can turn mystery_sub into a
function that returns all items that appear in @a OR in @b, but not in
both...

> Also can you give a little insight into 
> passing arrays to
> subroutines/functions.  I can pass them alright but have 
> problems accessing
> them.  I use $_[0] but it doesn't seem to work for arrays. 

The catch here is that Perl does not support arrays of arrays, at least not
without using references.  Here's an example:

@a = ('a', 'b', 'c');
@b = (1, 2, 3);
@c = (@a, @b);

How many items does @c contain?  Here's the trick:  @c does *not* contain 2
arrays.  @c contains six items, 'a', 'b','c',1, 2, 3.  When you combine 2
arrays, the result is a single array that contains all the items that were
in the original 2 arrays.

To get a little more specific, you need to understand the difference between
an array, and a list.  They're not quite the same.  An array is a variable
that holds a list, and a list is the "value" of an array.  A list plus a
list equals a big list -- the second list is just tacked on to the end of
the first.  Consider the statement @c = (@a, @b);  From the compiler's point
of view, it looks something like this:

<array> = ( <array>, <array> )

The <array> on the left is a place to put a list.  To find out what list to
put there, the compiler has to evaluate (i.e. compute the value of) the
right side.

( <array>, <array> ) looks like a list of arrays, but you can't have a list
of arrays because an array is just a container.  So the compiler has to get
the *value* of each array, and the value of an array is a list.  Thus, we
can reduce our original statement to:

<array> = ( <list>, <list> )

A list, in turn, can be evaluated as a sequence of items, i.e. <item>,
<item>, <item>...  This turns our expression into

<array> = ( <item>, <item>, <item>, <item>, <item>... )

Can you tell which items came from the first list and which came from the
second?  Neither can Perl!  There is no way--at this point it's all one big
list.  So what happens when you try to pass arguments to a subroutine?  As
you know, whenever you call a subroutine, Perl assigns all the arguments to
the @_ array.  In effect, Perl does an implicit statement like this:

@_ = (@arg1, @arg2, ...)

But as we've just seen, this destroys any distinction that may have existed
between @arg1 and @arg2.  All the array items get smooshed together into one
big array.  You can have all your items, but unless you happen to know for
sure exactly how many items are in each array, you won't be able to
reconstruct the original arrays from inside your subroutine.

The solution to the problem is to use references, which is a topic in and of
itself, but here's a quick-n-dirty preview:

@a = qw(a b c);
@b = (1, 2, 3);

any_sub(\@a, \@b); # *references* to arrays!

sub any_sub
{
  my @arg1 = @{$_[0]}; # de-reference first ref
  my @arg2 = @{$_[1]}; # ditto
  foreach $i (0..2)
  {
    print $arg1[$i], ", ", $arg2[$i], "\n";
  }
}

The backslash in front of the @ tells Perl to make a reference, or pointer
to the array.  A reference is a single item, or "scalar" in Perl lingo, so
you can pass as many array references as you like without Perl jumping in
and mucking about with the contents of your arrays.  To turn an array
reference back into an array, surround it with @{}:

@a = qw( a b c );
@b = @{\@a};  # a silly thing to do, but it shows
              # how to turn an array reference
              # back into an array.
RE: Stupid Hash Tricks

Reply via email to