RE: Should MY:: be a real symbol table?

Brent Dax Sun, 02 Sep 2001 21:02:00 -0700
# -----Original Message-----
# From: Dan Sugalski [mailto:[EMAIL PROTECTED]]
# Sent: Sunday, September 02, 2001 1:37 PM
# To: Brent Dax
# Cc: Simon Cozens; [EMAIL PROTECTED]
# Subject: RE: Should MY:: be a real symbol table?
#
#
# On Sun, 2 Sep 2001, Brent Dax wrote:
# >
# > Perhaps I wasn't entirely clear.  I'm suggesting
# array-of-symbol-tables
# > instead of array-of-arrays.
#
# It is right now in perl 5, and that isn't changing. How do you think
# string eval finds lexicals in scope by name now? :)

        "As mentioned briefly above, the xcv_padlist element holds a
        pointer to an AV. This array, the padlist, contains the
        names and values of lexicals in the current code
        block...The first element of the padlist - called the
        'padname' - is an array containing the names of the
        variables, and the other elements are lists of the current
        values of those variables. Why do we have several lists of
        current values? Because a CV may be entered several
        times - for instance, when a subroutine recurses."

--from Simon Cozens's tutorial on Perl 5 internals.

What I'm suggesting is that, instead of the padlist's AV containing
arrays, it should contain stashes, otherwise indistinguishable from the
ones used for global variables.  This way we don't have to recode the
lookup or do incredibly fancy stuff to support %MY::.  It may even speed
up eval STRING or other situations in which lookup is at runtime.  And
we should be able to resolve at least to the HE level at compile-time,
if you're worried about it slowing down runtime.

However, we might even be able to do this without a padlist!  How?  By
using temp() on the MY:: stash.  Observe:

        sub factorial {
                my($x)=@_;
                if($x==1) {return 1}
                else {return $x*factorial($x-1)}
        }

This could be translated by the parser to something like this:

        sub factorial {
                temp($MY::x)=@_;
                if($MY::x==1) {return 1}
                else {return $MY::x*factorial($MY::x-1)}
        }

Assuming that each subroutine is given one MY:: stash, what
functionality has been lost?  $x gets a new SV for each call into
factorial() (via the call to temp()), and since each subroutine has its
own MY:: stash, any routines that might be called by factorial() won't
be able to see factorial()'s MY::.  Finally, this opens up the
possibility of easily-implemented static my() variables--just eliminate
the call to temp().

The basic principle behind this is to ask one simple question:  "how are
temp() variables different from my() variables?"  Ignoring closures and
implementation details (like that my() variables are stored in a pad),
the only difference is that temp() variables stay visible outside the
subroutine they were temp()orized in, while my() variables don't.  The
simple way to emulate this is to make sure that no subroutine can see
another's MY:: stash.  Since any subroutine can have my() variables, we
simply have to make sure each sub's MY:: stash blocks out each other
sub's.

There is a possible caveat with inner blocks--how does an outer block
get, er, blocked from accessing an inner block's my() variables?
However, I think this isn't really that big a problem, and can easily be
solved with properties:

        sub foo {
                # $x not declared yet, so it's out of scope

                if(@_) {
                        my($x);
                        # now $x is in scope

                        # when we re-enter foo(), $x is not in
                        # scope in there.
                        foo();
                        # but by the time we get back, $x is back
                        # in scope again.
                }

                # $x is out of scope again
        }

becomes

        sub foo {
                temp($MY::x);
                $MY::x is out_of_scope();
                #since $MY::x is marked as out of scope,
                #attempts to access it in this block are
                #unsuccessful...

                if(@_) {
                        temp($MY::x);
                        # but in here, it's temp()ed again
                        # and the out_of_scope property is
                        # not in effect.

                        # when we re-enter foo, $x will be
                        # temp()ed again and the new $x
                        # will be marked as out of scope...
                        foo();
                        # but by the time we get back, we've
                        # got our $x back and we can access
                        # it.
                }

                # the inner temp is out of effect, so once
                # again $x is marked as out of scope.
        }

I don't think the argument that my() variables should be faster than
our() variables is a good one.  I see this as a side effect of the
current implementation, not a feature that I should base my decision to
use my() on.  I use my() variables because they're safer and easier to
use, not because they're faster.

--Brent Dax
[EMAIL PROTECTED]

"...and if the answers are inadequate, the pumpqueen will be overthrown
in a bloody coup by programmers flinging dead Java programs over the
walls with a trebuchet."
RE: Should MY:: be a real symbol table?

Reply via email to