Lexical variables, scratchpads, closures, ...

Jerome Vouillon Wed, 31 Jul 2002 09:04:30 -0700


Let us think a bit about the implementation of lexical variables.


Assignement

  First, let us consider how to compile a variable assignement such
  as:
     $x = $y
  where both $x and $y are lexical variables.  At first, one may think
  that this can be compiled simply into a register assignment:
     P0 = P1
  where P0 and P1 are the PMC registers standing respectively for $x
  and $y.

  But, I we look at the example below, we see that this will not work
  in general.
      sub foo {
        my $x = 13;
        my $y = 17;
        return (sub { print "$x\n"; },
                sub { $x = $y; });
      }
      ($print, $assign) = foo();
      &$assign();
      &$print();
  Indeed, the returned subroutines will not have access to the registers
  nor to the stack frame of the subroutine "foo".  This implies that
  variables must be allocated in the heap.

  It turns out that PMCs provide the right operations to implement
  variables:  we can compile
       my $x = 13;
       my $y = 17;
       $x = $y
  into
       new P0, .PerlInt          # Create e new integer variable ($x)
       set P0, 13                # Set its value to 13
       new P1, .PerlInt          # Create e new integer variable ($y)
       set P1, 17                # Set its value to 17
       set_pmc P0, P1            # Assigns the value of $y to $x
  ("set" invokes the "set_integer_native" method;
   "set_pmc" is not implemented yet, but is mentioned in PDD02.)

  So, we should really view a PMC not as a Perl value, but
  rather as a Perl variable.

:= operator

  Actually, things are a bit more complicated if we take the :=
  operator into account.  There seems to be two ways to implement it:
  - use an "alias" PMC which forward all method calls to another PMC:
    the operation "$x := $y" would then turns $x into an "alias" PMC
    pointing to $y;
  - implement a variable not as a PMC, but as a heap-allocated pointer
    to a PMC.

  The first possibility is probably more efficient because we avoid
  some memory allocations and indirections, but it is more complex to
  implement.  So I will only consider the second one.

Scratchpads

  We need to allocate an area in the heap for each lexical variable.
  Instead of allocating this area one variable at a time, we can
  allocate a single "scratchpad" value for all variables of a block:
  this is more efficient.

  The compiler can keep track of the index of the variables in the
  scratchpad.  So, the scratchpad can be implemented as an array of
  PMCs. (We will probably need some faster opcodes to access the
  array: there is no need to perform a method call, nor to do any
  bound checking.)

MY

  To implement MY, we need to be able to access to a variable by its
  name.  For this, the compiler can generate statically a hash mapping
  the variable name to its index in the scratchpad.  Then,
  MY{"foo"} will look up the index of the variable "foo" in the
  hash and use the index to get the variable PMC in the scratchpad.

  To access the scratchpad of an outer block, we need a way to move
  from a scratchpad to its parent scratchpad.  This is trivial if we
  always set the field 0 of a scratchpad to point to its parent
  scratchpad.  Likewise, we need a statically generated linked-list of
  hashes, which describe each scratchpad.

Closures

  A subroutine must have access to the scratchpads of all the
  englobing blocks.  As the scratchpads are linked, it is sufficient
  to add a pointer to the immediately englobing scratchpads to the
  closure (Sub class).

  Then, the exemple

      sub foo {
        my $x = 13;
        my $y = 17;
        return (sub { print "$x\n"; },
                sub { $x = $y; });
      }

   would be compiled into

      foo:  # We assume the closure is in P0
            # We extract the parent scratchpad from the closure and put it
            # in P1
            get_pad P1, P0
            # We allocate a new scratchpad
            new P2, .Array
            # The first field of the scratchpad contain the parent scratchpad
            set P2[0], P1
            # We allocate the $x variable
            new P3, .Int
            set P3, 13
            set P2[1], P3
            # We allocate the $y variable
            new P4, .Int
            set P4, 13
            set P2[2], P4
            # We create the first closure
            new P5, .Sub
            set_code P5, sub1
            set_pad P5, P2
            # We create the second closure
            new P6, .Sub
            set_code P6, sub2
            set_pad P6, P2
            # We put them into an array
            new P0, .Perlarray
            set P0[0], P5
            set P0[1], P6
            # We return the array
            ret
      sub1: # We assume the closure is in P0
            # We extract the parent scratchpad from the closure and put it
            # in P1
            get_pad P1, P0
            # We get the $x variable and put it in P2
            set P2, P1[1]
            # ... we print the variable value
            ret
      sub1: # We assume the closure is in P0
            # We extract the parent scratchpad from the closure and put it
            # in P1
            get_pad P1, P0
            # We get the $x and $y variables
            set P2, P1[1]
            set P3, P1[2]
            # We assign the value of $y to $x
            set_pmc P2, P3
            ret

Conclusion

  It seems to me that to implement lexical variables, we only need to
  implement the set_pmc method and to extend the Sub class so that it
  contains both a code pointer and a scratchpad.

  In particular, we don't need any special support for scratchpads, at
  least for the moment.

  What do you think?

-- Jerome

Lexical variables, scratchpads, closures, ...

Reply via email to