One question about the names you generate for the function table in combination with opcode caches.

Let's assume I have APC installed, and do the following:

foo.php:
$foo = function() {
  echo "foo";
}

bar.php:
include('foo.php');

All works fine, and cached versions of both files would be created.


However, if I now change bar.php to this:
$bar = function() {
  echo "bar";
}
include('foo.php');

wouldn't I get a "cannot redefine function __compiled_lamda_0" fatal, if foo.php was loaded from the opcode cache?


While this is a constructed example, it could easily occur with conditional includes with environments that use opcode caches.

Or am I completely wrong? :p

- David



Am 22.12.2007 um 16:08 schrieb Christian Seiler:

Hi,

I was following this thread and came upon Jeff's posting on how closures
could be implemented in PHP.

Since I would find the feature to be EXTREMELY useful, I decided to
actually implement it more or less the way Jeff proposed. So, here's the
patch (against PHP_5_3, I can write one against HEAD if you whish):

http://www.christian-seiler.de/temp/closures-php-5-3.patch

I started with Wez's patch for adding anonymous functions that aren't
closures. I changed it to make sure no shift/reduce or reduce/reduce
error occur in the grammar. Then I started implementing the actual
closure stuff. It was fun because I learned quite a lot about how PHP
actually works.

I had the following main goals while developing the patch:

1. Don't reinvent the wheel.
2. Don't break anything unless absolutely necessary.
3. Keep it simple.

Jeff proposed a new type of zval that holds additional information about the function that is to be called. Adding a new type of zval would need
changes throughout the ENTIRE PHP source and probably also throughout
quite a few scripts. But fortunately, PHP already posesses a zval that
supports the storage of arbitrary data while being very lightweight:
Resources. So I simply added a new resource type that stores zend
functions. The $var = function () {}; will now make $var a resource (of
the type "anonymous function".

Anonymous functions are ALWAYS defined at compile time, no matter where they occur. They are simply named __compiled_lamda_1234 and added to the
global function table. But instead of simply returning the string
'__compiled_lambda_1234', I introduced a new opcode that will create
references to the correct local variables that are referenced inside the
function.

For example, if you have:

$func = function () {
 echo "Hello World\n";
};

This will result in an anonymous function called '__compiled_lambda_0'
that is added to the function table at compile time. The opcode for the
assignment to $func will be something like:

        1  ZEND_DECLARE_ANON_FUNC     ~0      '__compiled_lambda_0'
        2  ASSIGN                             !0, ~0

The ZEND_DECLARE_ANON_FUNC opcode handler does the following:

It creates a new zend_function, copies the contents of the entire
structure of the function table entry corresponding to
'__compiled_lamda_0' into that new structure, increments the refcount,
registeres it as a resource and returns that resource so it can be
assigned to the variable.

Now, have a look at a real closure:

$string = "Hello World!\n";
$func = function () {
 lexical $string;
 echo $string;
};

This will result in the same opcode as above. But here, three additional
things happen:

1. The compiler sees the keyword 'lexical' and stores the information,
that a variable called 'string' should be used inside the closure.

2. The opcode handler sees that a variable named 'string' is marked as
lexical in the function definition. Therefore it creates a reference to
it in a HashTable of the COPIED zend_function (that will be stored in
the resource).

3. The 'lexical $string;' translates into a FETCH opcode that will work in exactly the same way as 'static' or 'global' - only fetching it from
the additional HashTable in the zend_function structure.

The resource destructor makes sure that the HashTable containing the
references to the lexical veriables is correctly destroyed upon
destruction of the resource. It does NOT destroy other parts of the
function structure because they will be freed when the function is
removed from the global function table.

With these changes, closures work in PHP.

Some caveats / bugs / todo:

* Calling anonymous functions by name directly is problematic if there
  are lexical variables that need to be assigned. I added checks to
  make sure this case does not happen.

* In the opcode handler, error handling needs to be added.

* If somebody removes the function from the global function table,
  (e.g. with runkit), the new opcode will return NULL instead of
  a resource (error handling is missing). Since I do increment
  refcount of the zend_function, it SHOULD not cause segfaults or
  memory leaks, but I haven't tested it.

* $this is kind of a problem, because all the fetch handlers in PHP
  make sure $this is a special kind of variable. For the first version
  of the patch I chose not to care about this because what still works
  is e.g. the following:

      $object = $this;
      $func = function () {
        lexical $object;
        // do something
      };

   Also, inside the closures, the class context is not preserved, so
   accessing private / protected members is not possible.

   I'm not sure this actually represents a problem because you can
   always use normal local variables to pass values between closure
and calling method and make the calling method change the properties
   itself.

* I've had some problems with eval(), have a look at the following
  code:

      $func = eval ('return function () { echo "Hello World!\n"; };');
      $func();

  With plain PHP, this seems to work, with the VLD extension loaded
  (that shows the Opcodes), it crashes. I don't know if that's a
  problem with eval() or just with VLD and I didn't have time to
  investigate it further.

* Oh, yes, 'lexical' is now a keyword. Although I really don't think
  that TOO many people use that as an identifier, so it probably won't
  hurt THAT much.

Except those above points, it really works, even with complex stuff. Let
me show you some examples:

1. Customized array_filter:

function filter_larger ($array, $min = 42) {
 $filter = function ($value) {
   lexical $min;
   return ($value >= $min);
 };
 return array_filter ($array, $filter);
}

$arr = array (41, 43);
var_dump (filter_larger ($arr)); // 43
var_dump (filter_larger ($arr, 40)); // 41, 43
var_dump (filter_larger ($arr, 44)); // empty

2. Jeff's example:

function getAdder($x) {
 return function ($y) {
   lexical $x;
   return $x + $y;
 };
}

$plusFive = getAdder(5);
$plusTen = getAdder(10);

echo $plusFive(4)."\n"; // 9
echo $plusTen(7)."\n"; // 17

3. Nested closures

$outer = function ($value) {
 return function () {
   lexical $value;
   return $value * 2;
 };
};

$duplicator = $outer (4);
echo $duplicator ()."\n"; // 8
$duplicator = $outer (8);
echo $duplicator ()."\n"; // 16

[Ok, yeah, that example is quite stupid and should NOT be used as an
example for good code. ;-) But it's simple and demonstrates the
possibilities.]

It would be great if somebody could review the patch because I'm shure
some parts can still be cleaned up or improved. And it would be even
better if this feature would make it into PHP. ;-)

Regards,
Christian

PS: I'd like to thank Derick Rethans for his GREAT Vulcan Logic
Disassembler - without it, developement would have been a LOT more painful.

PPS: Oh, yeah, if it should be legally necessary, I grant the right to
anybody to use this patch under any OSI certified license you may want
to choose.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to