Re: [PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP)
Hello Christian, I have put your proposal as a link to a PHP GSoC 2008 idea here: http://wiki.php.net/gsoc/2008 Feel invited to add to this idea in whatever way you want :-) marcus Saturday, December 22, 2007, 4:08:04 PM, you wrote: Hi, I was following this thread and came upon Jeff's posting on how closures could be implemented in PHP. Since I would find the feature to be EXTREMELY useful, I decided to actually implement it more or less the way Jeff proposed. So, here's the patch (against PHP_5_3, I can write one against HEAD if you whish): http://www.christian-seiler.de/temp/closures-php-5-3.patch I started with Wez's patch for adding anonymous functions that aren't closures. I changed it to make sure no shift/reduce or reduce/reduce error occur in the grammar. Then I started implementing the actual closure stuff. It was fun because I learned quite a lot about how PHP actually works. I had the following main goals while developing the patch: 1. Don't reinvent the wheel. 2. Don't break anything unless absolutely necessary. 3. Keep it simple. Jeff proposed a new type of zval that holds additional information about the function that is to be called. Adding a new type of zval would need changes throughout the ENTIRE PHP source and probably also throughout quite a few scripts. But fortunately, PHP already posesses a zval that supports the storage of arbitrary data while being very lightweight: Resources. So I simply added a new resource type that stores zend functions. The $var = function () {}; will now make $var a resource (of the type anonymous function. Anonymous functions are ALWAYS defined at compile time, no matter where they occur. They are simply named __compiled_lamda_1234 and added to the global function table. But instead of simply returning the string '__compiled_lambda_1234', I introduced a new opcode that will create references to the correct local variables that are referenced inside the function. For example, if you have: $func = function () { echo Hello World\n; }; This will result in an anonymous function called '__compiled_lambda_0' that is added to the function table at compile time. The opcode for the assignment to $func will be something like: 1 ZEND_DECLARE_ANON_FUNC ~0 '__compiled_lambda_0' 2 ASSIGN !0, ~0 The ZEND_DECLARE_ANON_FUNC opcode handler does the following: It creates a new zend_function, copies the contents of the entire structure of the function table entry corresponding to '__compiled_lamda_0' into that new structure, increments the refcount, registeres it as a resource and returns that resource so it can be assigned to the variable. Now, have a look at a real closure: $string = Hello World!\n; $func = function () { lexical $string; echo $string; }; This will result in the same opcode as above. But here, three additional things happen: 1. The compiler sees the keyword 'lexical' and stores the information, that a variable called 'string' should be used inside the closure. 2. The opcode handler sees that a variable named 'string' is marked as lexical in the function definition. Therefore it creates a reference to it in a HashTable of the COPIED zend_function (that will be stored in the resource). 3. The 'lexical $string;' translates into a FETCH opcode that will work in exactly the same way as 'static' or 'global' - only fetching it from the additional HashTable in the zend_function structure. The resource destructor makes sure that the HashTable containing the references to the lexical veriables is correctly destroyed upon destruction of the resource. It does NOT destroy other parts of the function structure because they will be freed when the function is removed from the global function table. With these changes, closures work in PHP. Some caveats / bugs / todo: * Calling anonymous functions by name directly is problematic if there are lexical variables that need to be assigned. I added checks to make sure this case does not happen. * In the opcode handler, error handling needs to be added. * If somebody removes the function from the global function table, (e.g. with runkit), the new opcode will return NULL instead of a resource (error handling is missing). Since I do increment refcount of the zend_function, it SHOULD not cause segfaults or memory leaks, but I haven't tested it. * $this is kind of a problem, because all the fetch handlers in PHP make sure $this is a special kind of variable. For the first version of the patch I chose not to care about this because what still works is e.g. the following: $object = $this; $func = function () { lexical $object; // do something }; Also, inside the closures, the class context is not preserved, so accessing private / protected members is not possible. I'm not
Re: [PHP-DEV] PATCH: Implementing closures in PHP
On Sun, Dec 23, 2007 at 02:40:02PM +0100, Christian Seiler wrote: I now updated the patch so that this problem is addressed. You will find it here: http://www.christian-seiler.de/temp/closures-php-5-3-v2.patch A few minor items from a quick read-through of the patch: 1. There appear to be some spurious whitespace insertions in this version of the patch. 2. The terms lamba and anonymous function are being used interchangeably. If we're going to introduce new terminology, it would be good to pick one name and use it consistently. I don't have a preference for which one is ultimately chosen. The term lexical could also be considered a competing term as its used in part of the patch. 3. The is_anonymous flags could be zend_bool values instead of bare integers, although that breaks the precedent started by some related flags (such as is_method). 4. This part of the zend_vm_def.h diff looks wrong (a stray f): -/* +f/* Looks great overall! -- Jon Parise (jon of php.net) :: The PHP Project (http://www.php.net/) -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
Hi! Thanks for reading through! 1. There appear to be some spurious whitespace insertions in this version of the patch. Oh, that's probably my editor, I'll fix that. 2. The terms lamba and anonymous function are being used interchangeably. If we're going to introduce new terminology, it would be good to pick one name and use it consistently. I don't have a preference for which one is ultimately chosen. Well, create_function uses an already-existing EG(lambda_count) and names the function __lambda_$counter so I thought I'd use CG(compiled_lambda_count) and name them __compiled_lambda_... But since anonymous functions aren't REAL lambdas, I named them anonymous elsewhere. But you're right, introducing duplicate terminology is a bad idea, I'll change everything to lambda for consistency, even though it's technically not 100% correct. The term lexical could also be considered a competing term as its used in part of the patch. 'lexical' is only used for the variables that are passed into the closure, not for the closure itself. 3. The is_anonymous flags could be zend_bool values instead of bare integers, although that breaks the precedent started by some related flags (such as is_method). You're right, zend_bool is a better idea. Since PHP 5.3 is going to break binary compability anyway, would it do any harm changing the types of the existing flags, too? 4. This part of the zend_vm_def.h diff looks wrong (a stray f): -/* +f/* WTF? I thought I had already fixed that. Hmm, obviously I hadn't... Looks great overall! Thanks! Merry Christmas, Christian -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
On Mon, Dec 24, 2007 at 04:14:00PM +0100, Christian Seiler wrote: 3. The is_anonymous flags could be zend_bool values instead of bare integers, although that breaks the precedent started by some related flags (such as is_method). You're right, zend_bool is a better idea. Since PHP 5.3 is going to break binary compability anyway, would it do any harm changing the types of the existing flags, too? I don't have a strong opinion one way or another, but other folks might. I think using it would make the API clearer, and there might be memory size benefits when used as a structure member. On the other hand, changing those integers to zend_bool (unsigned char) types could potentially have a runtime performance cost on some platforms (depending on how those values are being marshaled around), but that would have to be measured. -- Jon Parise (jon of php.net) :: The PHP Project (http://www.php.net/) -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
any idea about the possibility of hash conflict? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
Hi! Just a minor note; The semi-colon after the closing brace, seems superfluous. Is there any reason for it? Unfortunately, yes. The problem is that the closure must be an expression so it can a) be assigned to a variable and b) returned directly. And since the expression is only a part of a statement, I can take no influence at that point in the grammar as to whether a semicolon should follow or not. I don't see any way of removing the semicolon without a) either making the language inconsistent and/or b) adding a lot of bloat to the grammar. Regards, Christian -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
Hi David! One question about the names you generate for the function table in combination with opcode caches. [...] I now updated the patch so that this problem is addressed. You will find it here: http://www.christian-seiler.de/temp/closures-php-5-3-v2.patch The compiled functions are now named __compiled_lambda_$hash_$counter, where $counter is a per-file lambda counter and $hash is a hash made from the file name (see hash_compiled_filename in zend_compile.c, I wasn't sure how good Zend's hash function is with duplicates so I hashed the file name AND its basename - feel free to change that function if you have a better idea or know it's safe to only hash the filename itself because duplicates are too rare). Regards, Christian -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
Cool. I hope I'll have the time to give that a shot over the holidays. Great work, nice patch (from what I, with my rubbish knowledge, can tell :D) David Am 23.12.2007 um 14:40 schrieb Christian Seiler: Hi David! One question about the names you generate for the function table in combination with opcode caches. [...] I now updated the patch so that this problem is addressed. You will find it here: http://www.christian-seiler.de/temp/closures-php-5-3-v2.patch The compiled functions are now named __compiled_lambda_$hash_$counter, where $counter is a per-file lambda counter and $hash is a hash made from the file name (see hash_compiled_filename in zend_compile.c, I wasn't sure how good Zend's hash function is with duplicates so I hashed the file name AND its basename - feel free to change that function if you have a better idea or know it's safe to only hash the filename itself because duplicates are too rare). Regards, Christian -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP)
2007/12/22, Christian Seiler [EMAIL PROTECTED]: Hi, I was following this thread and came upon Jeff's posting on how closures could be implemented in PHP. Since I would find the feature to be EXTREMELY useful, I decided to actually implement it more or less the way Jeff proposed. So, here's the patch (against PHP_5_3, I can write one against HEAD if you whish): http://www.christian-seiler.de/temp/closures-php-5-3.patch Hi, I finally got some time to check out your patch (the first and the second version). It looks cool, works as expected in most of the things I've tried. Nevertheless, I don't need this feature, it doesn't provide any advantage over the functor objects I've been using since long. Closures and anonymous functions are harder to maintain and debug, and they are not as elegant as their counterparts in functional languages (lambda functions). On the other side, functor objects are more maintainable, scalable, elegant, and versatile (and php already supports them... it could be improved though). Bottom line, excellent work. If a considerable part of the php user base need this, excellent, they have an implementation which seems viable. But, sorry, I couldn't care less about closures or anonymous functions. Anyway, my opinion doesn't matter, I'm just one user. Best Regards, Martin Alterisio
[PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP)
Hi, I was following this thread and came upon Jeff's posting on how closures could be implemented in PHP. Since I would find the feature to be EXTREMELY useful, I decided to actually implement it more or less the way Jeff proposed. So, here's the patch (against PHP_5_3, I can write one against HEAD if you whish): http://www.christian-seiler.de/temp/closures-php-5-3.patch I started with Wez's patch for adding anonymous functions that aren't closures. I changed it to make sure no shift/reduce or reduce/reduce error occur in the grammar. Then I started implementing the actual closure stuff. It was fun because I learned quite a lot about how PHP actually works. I had the following main goals while developing the patch: 1. Don't reinvent the wheel. 2. Don't break anything unless absolutely necessary. 3. Keep it simple. Jeff proposed a new type of zval that holds additional information about the function that is to be called. Adding a new type of zval would need changes throughout the ENTIRE PHP source and probably also throughout quite a few scripts. But fortunately, PHP already posesses a zval that supports the storage of arbitrary data while being very lightweight: Resources. So I simply added a new resource type that stores zend functions. The $var = function () {}; will now make $var a resource (of the type anonymous function. Anonymous functions are ALWAYS defined at compile time, no matter where they occur. They are simply named __compiled_lamda_1234 and added to the global function table. But instead of simply returning the string '__compiled_lambda_1234', I introduced a new opcode that will create references to the correct local variables that are referenced inside the function. For example, if you have: $func = function () { echo Hello World\n; }; This will result in an anonymous function called '__compiled_lambda_0' that is added to the function table at compile time. The opcode for the assignment to $func will be something like: 1 ZEND_DECLARE_ANON_FUNC ~0 '__compiled_lambda_0' 2 ASSIGN !0, ~0 The ZEND_DECLARE_ANON_FUNC opcode handler does the following: It creates a new zend_function, copies the contents of the entire structure of the function table entry corresponding to '__compiled_lamda_0' into that new structure, increments the refcount, registeres it as a resource and returns that resource so it can be assigned to the variable. Now, have a look at a real closure: $string = Hello World!\n; $func = function () { lexical $string; echo $string; }; This will result in the same opcode as above. But here, three additional things happen: 1. The compiler sees the keyword 'lexical' and stores the information, that a variable called 'string' should be used inside the closure. 2. The opcode handler sees that a variable named 'string' is marked as lexical in the function definition. Therefore it creates a reference to it in a HashTable of the COPIED zend_function (that will be stored in the resource). 3. The 'lexical $string;' translates into a FETCH opcode that will work in exactly the same way as 'static' or 'global' - only fetching it from the additional HashTable in the zend_function structure. The resource destructor makes sure that the HashTable containing the references to the lexical veriables is correctly destroyed upon destruction of the resource. It does NOT destroy other parts of the function structure because they will be freed when the function is removed from the global function table. With these changes, closures work in PHP. Some caveats / bugs / todo: * Calling anonymous functions by name directly is problematic if there are lexical variables that need to be assigned. I added checks to make sure this case does not happen. * In the opcode handler, error handling needs to be added. * If somebody removes the function from the global function table, (e.g. with runkit), the new opcode will return NULL instead of a resource (error handling is missing). Since I do increment refcount of the zend_function, it SHOULD not cause segfaults or memory leaks, but I haven't tested it. * $this is kind of a problem, because all the fetch handlers in PHP make sure $this is a special kind of variable. For the first version of the patch I chose not to care about this because what still works is e.g. the following: $object = $this; $func = function () { lexical $object; // do something }; Also, inside the closures, the class context is not preserved, so accessing private / protected members is not possible. I'm not sure this actually represents a problem because you can always use normal local variables to pass values between closure and calling method and make the calling method change the properties itself. * I've had some problems with eval(), have a look at the following code: $func = eval ('return function ()
Re: [PHP-DEV] PATCH: Implementing closures in PHP
Hi! typo alert: Oh, thanks (don't know how it got in there ;-)), I fixed that, same address: http://www.christian-seiler.de/temp/closures-php-5-3.patch Very impressive patch, I'll be interested to try it out when I get a chance. Thanks! Christian -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP)
One question about the names you generate for the function table in combination with opcode caches. Let's assume I have APC installed, and do the following: foo.php: $foo = function() { echo foo; } bar.php: include('foo.php'); All works fine, and cached versions of both files would be created. However, if I now change bar.php to this: $bar = function() { echo bar; } include('foo.php'); wouldn't I get a cannot redefine function __compiled_lamda_0 fatal, if foo.php was loaded from the opcode cache? While this is a constructed example, it could easily occur with conditional includes with environments that use opcode caches. Or am I completely wrong? :p - David Am 22.12.2007 um 16:08 schrieb Christian Seiler: Hi, I was following this thread and came upon Jeff's posting on how closures could be implemented in PHP. Since I would find the feature to be EXTREMELY useful, I decided to actually implement it more or less the way Jeff proposed. So, here's the patch (against PHP_5_3, I can write one against HEAD if you whish): http://www.christian-seiler.de/temp/closures-php-5-3.patch I started with Wez's patch for adding anonymous functions that aren't closures. I changed it to make sure no shift/reduce or reduce/reduce error occur in the grammar. Then I started implementing the actual closure stuff. It was fun because I learned quite a lot about how PHP actually works. I had the following main goals while developing the patch: 1. Don't reinvent the wheel. 2. Don't break anything unless absolutely necessary. 3. Keep it simple. Jeff proposed a new type of zval that holds additional information about the function that is to be called. Adding a new type of zval would need changes throughout the ENTIRE PHP source and probably also throughout quite a few scripts. But fortunately, PHP already posesses a zval that supports the storage of arbitrary data while being very lightweight: Resources. So I simply added a new resource type that stores zend functions. The $var = function () {}; will now make $var a resource (of the type anonymous function. Anonymous functions are ALWAYS defined at compile time, no matter where they occur. They are simply named __compiled_lamda_1234 and added to the global function table. But instead of simply returning the string '__compiled_lambda_1234', I introduced a new opcode that will create references to the correct local variables that are referenced inside the function. For example, if you have: $func = function () { echo Hello World\n; }; This will result in an anonymous function called '__compiled_lambda_0' that is added to the function table at compile time. The opcode for the assignment to $func will be something like: 1 ZEND_DECLARE_ANON_FUNC ~0 '__compiled_lambda_0' 2 ASSIGN !0, ~0 The ZEND_DECLARE_ANON_FUNC opcode handler does the following: It creates a new zend_function, copies the contents of the entire structure of the function table entry corresponding to '__compiled_lamda_0' into that new structure, increments the refcount, registeres it as a resource and returns that resource so it can be assigned to the variable. Now, have a look at a real closure: $string = Hello World!\n; $func = function () { lexical $string; echo $string; }; This will result in the same opcode as above. But here, three additional things happen: 1. The compiler sees the keyword 'lexical' and stores the information, that a variable called 'string' should be used inside the closure. 2. The opcode handler sees that a variable named 'string' is marked as lexical in the function definition. Therefore it creates a reference to it in a HashTable of the COPIED zend_function (that will be stored in the resource). 3. The 'lexical $string;' translates into a FETCH opcode that will work in exactly the same way as 'static' or 'global' - only fetching it from the additional HashTable in the zend_function structure. The resource destructor makes sure that the HashTable containing the references to the lexical veriables is correctly destroyed upon destruction of the resource. It does NOT destroy other parts of the function structure because they will be freed when the function is removed from the global function table. With these changes, closures work in PHP. Some caveats / bugs / todo: * Calling anonymous functions by name directly is problematic if there are lexical variables that need to be assigned. I added checks to make sure this case does not happen. * In the opcode handler, error handling needs to be added. * If somebody removes the function from the global function table, (e.g. with runkit), the new opcode will return NULL instead of a resource (error handling is missing). Since I do increment refcount of the zend_function, it SHOULD not cause segfaults or memory leaks, but I haven't tested it. * $this is kind of a problem, because all the fetch handlers
Re: [PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP)
I have another observation about names. Instead of using an arbitrary name, as the name of the function, wouldn't it be possible to let the name be derived from the function-body. Eg., if you took the function-body's tokens and created a hash from them. This would have two implications: 1) Multiple definitions of the same function would be optimised into one. And more importantly 2) , it would be possible to serialize/unserialize a closure. Of course, this won't work if an anonymous function is a resource, since resources can't be serialized. This would work for Wez' original patch though. -- troels -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP)
2007/12/22, Christian Seiler [EMAIL PROTECTED]: PPS: Oh, yeah, if it should be legally necessary, I grant the right to anybody to use this patch under any OSI certified license you may want to choose. That's very kind of you but, if I was explained right, you don't have copyright on a patch. If I understood the legalese correctly, you retain authorship on the code provided, but this kind of derived work, which has no purpose outside the original, cannot be claimed through copyright. You specially did it for the php codebase, and has not a meaning outside it. Consider what happens to the code you make to maintain an in-house application of a company. You cannot claim copyright on that codebase, it remains to the company. But, if you were to develop an app from scratch for the same company, the company has the right to use it in-house, but if they want to distribute it, they have to require your consent, because you're one of the copyright holders. Also consider what happens to reviewers that check and correct works from other people, such as syntax and grammar checking in literature, or code reviewing in software. They can claim authorship on their corrections, but they cannot claim copyright. Anyway, it's too complicate. The bottom line is that copyright cannot be claimed on patches. Which makes complete sense, imagine what would happen to open-source if it wasn't this way... the licensing documentation would occupy more than the code itself...
Re: [PHP-DEV] PATCH: Implementing closures in PHP
Hi! I'm going to answer to everybody at once, if that's OK. David Zülke wrote: One question about the names you generate for the function table in combination with opcode caches. [...] While this is a constructed example, it could easily occur with conditional includes with environments that use opcode caches. Oh, yes, that's true, I didn't think of that, thanks a lot for pointing that out! Tomorrow I'll post an updated patch which will make sure this doesn't happen (see below). troels knak-nielsen wrote: I have another observation about names. Instead of using an arbitrary name, as the name of the function, wouldn't it be possible to let the name be derived from the function-body. Eg., if you took the function-body's tokens and created a hash from them. This would have two implications: 1) Multiple definitions of the same function would be optimised into one. And more importantly 2) , it would be possible to serialize/unserialize a closure. Of course, this won't work if an anonymous function is a resource, since resources can't be serialized. This would work for Wez' original patch though. Thanks for the suggestions, and although I don't completely agree with you, you pointed me into the direction I'm leaning towards now, so thanks. :-) First of all: I don't quite understand what you mean when you want to serialize a function (closure or not)? The opcodes? Ok, sure, with the current PHP implementation you can serialize the variable used to CALL the function (e.g. with $func = 'str_replace'; $func is only a string and can be serialized). But where would you need that? (Ok, for normal functions that are named this could actually be useful, but for anonymous functions?) Using the function body itself as a hashed value for the function name would require some _major_ changes to the parser. Currently, the function name has to be known before the function body starts. Also, the tokens inside the function would have to be tracked and stored in an array somehow. This would be quite a performance penalty during compile time. But: The idea I hat thanks to you is to use the file name and a per-file counter for the function name. So the name would be something like (symbollically speaking) '__compiled_lambda_' + hashfunction(__FILE__) + '_' + per_file_counter. That would solve the problem with opcode caches while not causing any real performance penalties (the hash of the loaded file would only have to be calculated once). As I also wrote above, I'll post an updated patch tomorrow that will address this problem. Martin Alterisio wrote: That's very kind of you but, if I was explained right, you don't have copyright on a patch. [...] I don't agree at all (except for the part where a patch is a derived work) but since I don't want to talk about copyright laws here (no offense, but I'd rather spend my time writing code), I think we both can live with the following statement: If you're right, the PHP team can use the patch anyway, so it isn't a problem. If I'm right, I've given the PHP team the necessary permission to use it, so it isn't a problem either. Regards, Christian -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] PATCH: Implementing closures in PHP
On Dec 23, 2007 2:23 AM, Christian Seiler [EMAIL PROTECTED] wrote: First of all: I don't quite understand what you mean when you want to serialize a function (closure or not)? The opcodes? Ok, sure, with the current PHP implementation you can serialize the variable used to CALL the function (e.g. with $func = 'str_replace'; $func is only a string Admittedly, I got the idea, in relation to Wez' patch. There, it would be trivial to serialize/unserialize the function pointer. I suppose that isn't immediately possible, if the closure is a resource type. and can be serialized). But where would you need that? (Ok, for normal functions that are named this could actually be useful, but for anonymous functions?) If a closure could be serialized, it would allow for continuation style applications. I'm not sure, how good idea this is in web applications anyway, so just ignore that. Just a minor note; The semi-colon after the closing brace, seems superfluous. Is there any reason for it? Otherwise excellent work there. -- troels -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php