Re: Re[5]: [PHP-DEV] Design of the Zend Engine's Instruction Set

2009-08-16 Thread Paul Biggar
Hi Lothar,

On Sun, Aug 16, 2009 at 6:53 PM, Lothar Scholz wrote:
> PB> Yes, by many times. Part of that might be the expense of PHP's weak
> PB> typing and references.
>
> No. Smalltalk, Javascript, Lua they all have the same problem with it

No, I disagree. Javascript, Lua and Smalltalk are much simpler
languages. None of them have references, which make many of the
implementation techniques more complicated. The weak typing you find
in Lua and Javascript are not as complex as in PHP. Lua has no
variable-variables. And there is plenty going on behind the scenes
that make improving the interpreter hard. Take a look at Section 2 of
my optimizer paper to see some of the challenges
(https://www.cs.tcd.ie/~pbiggar/#wip-optimizer)


> and they solve it in much more clever and performant ways. Javascript

Yes, they often do. But there is still a larger challenge with PHP.


> with its class free OO is even harder and with V8 they have a much
> much better solution.

Its very difficult for an interpreter to take advantage of PHP's
static class hierarchy. So its not much of an advantage.

V8 is a JIT, so its unfair to compare it to a straightforward
interpreter. JIT's have plenty of disadvantages, most of all
portability - V8 runs on very few platforms.


> It's just the implementation - not the language.

In my opinion, its almost certainly both. I've argued that pretty
strongly in my papers.


> 
> This sucks so huge and is such a
> fucking lazy argument of an incompetent language development team.

It think its more the lack of a language implementation team. I don't
think there is anybody spending significant time on making the PHP
interpreter faster.


> I've seen that you talked about branch prediction misses in your
> paper. Did you ever tried to compile the original PHP with different
> compilers, for example from intel or sun studio? Did you test or have
> you ever heard of someone who tested the influence of the profiling
> feedback that modern C compilers offer? (Well don't ask me - it's
> somewhere on my TODO list - but i also haven't had the time yet).

I tried FDO with gcc 4.4 on phc generated code, but not on the PHP
interpreter. I've not looked at other compilers either.


Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re[5]: [PHP-DEV] Design of the Zend Engine's Instruction Set

2009-08-16 Thread Paul Biggar
Hi Tom,

On Sun, Aug 16, 2009 at 7:16 PM, Tom Boutell wrote:
> Re: the performance of PHP, if it's badly implemented, shouldn't
> Quercus (a reimplementation of PHP in Java) run rings around it?
>
> In reality, Quercus is faster than PHP without APC, but with APC the
> Quercus team themselves admit it only "roughly matches" the original
> PHP in speed.

Having spent a few years working on a PHP compiler (phc  -
phpcompiler.org), I can suggest a few reasons for this. Mostly, its
hard to get fast when you have a small team. The effort involved in
even making a language match the underspecified, ad-hoc nature of
PHP's reference implementation is staggering. Roadsend seem to have
the same problem.

Secondly, all of PHP's libraries are similarly specified. They are all
written using the Zend API, and there are about 5000 functions. A
compelling reimplementation must reimplement large portions of this.
Its no small challenge.


In short, I believe the lack of great speedups from other PHP
implementations (including my own) can be attributed at least
partially to the implementation difficulty and under-specification of
PHP the language.


Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re[3]: [PHP-DEV] Design of the Zend Engine's Instruction Set

2009-08-16 Thread Paul Biggar
On Sun, Aug 16, 2009 at 9:58 AM, Lothar Scholz wrote:
>
>>>> where the instruction encoding
>>>> is much more complex then for instance for the
>>>> JVM, Python, or Smalltalk.
>
> PB>> Yes, definitely.
>
> And again we see that complexity is just that complex but not
> necessarily good. As all others outperform PHP in all basic
> operations.

Yes, by many times. Part of that might be the expense of PHP's weak
typing and references. They cause most operations to be more
expensive, even if they are not used in that particular operation.

I did a performance analysis of PHP in a paper I have in submission.
You might find it interesting (comment welcome):
https://www.cs.tcd.ie/~pbiggar/wip-sac-journal.pdf. There was a POPL
paper this year on Copy-on-write in PHP, which might be interesting to
you. There was also a PLDI paper on the performance effects of PHP's
memory allocator, which might also interest you.


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Design of the Zend Engine's Instruction Set

2009-08-16 Thread Paul Biggar
Hi Stefan,

On Sat, Aug 15, 2009 at 8:52 PM, Stefan Marr wrote:
> Sometimes, it would be really interesting to know
> where some of the used ideas are coming from
> and what the reasoning was. I tend to think that its rather unlikely that
> they
> are pulled out of thin air. Some parts of the model remind me of CISC
> instruction
> sets... 3-address form, register-memory model...

I think they are pulled out of thin air. More specifically, I think
there are optimizations heaped upon optimizations heaped upon an
initial implementation. It seems that each new release of PHP has a
small speed improvement based on some optimization performed, but that
there has been no major rearchitecture since the addition of a
bytecode based interpreter in PHP 4. I do not know how that was
designed though, maybe others do?

One thing I do find interesting is that the register machine nature of
PHP comes from an optimization called "compiled variables". CVs point
to symbol-table entries, but without them, I'm not sure whether we
would still call PHP a register machine. Any thoughts?


Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Design of the Zend Engine's Instruction Set

2009-08-15 Thread Paul Biggar
Hi Stefan,

On Thu, Aug 13, 2009 at 1:42 PM, Stefan Marr wrote:
> Hello internals:
>
> I had a look at the Zend Engine to understand some
> details about its internal design with respect
> to its opcodes and machine model.


To start with, the best reference about the Zend engine that I know of
is a presentation by Andy Wharmby at IBM:
www.zapt.info/PHPOpcodes_Sep2008.odp. It should answer a lot of your
questions.



> Would like to ask you for some comments if the
> following sounds wrong or misinterpreted to you:
>
>
> So, the basic design of the Zend Engine is a
> a stack-based interpreter for a fixed length

No, its a register based interpreter. There is a stack, but thats used
for calling functions only. The operands to the opcodes are pointed to
by the opcodes in the case of compiled variables, or in symbol tables
otherwise. That's as close to a register machine as we can get I
think, but its not very close to a stack machine. In a stack-based VM,
the operands to an opcode would be implicit, with add for example
using the top two stack operands, and thats not the case at all.


> instruction set (76byte on a 32bit architecture),

Andy's presentation says 96 bytes, but that might be 64 bit. I presume
this means sizeof(strict _zend_op)?


> where the instruction encoding
> is much more complex then for instance for the
> JVM, Python, or Smalltalk.

Yes, definitely.



> Even so, the source code is compiled to a linearized
> instruction stream, the instruction itself contain not just opcode and
> operands.
>
> The version I looked at had some 136 opcodes encoded
> in one byte, but the rest of the instruction has
> many similarities with a AST representation.

Are you referring to the IS_TMP_VAR type of a znode?


> Instructions encode:
>  - a function pointer to the actual handler which is
>   used to execute it

The type of interpreter dispatch used can be chosen at configure-time
using the --with-vm-kind flag. The call-based interpreter is the
default. I've heard the others are buggy, but I'm not certain where I
heard that.


> However, its not a simple, single stack model,
> but uses several purpose-specific stacks.

How so?


> What I am not so sure about is especially the
> semantics of the result field and the pointer
> to the other function (op_array).
>
> Would be grateful if someone could comment on that.

I'm not sure whats confusing about the result field? It points to a
zval, same as op1 and op2.

I _think_ that op_array is used to attach extra information to the
opcode by special extensions. I can't think of an example off the top
of my head.



> I am also not really sure with these complexity,
> whether is not actually some kind of abstract syntax
> tree instead of a instruction set like Java
> bytecode. Thats not a technical problem, but merely
> an academic question to categorize/characterize PHP.

I think the result field of a znode can make it seem like that, but I
would characterize it as you have before. An instruction set just like
Java bytecode. Way more complicated, obviously, but I dont think its
very close to an AST. Certainly the interpreter does not really
resemble an AST walker.



I hope I answered what you were looking for. I'm not certain about a
few of my answers, since I've really avoided the interpreter in my
work, but I think most of it is OK.



Best of luck,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Can't use embeded sapi

2009-07-13 Thread Paul Biggar
Hi Thomas,

On Mon, Jul 13, 2009 at 9:15 PM, Thomas Koch wrote:
> gcc  -c -I/usr/local/include/php/ -I/usr/local/include/php/main -
> I/usr/local/include/php/Zend -I/usr/local/include/php/TSRM -Wall -g -o
> worker.o worker.c
> gcc  -L/usr/local/lib -lphp5 -o worker worker.o
> worker.o: In function `main':
> /var/checkouts/gearman-php-worker/worker.c:5: undefined reference to
> `php_embed_init'
> /var/checkouts/gearman-php-worker/worker.c:6: undefined reference to
> `php_embed_shutdown'
> collect2: ld returned 1 exit status
> make: *** [all] Error 1

This worked for me, with your commands and program. Since its a linker
error, it seems likely that you haven't installed libphp5.so properly.
You didn't indicate that you ran make install, did you? If so, you may
want to verify that there is a libphp5.so in /usr/local/lib, and that
it is a shared library.



Paul


-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting - Request for Discussion

2009-07-09 Thread Paul Biggar
On Thu, Jul 9, 2009 at 5:20 PM, Lukas Kahwe Smith wrote:
> Actually, in terms of weak typing we are now at 4 supporters of the general
> idea:
> Paul B.,

Better make that 3 supporters. After seeing how complicated it was to
explain the semantics of a "sensible middle", I think the patch that
Ilia currently has is close to ideal (the strict semantics are trivial
to understand, the weak ones are the same as whats already here).
Naturally, I might change my position upon seeing the new RFCs.

Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting - Request for Discussion

2009-07-09 Thread Paul Biggar
On Thu, Jul 9, 2009 at 1:33 PM, Dennis Haarbrink wrote:
> I mean, libraries may have all kinds of quirks, you may not be satisfied with 
> the supplied API or maybe even the entire philosophy behind it, but that 
> shouldn't stop you from using it or is it?

You must learn what these quirks are. Likewise, every developer must
learn the quirks of the "type hinting" as they will almost certainly
be required to interact with code that uses it.

In an extreme case, suppose that we completely balls-ed up the
semantics of type checks to be really really hard to use, but people
used them anyway. That would lead to a sharp learning curve. The
people who are against this proposal, or any other, know they will be
required to use it when it comes in, and have an interest in making
sure it is clean (whatever that means in each case). We shouldn't
dismiss their objections on the argument that they arent required to
use it. If its in the language, they are required to learn it.

Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting - Request for Discussion

2009-07-09 Thread Paul Biggar
Hi Dennis,

On Thu, Jul 9, 2009 at 11:49 AM, Dennis Haarbrink wrote:
> The most important thing is that it is _optional_, you have the _choice_ to
> use it.

Sorry to go a little bit off-topic, but this idea has been bandied
around a lot over the last week, and I'd like to respond to it.

Although the proposed checks are optional, nobody has a "_choice_" to
use a new language feature. Though you may choose not to use a feature
in your code for whatever reason, it will appear in code from your
colleagues, clients, or upstream libraries. Every feature in PHP will
at some point be put in front of every PHP developer.

As a result, the argument that "you dont like it, so what, you have a
choice not to use it", is simply incorrect. Though at some point the
majority may have to overrule the views of the minority, please do not
dismiss them out of hand with this logic.


Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting - Request for Discussion

2009-07-09 Thread Paul Biggar
On Thu, Jul 9, 2009 at 10:46 AM, Zeev Suraski wrote:
> I believe that
> if we had a 'clean' Weak typing RFC as well as a Strict typing RFC, each
> with its pros and cons - there would be very different results.
>
> The two approaches were never compared head-to-head, with the pros and cons
>
> Another is that the difference between the
> two approaches, as well as the issues with strict typing - aren't fully
> understood by the audience - and I believe that this is actually the case.
>
> the
> discussion missed key ingredients, we should discuss it further.

I think you are right that the discussions weren't sufficiently clear,
and this could be resolved by better RFCs. This is obviously how it
should have been done in the first place (look at the success of
Python's PEPs).

I'm not sure this will change things, and worry it might derail what
we have now, but those are not good reasons to stop the discussion.


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting - Request for Discussion

2009-07-09 Thread Paul Biggar
Hi Zeev,

I've been keeping a very close eye, and have taken part in a good few
discussions, including putting my own idea on the table.

On Thu, Jul 9, 2009 at 8:47 AM, Zeev Suraski wrote:
> The options as I see them:
>
> 1.  Do nothing.  I think the vote established that this is not what people
> want.

Agreed.


> 2.  Move forward with the semantics of the current patch.  This approach has
> substantial drawbacks in my opinion, and is a strong misfit with the rest of
> PHP (again, in my opinion).  See previous posts from Stas, Rasmus, myself
> and maybe others illustrating this point.  Lukas's 'Strict and Weak Typing
> RFC' (http://wiki.php.net/rfc/typecheckingstrictandweak) has some good
> insight as to why strict typing is problematic in PHP.

I believe nearly all opinions voiced have wanted the current patch.
Many more were in favour of strict typing only, but for the few
dissenters, like myself, Ilia included casting semantics using (int)
syntax.

There are I think maybe 5 or 6 people who have expressed opinions
against this. All others have been pro (I'm not counting the vote
here, I'm looking at discussions).


> 3.  Implement 'weak' typing.  What does that mean?  Conversion rules will be
> very similar to those for internal functions, but slightly stricter (like I
> said, if we were to rewrite the conversion rules for internal functions
> today - we'd probably make them slightly stricter, too).  For example, if
> you denote that an argument is supposed to be an integer, then a string that
> looks like an integer (e.g. "17") would be silently converted to an integer,
> however, a string that does not look like an integer (e.g. "foo") will error
> out.  We need to come up with a complete pass/fail table, but that would be
> the theme.  Unlike option #2, this fits the rest of PHP quite well (except
> for minor inconsistencies with internal functions - but these are dwarfed in
> comparison to the inconsistencies of option #2).

This seems like a good idea, but others disagree. I wrote a set of
rules for this, which are very much how you describe them: slightly
stricter than casting, but weaker than strict typing. In the
discussion, there was very little support, and so I withdrew it.


> Two other issues that we need to revisit:
> 1.  Agree on when it will be implemented - I think it's pretty clear it
> should not go to 5.3.

The last discussion certainly made this clear. It should be cleared up
whether people want this is 5.4 (if it happens). I think its
established that people want this in 6.


> 2.  Make sure people understand *why* we're implementing it - performance is
> certainly *not* the reason.

I believe there was only two misguided souls who believed this would
lead to better performance. Everyone else wanted it on its merits.


> Before moving ahead with votes or committing the code, I suggest we focus on
> the pros & cons of options #2 and #3.  Since the inconsistencies introduced
> by option #2 are very substantial (between internal and userland functions,
> and also how values behave in PHP in general), I suggest we focus on use
> cases where option #2 would be truly needed and superior to option #3.  In
> other words - what is the added value and is it worth the price?

It doesnt sound like you are aware of the casting semantics that Ilia
added. They are largely consistent with the internal function
semantics.

function x (int x) { /* fail on non-int */ }
function x ((int) x) { /* cast parameter to an int */ } // just like
internals functions

Indeed this adds the ability to make your function work like an
internals function (useful perhaps for library interfaces).


> I think we can take Lukas's RFC and either change it or write something
> based on it for weak typing only.  If people here find it useful I'll go
> ahead and do that.

I believe people don't want this. I wrote a set of rules, Lukas wrote
the RFC, and neither got anything like the support that strict typing
got.

One large problem is that it introduced a new 3rd set of rules, and
this set is not intuitive but rather slightly subjective (reasonable
people might disagree on whether a bool should be weakly converted to
an int, for example). The strict type checks are at least simple, and
the current weak ones are what people are used to.


Obviously I'm not against discussion, but we just had this discussion.


Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting/casting request for vote

2009-07-08 Thread Paul Biggar
2009/7/8 Ilia Alshanetsky :
> All of the identified issues can be resolved and none of them represent a
> major challenge to address. However, if there is no consensus to put this in

OK, but you had not said you would resolve them. I would appreciate
some detail on what you will do to address them.


> the near future (which at this point is 5.3), I have hard time justifying
> spending further time on this. The original patch that was posted, that did
> break BC was far simpler and featureless, the changes since (which took a
> fair amount of work) were specifically made to address some of the main
> concerns that were on the list. I feel what is on the table right now is
> pretty close to what a final product could be, to have a vote on it. If

Personally, I had gotten the impression that this was the final
product, and that we could take it or leave it.


> decision is made to proceed within a practical release schedule, then
> suffice to say that I'd be more then happy to put further time to address
> the minor issues indicated.

I recommend:

 - stop the vote
 - address the issues (possibly deferring them until after voting, or whatever)
 - write an RFC
 - wait for Lukas to finish what he's doing
 - new vote, more options (5.3.x/5.4/6.0, Lukas'/yours, make it clear
what we're voting for)


Thanks,
Paul



> On 7-Jul-09, at 7:07 PM, Paul Biggar wrote:
>
>> 2009/7/7 Johannes Schlüter :
>>>
>>> On Mon, 2009-07-06 at 20:52 -0400, Ilia Alshanetsky wrote:
>>>>
>>>> Last week or so there was a fairly detailed discussion on the
>>>> internals list regarding type hinting based on my original patch.
>>
>>
>>> Having an "old" 5.3 extension with a typehint expecting an array
>>> arg_info.array_type_hint will be set to 1.
>>> The newly compiled engine with this patch will then do
>>>
>>> +               /* existing type already matches the hint or forced type
>>> */
>>> +               if (Z_TYPE_P(arg) == cur_arg_info->array_type_hint ||
>>> Z_TYPE_P(arg) == (cur_arg_info->array_type_hint ^ (1<<7))) {
>>>
>>> as it's main type check, but Z_TYPE_P(arg) will be IS_ARRAY (5) which
>>> doesn't match the 1 provided by the old extension, other checks in there
>>> will fail too, so the valid param will be rejected whereas an integer
>>> (IS_LONG 1) will be accepted where the extension expects an array.
>>
>> I raised this in my review, to which Ilia replied "It should be fine"
>> (http://news.php.net/php.internals/44707). I would not have thought it
>> would be fine.
>>
>> I had been thinking that Ilia would have to hack it to make 1 mean
>> array in this case, which would be ugly, but workable. Based on the
>> arguments in this thread, I believe it shouldn't go into 5.3 at all.
>> Are we allowed break the ABI for 5.4 (I would think so, but amn't
>> sure).
>>
>>
>>
>> Overall, I'm very disappointed with the way this has been conducted.
>> When reviews were posted they are not replied to (Stas posted
>> http://news.php.net/php.internals/44710, I posted
>> http://news.php.net/php.internals/44706, and I dont see any replies
>> except a cursory response to mine). Furthermore:
>>  - the RFC process has been wilfully ignored (despite multiple requests)
>>  - a vote was asked for when Lukas was still trying to discuss his
>> proposal
>>  - the vote was take it or leave it
>>  - there has been a general attitude of "throwing the toys out of the
>> pram"
>>
>>
>> I am mostly for the patch, and I 100% support the idea. However, I
>> feel I have to vote against it, and urge others to do the same, until
>> the entire mess is rectified.
>>
>> Ilia, I respect the work you have put into this, but I would ask you
>> to withdraw the patch and the vote until these things have been sorted
>> out.
>>
>>
>> -1
>>
>> Thanks,
>> Paul
>>
>>
>>
>> --
>> Paul Biggar
>> paul.big...@gmail.com
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting/casting request for vote

2009-07-07 Thread Paul Biggar
2009/7/7 Johannes Schlüter :
> On Mon, 2009-07-06 at 20:52 -0400, Ilia Alshanetsky wrote:
>> Last week or so there was a fairly detailed discussion on the
>> internals list regarding type hinting based on my original patch.


> Having an "old" 5.3 extension with a typehint expecting an array
> arg_info.array_type_hint will be set to 1.
> The newly compiled engine with this patch will then do
>
> +               /* existing type already matches the hint or forced type */
> +               if (Z_TYPE_P(arg) == cur_arg_info->array_type_hint || 
> Z_TYPE_P(arg) == (cur_arg_info->array_type_hint ^ (1<<7))) {
>
> as it's main type check, but Z_TYPE_P(arg) will be IS_ARRAY (5) which
> doesn't match the 1 provided by the old extension, other checks in there
> will fail too, so the valid param will be rejected whereas an integer
> (IS_LONG 1) will be accepted where the extension expects an array.

I raised this in my review, to which Ilia replied "It should be fine"
(http://news.php.net/php.internals/44707). I would not have thought it
would be fine.

I had been thinking that Ilia would have to hack it to make 1 mean
array in this case, which would be ugly, but workable. Based on the
arguments in this thread, I believe it shouldn't go into 5.3 at all.
Are we allowed break the ABI for 5.4 (I would think so, but amn't
sure).



Overall, I'm very disappointed with the way this has been conducted.
When reviews were posted they are not replied to (Stas posted
http://news.php.net/php.internals/44710, I posted
http://news.php.net/php.internals/44706, and I dont see any replies
except a cursory response to mine). Furthermore:
  - the RFC process has been wilfully ignored (despite multiple requests)
  - a vote was asked for when Lukas was still trying to discuss his proposal
  - the vote was take it or leave it
  - there has been a general attitude of "throwing the toys out of the pram"


I am mostly for the patch, and I 100% support the idea. However, I
feel I have to vote against it, and urge others to do the same, until
the entire mess is rectified.

Ilia, I respect the work you have put into this, but I would ask you
to withdraw the patch and the vote until these things have been sorted
out.


-1

Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Type hinting/casting request for vote

2009-07-07 Thread Paul Biggar
On Tue, Jul 7, 2009 at 6:41 PM, Lukas Kahwe Smith wrote:
> B) foo($bar);
> function bar((int) $bar) {}
>
> What am I really saying with B)?
> I don't care what you give me, I am going to use it as an int anyways?

Exactly. Very simple. I would phrase it as "I'll accept anything and
cast it to an int".


> Of course I am also quite opposed to sticking this into 5.3.

On which grounds? If you don't like the feature, please cast a -1
vote. If its because of the BC problems, I believe you and Johannes
have veto power on what goes into 5.3.x? If so, do you intend to use
it?


> Finally I would like to ask to rename this entire feature (including what we
> currently already have) to "type checking" or something other than "hint" in
> the documentation.

Seconded.


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Type hinting/casting request for vote

2009-07-07 Thread Paul Biggar
On Tue, Jul 7, 2009 at 3:53 PM, Greg Beaver wrote:
> 2) have 1 syntax, T_*_CAST as in function ((int) $a) {} and have that
> mean strict type hinting.

I dont think this is a good option. Having features dictated by trying
to keep the lexer/parser clean is a bad idea. If the lexer cannot be
fixed, I think the better option is to ruin things for people calling
things Bool, rather than compromising the feature. (I am assuming that
people are still able to call things \mynamespace\bool -- if not, that
changes things).

Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Type hinting/casting request for vote

2009-07-07 Thread Paul Biggar
On Tue, Jul 7, 2009 at 3:15 PM, Stan Vassilev wrote:
> I'm really puzzled why a non-essential, and for the past months (years?)
> controversial and always rejected feature such as strict type hints, has
> everyone turning 180 degrees and putting it in a minor release in the course
> of a week.

Because nearly everyone wants it, and people think it might actually
get in this time. It was never rejected, people simply couldnt agree
on what it was they wanted, so it never got in. Nearly there,
though...


Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] weak and strict type checking RFC

2009-07-06 Thread Paul Biggar
Hi Lukas,

On Mon, Jul 6, 2009 at 11:03 PM, Lukas Kahwe Smith wrote:
> Ok, I have updated the RFC now with a table that shows that I expect to pass
> and fail. Its fairly late, so I might have missed something. In general what
> I am proposing is more lax than is_*() for the most part. Especially when it
> comes to checking strings.

I hope you have missed some things (or that they are typos) because
otherwise a good chunk of this is plain lunacy.

value string float int numeric scalar bool array
0 (integer) fail fail pass pass pass pass fail
1 (integer) fail pass pass pass pass pass fail

0 fails conversion to a float, but 1 and 12 succeed?


12 (double) fail pass pass pass pass fail fail

It may seem that this is a good idea (12.0 passing the int check), but
what if 12.0 is OK, but 144.0/12 does not (which might not be 12.0 due
to floating point error)?




'0' (string) pass fail fail pass pass pass fail
'1' (string) pass fail fail pass pass pass fail
'12' (string) pass pass pass pass pass fail fail

Absolute lunacy. Please let this be a typo.

'12.0' (string) pass pass pass pass pass fail fail
'12.34' (string) pass pass fail pass pass fail fail

As above.


I think you need to present this information better. One advantage of
Ilia's proposal is that it is very clear. It does two things: strong
type check, or the same casts that currently exist. I think you need
to say what changes you are introducing, and why they are introduced.
The same flaw existed with my proposal: I dont think anyone wants a
3rd set of rules.


> I do not understand why its suddenly so urgent to get this feature in(*),
> that people already speak about frustration over the process after just a

I think because this same issue has been going on for so long, and
seem to be so very close now. This idea has been punted around in
various forms and patches for years at this stage.



> few days. We dont need years and usually not months, but this is not the
> addition of some function. Its an extension to the language syntax, so I
> think its totally normal that we talk about this for at least a month.

Well yes. But with near consensus, there is a danger of a 90%
good-enough patch being derailed by new proposals, and I get the
impression most people would be happier with the 90% patch.


> shouldn't we
> rather talk about finding a better release process (maybe build on top of
> recent developments in the version control world) that enables us to more
> quickly get x.y releases out without preventing bigger features like unicode
> from ever maturing?

I've heard you mention this before. Please roll it into an RFC so we
can think about it (FWIW, the idea that newer version control systems
will somehow change everything makes little sense, so I think a lot of
detail is required here).


Thanks,
Paul

--
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] weak and strict type checking RFC

2009-07-04 Thread Paul Biggar
On Sat, Jul 4, 2009 at 7:12 PM, Lukas Kahwe Smith wrote:
>> I can't see the difference between your proposal and the conclusion I
>> reached yesterday?
>>
>> (which was that there is a near consensus around strict checks by
>> default, with casts allowed with some syntax).
>
> Well to me it Sounded like you wanted to Rely on Standard Type juggling and
> what i am proposing is more strict than that. More over i am Not convinced
> that strict should Be the Default.

I don't know what you mean by standard type-juggling. Your proposal
really does not outline what you want very much, just what you're
against. As for strictness, if your proposal suggests that strict
typing is the default, I cannot see where.

As I see it, each proposal is a very minor variation on the other. My
proposal had an extra layer, but it it wasn't well received, so I've
withdrawn it.

Ilia has a patch now that does what I understand you want, using the
(int) syntax, with strict by default. I may be wrong, but I believe
the only thing left to argue about is strict versus weak by default. I
was originally of the opinion that weak typing should be the default.
However, it had barely any support, whereas there was great support
for strict by default.


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-04 Thread Paul Biggar
the hint or forced type */
+   if (Z_TYPE_P(arg) == cur_arg_info->array_type_hint || 
Z_TYPE_P(arg)
== (cur_arg_info->array_type_hint ^ (1<<7))) {
+   return 1;
+   }
+
+   /* NULL type give, check if parameter is optional */

I cant parse this comment.







+   case IS_NUMERIC:
+   switch (Z_TYPE_P(arg)) {
+   case IS_STRING:
+   if 
(is_numeric_string(Z_STRVAL_P(arg), Z_STRLEN_P(arg), NULL, NULL, 0)) {
+   return 1;
+   } else {
+   goto type_error;
+   }
+   break;
+   case IS_BOOL:
+   case IS_LONG:
+   case IS_DOUBLE:
+   return 1;
+   default:
+   goto type_error;
+   }
+   break;

I dont think bool should be in "numeric".


Index: Zend/zend.h
===
RCS file: /repository/ZendEngine2/zend.h,v
retrieving revision 1.293.2.11.2.9.2.37
diff -u -p -a -d -u -r1.293.2.11.2.9.2.37 zend.h
--- Zend/zend.h 17 Jun 2009 08:55:23 -  1.293.2.11.2.9.2.37
+++ Zend/zend.h 4 Jul 2009 17:20:50 -
@@ -536,6 +536,16 @@ typedef int (*zend_write_func_t)(const c
+/* used for forcing method/function parameter type */
+#define FORCE_BOOL (IS_BOOL | (1<<7))
+#define FORCE_STRING   (IS_STRING | (1<<7))
+#define FORCE_LONG (IS_LONG | (1<<7))
+#define FORCE_DOUBLE   (IS_DOUBLE | (1<<7))
+#define FORCE_ARRAY(IS_ARRAY | (1<<7))

Can we have a macro for 1 << 7? It 's used in quite a few places.


Index: Zend/zend_language_parser.y
===
RCS file: /repository/ZendEngine2/zend_language_parser.y,v
retrieving revision 1.160.2.4.2.8.2.35
diff -u -p -a -d -u -r1.160.2.4.2.8.2.35 zend_language_parser.y
--- Zend/zend_language_parser.y 26 Mar 2009 12:37:17 -  
1.160.2.4.2.8.2.35
+++ Zend/zend_language_parser.y 4 Jul 2009 17:20:50 -
@@ -128,6 +128,14 @@
 %token T_DOUBLE_ARROW
 %token T_LIST
 %token T_ARRAY
+%token T_BOOL_HINT
+%token T_STRING_HINT
+%token T_INT_HINT
+%token T_DOUBLE_HINT
+%token T_RESOURCE_HINT
+%token T_NUMERIC_HINT
+%token T_SCALAR_HINT
+%token T_OBJECT_HINT
 %token T_CLASS_C
 %token T_METHOD_C
 %token T_FUNC_C

Can you use T_BOOL_CHECK etc instead of T_BOOL_HINT?




@@ -661,10 +682,10 @@ lexical_vars:
  lexical_var_list:
-   lexical_var_list ',' T_VARIABLE {
zend_do_fetch_lexical_variable(&$3, 0 TSRMLS_CC); }
-   |   lexical_var_list ',' '&' T_VARIABLE {
zend_do_fetch_lexical_variable(&$4, 1 TSRMLS_CC); }
-   |   T_VARIABLE  
{ zend_do_fetch_lexical_variable(&$1, 0 TSRMLS_CC); }
-   |   '&' T_VARIABLE  
{ zend_do_fetch_lexical_variable(&$2, 1 TSRMLS_CC); }
+   lexical_var_list ',' T_VARIABLE {
zend_do_fetch_lexical_variable(&$3, 0 TSRMLS_CC); }
+   |   lexical_var_list ',' '&' T_VARIABLE {
zend_do_fetch_lexical_variable(&$4, 1 TSRMLS_CC); }
+   |   T_VARIABLE  { 
zend_do_fetch_lexical_variable(&$1, 0 TSRMLS_CC); }
+   |   '&' T_VARIABLE  { 
zend_do_fetch_lexical_variable(&$2, 1 TSRMLS_CC); }
 ;

I cant see what was changed here?


Index: Zend/zend_language_scanner.l
===
RCS file: /repository/ZendEngine2/zend_language_scanner.l,v
retrieving revision 1.131.2.11.2.13.2.40
diff -u -p -a -d -u -r1.131.2.11.2.13.2.40 zend_language_scanner.l
--- Zend/zend_language_scanner.l5 May 2009 01:35:44 -   
1.131.2.11.2.13.2.40
+++ Zend/zend_language_scanner.l4 Jul 2009 17:20:50 -
@@ -1158,6 +1158,38 @@ NEWLINE ("\r"|"\n"|"\r\n")
return T_ARRAY;
 }

+("bool"|"boolean") {
+   return T_BOOL_HINT;
+}
+
+("string"|"binary"|"unicode") {
+   return T_STRING_HINT;
+}

Someone asked on your last patch about that "unicode", with relation
to 5.3. I think it might be a nice idea for forward compatability, so
no objections, but I wanted to ask your plan for 5.3 with this.



+"object" {
+   return T_OBJECT_HINT;
+}

Great.


There is a good argument for allowing "mixed", and a tiny argument for
allowing "unset"/"null". It would be great if you could add these. I
think that "callback" would be too hard, but if anyone comes up with
an easy way, that would be cool too.


Thanks for all your work on this,
Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] weak and strict type checking RFC

2009-07-04 Thread Paul Biggar
Hi Lukas,

On Sat, Jul 4, 2009 at 7:20 AM, Lukas Kahwe Smith wrote:
> Last evening I put together a quick proposal for a weak and strict checking
> approach, since I felt that things were being concluded a bit prematurely.
> More importantly I detailed the issues I see with a pure strict type
> checking only approach.


I can't see the difference between your proposal and the conclusion I
reached yesterday?

(which was that there is a near consensus around strict checks by
default, with casts allowed with some syntax).


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Flexible type hinting

2009-07-03 Thread Paul Biggar
Hi Lukas,

On Fri, Jul 3, 2009 at 2:35 PM, Lukas Kahwe Smith wrote:
> hmm not sure that the consensus of a "weak" check would be an automatic
> cast. Actually that wouldnt be much of a check in that case. I think the
> other side is more asking for what Ilia already begun with "numeric" in the
> sense of a weak checker. After the check it could go ahead and cast too for
> all I care, but it shouldnt be the equivalent of a cast. But maybe you are
> not implying that and I am just getting confused by the syntax.

I think people wanted an automatic cast for the weak check. The idea
would be that if you hinted "(int)", you could be guaranteed that you
got an int. The syntax was suggested (I believe, it wasn't my
suggestion) to look like casting.

Thanks,
Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-03 Thread Paul Biggar
Hi Ilia,

On Wed, Jul 1, 2009 at 7:07 PM, Stanislav Malyshev wrote:
>> The patch is available here: http://ia.gd/patch/type_hint_53.txt
>
> Technical comment: as this patch changes binary API this shouldn't happen in
> 5.3 branch. So maybe it's better to make it for 6.


Index: Zend/zend_compile.h
===
RCS file: /repository/ZendEngine2/zend_compile.h,v
retrieving revision 1.316.2.8.2.12.2.40
diff -u -p -a -d -u -r1.316.2.8.2.12.2.40 zend_compile.h
--- Zend/zend_compile.h 5 Jun 2009 23:20:59 -   1.316.2.8.2.12.2.40
+++ Zend/zend_compile.h 1 Jul 2009 16:45:02 -
@@ -175,7 +175,7 @@ typedef struct _zend_arg_info {
zend_uint name_len;
const char *class_name;
zend_uint class_name_len;
-   zend_bool array_type_hint;
+   zend_uint type_hint;
zend_bool allow_null;
zend_bool pass_by_reference;
zend_bool return_reference;


I think you could make this work for 5.3, if it used the old
"zend_bool array_type_hint". A zend_bool is 8 bits, so that's plenty.
It would be a little bit messy, but I'm fairly confident it could be
made work.


Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-03 Thread Paul Biggar
Hi Ilia,

Your patch doesn't support a "null" (or maybe it should be called
"unset") type check. Its uses would be rare, but I think it should be
present for completeness.

Thanks,
Paul



On Wed, Jul 1, 2009 at 5:59 PM, Ilia Alshanetsky wrote:
> There has been quite a bit of discussion on this list, IRC, developer
> meetings, etc... about introduction of type hinting to PHP. Most people
> appear to think that this would be a good idea, but there is a reason why it
> is not in PHP already. The main source of conflict appears to be that in
> some cases typical type hinting is just too strict for PHP's typeless nature
> (most people expect that "1" == 1, while int type hint would definitely
> reject string "1").  Personally, I disagree with that opinion, but I can
> understand people who raise that issue. At work we've been using PHP 5.2
> with type hinting for nearly 2 years now with great success, it makes code
> much easier to read and understand and the security benefit of type hinting
> is not to be under valued. In many cases type hinting can present a last
> line of defense against unexpected input for numeric fields, which are
> typically abused to do SQL injection.
>
> I've taken a few hours this morning to port my 5.2 type hinting patch to
> 5.3. In recognition of a need for a more 'flexible' numeric type I've
> introduced (numeric) type hint that would allow bool/int/float data types as
> well as a string containing a numeric entity as identified by
> is_numeric_string(). For completion i've also added (scalar) data type that
> will allow any scalar data element.
>
> The patch is available here: http://ia.gd/patch/type_hint_53.txt
>
> It should be noted that this patch is fully compatible with opcode caches
> and and requires no changes on the part of an opcode cache such as APC to
> work.
>
> My hope is that the latest changes will allow this to become a standard part
> of PHP.
>
> Ilia Alshanetsky
>
> P.S.
>
> It should be noted that this is not the first idea for type hints, that
> credit goes to Hannes Magnusson who had posted a similar patch on the
> internals list back in 2006. Also, back in 2008 Felipe Pena wrote a type
> hinting patch for PHP that is available on wiki.php.net.
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Flexible type hinting

2009-07-03 Thread Paul Biggar
Hi Ilia,

On Thu, Jul 2, 2009 at 3:35 AM, Paul Biggar wrote:
> Thanks to Ilia for getting to ball rolling on scalar type hinting.
> I believe I have a solution that caters to each crowd, without being
> too complicated.

My impression is that there only minor support for the "flexible"
approach, and that even then, everyone prefers strict checking by
default. So the ideal seems to be your patch, with the addition of

   function x ((int) $casted_parameter) { ... }

Since this can be built as an extra step on top of your patch, this
can be added later (although it would obviously be great if you
preferred to add it now...). I recommend you proceed with the next
step of getting your patch accepted (I presume an RFC or something).


Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Flexible type hinting

2009-07-02 Thread Paul Biggar
On Thu, Jul 2, 2009 at 3:17 PM, Jonathan Bond-Caron wrote:
> From userland, I'm a big fan of this proposal / agree to the principle 
> although I'm not convinced that
>
> function($quantity)
> function(int $quantity)
> function(+int $quantity)
> function(-int $quantity)
>
> looks appropriate...
>
> Userland Note: The "-int" semantics confuses me, I realize it would cast 
> null's etc... but is it really a requirement? Can it be dropped?

Yes, it looks off, doesnt it. I quite like ~int, which you suggested
below, for the casting. It makes me think "this is kinda an int".


> function($quantity)
> function(int $quantity)  paul's (S) casting
> function(int! $quantity) STRICT type  --- seems appropriate for php, follows 
> CSS/web style '!important'

Yes, I like that too.


> To me, strict as default doesn't seem appropriate in php. But if it does, 
> this syntax could be interesting:
>
> function($quantity)
> function(int $quantity)  STRICT type
> function(~int $quantity) paul's (S) casting

This is also a nice, internally-consistent syntax.


If you had something for (H) in your suggestions, I'd be happier. We
need to cast there too.


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Flexible type hinting

2009-07-02 Thread Paul Biggar
On Thu, Jul 2, 2009 at 1:53 PM, Ilia Alshanetsky wrote:
> Paul's proposal is some part does not make sense because it allows weak type
> hinting, which should not be used if you need type hinting. The whole idea
> about type hinting is definition of strict interfaces, not loosely based
> one. That's just my opinion, which admittedly I feel fairly strongly about.

No, the idea of type hinting is that you hint to the user about what
type the interface expects. You have implemented "strong type
checking" instead. I wonder that you cannot see the conflict between
the term "type hinting" and "strict interfaces". I would be happier if
you renamed your patch "optional strong scalar type checking", which
is what it is.

I have described a system that allows "strong type checking", "type
hinting", and "type casting", which I think is all the use cases.

Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Flexible type hinting

2009-07-02 Thread Paul Biggar
On Thu, Jul 2, 2009 at 1:43 PM, Ilia Alshanetsky wrote:
>
> On 2-Jul-09, at 4:45 AM, Paul Biggar wrote:
>> I think you might not have read what I suggested (it is different than
>> the one I emailed to you privately). What you want is fully supported.
>> If you must be passed an int, use the +int hint.
>
> I'd rather use -int, then +int and make people who want loose typing do the
> extra bit, because native typing should be strict.

I think everyone wants their favourite to be default. I was eager to
have the typing that's currently in the manual be the default, since
that's what people are used to seeing. I suggested that strong type
checks use +int exactly because that's different to what's in the
manual, so there would be no confusion.

I'm really looking to get people to agree to the principle that we
would like to be able to hint every signature without large changes to
the manual, since we've been looking at it for years.



Thanks,
Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Flexible type hinting

2009-07-02 Thread Paul Biggar
On Thu, Jul 2, 2009 at 5:26 AM, Ilia Alshanetsky wrote:

> 1) Strict type hinting helps to solve bugs, both the ones made out of
> careless/missing validation as well as subtle logic bugs that often take
> hours to resolve. I can tell you that within a week of implementing type
> hints we've been able to identify 30-40 bugs within a period of day. Many of
> which would not have been detected with "flexible" type hints that Paul is
> suggesting here is one example:

I think you might not have read what I suggested (it is different than
the one I emailed to you privately). What you want is fully supported.
If you must be passed an int, use the +int hint.



> type. The code also includes bits from PEAR and external libs like fpdf and
> guess what those have no type hints and they work along side with type
> hinted code without any issues.

It is instructive that PEAR could not use your proposed hints. The
"flexible" system would work fine though.



> situations. Of all the replies I see no objection so far and the only
> complaint (funnily enough) is about existence of IS_NUMERIC.

Yes. Nobody wants numeric. It doesnt hint at anything.


> I do not wish to start a flame war or arrive at a wishy washy compromise
> that does not provide a solid solution. If the majority disagrees with the

I think my "flexible" system is not a wishy washy compromise (the one
I sent you by private email was). I think rather it has all the
advantages you want, all the advantages I want, and even supports what
Stas wants.



> to work in a future with a library/framework that is strict about its input
> or some far fetched idea that it will change the very nature of PHP.

I don't think we are worried about it changing PHP, or about libraries
using strict type hints. We are worried that libraries will use no
hints, because the ones on offer are not useful to them.



Please, take a read over my full proposal. I think you'll find it that
it supports everyone's features without compromise.

Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Flexible type hinting

2009-07-02 Thread Paul Biggar
On Thu, Jul 2, 2009 at 9:27 AM, Stanislav Malyshev wrote:
> There is a functionality handling default values (or, more precisely,
> optional arguments) with inheritance, etc.

Are optional argument specifications inherited? I did not think they
were, but its not clear if you are saying they are.

My understanding of PHP's optional arguments is that they only apply
to a function once it is called.

> And it is doing something - if
> you declare interface as foo(int $a) and implement it as foo(string $a)
> there may be a lot of WTF happening.

I mean that it is never called, therefore the type hints are not
checked, and this is fine.

Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Flexible type hinting

2009-07-02 Thread Paul Biggar
On Thu, Jul 2, 2009 at 8:28 AM, Stanislav Malyshev wrote:
> Also, looking at the patch I think it doesn't cover the matter of inheriting
> the typehinted methods - i.e. if there's a typehinted method, could I
> override it with non-typehinted version or vice versa? What about typehinted
> interfaces?

I don't think we need to worry about this. Consider it to be the same
as default values. So yes, yes, allowed but dont do anything since
those functions are never called.


Paul


> --
> Stanislav Malyshev, Zend Software Architect
> s...@zend.com   http://www.zend.com/
> (408)253-8829   MSN: s...@zend.com
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Flexible type hinting

2009-07-01 Thread Paul Biggar
Hi folks,

Thanks to Ilia for getting to ball rolling on scalar type hinting.

It seems there are 3 camps:
  - (C) the type checking camp: "when I say 'int' I mean 'int'". This
is what Ilia's patch does.
  - (H) the type hinting crowd: 'int' is a hint to the user that an
int is expected. This gels well with PHP's weakly typed scalars. I
think few people are in this crowd, but a lot of the (S) crowd are
mistakenly thought to be.
  - (S) the "sensible" middle: 'int' means an integer of course. The
manual is written somewhere between (S) and (H).

I believe I have a solution that caters to each crowd, without being
too complicated.

There are advantages and disadvantage to all of these:

 - The main disadvantage of each system is that it doesnt provide what
the other systems allow. Strong is too strong for many. Weak is too
weak for most.
 - Ilia had a very good point against (H), which is that many
functions return NULL or FALSE, and there are lots of errors when
these are automatically (and silently) converted to 0 or "". (H) will
not catch anything.
 - A strong argument against (C) is that this currently has no
parallel with how scalars are handled in PHP currently.
 - A (I think weak) argument for (C) is that this is how object type
hinting works
 - An argument for (H)/(S) is that the manual has been written in this
style, using this syntax.
 - A good argument against (C) is that it cannot be used to hint PHP's
builtin functions.
 - The (C) crowd suggested numeric and scalar to the (H) crowd, but I
dont think they were impressed.
 - I dont think there is a strong case for a strongly typed bool.


Here is the solution:

By default, use (S). The semantics of (S) are roughly provided in a
table at the bottom. The idea is that for ints, we take "5", and 5,
and fail on "str", FALSE, resource, etc.

Allow a very easy way to get (C) and (H) using '+' and '-'. "+int"
means fail on anything but an int. This is (C). "-int" means "I expect
an int, but I'll take whatever you give me, and cast it to an int".
This is (H). (H) is for those times where neither (C) nor (S) are
suitable, which occurs in the standard library a lot. I hope that it
wouldnt be used much otherwise.

With each case, the function author can expect that they if they ask
for X, they will get an X.

I think numeric isnt required anymore, which is good.

Example:

function add_user (+string name, string phone_number, int age, +int
friend_count, resource photo) { ... }




We may bike shed for a while about the choice of +/- vs "strict int"
or "weak int", as well as some of the choices in (S). Lets argue about
the overall idea first, and get to specifics later.

If people like this, I can work on the patch.


Thanks,
Paul


* This is a suggested semantics for (S) 

Each line is in the form: "Run-time type -> type hint = result". You
may read "x -> y = z" as "an x passed to a hinted parameter y gives a
z". * means all types I didn't mention explicitly. ?? means reasonable
people may disagree. I would lean towards FAIL in these cases.


array -> array = array
* -> array = FAIL

numeric string -> int = cast to int
real -> int = cast to int
int -> int = int
* -> int = FAIL

int -> numeric = int
real -> numeric = real
string -> numeric = real/int
bool -> numeric = ??
* -> numeric = FAIL

int -> bool = bool
bool -> bool = bool
null -> bool = false
real -> bool = bool
string -> bool = bool
* -> bool = ??


null -> null = null
* -> null = FAIL


array -> scalar = FAIL
int -> scalar = int
bool -> scalar = bool
null -> scalar = null
real -> scalar = real
string -> scalar = string
resource -> scalar = FAIL
object -> scalar = FAIL
MyObj -> scalar = FAIL

* -> mixed = *

int -> real = real
real -> real = real
numeric string -> real = real
* -> real = FAIL

array -> string = FAIL
int -> string = string
bool -> string = FAIL
null -> string = FAIL
real -> string = string
string -> string = string
resource -> string = FAIL
object -> string = __toString() or FAIL


resource -> resource = resource
* -> resource = FAIL

object -> object = object
MyObj -> object = MyObj
* -> object = FAIL

MyObj -> MyObj = MyObj
* -> MyObj = FAIL



* This is a suggested semantics for (H) 

Whatever is passed will be cast to whatever you ask for, using
existing casting rules, even if thats stupid.


* This is a suggested semantics for (H) 

If you ask for X, it must be X, except:
object with __toString() -> string = string

Anything else is FAIL (which I believe is an E_RECOVERABLE_ERROR).



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-01 Thread Paul Biggar
On Wed, Jul 1, 2009 at 10:53 PM, Hannes
Magnusson wrote:
> On Wed, Jul 1, 2009 at 23:36, Paul Biggar wrote:
>> On Wed, Jul 1, 2009 at 10:23 PM, Hannes
>> Magnusson wrote:
>>> On Wed, Jul 1, 2009 at 23:07, Paul Biggar wrote:
>>>> So, what you're saying is, the patch already handles coercion? If
>>>> that's the case, then problem solved.
>>>
>>> The patch offers scalar type _hinting_. Not type _casting_.
>>>
>>> Type hinting in PHP works very simply: If the value doesn't type-match
>>> the argument information (arginfo internally) then it will be rejected
>>> and E_RECOVERABLE_ERROR thrown.
>>>
>>> In most circumstances that error is fatal. However. If the user
>>> chooses then he can ignore that error (by creating his own error
>>> handler) and continue the execution.
>>>
>>> Type _hinting_ is in no way related to type _casting_.
>>
>>
>> It should be.
>
> function foo(string $str, array $arr) {}
>
> foo(false, "foobar");
>
> You are saying that the first argument should be casted, but not the second?
> Or are you planning on breaking pretty much every single application using 
> PHP5?

I'm sorry, I don't see what you're saying?



>> Yes. Therefore only the scalar and numeric types are useful. Nobody
>> wants to use an 'int' hint that fails on numeric strings.
>
> I do. I don't only deal with $_REQUEST stuff.
> I don't have the resources to go the Y! route. I write bunch of stuff
> in PHP. Real type hinting would help _alot_.

My point is that type hints should be like what we've been using for
years in the docs.

There is obviously a tension here. People want two different features.
I'm not sure I see a way to reconcile that (unless you'd like 'strict
int' or 'is int'?)




>> Also, I don't know what happens for string hints when you pass an
>> object with a __toString handler, but it should be allowed.
>
> Apply the patch and try?
> It would be neat if people would do a quick readthrough the patch
> before arguing against it :)

I did of course read the patch. I ask questions the way I do to avoid
confrontation, which is all too prevalent on this list.

I wanted to know what the code was intended to do, not what it does.
The patch wasnt clear without context, and it had no comments or
tests.

(FYI, I did a fairly detailed review of the type hinting patch last
year, which was ignored, so I'm reluctant to put the same effort in
here).


Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-01 Thread Paul Biggar
On Wed, Jul 1, 2009 at 10:23 PM, Hannes
Magnusson wrote:
> On Wed, Jul 1, 2009 at 23:07, Paul Biggar wrote:
>> So, what you're saying is, the patch already handles coercion? If
>> that's the case, then problem solved.
>
> The patch offers scalar type _hinting_. Not type _casting_.
>
> Type hinting in PHP works very simply: If the value doesn't type-match
> the argument information (arginfo internally) then it will be rejected
> and E_RECOVERABLE_ERROR thrown.
>
> In most circumstances that error is fatal. However. If the user
> chooses then he can ignore that error (by creating his own error
> handler) and continue the execution.
>
> Type _hinting_ is in no way related to type _casting_.


It should be.

The current type hinting is for objects, which in PHP are strongly
typed. We plan to extend it to scalars, which in PHP are weakly typed.
Adding a strong type system for scalars goes against the rest of the
language.

As Stas said:
> With this patch, we won't have one logic anymore - we'd have two logics - one 
> for typehinted functions (reject everything that doesn't match the type) and 
> one for the rest of the language (try to coerce
> types). Two logics in one language is usually not good.

PHP already has 2 type systems. I don't think that adding a 3rd one is
complementary.



> Furthermore, the patch introduces couple of new types, "scalar" and
> "numeric". These are "magic types" and do value-to-real-type
> comparison. No type casting.
> The scalar type hint accepts strings, booleans, ints and floats.
> The numeric type hint accepts strings (that pass is_numeric()),
> booleans, ints and floats.


Yes. Therefore only the scalar and numeric types are useful. Nobody
wants to use an 'int' hint that fails on numeric strings.

Also, I don't know what happens for string hints when you pass an
object with a __toString handler, but it should be allowed.


Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-01 Thread Paul Biggar
On Wed, Jul 1, 2009 at 10:02 PM, Hannes
Magnusson wrote:
> On Wed, Jul 1, 2009 at 22:50, Stanislav Malyshev wrote:
>> No, internal typehint doesn't work the way "int" typehint works with this
>> patch. Internal typehint (zend_parse_parameters) do conversions, see
>
> You are wrong. Internal type hinting is done in the form of argument
> information.
> Those are identical to the userspace type hinting.
>
> ZEND_BEGIN_ARG_INFO(arginfo_foo, 0)
>    ZEND_ARG_OBJ_INFO(0, MyClass, argumentName, 0)
> ZEND_END_ARG_INFO();
>
> is the same as the userspace form of:
>
> function foo(MyClass $argumentName) {}
>
> The current patch is missing a ZEND_ARG_STRING_INFO(0, argumentName,
> 0) which would be the same as
> fnuction foo(string $argumentName){}
> If that is the onlything you are worrying about then thats easily fixed.
>
>
> zend_parse_parameters(... abcdefg)
> is the same as
> function($a, $b, $c..) { $a = (int) $a; $b = (string) $b; $c = (array) $c...}


So, what you're saying is, the patch already handles coercion? If
that's the case, then problem solved.

(But I didnt think it did.)

Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-01 Thread Paul Biggar
On Wed, Jul 1, 2009 at 8:04 PM, Stanislav Malyshev wrote:
>> I agree. We won't be able to use an int type for something which
>> should take an int. That might not matter in user code, but if we
>> cannot actually type hint internals functions then its a problem.
>
> Internal functions have types, however parameters of different types are
> usually converted, not rejected.

Yes, rejection is bad. Coercion is good. (Silently accepting is bad).


>> My feeling is that scalars should be automatically coerced to the
>> correct type, if it makes sense to do so. (ie reject non-numeric
>
> That (coercion) is what internal functions do, but not what the proposed
> patch does (except for "numeric" hint).

Right. I think we're arguing on the same side. I'm just saying that if
the manual says 'int', we should be able to use int, and not be
required to use 'numeric'.


Thanks,
Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-01 Thread Paul Biggar
On Wed, Jul 1, 2009 at 7:44 PM, Stanislav Malyshev wrote:
> Hi!
>
>> As far as your point goes, numeric hint addresses it.
>
> Numeric hint addresses one scenario only. It doesn't address conversions to
> strings or booleans, for example (even C allows you to use int as boolean!
> :).

I agree. We won't be able to use an int type for something which
should take an int. That might not matter in user code, but if we
cannot actually type hint internals functions then its a problem.

I think it should be a requirement that internals functions should be
able to be type hinted using what the manual says. We don't need to be
too strict on that, but if that manual says int, it should be hintable
with int, and accept "1".

My feeling is that scalars should be automatically coerced to the
correct type, if it makes sense to do so. (ie reject non-numeric
strings for ints).

Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Type hinting revisited for PHP 5.3

2009-07-01 Thread Paul Biggar
Hi Ilia,

This is great.

On Wed, Jul 1, 2009 at 5:59 PM, Ilia Alshanetsky wrote:
> I've taken a few hours this morning to port my 5.2 type hinting patch to
> 5.3. In recognition of a need for a more 'flexible' numeric type I've
> introduced (numeric) type hint that would allow bool/int/float data types as
> well as a string containing a numeric entity as identified by
> is_numeric_string(). For completion i've also added (scalar) data type that
> will allow any scalar data element.

I think this will go a long way to addressing people's concerns when
this came up previously.


> The patch is available here: http://ia.gd/patch/type_hint_53.txt

I presume the idea is that some people (if they so chose) would want
to type hint every parameter in their program. To facilitate this, I
might suggest a "mixed" hint (like in the docs), and a null hint
(though I'm not sure if it would be called "null" or "unset" or both).


Finally, I don't want to ruin this, but last time there was
disagreement over whether numbers should be coerced to the specified
types, or left alone. What does your patch do?

ie function x (int $x) { echo is_int ($x); } x ("5");



Thanks,
Paul


> It should be noted that this is not the first idea for type hints, that
> credit goes to Hannes Magnusson who had posted a similar patch on the
> internals list back in 2006. Also, back in 2008 Felipe Pena wrote a type
> hinting patch for PHP that is available on wiki.php.net.
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: PHC Dataflow

2009-06-30 Thread Paul Biggar
 a small set of hidden operations, like setting
$php_errormsg. Once you spot when those happen, you're probably home
free.


> evals, and includes I was considering inserting a fake def at each one of

That is the approach often taken:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.6974  (look
for virtual variable). Although now that I think about it, that paper
is hard, so I wouldn't bother. I've chatted with experts about what
that paper really means, and people don't really know.

So in this case, your use of variable $x would be chained to the def
of the fake var for $$a, which would be chained to all defs before
that.

However, I think they're rare enough that you can either stop
optimizing the function, or that you won't suffer much performance
penalty from over-writing every variable.


> those. The fake def could be marked specially in order to allow dead code
> elimination the ability to know that these aren't real defs and maybe make
> some observations about that (or whatever).

Don't bother. You won't gain anything. I wouldn't spend time trying to
make every single case precise.


> As for array/object stuff... I’m
> not extremely interested in trying to track this at the moment. For one I
> don’t think doing so will yield many results with the limited scope
> pecl/optimizer has for analysis plus I have no plans for using that data at

Good, that sounds right.


> the moment. The only big advantage is it might help narrow down data types
> for other vars better.  The other big area I haven’t looked at are
> references. The easiest approach I see here is to just duplicate the def-use
> graph for each reference from the point after the reference was created.

I'm not sure what you mean by "duplicate the def-use graph".
Basically, you just need to keep track of which variables might
reference other variables, and possibly whether they "might" or
"must". Variables which are passed into functions, come as return
values, are global, are parameters, etc, fall into the set of "might
be referenced". You should be able to access all the signatures of
called functions, so to avoid the results being too bad.

> This of course could lead to a lot of extra junk in code with lots of
> references. Plus references to specific array indexes etc might turn out to
> be a big pain.  Heh sorry for the open endedness here but... Any
> Ideas/comments on this approach?

Don't model array indices or fields (especially fields). Always assume
you never know the value of a referenced variable. If there are lots
of references, and you are using too much memory, give up after a
certain threshold.


Some advice that I would have like to have taken: give up on things
that look like they might be hard. (I wasn't in a position to do that,
because the hard thing is what I'm doing my PhD on, but you dont need
to do it all).



Some other things I would add:
 - focus on constants, DCE, and bytecode specific stuff. You're not
compiling here.
 - use SCCP for constant propagation. Use the DCE algorithm from
"Engineering a Compiler" by Cooper/Torczon.


Anyway, it sounds like you really know what you're at. You must have
been studying. What texts are you using? If you don't have it, I'm a
big fan of Cooper/Torczon -- its very pragmatic.



Paul


> On 6/29/09 11:04 AM, "Paul Biggar"  wrote:
>
> Hey Graham,
>
> Sorry for the delay, I meant to send this over the weekend.
>
> On Thu, Jun 25, 2009 at 2:50 AM, Paul Biggar wrote:
>>
>> When I say model, I mean that we "abstractly" "interpret" the program.
>> Its like we're interpreting it, but we don't have the actual values,
>> so we model all possible values at a certain point. We do this using 3
>> combined analyses: constant propagation, alias analysis, and type
>> inference. We also model definition of constants and the callgraph. We
>> don't currently (but plan to in the future) model integer- and
>> string-value-ranges, included files, or "truth analysis" (though that
>> would only take about an hour, I think).
>
> (If you're already familiar with how compiler optimization works, the
> first few paragraphs won't be anything new to you. I'll gloss over the
> simple compiler terms, please email me if googling hasnt worked for
> you; I'll keep explaining the advanced concepts).
>
> So the abstract interpretation then. We analyse a representation
> called the "Medium-level Intermediate Representation" (MIR), which is
> pretty close to PHP bytecode. We start by invoking the first method,
> __MAIN__, which represents the global scope.
>
>
> Upon entering a method, we convert it into a control-flow-gra

[PHP-DEV] Re: PHC Dataflow

2009-06-29 Thread Paul Biggar
Hey Graham,

Sorry for the delay, I meant to send this over the weekend.

On Thu, Jun 25, 2009 at 2:50 AM, Paul Biggar wrote:
>
> When I say model, I mean that we "abstractly" "interpret" the program.
> Its like we're interpreting it, but we don't have the actual values,
> so we model all possible values at a certain point. We do this using 3
> combined analyses: constant propagation, alias analysis, and type
> inference. We also model definition of constants and the callgraph. We
> don't currently (but plan to in the future) model integer- and
> string-value-ranges, included files, or "truth analysis" (though that
> would only take about an hour, I think).

(If you're already familiar with how compiler optimization works, the
first few paragraphs won't be anything new to you. I'll gloss over the
simple compiler terms, please email me if googling hasnt worked for
you; I'll keep explaining the advanced concepts).

So the abstract interpretation then. We analyse a representation
called the "Medium-level Intermediate Representation" (MIR), which is
pretty close to PHP bytecode. We start by invoking the first method,
__MAIN__, which represents the global scope.


Upon entering a method, we convert it into a control-flow-graph (CFG),
which has one "basic block" per statement. We add the entry block to a
"worklist", which is processed roughly in FIFO order. After we process
a statement for the first time, we add its successors to the worklist.
In the case of loops, we may process the same statement again (or a
few times) -- in which case we only add the successor if the results
of the analysis have changed (aka we iterate until we reach a
"fixed-point").

This is all just like normal dataflow analysis. One twist we have is
that we use "conditional constant propagation". This means that if we
have something like:

if ($myvar) $x = 5 else $x = 6;

and we know that $myvar is 7, then we don't process the $x = 6, since
it can't occur.


At each "program point" (ie after each block) we store the complete
state of the analysis at this point. What exactly is stored depends on
the analysis. For constant-propagation, we store a "lattice" of
constants. That means that we know each variable is one of:
- TOP: the value is undefined
- BOTTOM: the variable may have multiple values
- or we know the exact constant that variable represents

With PHP, there is a slight difference from "normal" lattice based
analysis, in that TOP can represent NULL, since variables are
automatically initialized to NULL. When we find another value for $x,
we merge it with the current value. While (TOP + some value V => V),
(NULL + non-NULL => BOTTOM).

For type-inference, we use a set of type names for each variable. The
types are int, bool, unset, real, string, resource, array, and any
concrete classname. We don't have a BOTTOM (most conservative result),
because not knowing one type is the end of the analysis. In theory,
this means that in some unusual cases (like "new $classname.$num") the
analysis will only terminate by running of out memory. Oh well.

For constants (as in "define (MYCONST, 5);"), we store the value of
each constant so far in the program. We use the lattice approach from
constant-propagation, but unknown values only come up if you define
the same constant in different branches, which isn't too common I
think. We currently don't model these well enough to be able to
resolve a call to is_defined, but this wouldn't be too hard.

Finally, the really big and important analysis is alias analysis. I'll
go into some detail in a moment. We store at each program point a
complete points-to graph of the alias results, and use copy-on-write
to avoid a massive explosion in space and execution time (not very
successfully).


Alias analysis is the really important analysis. I mentioned before
that not knowing a single type will ruin the analysis. When I said
above that we stored a constant lattice and a set of types for each
variable, that wasn't exactly true. We actually store them for each
object, field, array, index, symbol table and variable. The alias
analysis allows us to tell the interaction between all of these.

You might notice that these three concepts are mostly the same: object
is-to field as array is-to index as symbol-table is-to variable. In
the Zend engine, these are all implemented using hashtables. In our
analysis, these are also all modelled the same way.

Our points-to graph is just a graph with 3 kinds of nodes:
   - "storage nodes" represent objects, arrays and symbol tables,
  - "index nodes" represent fields, variables and array indices,
  - "value nodes" represent scalar values of index nodes.

There are 3 kinds of edges
 - "field edges" go fr

Re: [PHP-DEV] Re: PHC Dataflow

2009-06-25 Thread Paul Biggar
Hi Nuno,

On Thu, Jun 25, 2009 at 9:08 PM, Nuno Lopes wrote:
>> So we do whole program analysis. We start at the first statement in
>> global scope, model it, then add all its successor to a queue, and
>> keep modelling until we get to the end of the program. When we get to
>> a method call, we start modelling the execution of the method. At the
>> end of the method, we propagate the results back to the caller, model
>> the return value, and carry on modelling the caller.

This is just an outline. I imagine this description can be applied to
a lot of compilers and a lot of analysis algorithms.


> it sounds like this algorithm:
> Precise interprocedural dataflow analysis via graph reachability, POPL'95
> http://portal.acm.org/citation.cfm?id=199462

I'm not familiar with this paper. However, from the abstract, it looks
like it wouldn't handle alias analysis (thats not to say it couldnt be
extended to do it). More importantly though, PHP's semantics are truly
special, so most analyses for "traditional" languages don't apply
well. Of course, many of the techniques do.

If you're looking for papers that describe the general approach, try
http://portal.acm.org/citation.cfm?id=178264 and
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.2637.

(FYI, people without ACM access can nearly always find those papers by
googling the title).

Paul



>
> Nuno
>



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: PHC Dataflow

2009-06-24 Thread Paul Biggar
Hi folks,

>> On Wed, Jun 24, 2009 at 11:42 PM, Graham Kelly
>> wrote:
>>> You mentioned a couple of weeks ago that you would be interested in
>>> further
>>> discussing dataflow for PHP. I briefly looked at the dataflow branch of
>>> PHC.
>>> It looks like you have been doing a LOT of work in this area. I would
>>> like
>>> to move the optimizer more towards this direction with the analysis and
>>> was
>>> wondering if maybe you could share some insights.

>> I would love a more in-depth description of the method you are using. As
>> for
>>
>> I have no problem with you CCing internals on it.

Following last week's start, this is a continuation of my discussion
with Graham Kelly about the design of the phc optimizer, so as to
better influence the PECL_optimizer. I've CCed PHP-internals since I
was asked to, and most people who would be interested are here. Please
let me know if it is too much noise, and we can take it elsewhere. I
would strongly welcome feedback.


So, I have a ton of information here. I'll try to organize it into a
few chunks, since there is way too much for one email. I'll start with
a quick intro and overview

The phc optimizer will form the bulk of my PhD thesis (coming soon, I
swear). I've spent about 2 years on it, not including extra time I
spent on phc itself. The source for the optimizer is online
(http://code.google.com/p/phc/source/browse/#svn/branches/dataflow/src/optimize),
and there is some other information about phc too
(http://phpcompiler.org/documentation.html). The state of the
optimizer is that the prototype is almost finished. It handles most
things. I plan to finish the prototype in about a week, write a paper,
then write my thesis.


The primary assumption I made is that the optimizer should be
whole-program. I'll justify that in a moment. This means we need to
see the source of all the code for a PHP program. This means it needs
to be a "deployment-time" optimizer, meaning after you've installed
the program, set the configuration variables, installed the plugins,
etc. A change in the source (ie addition of extra plugins) means the
optimizer will need to run again. Fiddling with .ini settings may
require a recompile, depending on the setting. I don't think this
hurts most people who would be serious about optimization (say, people
who run APC).

The main reason for this assumption is that we are unable to get any
results at all using intra-procedural analysis (intra-procedural means
you look at one function at a time). Consider:

function myfunc ($x, $y) { ... }

Since we know nothing about the types of $x and $y, there is little we
can say about the function. It may be $x and $y are references to each
other (via call-time-pass-by-ref), and that modifying one will modify
the other. It may be that $x is an object, and that echoing it will
call the __toString handler, possibly changing the value of $y, $this,
etc.

 There are a few more cases like this, but the hardest is the case
where $x has object_handlers. In this case, many simple operations
might trigger bizarre affects. As a worst-case example, the "read"
handler can be invoked by a read to $x, changing variables in the
active-symbol-table (local scope), setting each of them to have a
read-handler. In this way, unknown results can propagate to callers,
callees, globals, etc, making all analysis pretty much worthless. This
is an extreme case, which you could probably ignore, but the other
examples above couldn't be ignored.


So we do whole program analysis. We start at the first statement in
global scope, model it, then add all its successor to a queue, and
keep modelling until we get to the end of the program. When we get to
a method call, we start modelling the execution of the method. At the
end of the method, we propagate the results back to the caller, model
the return value, and carry on modelling the caller.

(The technical names for this is "abstract interpretation", or
"symbolic execution", though we're kinda abusing both terms. Really,
this is pretty similar to dataflow analysis).

When I say model, I mean that we "abstractly" "interpret" the program.
Its like we're interpreting it, but we don't have the actual values,
so we model all possible values at a certain point. We do this using 3
combined analyses: constant propagation, alias analysis, and type
inference. We also model definition of constants and the callgraph. We
don't currently (but plan to in the future) model integer- and
string-value-ranges, included files, or "truth analysis" (though that
would only take about an hour, I think).

I think that's probably enough for now. I'll talk about those 3
analyses tomorrow. Questions and feedback are welcome (especially if I
didn't explain clearly enough - academic stuff can be tough to explain
to a lay audience).

Thanks,
Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] cvs: phpruntests /code-samples arrayTest.php

2009-06-21 Thread Paul Biggar
Hi Georg,

Thanks for the quick reply.

On Sun, Jun 21, 2009 at 8:22 PM, Georg Gradwohl wrote:
> thanks for the notice. To cut a long story short: we've discussed the use of
> the SPL-File-Iterator in the runtests-project. So I created a very simple
> test which passes through an array to compare the performance of the
> spl-iterator to a for-loop.
>
> So I think this is no overlap with the benchmark-project, it's just a small
> helper to alleviate our decision.

Ah ok, I can see that. Thanks.

 FWIW, I consider the benchmarks to be a subproject of the run-tests
framework, at least in the long run.


> PHP 5.3.0RC2 (cli)
> g2-mbp:code-samples g2$ php arrayTest.php 
> size:    
> loop:    0.00361 sec (100%)
> spl:     0.02502 sec (693%)
> DIFF:    0.02141 sec
>
> PHP 5.2.8 (cli)
> g2-mbp:code-samples g2$ php arrayTest.php 
> size:    
> loop:    0.00489 sec (100%)
> spl:     0.76226 sec (15588%)
> DIFF:    0.75737 sec


That is interesting. Alexander is making a place for microbenchmarks,
so we might copy this at some point.

Thanks,
Paul


> Am 21.06.09 20:49 schrieb "Paul Biggar" unter :
>
>> Hi Georg,
>>
>> I notice this commit that looks like a benchmark. I'm trying to figure
>> out if the new run-tests rewrite overlaps with the benchmarks GSoC
>> project (you might not have been aware since we have the discussions
>> on php-qa). I wouldn't mind hearing your thoughts so that we avoid
>> duplicated effort.
>>
>> Thanks,
>> Paul
>
>
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] cvs: phpruntests /code-samples arrayTest.php

2009-06-21 Thread Paul Biggar
Hi Georg,

I notice this commit that looks like a benchmark. I'm trying to figure
out if the new run-tests rewrite overlaps with the benchmarks GSoC
project (you might not have been aware since we have the discussions
on php-qa). I wouldn't mind hearing your thoughts so that we avoid
duplicated effort.

Thanks,
Paul

On Sun, Jun 21, 2009 at 6:30 PM, Georg Gradwohl wrote:
> g2              Sun Jun 21 17:30:46 2009 UTC
>
>  Added files:
>    /phpruntests/code-samples   arrayTest.php
>  Log:
>  phpruntests - add arrayTest (spl-performace-test)
>
>
> http://cvs.php.net/viewvc.cgi/phpruntests/code-samples/arrayTest.php?view=markup&rev=1.1
> Index: phpruntests/code-samples/arrayTest.php
> +++ phpruntests/code-samples/arrayTest.php
> 
> $size = isset($argv[1]) ? $argv[1] : 9;
>
>
> // create array
>
> $testArray = array();
>
> for ($i=0; $i<$size; $i++) {
>
>        $testArray[$i] = rand(0,9);
> }
>
> print "size:\t$size\n";
> flush();
>
>
> // loop
>
> $s = microtime(true);
>
> for ($i=0; $i<$size; $i++) {
>
>        if (isset($testArray[$i])) {
>
>                if ($i%2 == 0) {
>
>                        $testArray[$i] = 'G';
>                }
>        }
> }
>
> $e = microtime(true);
>
> $tl = round($e-$s, 5);
>
> print "loop:\t$tl sec (100%)\n";
> flush();
>
>
> // iterator
>
> $s = microtime(true);
>
> $testObject = new ArrayObject($testArray);
> $iterator = $testObject->getIterator();
>
> while ($iterator->valid()) {
>
>        if ($iterator->key()%2 == 0) {
>
>                $testObject[$iterator->current()] = 'G';
>        }
>
>        $iterator->next();
> }
>
> $e = microtime(true);
>
> $ti = round($e-$s, 5);
>
> $diff = round($ti/$tl*100, 0);
>
> print "spl:\t$ti sec ($diff%)\n";
> print "DIFF:\t".($ti-$tl)." sec\n";
> flush();
>
>
> ?>
>
>
> --
> PHP CVS Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Last steps towards 5.3.0

2009-06-11 Thread Paul Biggar
On Thu, Jun 11, 2009 at 9:35 AM, Nuno Lopes wrote:
> Oh, very nice, Paul ;) I wasn't aware of this script.
> btw, does it perform inlining of included files? (so that you can have a
> self-contained reduced file)

phc supporting inlining with --include. So I would guess that calling
$reduce->set_phc ("phc --include") it would work, but I haven't tried
it.


Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Last steps towards 5.3.0

2009-06-10 Thread Paul Biggar
On Wed, Jun 10, 2009 at 11:10 PM, Paul Biggar wrote:
> I wrote a tool which automatically. It uses the delta debugging

I sent this prematurely. It should read
"I wrote a tool which reduces test cases automatically."

> algorithm, but is very much faster than delta, since it knows how PHP
> works. I hope its also easier to use - you basically write a function
> which takes source code, and returns true if the bug is still present.
>
> Its at 
> http://code.google.com/p/phc/source/browse/trunk/test/framework/reduce/Reduce.php.
> You need to a copy of phc (phpcompiler.org) as well. The usage is
> explained in the comments. I've used it a lot, but I've only just made
> it work more generally. Please report any problems to me.
>
>
> Paul
>
>
> --
> Paul Biggar
> paul.big...@gmail.com
>



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Last steps towards 5.3.0

2009-06-10 Thread Paul Biggar
Hi Guilherme, Nuno,

On Wed, Jun 10, 2009 at 9:38 PM, Nuno Lopes wrote:
>> PS: If someone wants to give a hand to find the regression... I can
>> explain how to reproduce.
>
> The easiest way is to use a tool to help reducing the test case
> automatically.
> I don't know about any specific tool for reducing PHP files, but in the past


I wrote a tool which automatically. It uses the delta debugging
algorithm, but is very much faster than delta, since it knows how PHP
works. I hope its also easier to use - you basically write a function
which takes source code, and returns true if the bug is still present.

Its at 
http://code.google.com/p/phc/source/browse/trunk/test/framework/reduce/Reduce.php.
You need to a copy of phc (phpcompiler.org) as well. The usage is
explained in the comments. I've used it a lot, but I've only just made
it work more generally. Please report any problems to me.


Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Academic papers on PHP (was Re: [PHP-DEV] RE: Optimizer discussion)

2009-06-07 Thread Paul Biggar
Hi Sebastian,

On Sun, Jun 7, 2009 at 6:56 AM, Sebastian
Bergmann wrote:
> Paul Biggar schrieb:
>> They have a paper on PHP memory usage.
>
>  Link? I am collecting papers that deal with PHP at
>  http://delicious.com/sebastian_bergmann/academic_paper+php

This is great. Below is a list of all the papers I can think of. I
wonder if its a good idea to move your page to the wiki?

Some of these papers are less "academic" than others - I don't know
where you'd like to draw the line :)

Paul



Papers:

I see you have the Pixy and Minamide papers

This is the PLDI 2009 one. The paper is not yet published, so I can't
find a link.
A Study of Memory Management for Web-based Applications on Multicore Processors
by Hiroshi Inoue, Hideaki Komatsu, and Toshio Nakatani, IBM Tokyo
Research Laboratory


Sound and Precise Analysis of Web Applications for Injection Vulnerabilities
Gary Wassermann, Zhendong Su, PLDI'07.
http://wwwcsif.cs.ucdavis.edu/~wassermg/research/pldi07.pdf

Static Detection of Security Vulnerabilities in Scripting Languages
Yichen Xie and Alex Aiken
http://theory.stanford.edu/~yxie/sec.pdf

@conference{benda06,
   author = {Jan Benda and Tomas Matousek and Ladislav Prosek},
   year = {2006},
   title = {Phalanger: Compiling and Running {PHP} Applications on the
{Microsoft} {.NET} Platform},
   booktitle = {.NET Technologies 2006},
   month = {May},
   location = {Plzen, Czech Republic},
}

@article{johnson06,
   author = {Graeme Johnson and {Zo\"{e}} Slattery},
   title =  "{PHP}: A Language Implementer's Perspective",
   journal ="International PHP Magazine",
   year =   "2006",
   pages =  "24--29",
   month =  "Dec",
}

@techreport{deVries07,
  title = {Design and Implementation of a {PHP} Compiler Front-end},
  author = {Edsko de {Vries} and John Gilbert},
  institution = {Trinity College Dublin},
  type = {Dept. of Computer Science Technical Report},
  number = {TR-2007-47},
  year = {2007}
}

@inproceedings{1480908,
 author = {Tozawa, Akihiko and Tatsubori, Michiaki and Onodera, Tamiya
and Minamide, Yasuhiko},
 title = {Copy-on-write in the PHP language},
 booktitle = {POPL '09: Proceedings of the 36th annual ACM
SIGPLAN-SIGACT symposium on Principles of programming languages},
 year = {2009},
 isbn = {978-1-60558-379-2},
 pages = {200--212},
 location = {Savannah, GA, USA},
 doi = {http://doi.acm.org/10.1145/1480881.1480908},
 publisher = {ACM},
 address = {New York, NY, USA},
 }
http://www.trl.ibm.com/people/mich/pub/200901_popl2009phpsem.pdf

@inproceedings{biggar09,
author = {Paul Biggar and Edsko de Vries and David Gregg},
title = {A Practical Solution for Scripting Language Compilers},
booktitle = {SAC '09: Proceedings of the 2009 ACM symposium on
Applied computing},
year = {2009},
isbn = {978-1-60558-166-8},
pages = {1916--1923},
location = {Honolulu, Hawaii, U.S.A},

    publisher = {ACM},
address = {New York, NY, USA},
}
https://www.cs.tcd.ie/~pbiggar/sac-2009.pdf



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RE: Optimizer discussion

2009-06-06 Thread Paul Biggar
On Fri, Jun 5, 2009 at 11:23 PM, Nuno Lopes wrote:
> I'm happy there's some interest in a PHP optimizer :)
> I agree with Paul that PECL's optimizer duplicates way too much stuff from
> the Zend engine, which is not practic nor maintainable. (compare for example
> with the simple constant folder I implemented some years ago:
> http://web.ist.utl.pt/nuno.lopes/zend_constant_folding.txt).

This is certainly a much better demonstration of how the optimizer should work.

> About runkit & friends, I wouldn't worr
 much about them. If you're running
> them problably you also don't care about optimizations. If you want to be
> able to optimize something, you need to remove as many freedom degrees as
> you can..

This is probably true of runkit. However, I would be careful what you
remove for extra freedom. There is very likely PHP code out there that
relies (possibly by accident) on some edge cases.


> P.S.: I'll try to meet with Paul in PLDI (in a week) and chat about these
> kinds of things. Is anyone else comming that wants to join the discussion?

You should probably mention this is in Dublin.

Some of the IBM Toyko researches who work on (or maybe close to)
Project Zero will be there, and might have interesting ideas. They
have a paper on PHP memory usage.



Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Compiling 5.3 Snap on OS.X (embed & static/shared compile issue)

2009-06-05 Thread Paul Biggar
Hi Ralph,

On Fri, Jun 5, 2009 at 5:59 PM, Ralph Schindler wrote:
> Similarly described in this bug:
>
>  http://bugs.php.net/bug.php?id=42106

You might also find this relevant: http://bugs.php.net/bug.php?id=44462


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Optimizer discussion

2009-06-05 Thread Paul Biggar
Hi Graham,

On Fri, Jun 5, 2009 at 12:03 PM, Paul Biggar wrote:
>> Why not start off with the big stuff, dataflow. I personally believe that 
>> working out good data flow for PHP is key to getting good optimizations. But 
>> you are right, its a very tricky thing to do and in some cases impossible. 
>> Ultimately, I would like to move a lot of the optimizer work more into this 
>> direction and use the data flow to build a basic platform for code analysis 
>> on which optimizations can be done. For now though, pecl/optimizer is "dumb" 
>> about data types :-)
>
>
> And now the hard stuff. To avoid me repeating myself, let me just pimp
> my Tech Talk. Have a look at
> http://www.youtube.com/watch?v=kKySEUrP7LA from about the 30:45 mark
> until just before the 47:00 mark (slides at
> https://www.cs.tcd.ie/~pbiggar/paul_biggar_google_18_mar_2009_notes.pdf).
> That highlights most of the problems, and vaguely hints at their
> solution. We can go into much greater detail on the solutions after.


Based on the fact that you want to do dataflow, I wonder if its a good
idea to think about co-opting the phc optimizer to perform analysis on
bytecode. To my mind this seems much easier than re-implementing from
scratch. As I mentioned before, this incorporates about 2 years of
work (much of it research of course, so it might not take as long to
replicate). This would mean you could go straight to performing
analyses (though there will no doubt be work required on the optimizer
itself).

Technically speaking, this isn't a big problem. We'd probably need to
change the phc MIR to mirror the bytecode (no harm anyway in terms of
correctness), and have a bytecode-reader and -writer (though this
needn't involve serializing - likely a small interface instead).
Politically, I assume it won't be a problem either, since its in PECL.


Thoughts?

Paul




-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Optimizer discussion

2009-06-05 Thread Paul Biggar
Hi Graham,

Simple things first:

On Fri, Jun 5, 2009 at 1:08 AM, Graham Kelly  wrote:
> I'm not sure which optimization you are talking about with the GLOBALS stuff 
> but what your saying makes sense. (Its been awhile since I've looked at the 
> code base myself, I'm just getting back to working on it)

I copied that comment straight from the source, but I can't find it
now that I went looking for it. No matter.



> Why not start off with the big stuff, dataflow. I personally believe that 
> working out good data flow for PHP is key to getting good optimizations. But 
> you are right, its a very tricky thing to do and in some cases impossible. 
> Ultimately, I would like to move a lot of the optimizer work more into this 
> direction and use the data flow to build a basic platform for code analysis 
> on which optimizations can be done. For now though, pecl/optimizer is "dumb" 
> about data types :-)


And now the hard stuff. To avoid me repeating myself, let me just pimp
my Tech Talk. Have a look at
http://www.youtube.com/watch?v=kKySEUrP7LA from about the 30:45 mark
until just before the 47:00 mark (slides at
https://www.cs.tcd.ie/~pbiggar/paul_biggar_google_18_mar_2009_notes.pdf).
That highlights most of the problems, and vaguely hints at their
solution. We can go into much greater detail on the solutions after.



Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Optimizer discussion

2009-06-04 Thread Paul Biggar
Graham and I are having a brief chat about the work he's going to do
on the PECL optimizer. People have asked me to do this on-list (they
may have meant the PECL list, but optimizations on PHP seem more
relevant here), so here goes.


Hi Graham,

So the general gist of what I have to say is that dataflow
optimizations on PHP are very difficult, and nearly impossible at the
function-local level. Loop-invariant hoisting and other redundant
expression computation liekwise. If you're planning on working on
them, we can go into more detail.


I guess the biggest thing is that I'm wondering what your plans are
for the PECL optimizer? I've spent about 2 years working on the phc
optimizer, (and a bit longer on relevant things) so I hope that my
advice will be relevant.



I've taken a look through the optimizer a few times over the last
while, (and even stolen some ideas from it). Here are my comments on
the current code:

- There is lots of code which reimplements parts of the engine, for
example: ini_bool_decode, optimizer_acosh and friends, optimize_md5,
optimize_crc32, optimize_sha1, optimize_class_exists and friends (to a
lesser extent). There are also lots of constant foldings, like casts
and "0 == false" (etc) in optimize_code_block. I don't understand why
there is logic in the code for that, rather than simply executing the
opcodes, or constructing an eval and executing that.

- is_numeric_result: there has been great effort to figure out numeric
results from pure functions, when it seems straightforward to
optimizer the results straight in. Maybe that is being done elsewhere?
If so, there may need to be some care taken to ensure that all
optimizations terminate.

- File system functions are very iffy. I would be surprised if people
have content that reads from files repeatedly, but where the files do
not change, and who are willing to use that flag.

- Most of the identity optimizations arent safe. $x + 0 !== $x,
unfortunately, due to integer coercions (parallels exist for other
types/operators)

- I think I saw an optimizations converting ("45" + $x) into (45+$x) -
that's a great idea, which I will steal.

- How does runkit (and other weird extensions) affect optimizations on
constants, class_exists, etc?

- The optimization "unsafe: optimize out isset()/empty() ops on
GLOBALS['foo'] into $foo " is not safe, as GLOBALS['foo'] may not be
the same variable as $foo ($GLOBALS may be unset, and indeed, there
may be good reasons to do so).


I'm also wondering what the optimizations are on fcall? I couldn't make it out.


That's quite a lot, but its everything I have on the current PECL optimizer :)


Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PHP] Question about Olivier's method

2009-05-26 Thread Paul Biggar
On Tue, May 26, 2009 at 7:31 PM, Hannes Magnusson
 wrote:
> On Tue, May 26, 2009 at 20:21, Paul Biggar  wrote:
>> "The link below to the Oliver document appears to be broken. Here is
>> one that works: http://citeseer.ist.psu.edu/oliver93decision.html";
>
> That link is broken to, but archive.org has it from 1997;
> http://web.archive.org/web/*/http://www.cs.monash.edu.au/~jono/TechReports/TR173.dgraph.ps

Citeseer keeps a cache (in the top right-hand-side of the page).

Citeseer has been very reliable for the last year or so, but there
were problems with links changing, and an unreliable connection before
that. Maybe archive.org is better as a result.

Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PHP] Question about Olivier's method

2009-05-26 Thread Paul Biggar
On Tue, May 26, 2009 at 7:31 PM, Hannes Magnusson
 wrote:
> On Tue, May 26, 2009 at 20:21, Paul Biggar  wrote:
>> "The link below to the Oliver document appears to be broken. Here is
>> one that works: http://citeseer.ist.psu.edu/oliver93decision.html";
>
> That link is broken to, but archive.org has it from 1997;
> http://web.archive.org/web/*/http://www.cs.monash.edu.au/~jono/TechReports/TR173.dgraph.ps

Citeseer keeps a cache (in the top right-hand-side of the page).

Citeseer has been very reliable for the last year or so, but there
were problems with links changing, and an unreliable connection before
that. Maybe archive.org is better as a result.

Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [PHP] Question about Olivier's method

2009-05-26 Thread Paul Biggar
On Tue, May 26, 2009 at 7:06 PM, Hannes Magnusson
 wrote:
> On Tue, May 26, 2009 at 16:17, paul (aka: azmodai)

> So, who is this Oliver - and where is that description/pseudo code? :)

A note on that page (from brad dot fish at gmail dot com on
24-Feb-2006 10:30) says:

"The link below to the Oliver document appears to be broken. Here is
one that works: http://citeseer.ist.psu.edu/oliver93decision.html";


Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] CVS Account Request: dmendolia

2009-05-25 Thread Paul Biggar
Hi folks,

On Tue, May 5, 2009 at 6:19 PM, Davide Mendolia  wrote:
> contribute to php-benchmark
> http://cvs.php.net/viewvc.cgi/php-benchmarks/
>
> My first proposal(raytracer) was committed by Paul Biggar

Do I need to vouch for this? If so, vouched.

Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] CVS Account Request: dmendolia

2009-05-25 Thread Paul Biggar
Hi folks,

On Tue, May 5, 2009 at 6:19 PM, Davide Mendolia  wrote:
> contribute to php-benchmark
> http://cvs.php.net/viewvc.cgi/php-benchmarks/
>
> My first proposal(raytracer) was committed by Paul Biggar

Do I need to vouch for this? If so, vouched.

Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Method call improvements

2009-05-13 Thread Paul Biggar
Hi Stas,

On Tue, May 12, 2009 at 7:24 PM, Stanislav Malyshev  wrote:
> Hi!
>
>> Apologies, I'm not familiar with run-time inheritence in PHP. My
>> understanding was that when a classes source code is compiled, its
>> parent classes must be known. When is this not the case? Must it be
>> known for the class' first instantiation?
>
> No, the problems here are different. The process works as follows:
> 1. Class X source is compiled.
> 2. "X" is added to the class table
> 3. Class Y (extends X) source is compiled.
> 4. Since Y extends X, methods of X are added to methods of Y
> 5. "Y" is added to the class table
>
> Now, adding bytecode caching. Bytecode caching replaces steps 1 and 3 with
> "loaded from cache" - however since the identity of X can change between
> requests, what is stored for step 3 can not bind to X as it is now - for
> that there's step 4 which is executed at runtime, when the line where class
> is defined is executed. That means static table describing class Y can exist
> only after step 4, and it is not cacheable beyond the bounds of one request.

Great explanation, thank you. As far as terminology goes, this is
still static inheritance, as you cannot change a class' parent after
it has been "set" in a request. Run-time inheritance is where it can
change, for example in Javascript where an object's prototype can be
changed. I think you could do lookup caches (ie the OP's patch) either
way, but its probably cheaper with static inheritance.



> However, if we now are compiling the code such as:
> $a->foo();
> we meet with following challenges:
> 1. We do not know what class $a is (suppose it's X, but in most cases we
> won't know that)
> 2. If we did, we do not know what class X is (definition, as opposed to just
> name) at the compile time (it could be defined later)
> 3. If we knew what class X definition is at compile time, the above would
> preclude us from generating any code that binds to that definition since
> such code would not be cacheable.
>
> These are three independent challenges, without overcoming each of them I do
> not see how virtual table would be helpful.

Yes. As I replied to Dmitry, I clearly wasn't thinking when I
suggested this. FYI, I do type-inference on PHP, and the types here
are difficult to calculate in the general case.

Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Method call improvements

2009-05-12 Thread Paul Biggar
On Tue, May 12, 2009 at 12:29 PM, Dmitry Stogov  wrote:
>> But I think its fair to say that it has static inheritance - that is,
>> its full inheritance chain is known before it can be instantiated, and
>> it can never be changed after that.
>
> Right, but it has a lot of dynamic issues anyway. E.g. parent class may be
> changed or loaded from different file.

This is what I'm getting at. How can the parent class be changed? I
can see that it might be deferred, but I don't see how it can be
changed once it's set.


Thanks,
Paul




-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Method call improvements

2009-05-12 Thread Paul Biggar
On Tue, May 12, 2009 at 9:43 AM, Dmitry Stogov  wrote:
>> Apologies, I'm not familiar with run-time inheritence in PHP. My
>> understanding was that when a classes source code is compiled, its
>> parent classes must be known. When is this not the case?
>
> The parent class may be defined in other file that is loaded at runtime
> using include() statement. It's very usual case. So the PHP first loads the
> include file and then declares child class at runtime.

>> Must it be known for the class' first instantiation?
>
> Of course. :)


On Tue, May 12, 2009 at 7:25 AM, Dmitry Stogov  wrote:
> The real things is even worse as during compilation of a class it's parent
> class doesn't have to be known. So construction of VMTs becomes a bit
> problematic. BTW we could think in this way...


OK, so I dont understand this exactly. Is it correct to say that if a
class uses inheritance its compilation will be deferred until its
first instantiation? Or is it compiled when it is seen, and its parent
backpatched in later. When is later?

But I think its fair to say that it has static inheritance - that is,
its full inheritance chain is known before it can be instantiated, and
it can never be changed after that.





>> In the worst case, it _might_ be cheaper to build it at instantiation
>> time, but I would have to look up how expensive that is in a more
>> static language to be sure. Certainly, it is currently so expensive
>> that almost anything else would be better (including the OP's patch).
>
> I don't see how run-time VMT contraction may help, because calls to virtual
> method must know VMT offset at compile-time.

Right. Construction is fine. Their use is not. I don't know what I was
thinking.



So it looks like the best way forwards is still the OP's patch?



Thanks,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Method call improvements

2009-05-12 Thread Paul Biggar
Hi Stas, Dmitry,

On Tue, May 12, 2009 at 2:01 AM, Stanislav Malyshev  wrote:
> IHMO it's not static enough. I.e., since PHP is not compiled, we can not
>  create VD table for the class until runtime inheritance, which means that
> the code using this class can use method resolution more efficient than
> name->function, i.e. hashtable. These lookups can be cached (i.e. CV style)
> but I don't see how they can be altogether prevented.

On Tue, May 12, 2009 at 7:25 AM, Dmitry Stogov  wrote:
> The real things is even worse as during compilation of a class it's parent
> class doesn't have to be known. So construction of VMTs becomes a bit
> problematic. BTW we could think in this way...


Apologies, I'm not familiar with run-time inheritence in PHP. My
understanding was that when a classes source code is compiled, its
parent classes must be known. When is this not the case? Must it be
known for the class' first instantiation?

In the worst case, it _might_ be cheaper to build it at instantiation
time, but I would have to look up how expensive that is in a more
static language to be sure. Certainly, it is currently so expensive
that almost anything else would be better (including the OP's patch).

Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Method call improvements

2009-05-11 Thread Paul Biggar
On Mon, May 11, 2009 at 7:47 PM, Guilherme Blanco
 wrote:
> What's the status on this one?!

I think it died from neglect. But it was a really good idea.


One question that was raised was:

> On Thu, Jan 22, 2009 at 10:20 AM, Dmitry Stogov  wrote:
>> However we can introduce some indirect table associated with op_array, which
>> can be used to implement inline caches without direct opcode modification
>> (in the same way as IS_CV variables work). There are a lot of papers about
>> polymorphic inline caches (e.g.
>> http://research.sun.com/self/papers/pics.html) which we probably should use
>> to not to invite bicycle.

You can't actually use PICs or even ICs with the Zend engine, because
you can't insert code into the callee method's header (you would need
a JIT). You also wouldn't want to, since PHP can't use the
recompilation techniques that Self had. You can use lookup caches,
which is exactly what the original patch was.

FWIW, since PHP has a static inheritence chain, the best approach
seems to be to build a virtual dispatch table, instead of a hashtable
for functions. However, there might be some esoteric extensions which
make this difficult.



Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Bug 47468: enabling readline/libedit/pcntl/ncurses with --enable-embed

2009-05-08 Thread Paul Biggar
Hi Arnaud,

Thanks for looking at this.

On Fri, May 8, 2009 at 2:11 PM, Arnaud Le Blanc  wrote:
> Does the following patch works for you ? (use ./buildconf after
> applying, then reconfigure). It does the same thing as yours, but moves
> the decision of allowing "cli" extensions to SAPI's config.m4.

I tested this on CVS 5.3 (with the configure command: ./configure
--with-readline --enable-embed --enable-maintainer-zts
--enable-debug). It applies cleanly, and I checked the readline
symbols were present. I have also forwarded it to two of phc's users
who reported the bug. I expect they'll be back to me within a day or
two, if you prefer to wait for further confirmation.

FWIW, I like the new patch better than my hack. However, I don't
understand the how it works from the comment (that bit of acinclude is
very shoddily documented in general).

Thanks again,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Bug 47468: enabling readline/libedit/pcntl/ncurses with --enable-embed

2009-05-06 Thread Paul Biggar
Hi folks,

Could I get someone to look at http://bugs.php.net/bug.php?id=47468?.
It includes a patch which is confirmed to fix the problem.

Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Removing the Zend API

2009-04-05 Thread Paul Biggar
Hi Andi,

> On Wed, Apr 1, 2009 at 3:35 PM, Andi Gutmans  wrote:
>> Hi Paul,
>>
>> This is something I have considered in the past esp. as it would also reduce 
>> dependency of extensions on PHP runtime and make it easier for 3rd parties 
>> to distribute PHP extensions which don't have to be rebuilt per-PHP version. 
>> This is similar to JNI.

I'm working on this a little more atm. When you were considering it
before, was anything committed to paper/email? If so, I would be
interested to see any thoughts or discussions.

Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Removing the Zend API

2009-04-02 Thread Paul Biggar
2009/4/1 Johannes Schlüter :
> Hi,
>
> On Wed, 2009-04-01 at 16:16 +0100, Paul Biggar wrote:
>> I think that to handle more complex cases we need the kind of
>> information which makes it straightforward to easily generate code to
>> make a seamless interface between C and the engine API. The only case
>> I had thought of was to somehow mangle structs/pointers into
>> resources. But I suppose we need lengths for strings. I expect (many?)
>> more of these cases will come up.
>
> Well, as soon as any pointer exists you need manual work for a special
> case. And even when only using integers it's not fully fast-forward:
> There are cases where not the full integer range is allowed but just a
> few flags  or some specific range. C programmers will know that, passing
> that 1:1 to PHP userland can be bad.

Well, it depends what the pointer does of course. I dont know if we
need to support the general case of 'anything goes with pointers'.
Instead, I had been thinking that the pointer would be a pointer to a
struct, in the manner of 'OO-in-C'.


> For simple cases http://pecl.php.net/package/ffi might be enough, for
> average cases there are just a few APIs (PHP_FUNCTION,
> zend_parse_parameters, RETURN_*) one has to know for a start for an
> extensions, Hartmut's CodeGen_PECL abstracts that using some XML and
> then there's PEAR's Inline_C as some "weird" approach.

I'll take a look at these, thanks for the pointers. However, the main
idea is not exactly what we use, just that we no longer use the Zend
API.


> I'd be happy to have some simple toolkit for this, but I guess it's
> really hard to make some easy tool which really works in average cases
> not just in proof-of-concept cases. This might also be interesting for
> other projects like ProjectZero (PHP using a JVM) or pipp (using Parrot)

Yes. This is one of the motivations. In theory, Project Zero (et al)
would generate their own code from the library spec. AFAIK, they
currently go through the Zend API, which I believe there not too happy
about.



Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Removing the Zend API

2009-04-01 Thread Paul Biggar
On Wed, Apr 1, 2009 at 3:35 PM, Andi Gutmans  wrote:
> Hi Paul,
>
> This is something I have considered in the past esp. as it would also reduce 
> dependency of extensions on PHP runtime and make it easier for 3rd parties to 
> distribute PHP extensions which don't have to be rebuilt per-PHP version. 
> This is similar to JNI.


It is similar to JNI. This has been done many times before for many
languages, including Pythons Pyrex and ctypes, Ruby's FFI, Java's JNI
and JNA, and no doubt countless others. The only difference here is
that I recommend that we made this the _only_ interface (or as close
as we can make it) from the interpreter internals.



> There are some real challenges though and JNI is a good example of those 
> challenges. In order to completely abstract the API from data structure you 
> need higher level API calls esp. for things like arrays and objects which 
> typically incur a significant performance loss. JNI sucks big time on that 
> front. Also it often leads to additional data copying.

All of this happens at the moment to marshal data into zvals. My RFC
does not intend to add to this complexity, but rather to make it work
exactly the same as it does now. So if currently a library avoids
copying values, it should be possible to keep that property. If the
library cannot currently avoid it, I do not expect to be able to avoid
it with a new scheme.

This is very much more important for PHP that JNI is to Java. Every
library shipped with PHP (including most of SPL I believe) is tightly
coupled to the interpreter. By contrast, the vast majority of Java's
Class library is written in Java.



> Also this doesn't necessarily have to replace the Zend API but in fact be an 
> engine independent API. Over time if everyone adopts then we could get rid of 
> Zend API. However, if what I say above is correct, we may find that it's 
> actually very complementary and that some core extensions prefer to hook into 
> the engine very tightly while third parties (e.g. pdflib) and less core 
> extensions prefer to stick to an independent API which can work across not 
> only mini release of PHP but also minor and in some cases major release of 
> PHP.


It doesn't have to, but I think it should. But it would be insane to
expect a new scheme to replace the current one, unless it works
universally.

Core "extensions", like important array and string functions, will
probably need to be tightly coupled to the interpreter. Some other
extensions would too, like Xdebug. If people could suggest other
extensions which should not be decoupled, I would appreciate it.


> This API would need to be designed in great detail and we would need to make 
> sure it is long lasting.

I could not agree more.



> My 2 cents.
> Andi

Thanks for the input, the more the merrier :)
Paul

>
>> -Original Message-
>> From: Paul Biggar [mailto:paul.big...@gmail.com]
>> Sent: Monday, March 30, 2009 4:07 PM
>> To: PHP Internals
>> Subject: [PHP-DEV] RFC: Removing the Zend API
>>
>> Hi,
>>
>> I've added a new RFC to the wiki
>> (http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
>> and decouple the Zend engine from the libraries, in order to allow
>> large scale changes to the Zend engine in the future. The RFC
>> describes a prototype phase of the project, which could reasonably be
>> done within a GSOC project, so I have added it to the GSOC 09 page
>> (http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).
>>
>> If anybody has any comments, I'd be delighted to hear them. If anybody
>> knows (or is) a good student looking for a GSOC project (and I've left
>> it late, there are only 3 days left to apply), please encourage the
>> student to look at this. Finally, if anybody is interested in helping
>> mentor this as part of the GSOC, I'd be grateful for the help (I have
>> to start writing my thesis soon).
>>
>> Thanks,
>> Paul
>>
>> --
>> Paul Biggar
>> paul.big...@gmail.com
>>
>> --
>> PHP Internals - PHP Runtime Development Mailing List
>> To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Removing the Zend API

2009-04-01 Thread Paul Biggar
2009/4/1 Johannes Schlüter :
> Hi,
>
> On Wed, 2009-04-01 at 14:24 +0100, Paul Biggar wrote:
>> > Moreover, in your example in the wiki you don't include how you would do
>> > parameter parsing. Or do you rely on the code generator to look at the C
>> > functions signatures and figure out by itself what to do? (actually there 
>> > is
>> > some ambiguity, AFAIR, and thus guessing cannot be done reliably)
>>
>> That is exactly right. (I'll make this clearer in the RFC). I can't
>> think of any cases where guess cannot be done reliably. If you can
>> give me an example, I'll try and address it.
>
> Well, take your example:
>
> void Y(char *, int)
>
> Is the second parameter the length of the string or something
> independent? Is the char* changed? And who is going to free it?

Good points. I had initially thought that there should be some simple
declarative DSL, and later thought 'why can't it be a header file in
the simple case'. I guess this is why.

I think that to handle more complex cases we need the kind of
information which makes it straightforward to easily generate code to
make a seamless interface between C and the engine API. The only case
I had thought of was to somehow mangle structs/pointers into
resources. But I suppose we need lengths for strings. I expect (many?)
more of these cases will come up.


(Of course, this is why I recommended a SoC project to try it)


Thanks for the comments, I'll update the RFC.
Paul



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RFC: Removing the Zend API

2009-04-01 Thread Paul Biggar
On Tue, Mar 31, 2009 at 9:23 PM, Nuno Lopes  wrote:
> Hi Paul et all,
>
> I fully understand (and even share) your motivations and goals. However it
> seems to me that describing an extension in PHP will lead to loss of
> performance, as you cannot capture certain C features in PHP. For example,
> there are some internal functions that rely on pointer arithmetic to get
> decent performance.

This is not about capturing every C feature. Instead, it is about
strictly separating the C and PHP code. If someone wants to C pointer
arithmetic, it is simple to code it on the C side of the line. Its not
necessary to expose the exact C function from the library. Sometime,
you may wish to to have a C function wrapping it, to do some "dirty
tricks".


> Then you may extend to PHP to better capture these "dirty tricks", and then
> you'll end up with some DSL for building PHP extensions. It's not
> necessarily bad, it's just a lot of work.. :)

This - which I'll call the Pyrex model - is one way to go, but its not
my preference. While I think it beats the current model, I hope that
it won't be required with whatI propose in the RFC.


> Moreover, in your example in the wiki you don't include how you would do
> parameter parsing. Or do you rely on the code generator to look at the C
> functions signatures and figure out by itself what to do? (actually there is
> some ambiguity, AFAIR, and thus guessing cannot be done reliably)

That is exactly right. (I'll make this clearer in the RFC). I can't
think of any cases where guess cannot be done reliably. If you can
give me an example, I'll try and address it.



> To summarize my e-mail, I believe this is a very interesting idea, but needs
> a lot more thinking :)  It's a nice SoC project nethertheless.

It certainly does need more thinking, and I'm hoping that people can
pick holes in it, so that I can fill them. A SoC project would be
ideal, as it would probably expose - and hopefully solve - a great
deal of flaws.


Thanks for your comments, I'll try and update the RFC in response.
Paul


> - Original Message -
>>
>> Hi,
>>
>> I've added a new RFC to the wiki
>> (http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
>> and decouple the Zend engine from the libraries, in order to allow
>> large scale changes to the Zend engine in the future. The RFC
>> describes a prototype phase of the project, which could reasonably be
>> done within a GSOC project, so I have added it to the GSOC 09 page
>> (http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).
>>
>> If anybody has any comments, I'd be delighted to hear them. If anybody
>> knows (or is) a good student looking for a GSOC project (and I've left
>> it late, there are only 3 days left to apply), please encourage the
>> student to look at this. Finally, if anybody is interested in helping
>> mentor this as part of the GSOC, I'd be grateful for the help (I have
>> to start writing my thesis soon).
>>
>> Thanks,
>> Paul
>
>



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Reserved namespaces

2009-03-30 Thread Paul Biggar
On Tue, Mar 31, 2009 at 12:28 AM, Lukas Kahwe Smith  wrote:
> If nobody proposes something, this will just slide by ..

It seems simple enough to add a note on the manual page reserving the
use of some namespaces:

PHP (even if we never use it, best no-one else does)
internals (ditto)
SPL (reserved for SPL)
PEAR (reserved for PEAR)
PECL (reserved for PECL)

Though we could add a simple check to the parser, I see no real need
if we make this list conspicuous enough. Once the users have been
warned, we are in a better position to administer their kick to the
nuts.


Paul



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] RFC: Removing the Zend API

2009-03-30 Thread Paul Biggar
Hi,

I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).

If anybody has any comments, I'd be delighted to hear them. If anybody
knows (or is) a good student looking for a GSOC project (and I've left
it late, there are only 3 days left to apply), please encourage the
student to look at this. Finally, if anybody is interested in helping
mentor this as part of the GSOC, I'd be grateful for the help (I have
to start writing my thesis soon).

Thanks,
Paul

-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: cvs: ZendEngine2 / zend_builtin_functions.c /tests 009.phpt php-src/ext/reflection/tests 027.phpt php-src/ext/standard/tests/class_object get_class_error_001.phpt get_class_variati

2009-03-16 Thread Paul Biggar
On Mon, Mar 16, 2009 at 6:46 PM, Stanislav Malyshev  wrote:
> Hi!
>
>> Why do functions return NULL? (I know the technical answer, obviously,
>> I'm interested in the design decision). If FALSE is the error
>> condition, whats the point in a separate NULL error condition?
>
> Because false is not an error condition. It's a legitimate value, which can
> be normal answer for many functions.

Is this not the case for NULL?

Thanks,
Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: cvs: ZendEngine2 / zend_builtin_functions.c /tests 009.phpt php-src/ext/reflection/tests 027.phpt php-src/ext/standard/tests/class_object get_class_error_001.phpt get_class_variati

2009-03-16 Thread Paul Biggar
2009/3/16 Johannes Schlüter :
> On Mon, 2009-03-16 at 18:41 +0100, Pierre Joye wrote:
>> >> Usually we return NULL in case parameter parsing fails, this is
>> >> documented like this:

> The documentation is just like with other functions returning NULL for
> param parsing and false for other errors. While browsing a bit through
> the docs I couldn't find any function mentioning "Returns NULL in case
> parameter parsing fails" even though most functions do.

Why do functions return NULL? (I know the technical answer, obviously,
I'm interested in the design decision). If FALSE is the error
condition, whats the point in a separate NULL error condition?


Paul




-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] CVS Account Request: pbiggar

2009-02-24 Thread Paul Biggar
Contributing benchmarks

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] substr passing null...

2009-01-28 Thread Paul Biggar
On Wed, Jan 28, 2009 at 6:39 PM, Dan  wrote:
>>
>> You're assumption is wrong then, NULL isn't treated as not passing a
>> value. The reason it worked with substr was by pure chance.
>>
>
> Out of interest, is there a reason that that is the case? Surely passing
> null would be best treated as the same as passing nothing?

NULL is a value. There is no way to tell that when you pass NULL, you
actually intended to pass nothing and use the default value.

Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Unexpected behavior of $this->$propertyName[...]

2009-01-15 Thread Paul Biggar
On Thu, Jan 15, 2009 at 7:24 PM, Robert Lemke  wrote:
> Dear internals,
>
> please consider the following code executed with PHP 5.3alpha3:

I suspect neither bug nor feature. I think you expect that
 $this->$propertyName['bar']
 is the same as:
 ($this->$propertyName)['bar']
but in fact it is:
 $this->($propertyName['bar'])

So in your example, 'bar' is the string index 0 for "foo"[0], hence 'f'.


> Is this a bug or missing feature?

It has been pointed out before that one should be able to index an
expression, but currently PHP can only index variables. Personally, it
seems like a bug, but I'm not sure whether or not there's consensus.

Paul


-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Extending PHP with C++ (Newbie Question)

2008-11-15 Thread Paul Biggar
Hi Chris,

On Sat, Nov 15, 2008 at 3:11 PM, Chris Jiang <[EMAIL PROTECTED]> wrote:
> I've been trying to make a PHP extension for my project, and would really
> like to use C++ instead of C to write the code. I've been searching for some
> tutoral or manual for some time already, but not so lucky fining anything
> useful for newbies like myself.


It looks like http://developers.facebook.com/phpembed/ will give you
what you're looking for.

Paul


-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Howto build statically linked PHP CGI (without shared libs)

2008-09-10 Thread Paul Biggar
Hi Santi,

On Wed, Sep 10, 2008 at 10:39 PM, Santi Saez <[EMAIL PROTECTED]> wrote:
> Jani Taskinen escribió:
>>
>> Wrong: --disable-shared --enable-static

As you say, this doesn't work.

I needed something like this to for profiling. I believe I got the
static archive linked, but it was a while ago, so I'm not positive
(sadly, the profiling never worked this way).

Here is the script I used. I don't think the script itself works, but
if you have a look through the last 5 or 6 lines of it, it should be
easy enough.

http://code.google.com/p/phc/source/browse/trunk/misc/old/get_libphp5_archive


Those long lines were created by looking at the make output used for
building the shared archive. I think I had to manually remove .o files
every time the linker complained, which was a lot, a problem that you
don't have with shared libs.

Let me know how you get on,
Paul


On Wed, Sep 10, 2008 at 10:39 PM, Santi Saez <[EMAIL PROTECTED]> wrote:
> Jani Taskinen escribió:


-- 
Paul Biggar
[EMAIL PROTECTED]


Re: [PHP-DEV] OS X, enable embed problem, patch

2008-04-30 Thread Paul Biggar
Hi Chris,

I'm very interested in having --enable-embed working on OS X. My
experience with libtool is that it abstracts this sort of thing, so
I'm a little bit confused by your work-around. Could you explain why
you added --enable-embed=dylib instead of fixing
--enable-embed=shared?


Thanks,
Paul


On Wed, Apr 30, 2008 at 6:14 PM, Christopher Thompson
<[EMAIL PROTECTED]> wrote:
> Although there's obviously some debate as to the usefulness of my embed
> patch for OS X, here's a replacement for the Makefile.global bit.  It adds
> $(PHP_LDFLAGS) to the compile line for libphp5.dylib, necessary in many
> cases.
>
>
>  --- php-5.2.5.clean/Makefile.global 2007-08-03 08:01:56.0 -0600
>  +++ php-5.2.5/Makefile.global   2008-04-29 17:00:20.0 -0600
>
>  @@ -17,6 +17,10 @@
> $(LIBTOOL) --mode=link $(CC) $(CFLAGS) $(EXTRA_CFLAGS) -rpath
> $(phptempdir) $(EXTRA_LDFLAGS) $(LDFLAGS) $(PHP_RPATHS) $(PHP_GLOBAL_OBJS)
> $(PHP_SAPI_OBJS) $(EXTRA_LIBS) $(ZEND_EXTRA_LIBS) -o $@
> [EMAIL PROTECTED](LIBTOOL) --silent --mode=install cp $@ 
> $(phptempdir)/$@
> >/dev/null 2>&1
>
>  +libphp$(PHP_MAJOR_VERSION).dylib: $(PHP_GLOBAL_OBJS) $(PHP_SAPI_OBJS)
>  +   $(LIBTOOL) --mode=link $(CC) -dynamiclib -install_name
> $(INSTALL_ROOT)$(prefix)/lib/$@ -current_version $(PHP_VERSION)
> -compatibility_version $(PHP_MAJOR_VERSION) -undefined dynamic_lookup
> $(PHP_RPATHS) $(PHP_LDFLAGS) $(PHP_GLOBAL_OBJS:.lo=.o)
> $(PHP_SAPI_OBJS:.lo=.o) $(EXTRA_LIBS) $(ZEND_EXTRA_LIBS) -o $@
>
>  +   [EMAIL PROTECTED](LIBTOOL) --silent --mode=install cp $@ 
> $(phptempdir)/$@
> >/dev/null 2>&1
>  +
>   libs/libphp$(PHP_MAJOR_VERSION).bundle: $(PHP_GLOBAL_OBJS)
> $(PHP_SAPI_OBJS)
> $(CC) $(MH_BUNDLE_FLAGS) $(CFLAGS_CLEAN) $(EXTRA_CFLAGS) $(LDFLAGS)
> $(EXTRA_LDFLAGS) $(PHP_GLOBAL_OBJS:.lo=.o) $(PHP_SAPI_OBJS:.lo=.o)
> $(PHP_FRAMEWORKS) $(EXTRA_LIBS) $(ZEND_EXTRA_LIBS) -o $@ && cp $@
> libs/libphp$(PHP_MAJOR_VERSION).so
>
>
>
>  --
>  PHP Internals - PHP Runtime Development Mailing List
>  To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Strict type hints (parameter and return value)

2008-04-18 Thread Paul Biggar
On Fri, Apr 18, 2008 at 9:03 AM, Lukas Kahwe Smith <[EMAIL PROTECTED]> wrote:
>  Suddenly I need to ensure that all my variables have the proper types. So
> what will people do?
> 
> So maybe an "(int)" here or there
> is not so much more code to type, but you can already see that code you safe
> on one end, you have to start adding on the other, the one which you write
> more often.

You are right, of course. I stated before that the hints automatically
casting the parameter is the best approach. With automatic casting,
which seems to be fairly obvious semantics for type hints, you will
not need to cast at call-time.


> More importantly, in some cases simple type juggeling with
> "(int)" will not be sufficient, you might have to do some minor checks etc.
> In the past library authors did this as part of the code that the strict
> type hinting camp is now hoping to remove. So again more code in my glue
> code.

There isn't a strict typing camp. Felipe's first patch was just
checking, which didn't provide any guarantees to the library author.
The second was strict typing, which provides guarantees, as you say,
at great expense. I am hoping that the third will be casting, which
provides guarantees, at no call-time expense.


Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Strict type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
On Thu, Apr 17, 2008 at 9:04 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:

>  In fact, we have good old conversion system used in parse_parameters. So
> why not use the same rules? That'd make at least some sense - give PHP
> programmers the tools that extension C programmers have. I still wouldn't
> like it too much but at least it would make sense :)

As I read it, these convert the parameter to the expected type, if it
is not already, and fail when this is not possible. All thats missing
is scalar and numeric to be consistent with the is_*(), and its
perfect.


Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
On Thu, Apr 17, 2008 at 8:41 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:
> > is_* are the important functions, since they are the ones which most
> > resemble the type hints under discussion. Up until now, anyone
>
>  Circular argument here.

Yes, I phrased that badly. I believe it is fair to say that there
would be confusion over why is_int may fail after an 'int' type hint.
More generally, I think it is difficult to explain why an int type
hint does not guarantee an int.



> > implementing this in user-land would use is_int(). It makes perfect
>
>  No, you shouldn't do that - there's absolutely no reason to accept 1 and
> reject '1', unless you have some very rare and special circumstance. If you
> are using is_int for that, your code is wrong.

That is very subjective. I doubt there is 'one true style' of PHP,
despite suggestions I see about 'the PHP way'.  But let us suppose you
would use an int cast instead to ensure you have an int (by type, not
coerced value). The behaviour I recommend is also consistent with
casting. It would be syntactic sugar for

function ($x) { $x = (int) ($x); ... }




> > sense that type hints should mimic this behaviour. Not doing so would
> > be very inconsistent, and so confusing.
> >
>
>  Could we stop using word "consistent" to replace "I like it"?

I am very deliberately not using "consistent" to replace "I like it",
except in the sense that I like consistency. It makes no sense to find
that you do not have an int, when you specify you must. I very much
mean consistency, and do not have a preference for the approach,
except that only one seems perfectly consistent in this case, which is
why I have recommended it.


Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
Hi,

On Thu, Apr 17, 2008 at 8:22 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:
> > >  If you talk about consistency, you should remember that right now no
> > > function and no feature in PHP relies on strict checking of primitive
> types.
> > >
> >
> > That is not the case (or I am very much mistaken). In particular,
> > is_int, which I mentioned in my email, is predicated on the IS_LONG
> > type only.
> >
>
>  Read that "no function except for a tiny handful of very special cases like
> is_* and serializes and probably one or two other functions that I forgot",
> ok?


is_* are the important functions, since they are the ones which most
resemble the type hints under discussion. Up until now, anyone
implementing this in user-land would use is_int(). It makes perfect
sense that type hints should mimic this behaviour. Not doing so would
be very inconsistent, and so confusing.


Paul


-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Strict type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
Hi Stanislav,



On Thu, Apr 17, 2008 at 7:59 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:
> As a final shock, I am not convinced
> functions in PHP should throw exceptions when parameters don't match - and
> even if they do, first they should attempt to make sense of parameters
> (which makes string hints like "integer" and "string" make no sense - you
> can convert a lot of things to int or string).

This is of course why I suggested that the best approach would be
automatically convert them, rather than reject a type which can be
coerced appropriately. I would settle for the current proposal, but if
you're going to make the language weakly-typed, which it is, it is
more appropriate to keep this in line with the rest of the language.


Paul


-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
On Thu, Apr 17, 2008 at 7:02 PM, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:
> > Your current implementation seems to be inconsistent with both itself
> > and the rest of PHP. I hope this can be rectified before it is
> > included. PHP is inconsisent enough without adding more.
> >
>
>  If you talk about consistency, you should remember that right now no
> function and no feature in PHP relies on strict checking of primitive types.

That is not the case (or I am very much mistaken). In particular,
is_int, which I mentioned in my email, is predicated on the IS_LONG
type only.

Paul


-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Strict type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
On Thu, Apr 17, 2008 at 5:24 PM, Felipe Pena <[EMAIL PROTECTED]> wrote:
>  > Thanks for making the change. This is much more consistent. It remains
>  > that there is no 'scalar' or 'numeric' hint. Are you planning on
>  > including them?
>
>  No, i'm not planning to do that.

Ah, I should have been more specific. I meant to ask for those
changes. Can I ask why not?



>  > I also wonder about allowing NULL for an 'array' type hint. Is this
>  > acceptable? The patch seems to indicate it is. If so, why is this
>  > allowed?
>
>  I added yesterday an information about that in the doc:
>
>  "Functions are now able to force parameters to be objects (by specifying
>  the name of the class in the function prototype) or arrays (since PHP
>  5.1). However, if NULL is used as the default parameter value, it will
>  be allowed as an argument for any later call."

That doesnt explain why its allowed. Can you elaborate?



Thanks,
Paul



>  > On Thu, Apr 17, 2008 at 4:42 PM, Felipe Pena <[EMAIL PROTECTED]> wrote:
>  > > Hi.
>  > >
>  > >  Well, thinking better about the behavior of type hinting, i decided to
>  > >  change the proposal for strict type. I.e. don't accept numeric string
>  > >  as an intenger, etc.
>  > >
>  > >  If anyone wants use type hinting, i believe that it should be strict.
>  > >  Otherwise, it makes more sense to not use it.
>  > >
>  > >  Examples, patches and tests: http://wiki.php.net/rfc/typehint
>  > >
>  > >
>  > >  --
>  > >  Regards,
>  > >  Felipe Pena.
>  > >
>  > >  --
>  > >  PHP Internals - PHP Runtime Development Mailing List
>  > >  To unsubscribe, visit: http://www.php.net/unsub.php
>  > >
>  > >
>  >
>  >
>  >
>  --
>  Regards,
>  Felipe Pena.
>
>



-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Strict type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
Hi Felipe,

Thanks for making the change. This is much more consistent. It remains
that there is no 'scalar' or 'numeric' hint. Are you planning on
including them?

I also wonder about allowing NULL for an 'array' type hint. Is this
acceptable? The patch seems to indicate it is. If so, why is this
allowed?


Thanks,
Paul

On Thu, Apr 17, 2008 at 4:42 PM, Felipe Pena <[EMAIL PROTECTED]> wrote:
> Hi.
>
>  Well, thinking better about the behavior of type hinting, i decided to
>  change the proposal for strict type. I.e. don't accept numeric string
>  as an intenger, etc.
>
>  If anyone wants use type hinting, i believe that it should be strict.
>  Otherwise, it makes more sense to not use it.
>
>  Examples, patches and tests: http://wiki.php.net/rfc/typehint
>
>
>  --
>  Regards,
>  Felipe Pena.
>
>  --
>  PHP Internals - PHP Runtime Development Mailing List
>  To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
Hi Sam,

On Thu, Apr 17, 2008 at 4:01 PM, Sam Barrow <[EMAIL PROTECTED]> wrote:
\>  > 2.) is_int has different semantics to the int type hint. Numeric
>  > strings qualify as the latter, but not the former. In general this is
>  > a problem. It seems type hints can only be made consistent if they
>  > convert the actual parameter to the type which is hinted. (Note that
>  > for call-by-reference, this will change the value in the caller, not
>  > just the copy in the callee - I think this is a good idea). As an
>  > example, this will fail, which it shouldnt: function y (int $x) {
>  > assert (is_int($x); } y ("24");
>
>  The problem with this is that there's not much point in converting the
>  value. PHP will do that anyway, making this kind of pointless.

That is not quite correct. PHP's weak typing is somewhat inconsistent,
and in the example I included, it will not coerce the value of $x. An
'int' type hint is not the same as is_int (), which is a mistake.

It seems the easiest thing is to make the conversion mandatory at
call-time. Alternatives would include weakening is_int(), or making
the 'int' hint fail for numeric strings (as you mention below). I
believe these two solutions are not as good.



>  Overall, I think type hinting should work by checking the type. If it
>  does not match, raise an error. For example, int means int, not numeric
>  string.
>  This only serves to include an additional type juggling system into php,
>  which is very confusing.

This is one alternative. The aim should be consistency (or as you say,
avoiding confusion). This weak typing is already part of the language,
so I don't believe it is inconsistent, though your suggestion clearly
is. However, it is more consistent than, and therefore preferable to,
the current patch.



Thanks,
Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Type hints (parameter and return value)

2008-04-17 Thread Paul Biggar
Hi,

Your current implementation seems to be inconsistent with both itself
and the rest of PHP. I hope this can be rectified before it is
included. PHP is inconsisent enough without adding more.


1.) There are a number of is_* functions. It seems obvious that they
should be consistent with this. So you're missing numeric and scalar.
I'm not clear if there is a is_unicode(), but if so, that should be
consistent too. (I believe scalar is particularly important, at a
language level, since mixed and object are provided).

2.) is_int has different semantics to the int type hint. Numeric
strings qualify as the latter, but not the former. In general this is
a problem. It seems type hints can only be made consistent if they
convert the actual parameter to the type which is hinted. (Note that
for call-by-reference, this will change the value in the caller, not
just the copy in the callee - I think this is a good idea). As an
example, this will fail, which it shouldnt: function y (int $x) {
assert (is_int($x); } y ("24");

3.) This doesnt seem to be integrated with class type hints, in the
codebase. I wonder why that is? (I don't care all that much, I'm just
wondering).

I hope we can get these ironed out.

Thanks in advance.
Paul


On Thu, Apr 17, 2008 at 5:50 AM, Felipe Pena <[EMAIL PROTECTED]> wrote:
> Hello again!
>
>  Now with both parameter and return value type hints.
>
>  For parameter type hints, i have completed the actual implementation
>  with the leftover php types:
>  - string (binary string and unicode)
>  - integer (accepting numeric string too)
>  - double (accepting numeric string too)
>  - boolean ('0', '1', true, false)
>  - resource
>  - object
>
>  (Yes, following our dynamic typing)
>
>
>  Since then, the patch also includes new methods to Reflection:
>  - isInt()
>  - isDouble()
>  - isBool()
>  - isString()
>  - isObject()
>  - isResource()
>
>
>  And for return value, i have modified and improved the previous patch,
>  and now it is using only the "(type)" notation. Hence, that
>  implementation doesn't make any BC break.
>
>
>  Examples, patches and tests: http://wiki.php.net/rfc/typehint
>
>
>  Thanks.
>
>  --
>  Regards,
>  Felipe Pena.
>
>
>  --
>  PHP Internals - PHP Runtime Development Mailing List
>  To unsubscribe, visit: http://www.php.net/unsub.php
>
>



-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Discussion of zval semantics

2008-04-11 Thread Paul Biggar
Hi,

I'm writing a conference paper about phc, and I discuss the PHP C API
in some detail. I've included the text I intend to include in my paper
below. I wonder if any of the experts on the topic see any flaws in my
account. All the details come from Sarah Golemon's book, my experience
with the PHP embed SAPI, study of the source code, and following
discussions on this list.



The primitive unit of data in PHP is the zval, a small structure encompassing a
union of values---objects, hashtables, strings and numeric types---and
memory-management counters and flags. A PHP variable is a symbol-table entry
pointing to a zval, and multiple variables can point to the same zval, using
reference counting for memory management.

Objects in PHP are copied by reference. Assignment of primitive values,
however, is by copy, meaning that semantically the l-value becomes a copy of
the r-value. As an optimization, the PHP run-time causes the l-value to share
the r-value's zval, increasing the reference count, and the variables become
part of the same copy-on-write set. Assignment can also be by reference, which
puts the two variables in the same change-on-write set, in a similar fashion.
This sets the is_ref flag of the shared zval, indicating that the variables in
this set all reference each other. Updating a variable which is a reference
updates its zval, changing the value of all the other variables in that
change-on-write set.

Variables in a copy-on-write set share the same zval, but are not semantically
related. Although this is an optimization applied by the PHP run-time, it is a
feature which phc must deal with to interact with the run-time, and so it
reuses it for performance. In order to update the value of a variable in a
copy-on-write set, it must first be separated. A copy of its zval is created---a
deep copy in the case of arrays and strings---and the original zval has its
reference count decremented.  zvals with a reference count of zero are
deallocated.

Variables in a change-on-write set must similarly be separated if they are
assigned to a copy-on-write set. Otherwise, assignment to a variable overwrites
a zval---s value field, changing the value of all the variables in that
change-on-write set. Variables with a reference count of one, which are in
neither a copy-on-write or change-on-write set---are similarly treated.

The PHP interpreter keeps a reference to a variable's zval in global and
function-local symbol-tables---hashtables indexed by the variable's name. When a
function finishes execution, the local symbol-table is destroyed, decreasing
the reference count of all zvals contained within. The global symbol table is
destroyed at the end of the execution of a script.

As a result of the function-local symbol-table, each PHP variable uses a great
amount of space. The zval itself is 16 bytes long. However, the symbol-table is
a hashtable with a 36 byte bucket. Combined with memory allocation overhead,
each variable occupies 68 bytes on a 32-bit platform [?], and nearly double
that on a 64-bit platform. This means that variable allocations, copies,
separations and deallocations are quite expensive---the PHP interpreter spends
over 20% of its time in memory-management, according to our profiling, which
does not include time spent incrementing and decrementing reference counts. As
a result of the re-use of the PHP run-time, phc is afflicted with the same
problem.



Thanks in advance,
Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Refactoring code for namespaces appliance

2007-12-22 Thread Paul Biggar
On Dec 22, 2007 12:02 AM, Martin Alterisio <[EMAIL PROTECTED]> wrote:
> 2007/12/21, Paul Biggar <[EMAIL PROTECTED]>:
>
> > > 4) find all function calls (not method calls) which are not keywords
> (array,
> > > isset, unset, list, etc) and prefix them with ::
> >
> > list, array etc wouldnt be confused with functions. Functions are just
> > methods without target. You'd need a short list of functions such as
> > empty, unset etc to avoid.

> Because isset is actually a keyword.

Right, the phc grammar doesnt differentiate between built-ins and
functions, even though the PHP grammar has rules for each. You just
need a list of the built-ins, such as isset, etc. You'll see when you
start working with phc, but the point I was making was that this isnt
hard.


> > > 6) find static method calls with variables used as class name, and mark
> them
> > > for user handled refactoring
> >
> > I'm not sure why this couldnt be done automatically, but finding
> > static method calls is also easy. You could add comments to mark these
> > fairly easily.
>
> Maybe some workaround can be found to str_replace the namespace separator,
> but I think this wouldn't be optimal and possibly harm code readability and
> maintainability. Consider a factory method:
>
>   class Factory {
>   public static function create($what, $arguments) {
> switch ($what) {
> case 'bananas':
>   $class = 'Food_Fruit_Banana';
>break;
>   ...
>  }
> return $class::create($arguments);
>}
> }
> ?>
>
> An automated refactoring would be tempted to do:
>
>   class Factory {
>public static function create($what, $arguments) {
>  switch ($what) {
>  case 'bananas':
>$class = 'Food_Fruit_Banana';
>break;
>...
>  }
> $temp = str_replace('_', '::', $class);
>  return $temp::create($arguments);
>}
>  }
>  ?>
>
> Which I think is not the proper way to refactorize this code.
> This would be, IMHO, the right way to refactorize this code:
>
>   class Factory {
>public static function create($what, $arguments) {
>  switch ($what) {
>  case 'bananas':
>$class = 'Food::Fruit::Banana';
>break;
>...
>  }
> return $class::create($arguments);
>}
>  }
>  ?>
>
> Considering the variants and possible uses of this syntax, I'll say it's
> better to leave this job to the coder rather than to an automated job.

True, as I'm sorry to say that we dont have the facilities to
automatically match the definition of variables to where they are
used.
 However, once you have a list of class_names, a global replace on all
strings may be all you need.

Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Refactoring code for namespaces appliance

2007-12-21 Thread Paul Biggar
Hi Martin,

On Dec 21, 2007 8:30 PM, Martin Alterisio <[EMAIL PROTECTED]> wrote:
> I've been pondering about how to algorithmically/mechanically prepare a non
> namespaced code for namespaces appliance. This would be a first step which
> will just prepare the code for namespaces in a safe way, but not to profit
> from aliases created by use. The latest will have to be implemented manually
> (IMHO).

phc (www.phpcompiler.org) would be suitable for this purpose. It has
good support for nearly all of the features you need.


> I've come up with the following, for an organization model of a class per
> file.
> Please review.
>
> 1) find the class definition and reduce the class name to namespace/class
> 2) add namespace declaration on top of file
> 3) replace class name in definition with short name

So far this would be trivial.

> 4) find all function calls (not method calls) which are not keywords (array,
> isset, unset, list, etc) and prefix them with ::

list, array etc wouldnt be confused with functions. Functions are just
methods without target. You'd need a short list of functions such as
empty, unset etc to avoid.

> 5) find all class names (static method calls (no self:: nor parent::), new,
> catch, type hints, instanceof, extends, implements) and:

All of these are trivial.

> 5a) replace separator used (possibly: _) with namespace separator (::)
> 5b) IF the class has no namespace (ie global namespace) prefix the class
> name with ::
> 5c) IF the namespace of the class starts with the same namespace found in
> (1), remove it or prefix it with namespace::
>   (although this is not necessary)

This would be easy too.

> 6) find static method calls with variables used as class name, and mark them
> for user handled refactoring

I'm not sure why this couldnt be done automatically, but finding
static method calls is also easy. You could add comments to mark these
fairly easily.

> 7) find internal functions that use callables (call_user_func, array_map,
> array_filter, etc) and mark them for user handled refactoring

A lot of this could be automated, but there would of course be some
which would require marking.


> Later on, a methodical way of adding use statements for reducing the class
> names used, can be implemented. Although this might be more complicate than
> it seems if the method has to be careful to avoid creating possible name
> clashes.
>
> If anyone has a comment on this, I'll be more than grateful if you share.


If you need any advice, on this, please let me know.


Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Compiled variables and backpatching

2007-10-01 Thread Paul Biggar
Hi,

Stanislav wrote me a very useful reply to my CV questions. I wonder if
anyone has time to give me a quick note on backpatching, and what that
is?

On 9/23/07, Paul Biggar <[EMAIL PROTECTED]> wrote:
> What is backpatching?
> Is it related to compiled variables, or are they different concepts?
> Why was backpatching necessary?
> Is this performance related, and what performance impact does backpatching 
> have?

Thanks a lot,
Paul


-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Compiled variables and backpatching

2007-09-23 Thread Paul Biggar
On 9/23/07, Stanislav Malyshev <[EMAIL PROTECTED]> wrote:
> > What is a compiled variable?
> > How does it differ from a non-compiled-variable?
>
> When engine encounters variable like $a what it usually does is it looks
> up current symbol table for an entry named 'a', and then takes zval*
> sitting there. However, since during the same scope the entry for $a
> will be the same (though not a value sitting there!), once found in can
> be stored so future calls to $a will not require looking up in the hash
> table. That's what CVs do. The reason, of course, is performance :)

OK, so its a bit of caching. So it sounds like it stores a pointer
into the symbol table, so that if $a is redefined it will still be
correct. What happens if the symbol table must be rehashed? Does it
still work, or does it look it up again?

Is there a single CV for each $a in a scope, or is there one per
bytecode operand? In other words, in a function which uses $a a lot,
does it look it up for each bytecode which uses $a, or only once for
that function?


Thanks for the quick response,
Paul


-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Compiled variables and backpatching

2007-09-23 Thread Paul Biggar
Hi,

I'd be very interested to have an overview of compiled variable and
backpatching within the interpreter. I can find plenty of emails and
code that refers to the two topics, but none that defines them. If
there is an overview somewhere, could somebody point me to it?

If not, I have a few questions. Answers of any length are appreciated.

What is a compiled variable?
How does it differ from a non-compiled-variable?
Why were they introduced?
I presume that this is for performance - what performance impact do they have?

What is backpatching?
Is it related to compiled variables, or are they different concepts?
Why was backpatching necessary?
Is this performance related, and what performance impact does backpatching have?


Thanks in advance,
Paul

-- 
Paul Biggar
[EMAIL PROTECTED]

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php