Re: [PHP-DEV] How does the interpreter work

2009-12-21 Thread Rob Nicholson

Christian Grobmeier  wrote on 21/12/2009 13:56:08:


> I would like to learn more about how the interpreter works, but I was
> unable to find good documents on the web. Basically I am thinking on
> something about allocation of variables, how does object creation work
> and such stuff. Maybe something on the overall architecture of PHP
> would be of interest too.
>
> In java world there is the JVM specification, I hoped there is
> something for PHP too.

Hi Christian,

The PHP architecture is a little different from the JVM in that it does not
explicitly document/specify the interface between the compiler and the
bytecode/opcode interpreter the way that Java does. It still exists though.

I suggest you look at the links under here:

http://www.php.net/manual/en/internals2.php
In particular:
 http://www.php.net/manual/en/internals2.opcodes.php

Another good reference is Sara Goleman's book "Extending and Embedding PHP"

Andy Wharmby produced a set of charts which you can find on Zoe's Blog
here : http://zoomsplatter.blogspot.com/2008/08/php-opcodes.html .
These may help you to make a fast start  understanding the overall design.

Rob.


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] RE: Optimizer discussion

2009-06-07 Thread Rob Nicholson
Graham, Paul,

Paul Biggar  wrote on 07/06/2009 02:28:48:
> 
> On Fri, Jun 5, 2009 at 11:23 PM, Nuno Lopes wrote:
> 
> > About runkit & friends, I wouldn't worr
>  much about them. If you're running
> > them problably you also don't care about optimizations. If you want to 
be
> > able to optimize something, you need to remove as many freedom degrees 
as
> > you can..
> 
> This is probably true of runkit. However, I would be careful what you
> remove for extra freedom. There is very likely PHP code out there that
> relies (possibly by accident) on some edge cases.
> 

Firsly its great to see more and more folks experimenting with the 
implementation
of PHP. I think this will be good for the wider PHP community as the 
design 
of PHP and the possible optimisations become better understood.

I think you'll find that there are a lot of "edge cases" as Paul mentions 
in PHP that PHP code relies on. I work on IBM's project zero and we have 
hit
quite a lot of them.  Just one example to illustrate. 
We found that the evaluation order within assignments is not at all what 
you
might predict and that existing PHP applications actually rely on the 
evaluation 
order. Consider the following where foo() bar() and baz() have some 
coupling.
$a[foo()]=$b[bar()][baz()];

Even though the test coverage of the Zend Engine as measured by line 
coverage is 
fairly complete we found that there were missing testcases to verify this 
behaviour. We've been following a policy of writing new tests for any such 
behaviour 
that we find so I would suggest that you ensure that you can run and
pass all the PHPT tescases under /tests/lang and under /Zend. 

For example the tests for the behaviour I mention above are 
tests/lang/engine_assignExecutionOrder_XXX.phpt

Then if you find any more PHP code that does not run the same optimised as 
it
does unoptimised it would be great if you could contribute testcases for 
them. 

Actually for full disclosure I should say that although most of the tests 
we have
written are now in cvs, we are still a little behind with contributing all 
the 
engine tests we have written. Hopefully they'll all be there before you 
need them.


Rob Nicholson





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU







Re: [PHP-DEV] RFC: Removing the Zend API

2009-04-06 Thread Rob Nicholson
Paul Biggar  wrote on 31/03/2009 00:06:33:

> 
> I've added a new RFC to the wiki
> (http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
> and decouple the Zend engine from the libraries, in order to allow
> large scale changes to the Zend engine in the future. The RFC
> describes a prototype phase of the project, which could reasonably be
> done within a GSOC project, so I have added it to the GSOC 09 page
> (http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).

Hi Paul,


This is certainly an interesting project. I work on ProjectZero and 
I see from the wiki that you have looked at the approach we have taken. 
As you correctly point out Project Zero wants to allow users to re-use the 

majority of PHP extensions without re-writing them and as you observe, 
using the existing interface as we do today brings a number of problems.
We would also like to enable others to attach arbitrary PHP extensions 
written in C to ProjectZero. 

So we would like to see the "PHP Native Interface" be successful and would 
like to help if we can.

A few of the most significant issues from my perspective:

1. PHP arrays present a significant issue. Look at the code in array.c.
Much of this code rummages directly in the internals of the Zend Engine 
implementation 
of hashtables and needs to in order to achieve reasonable performance. We 
were unable 
to attach this code to a JVM implementation of PHP and rewrote it in Java. 

Perhaps we will need to accept that the array manipulation functions and a 
small set 
of other built-in extensions must continue to use the internal interfaces.
Its also worth mentioning that today many extension make use of the Zend 
HashTable implementation for their own purposes (as a general library 
function)
in addition to using the HashTable as an interface. 

2. Memory management. If we separate extensions from the internal 
implementation of Zvals
then it becomes difficult to manage memory allocated by the extension 
during a request.
This "falls out in the wash" today because extensions participate in the 
Zend engine's reference
counting scheme which allows memory to be de-allocated once the refcount 
falls to zero.

3. A logistical problem seems to me that in order for this project to gain 
traction 
a significant number of extensions would need to adopt it. In order for 
extensions
to adopt it, we would need to convince their maintainers that the project 
had traction. 
I wonder whether improving the interface could be combined with some of 
the unicode work
so that the resulting porting work for unicode was simpler? 

Rob Nicholson







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU







Re: [PHP-DEV] Writing PHPT tests

2008-02-02 Thread Rob Nicholson
Hi Zoe,

I notice also that the testcase generator and the testcases we have been 
committing differ slightly from the conventions listed at: 
http://qa.php.net/write-test.php . This was based on feedback we received 
so I think that http://qa.php.net/write-test.php should be updated. 
Specifically the  convention of writing multiple small tests to cover 
basic, variation  and error cases separately differs from the naming 
convention suggested on the QA pages.  We have also been following a 
convention w.r.t comments in the testcase which I think is helpful so I'd 
suggest we cover this too. Since the testcase generator adds these 
comments its  worth documenting them.

I'll produce a patch to the documentation for review if you like.


Rob Nicholson



From:
zoe <[EMAIL PROTECTED]>
To:
internals@lists.php.net
Date:
02/02/2008 11:21
Subject:
[PHP-DEV] Writing PHPT tests



Hi - for any of you that are writing PHPT tests for existing extensions 
- I put a PHP script called generate_phpt.php into PHP 5.3 yesterday 
which might help a little.

It's quite a simple command line script (Raghu and I wrote it last 
year), it works by looking at the {{{proto line for a function in PHP 
source code and construct a test case frame from it. It can be used to 
construct very simple tests cases - or to turn an existing PHP file into 
PHPT format. It doesn't try and guess what the results of a test should 
be :-). I will document it properly on qa.php.net later. In the meantime:

php generate_phpt.php --help

tells you what it's supposed to do.

Zoe

PS - It doesn't work for PHP6 right now because the {{{proto line is 
different.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php








Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU







Re: [PHP-DEV] how php knows the charset of my code?

2007-09-27 Thread Rob Nicholson
Suggest you check out the slides from one of Andrei's talks on the 
subject.
http://www.gravitonic.com/talks/






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU







Re: [PHP-DEV] how php knows the charset of my code?

2007-09-27 Thread Rob Nicholson
drysler,

So you are asking about what PHP6 calls "script_encoding".   PHP5 does not 
have this concept.

PHP5 does not  know the character set that your script is encoded in. The 
script can be encoded in any character set that is "ASCII compatible". 
ASCII compatible encodings have the property that any byte value which has 
a meaning as an ASCII character has the same meaning in the 
character set.  iso-8859-1 or utf-8 both have this property whereas utf-16 
for example does not.   There are rules about
what characters can be used in identifiers, (see 
http://uk.php.net/manual/en/language.functions.php#functions.user-defined) 
 but you'll see that 
these are described in terms of byte values.  Thus when you define an 
identifier or string literal in a script it is simply treated as a series 
of byte 
values by PHP without any understanding of the meaning of those byte 
values other than if they happen to represent ASCII characters.

Rob.

drysler <[EMAIL PROTECTED]> wrote on 27/09/2007 20:32:09:

> > On 9/27/07, drysler <[EMAIL PROTECTED]> wrote:
> >> Hello,
> >>
> >> i am practising with charsets at the moment and so i thought:
> >>
> >> -> How does PHP know the charset i use in my source-code?
> >> -> Are php-sources limited to specific charsets?
> >> -> In which areas you have to be aware of the source-code-charset?
> >>
> >>
> >> Perhaps somebody here on the list can tell something about these 
issues?
> >> Thanks!
> >>
> > Unless I'm mistaken, PHP expects the source files to be in the
> > internal charset, which is ISO-8859-1. If you use the mbstring
> > extension, you can use different internal encodings. See:
> > http://www.php.net/mbstring
> > 
> > Another good read on charset vs. PHP is:
> > http://www.phpwact.org/php/i18n/charsets?s=utf
> > 
> > --
> > troels
> 
> 
> I think, the problem may be divided into 2 areas:
> 
> 1) handling charsets of data (e.g. regex or string functions)
> 
> No unsolvable problem. You have to know (and/or validate) the charset of 

> the data you process, no matter if typed in in the source code or loaded 

> from other data sources. There are "tools and workarounds" available, to 

> do the things right.
> 
> 2) paying attention to the charset of the source code
> 
> This is the main issue, i wanted to address with my posting.
> I asked myself, if there can be characters i use as source code, which
> php perhaps can not recognize because of the charset i used in the 
> source-code-document.
> Or perhaps in php are only characters "allowed", which are represented 
> all the same in all supported charsets, so there might be a list of 
> charsets, you can safely use when scripting php.
> 
> I mean, is there a difference (bytes?) writing the following in 
> iso-8859-1 or utf-8?
> 
> public function foo($bar = true) {
>return self::SOME_CONSTANT;
> }
> 
> And if there is a difference, how php knows what i typed?
> 
> So many questions  :)
> 
> 
> --
> Greetings,
> drysler
> 
> -- 
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
> 






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU