Hi Stefan,

On Thu, Aug 13, 2009 at 1:42 PM, Stefan Marr<p...@stefan-marr.de> wrote:
> Hello internals:
>
> I had a look at the Zend Engine to understand some
> details about its internal design with respect
> to its opcodes and machine model.


To start with, the best reference about the Zend engine that I know of
is a presentation by Andy Wharmby at IBM:
www.zapt.info/PHPOpcodes_Sep2008.odp. It should answer a lot of your
questions.



> Would like to ask you for some comments if the
> following sounds wrong or misinterpreted to you:
>
>
> So, the basic design of the Zend Engine is a
> a stack-based interpreter for a fixed length

No, its a register based interpreter. There is a stack, but thats used
for calling functions only. The operands to the opcodes are pointed to
by the opcodes in the case of compiled variables, or in symbol tables
otherwise. That's as close to a register machine as we can get I
think, but its not very close to a stack machine. In a stack-based VM,
the operands to an opcode would be implicit, with add for example
using the top two stack operands, and thats not the case at all.


> instruction set (76byte on a 32bit architecture),

Andy's presentation says 96 bytes, but that might be 64 bit. I presume
this means sizeof(strict _zend_op)?


> where the instruction encoding
> is much more complex then for instance for the
> JVM, Python, or Smalltalk.

Yes, definitely.



> Even so, the source code is compiled to a linearized
> instruction stream, the instruction itself contain not just opcode and
> operands.
>
> The version I looked at had some 136 opcodes encoded
> in one byte, but the rest of the instruction has
> many similarities with a AST representation.

Are you referring to the IS_TMP_VAR type of a znode?


> Instructions encode:
>  - a function pointer to the actual handler which is
>   used to execute it

The type of interpreter dispatch used can be chosen at configure-time
using the --with-vm-kind flag. The call-based interpreter is the
default. I've heard the others are buggy, but I'm not certain where I
heard that.


> However, its not a simple, single stack model,
> but uses several purpose-specific stacks.

How so?


> What I am not so sure about is especially the
> semantics of the result field and the pointer
> to the other function (op_array).
>
> Would be grateful if someone could comment on that.

I'm not sure whats confusing about the result field? It points to a
zval, same as op1 and op2.

I _think_ that op_array is used to attach extra information to the
opcode by special extensions. I can't think of an example off the top
of my head.



> I am also not really sure with these complexity,
> whether is not actually some kind of abstract syntax
> tree instead of a instruction set like Java
> bytecode. Thats not a technical problem, but merely
> an academic question to categorize/characterize PHP.

I think the result field of a znode can make it seem like that, but I
would characterize it as you have before. An instruction set just like
Java bytecode. Way more complicated, obviously, but I dont think its
very close to an AST. Certainly the interpreter does not really
resemble an AST walker.



I hope I answered what you were looking for. I'm not certain about a
few of my answers, since I've really avoided the interpreter in my
work, but I think most of it is OK.



Best of luck,
Paul



-- 
Paul Biggar
paul.big...@gmail.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to