Hi,

> -----Original Message-----
> From: Nikita Popov [mailto:nikita....@gmail.com]
> Sent: Tuesday, June 6, 2017 2:43 PM
> To: François Laupretre <franc...@tekwire.net>
> Cc: PHP internals <internals@lists.php.net>
> Subject: Re: [PHP-DEV] Proposing inclusion of PCS in the 7.2 core distribution
> 
> On Mon, Jun 5, 2017 at 7:46 PM, François Laupretre <franc...@tekwire.net>
> wrote:
> 
> > Hi,
> >
> > PCS provides a fast and easy mechanism to mix C and PHP code in PHP
> > extensions (more about PCS at http://pcs.tekwire.net). Thanks to the
> > PHP
> > 7 performance improvement and the inclusion of opcache in the core, a
> > lot of existing non-performance-critical extension code may now be
> > converted to PHP without significant performance loss (this must be
> > measured case by case, of course, but tests show that opcode-cached
> > PHP code is often faster than C).
> >
> > Another motivation is the lack of extension maintainers. It may be
> > complex to convert a C extension to PHP but, once it's done,
> > maintenance becomes much easier.
> >
> > As one of PCS goals is to allow converting parts of existing core
> > extensions to PHP, it seems natural to initiate the movement by an
> > inclusion of PCS in the core distribution. Then, I and others will
> > start proposing conversions of existing code. IMO, the PDO generic
> > layer is a perfect candidate, but there are many others.
> >
> > Converting existing C code to PHP is not the only usage. With PCS,
> > adding an OO layer to a function-only extension becomes an easy task.
> > Sara recently told about a curl OOP layer
> > (https://gist.github.com/sgole mon/e95bfc34d34c4f63fa953ee9294ae02c).
> > Using PCS, adding such PHP code on top of the curl extension would take less
> than one hour.
> >
> > I hadn't proposed this so far because the 'cache_key' operation
> > currently proposed for 7.2 is a pre-requisite, as PCS exposes the PHP
> > code it manages via a stream wrapper.
> >
> > So, please give me your thoughts. Suggestions of potential candidates
> > to be rewritten from C to PHP are welcome too.
> >
> 
> Hi,
> 
> First of all: I think the ability to implement parts of PHP extensions in PHP 
> is
> extremely important and will be a game changer in our ability to maintain and
> improve our standard library.
> 
> There are essentially only two good reasons for implementing functionality in 
> C:
> One is performance, the other is FFI. Unfortunately, the requirement to use C
> for everything inside an extension means that we write a large amounts of C
> code that does not fall into either of those categories. The resulting code 
> is hard
> to maintain, often subtly buggy and usually not consistent with ordinary 
> userland
> PHP code. Typical issues we see all the time are bad or completely absent
> serialization support, lack of circular garbage collection, crashes when the
> object is improperly initialized and bugs appearing when internal classes are
> extended.
> 
> On top of that, implementing certain functionality in C actually makes the
> resulting code slower than equivalent PHP code. While our virtual machine is
> highly optimized, our internal APIs are often not, or not typically used in 
> their
> most efficient form. One case where internal code loses are invocations of
> userland callbacks. Another is access to properties.
> 
> The current situation also has a large and somewhat hidden impact on our API
> design. Due to the large maintenance burden that implementing "proper"
> APIs imposes on us, we tend to go with the simplest possible API. Usually this
> means that we end up directly exposing C binding APIs, even if they are a very
> bad fit for PHP. As already noted in this thread, the current curl API is 
> such an
> example. (I know that some people will argue that its better to expose simple
> procedural APIs rather than fancy object oriented APIs -- however, that's a
> choice that should be made based on technical arguments, not due to technical
> limitations.)
> 
> Some people have mentioned that this is better solved by shipping the PHP code
> separately using composer. While this may be viable for 3rd party extensions
> (and may be preferable if they have large fractions of PHP code), this option
> does not exist for our standard library. We can hardly tell people that they
> should go install a composer package in order to make use of some APIs in our
> standard library.
> 
> Anyway, to get back to the topic of PCS. First, I would recommend to target 
> PHP
> 7.3 for this change. Feature freeze for 7.2 is in a bit over a month and I 
> think
> we'll want to make some non-trivial changes to how this works if we integrate 
> it
> in PHP. If added to PHP, I think this should be integrated into the core, 
> rather
> than being an extension.
> 
> Here are some random thoughts:
> 
> 1. As far as I understand, PCS relies on autoloading. There are two issues
> here: First, autoloading does not register symbols prior to autoloading.
> This means that functions like get_defined_classes() will not behave as
> expected. Second, autoloading does not support functions. I think both of 
> these
> problems can be solved with some up-front symbol analysis. Lazily compiling
> internal functions should not run into any of the problems we have with 
> userland
> function autoloading.
> 
> 2. It has already been mentioned in the thread, but what seems to lack right 
> now
> is a good way of integrating PHP and C portions. As far as I understand, PCS
> allows you to write an entire class in PHP, but it does not allow you to 
> offload
> parts of the functionality to C without exposing additional public APIs. I 
> think
> there are two things we can do here:
> 
> a) Provide a mechanism that makes certain functions only available inside
> extension PHP code. This would allow exposing some private PHP functions
> which are only used in the internal implementation of the extension.
> 
> b) Allow binding some methods specified in PHP to an internal implementation.
> The way this works in HHVM is that the PHP file contains just a signature, 
> with an
> attribute that signifies that an internal implementation will be bound to that
> function:
> 
> class Bar {
>     <<__Native>>
>     function foo($args);
> }
> 
> This would be useful for other reasons as well. In particular, this could 
> become a
> replacement for the existing arginfo-based signature specification, which is
> somewhat limited and causes discrepancies with userland classes. For example,
> arginfo does not support default values.
> 
The mechanism like HHVM has is what were surely useful. Where I see a concern 
regarding PHP is, that with the original proposal a PHP interpreter is needed 
for the partials processing. To the compilation time for the core, it is not 
expected to be available. In further, for example if the binary just compiled 
would be used, it has an issue potential with FDO/PGO. Reason - required 
preparation tasks would produce training data not necessarily desired. Perhaps 
that is solvable by extending the build time - an independent minimal binary 
could be produced just for the goal. That, however, might not suffice depending 
on complexity, fe like it were about a PECL ext depending on classes of another 
one and trying to use type hints, classes from dependent, etc. Perhaps this 
needs more evaluation for non core exts. It's somehow a chinken/egg issue.

IMO, in any case the pieces, that can be handed out into a PHP code, would be a 
huge win. That would add to complexity reduction of the actual C parts an make 
the actual dev faster and more qualitative. There are always cases, where 
moving the implementation partially to C is a win in speed or functionality, or 
where is moving C implementation into userland doesn't make things worse but 
simplifies a lot. Currently it's almost only one way - only moving parts to C 
is a win in most case. Flexibility is a huge win in having both, too. 

I'd also see the topic as coupled tightly with the previous discussions about 
Opcache core integration. Looking at what Python does, a possibility to 
redistribute the opcode cached bins might make sense. Of course, there are 
differences in how it works, where we could introduce some naming/configuration 
conventions to pursue the goal. That might involve also a change to package.xml 
specs for PECL exts. In general, I'd put the Opcache integration in the 
foreground, as that is the long standing topic that would also make evermore 
sense when the JIT branch in integrated. A PECL ext can have sources only, but 
given it's compiled for and on a specific platform - having right the opcode 
bins is more optimal. Maybe having both ways of either pure PHP or opcode bins 
were useful, too. In any case, Opcache integration with the core could be a 
game changer in many topics, as for me. Another related topic might be the 
integration of libffi, also looking at the good examples from Python.

Thanks

Anatol

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to