On Mon, Jun 5, 2017 at 7:46 PM, François Laupretre <franc...@tekwire.net>
wrote:

> Hi,
>
> PCS provides a fast and easy mechanism to mix C and PHP code in PHP
> extensions (more about PCS at http://pcs.tekwire.net). Thanks to the PHP
> 7 performance improvement and the inclusion of opcache in the core, a lot
> of existing non-performance-critical extension code may now be converted to
> PHP without significant performance loss (this must be measured case by
> case, of course, but tests show that opcode-cached PHP code is often faster
> than C).
>
> Another motivation is the lack of extension maintainers. It may be complex
> to convert a C extension to PHP but, once it's done, maintenance becomes
> much easier.
>
> As one of PCS goals is to allow converting parts of existing core
> extensions to PHP, it seems natural to initiate the movement by an
> inclusion of PCS in the core distribution. Then, I and others will start
> proposing conversions of existing code. IMO, the PDO generic layer is a
> perfect candidate, but there are many others.
>
> Converting existing C code to PHP is not the only usage. With PCS, adding
> an OO layer to a function-only extension becomes an easy task. Sara
> recently told about a curl OOP layer (https://gist.github.com/sgole
> mon/e95bfc34d34c4f63fa953ee9294ae02c). Using PCS, adding such PHP code on
> top of the curl extension would take less than one hour.
>
> I hadn't proposed this so far because the 'cache_key' operation currently
> proposed for 7.2 is a pre-requisite, as PCS exposes the PHP code it manages
> via a stream wrapper.
>
> So, please give me your thoughts. Suggestions of potential candidates to
> be rewritten from C to PHP are welcome too.
>

Hi,

First of all: I think the ability to implement parts of PHP extensions in
PHP is extremely important and will be a game changer in our ability to
maintain and improve our standard library.

There are essentially only two good reasons for implementing functionality
in C: One is performance, the other is FFI. Unfortunately, the requirement
to use C for everything inside an extension means that we write a large
amounts of C code that does not fall into either of those categories. The
resulting code is hard to maintain, often subtly buggy and usually not
consistent with ordinary userland PHP code. Typical issues we see all the
time are bad or completely absent serialization support, lack of circular
garbage collection, crashes when the object is improperly initialized and
bugs appearing when internal classes are extended.

On top of that, implementing certain functionality in C actually makes the
resulting code slower than equivalent PHP code. While our virtual machine
is highly optimized, our internal APIs are often not, or not typically used
in their most efficient form. One case where internal code loses are
invocations of userland callbacks. Another is access to properties.

The current situation also has a large and somewhat hidden impact on our
API design. Due to the large maintenance burden that implementing "proper"
APIs imposes on us, we tend to go with the simplest possible API. Usually
this means that we end up directly exposing C binding APIs, even if they
are a very bad fit for PHP. As already noted in this thread, the current
curl API is such an example. (I know that some people will argue that its
better to expose simple procedural APIs rather than fancy object oriented
APIs -- however, that's a choice that should be made based on technical
arguments, not due to technical limitations.)

Some people have mentioned that this is better solved by shipping the PHP
code separately using composer. While this may be viable for 3rd party
extensions (and may be preferable if they have large fractions of PHP
code), this option does not exist for our standard library. We can hardly
tell people that they should go install a composer package in order to make
use of some APIs in our standard library.

Anyway, to get back to the topic of PCS. First, I would recommend to target
PHP 7.3 for this change. Feature freeze for 7.2 is in a bit over a month
and I think we'll want to make some non-trivial changes to how this works
if we integrate it in PHP. If added to PHP, I think this should be
integrated into the core, rather than being an extension.

Here are some random thoughts:

1. As far as I understand, PCS relies on autoloading. There are two issues
here: First, autoloading does not register symbols prior to autoloading.
This means that functions like get_defined_classes() will not behave as
expected. Second, autoloading does not support functions. I think both of
these problems can be solved with some up-front symbol analysis. Lazily
compiling internal functions should not run into any of the problems we
have with userland function autoloading.

2. It has already been mentioned in the thread, but what seems to lack
right now is a good way of integrating PHP and C portions. As far as I
understand, PCS allows you to write an entire class in PHP, but it does not
allow you to offload parts of the functionality to C without exposing
additional public APIs. I think there are two things we can do here:

a) Provide a mechanism that makes certain functions only available inside
extension PHP code. This would allow exposing some private PHP functions
which are only used in the internal implementation of the extension.

b) Allow binding some methods specified in PHP to an internal
implementation. The way this works in HHVM is that the PHP file contains
just a signature, with an attribute that signifies that an internal
implementation will be bound to that function:

class Bar {
    <<__Native>>
    function foo($args);
}

This would be useful for other reasons as well. In particular, this could
become a replacement for the existing arginfo-based signature
specification, which is somewhat limited and causes discrepancies with
userland classes. For example, arginfo does not support default values.

Regards,
Nikita

Reply via email to