[PHP-DEV] Operator overloading for userspace objects

jan.h.boehmer Tue, 28 Jan 2020 15:15:21 -0800

Hello everybody,


the last days I have experimented a bit with operator overloading in
userspace classes (redefing the meaning of arithmetic operations like +, -,
*, etc. for your own classes).

This could be useful for different libraries which implements custom
arithmetic objects (like money values, tensors, etc.) or things like Symfony
string component (concatenate) operator, because it improves readability
much:

$x * ($a + $b) instead of $x->multiply($a->add($b))

 

4 years ago, there was a RFC about this topic (
<https://wiki.php.net/rfc/operator-overloading>
https://wiki.php.net/rfc/operator-overloading), which was discussed a bit (
<https://externals.io/message/89967> https://externals.io/message/89967),
but there was no real Outcome.

 

I have tried to implement a proof of concept of the RFC, I encountered some
problems, when implementing the operator functions as (non-static) class
members and pass them only the other argument: What happens when we
encounter an expression like 2/$a and how can the class differ this from
$a/2. Also not every operation on every structure is e.g on commutative
(e.g. for matrices A*B =/= B*A). So I tried a C#-like approach, where the
operator implementations are static functions in the class, and both
arguments are passed. In my PHP implementation this would look something
like this:

 

Class X {

    public static function __add($lhs, $rhs) {

                //...

   }

}

 

The class function can so decide what to do, based on both operands (so it
can decide if the developer wrote 2/$a or $a/2). Also that way an
implementor can not return $this by accident, which could lead to unintended
side effect, if the result of the operation is somehow mutated.

 

I have taken over the idea of defining a magic function for each operation
(like Python does), because I think that way it is the clearest way to see,
what operators a class implements (could be useful for static analysis). The
downside to this approach is that this increases the number of magic
functions highly (my PoC-code defines 13 additional magic functions, and the
unary operators are missing yet), so some people in the original discussion
suggest to define a single (magic) function, where the operator is passed,
and the user code decides, what to do. Advantageous is very extensible (with
the right parser implementation, you could even define your own new
operators), with the cost that this method will become very complex for data
structures which use multiple operators (large if-else or switch
constructions, which delegate the logic to the appropriate functions). An
other idea mentioned was to extract interfaces with common functionality
(like Arithmetically, Comparable, etc.) like done with the ArrayAccess or
Countable interfaces. The problem that I see here, is that this approach is
rather unflexible and it would be difficult to extract really universal
interfaces (e.g. vectors does not need a division (/) operation, but the
concatenation . could be really useful for implementing dot product). This
would lead to either that only parts of the interfaces are implemented (and
the other just throw exceptions) or that the interfaces contain only one or
two functions (so we would have many interfaces instead of magic functions
in the end).

 

On the topic which operators should be overloadable: My PoC-implementation
has magic functions for the arithmetic operators (+, -, *, /, %, **), string
concatenation (.), and bit operations (>>, <<, &, |, ^). Comparison and
equality checks are implement using a common __compare() function, which
acts like an overload of the spaceship operator. Based if -1, 0 or +1 is
returned by the  comparison operators (<, >, <=, >=, ==) are evaluated. I
think this way we can enforce, that the assumed standard logic (e.g
!($a<$b)=($a>=$b) and ($a<$b)=($b>$a)) of comparison is implemented. Also I
dont think this would restrict real world applications much (if you have an
example, where a separate definition of < and >= could be useful, please
comment it).

Unlike the original idea, I dont think it should be possible to overwrite
identity operator (===), because it should always be possible to check if
two objects are really identical (also every case should be coverable by
equality). The same applies to the logic operators (!, ||, &&), I think they
should always work like intended (other languages like Python and C# handles
it that way too).

For the shorthand assignment operators like +=, -= the situation is a bit
more complicated: On the one hand the user has learned that $a+=1 is just an
abbreviation of $=$a+1, so this logic should apply to overloaded operators
as well (in C# it is implemented like this). On the other hand it could be
useful to differentiate between the two cases, so you can mutate the object
itself (in the += case) instead of returning a new object instance (the
class cannot know it is assigned to its own reference, when $a + 1 is
called). Personally I dont think that this would be a big problem, so my
PoC-Code does not provide a possibility to override the short hand
operators.) For the increment/decrement operators ($a++) it is similar, it
would be nice if it would be possible to overload this operator but on the
other hand the use cases of this operator is really limited besides integer
incrementation and if you want to trigger something more complex, you should
call a method, to make clear of your intent.

 

On the topic in which order  the operators should be executed: Besides the
normal priority (defined by PHP), my code checks if the element on the left
side is an object and tries to call the appropriate magic function on it. If
this is not possible the same is done for the right argument. This should
cover the most of the use cases, except some cases: Consider a expression
like $a / $b, where $a and $b has different classes (class A + class B). If
class B knows how to divide class A, but class A does not know about class
B, we encounter a problem when evaluating just from left to right (and check
if the magic method exists). A solution for that would be that object $a can
express that he does not know how to handle class B (e.g. by returning null,
or throwing a special exception) and PHP can call the handler on object $b.
I'm not sure how common this problem would be, so I dont have an idea how
useful this feature would be.

 

My proof-of-concept implementation can be found here:
<https://github.com/jbtronics/php-src> https://github.com/jbtronics/php-src

Here you can find some basic demo code using it:
<https://gist.github.com/jbtronics/ee6431e52c161ddd006f8bb7e4f5bcd6>
https://gist.github.com/jbtronics/ee6431e52c161ddd006f8bb7e4f5bcd6

 

I would be happy to hear some opinions for this concept, and the idea of
overloadable operators in PHP in general.

 

Thanks and Best regards,

Jan Böhmer

[PHP-DEV] Operator overloading for userspace objects

Reply via email to