Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Larry Garfield Tue, 27 Feb 2024 15:20:02 -0800

On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote:
> [Including my full previous reply, since the list and gmail currently 
> aren't being friends. Apologies that this leads to rather a lot of 
> reading in one go...]

Eh, I'd prefer a few big emails that come in slowly to lots of little emails 
that come in fast. :-)

>> On 21/02/2024 18:55, Larry Garfield wrote:
>>> Hello again, fine Internalians.
>>>
>>> After much on-again/off-again work, Ilija and I are back with a more 
>>> polished property access hooks/interface properties RFC.
>>
>>
>> Hello, and a huge thanks to both you and Ilija for the continued work 
>> on this. I'd really like to see this feature make it into PHP, and 
>> agree with a lot of the RFC.
>>
>>
>> My main concern is the proliferation of things that look the same but 
>> act differently, and things that look different but act the same:

*snip*

>> - a and b are both what we might call "traditional" properties, and 
>> equivalent to each other; a uses legacy syntax which we haven't 
>> removed for some reason

I don't know why we haven't removed `var` either.  I can't recall the last time 
I saw it in real code.  But that's out of scope here.

*snip*

> I think there's some really great functionality in the RFC, and would 
> love for it to succeed in some form, but I think it would benefit from 
> removing some of the "magic".
>
>
> Regards,
>
> -- 
> Rowan Tommins
> [IMSoP]

I'm going to try and respond to a couple of different points together here, 
including from later in the thread, as it's just easier.

== Re, design philosophy:

> In C#, all "properties" are virtual - as soon as you have any 
> non-default "get", "set" or "init" definition, it's up to you to declare 
> a separate "field" to store the value in. Swift's "computed properties" 
> are similar: if you have a custom getter or setter, there is no backing 
> store; to add behaviour to a "stored property", you use the separate 
> "property observer" hooks.
>
> Kotlin's approach is philosophically the opposite: there are no fields, 
> only properties, but properties can access a hidden "backing field" via 
> the special keyword "field". Importantly, omitting the setter doesn't 
> make the property read-only, it implies set(value) { field = value }

A little history here to help clarify how we ended up where we are: The 
original RFC as we designed it modeled very closely on Swift, with 4 hooks.  
Using get/set at all would create a virtual property and you were on your own, 
while the beforeSet/afterSet hooks would not.  We ran that design by some PHP 
Foundation sponsors a year ago (I don't actually know who, Roman did it for 
us), and the general feedback was "we like the idea, but woof this is 
complicated with all these hooks and having to make my own backing property for 
all these little things.  Couldn't this be simplified?"  We thought a bit more, 
and I off-handedly suggested to Ilija "I mean, would it be possible to just 
detect if a get/set hook is using a backing store and make it automatically?  
Then we could get rid of the before/after hooks."  He gave it a quick try and 
found that was straightforward, so we pivoted to that simplified version.  We 
then realized that we had... mostly just recreated Kotlin's design, so shrugged 
happily and went on with life.

As noted in an earlier email, C#, Kotlin, and Swift all have different stances 
on the variable name for the incoming value.  We originally modeled on Swift so 
had that model (optional newVal name), and also because we liked how compact it 
was.  When we switched to the simplified, incidentally Kotlin-esque approach, 
we just kept the optional variable as it works.

I think where that ended up is pretty nice, personally, even if it is not a 
direct map of any particular other language.

== Re asymmetric typing:

This is capability already present today if using a setter method.  

class Person {
    private $name;

    public function setName(UnicodeString|string $name)
    {
        $this->name  =  $value  instanceof UnicodeString ? $value : new  
UnicodeString($value);         
    }
}

And widening the parameter type in a child class is also entirely legal.  As 
the goal of the RFC is, essentially, "make most common getter/setter patterns 
easy to add to a property without making an API-breaking method, so people 
don't have to add redundant just-in-case getters and setters all the time," 
covering an easy-to-cover use case seems like a good thing to do.  

It also ties into the question of the explict/implicit name, for the reason you 
mentioned earlier (unspecified means mixed), not by intent.  More on that in 
another section.

== Re virtual properties:

Ilija and I talked this through, and there's pros and cons to a `virtual` 
keyword.  Ilija also suggested a `backed` keyword, which forces a backed 
property to exist even if it's not used in the hook itself.

* Adding `virtual` adds more work for the developer, but more clarity.  It 
would also mean $this->$propName or $this->{__PROPERTY__} would work "as 
expected", since there's no auto-detection for virtual-ness.  On the downside, 
if you have a could-be-virtual property but never actually use the backing 
value, you have an extra backing value hanging around in memory that is 
inaccessible normally, but will still show up in some serialization formats, 
which could be unexpected.  If you omit one of the hooks and forget to mark it 
virtual, you'll still get the default of the other operation, which could be 
unexpected.  (Mostly this would be for a virtual-get that accidentally has a 
default setter because you forgot to mark it `virtual`.)
* Doing autodetection as now, but with an added "make a backing value anyway" 
flag would resolve the use case of "My set hook just calls a method, and that 
method sets the property, but since the hook doesn't mention the property it 
doesn't get created" problem.  It would also allow for $this->$propName to work 
if a property is explicitly backed.  On the flipside, it's one more thing to 
think about, and the above example it solves would be trivially solved by 
having the method just return the value to set and letting the set hook do the 
actual write, which is arguably better and more reliable code anyway.
* The status quo (auto-detection based on the presence of $this->propName).  
This has the advantage it "just works" in the 95% case, without having to think 
about anything extra.  The downside is it does have some odd edge cases, like 
needing $this->propName to be explicitly used.  

I don't think any is an obvious winner.  My personal preference would be for 
status quo (auto-detect) or explicit-virtual always.  I could probably live 
with either, though I think I'd personally favor status quo.  Thoughts from 
others?

== Re reference-get

Allowing backed properties to have a reference return creates a situation where 
any writes would then bypass the set hook, and thus any validation implemented 
there.  That is, it makes the validation unreliable.  A major footgun.  The 
question is, do we favor caveat-emptor flexibility or correct-by-construction 
safety?  Personally I always lead toward the latter, though PHP in general 
is... schizophrenic about it, I'd say. :-)

At this point, we'd much rather leave it blocked to avoid the issue; it's 
easier to enable that loophole in the future if we really want it than to get 
rid of it if it turns out to have been a bad idea.

There is one edge case that *might* make sense: If there is no set hook 
defined, then there's no set hook to worry about bypassing.  So it may be safe 
to allow &get on backed properties IFF there is no set hook.  I worry that is 
"one more quirky edge case", though, so as above it may be better to skip for 
now as it's easier to add later than remove.  But if the consensus is to do 
that, we're open to it.  (Question for everyone.)

== Re 

== Re arrays

> Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"

That's already discussed in the RFC:

> The simplest approach would be to copy the array, modify it accordingly, and 
> pass it to set hook. This would have a large and obscure performance penalty, 
> and a set implementation would have no way of knowing which values had 
> changed, leading to, for instance, code needing to revalidate all elements 
> even if only one has changed. 

Unless we were OK with that bypassing the set hook entirely if defined, which, 
as noted above, means any safety guarantees provided by a set hook are 
bypassed, leading to untrustworthy code.

== Re hook shorthands and return values

Ilija and I have been discussing this for a bit, and we've both budged a 
little. :-)  Here's our counter-proposal:

- Drop the "top level" shorthand, for get-only hooks.
- Keep the => shorthand for the get hook itself.
- For a set hook, the {} form has no return value; set the value yourself 
however you want.
- For a set hook, the => form implies a backed value and will set the property 
to whatever value that evaluates to.

So these are equivalent:

public $foo { set { $this->foo = $value; } }
public $foo { set => $value; }

These are equivalent:

public string $foo {
  get {
    return strtoupper($this->foo);
  }
}
public string $foo { get => strtoupper($this->foo); }

And this goes away:

public string $foo => strtoupper($this->foo);

That covers the common cases with an arrow-function-like syntax that behaves as 
you'd expect (it returns things), and allows a longer version with arbitrarily 
complex logic if desired.  It also means that each syntax variant does mean 
something importantly different.

Would that be an acceptable compromise?  (Question for everyone.)

== Re the $value variable in set

Honestly, Rowan's earlier point here is the strongest argument in favor for me 
of the current RFC approach.  Anywhere else in PHP, something that looks like a 
parameter and has no type, like `($param)`, means its type is `mixed`.  It 
would be weird and confusing to be different here.  That's above and beyond the 
issue of forcing people to retype something obvious every time.  (I cite again, 
recent PHP's trend toward removing needless boilerplate, which is very good.)  
Requiring that the type be specified, for consistency, makes little sense if 
the type is not allowed to vary.  You're just repeating a string from earlier 
on the same line, for no particular benefit.

I genuinely don't understand the pushback on $value.  It's something you learn 
once and never have to think about again.  It's consistent.

Ilija jokingly suggested making it always $value, unconditionally, and allowing 
only the type to be specified if widening:

public int $foo { set(int|float) => floor($value); }

Though I suspect that won't go over well, either. :-)

So what makes the most sense to me is to keep $value optional, but IF you 
specify an alternate name, you must also specify a type (which may be wider).  
So these are equivalent:

public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }

And only those forms are legal.  But you could also do this, if the situation 
called for it:

public int $foo { set(int|float $num) => floor($num) + 1; }

This "all or nothing" approach seems like it strikes the best balance, gives 
the most flexibility where needed while still having the least redundancy when 
not needed, and when a name/type is provided, its behavior is the same as for a 
method being inherited.

Does that sound acceptable?  (Again, question for everyone.)

The alternative that gives the most future-flexibility is to do neither: The 
variable is called $value, period, you can't change it, and you can't change 
the type, either.  There is no () after set, ever.  Punt both of those to a 
later follow-up.  I'd prefer to include both now, but including neither now is 
the next-safer option.

## Regarding $field

Sigh, now y'all like it. :-P  Most of the feedback on this has been negative, 
so I'm inclined to leave it out at this point, unless there's a major swing in 
feedback to bring it back.  But the RFC seems more likely to pass without it 
than with right now.

--Larry Garfield

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Reply via email to