On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote:
> [Including my full previous reply, since the list and gmail currently
> aren't being friends. Apologies that this leads to rather a lot of
> reading in one go...]
Eh, I'd prefer a few big emails that come in slowly to lots of little emails
that come in fast. :-)
>> On 21/02/2024 18:55, Larry Garfield wrote:
>>> Hello again, fine Internalians.
>>>
>>> After much on-again/off-again work, Ilija and I are back with a more
>>> polished property access hooks/interface properties RFC.
>>
>>
>> Hello, and a huge thanks to both you and Ilija for the continued work
>> on this. I'd really like to see this feature make it into PHP, and
>> agree with a lot of the RFC.
>>
>>
>> My main concern is the proliferation of things that look the same but
>> act differently, and things that look different but act the same:
*snip*
>> - a and b are both what we might call "traditional" properties, and
>> equivalent to each other; a uses legacy syntax which we haven't
>> removed for some reason
I don't know why we haven't removed `var` either. I can't recall the last time
I saw it in real code. But that's out of scope here.
*snip*
> I think there's some really great functionality in the RFC, and would
> love for it to succeed in some form, but I think it would benefit from
> removing some of the "magic".
>
>
> Regards,
>
> --
> Rowan Tommins
> [IMSoP]
I'm going to try and respond to a couple of different points together here,
including from later in the thread, as it's just easier.
== Re, design philosophy:
> In C#, all "properties" are virtual - as soon as you have any
> non-default "get", "set" or "init" definition, it's up to you to declare
> a separate "field" to store the value in. Swift's "computed properties"
> are similar: if you have a custom getter or setter, there is no backing
> store; to add behaviour to a "stored property", you use the separate
> "property observer" hooks.
>
> Kotlin's approach is philosophically the opposite: there are no fields,
> only properties, but properties can access a hidden "backing field" via
> the special keyword "field". Importantly, omitting the setter doesn't
> make the property read-only, it implies set(value) { field = value }
A little history here to help clarify how we ended up where we are: The
original RFC as we designed it modeled very closely on Swift, with 4 hooks.
Using get/set at all would create a virtual property and you were on your own,
while the beforeSet/afterSet hooks would not. We ran that design by some PHP
Foundation sponsors a year ago (I don't actually know who, Roman did it for
us), and the general feedback was "we like the idea, but woof this is
complicated with all these hooks and having to make my own backing property for
all these little things. Couldn't this be simplified?" We thought a bit more,
and I off-handedly suggested to Ilija "I mean, would it be possible to just
detect if a get/set hook is using a backing store and make it automatically?
Then we could get rid of the before/after hooks." He gave it a quick try and
found that was straightforward, so we pivoted to that simplified version. We
then realized that we had... mostly just recreated Kotlin's design, so shrugged
happily and went on with life.
As noted in an earlier email, C#, Kotlin, and Swift all have different stances
on the variable name for the incoming value. We originally modeled on Swift so
had that model (optional newVal name), and also because we liked how compact it
was. When we switched to the simplified, incidentally Kotlin-esque approach,
we just kept the optional variable as it works.
I think where that ended up is pretty nice, personally, even if it is not a
direct map of any particular other language.
== Re asymmetric typing:
This is capability already present today if using a setter method.
class Person {
private $name;
public function setName(UnicodeString|string $name)
{
$this->name = $value instanceof UnicodeString ? $value : new
UnicodeString($value);
}
}
And widening the parameter type in a child class is also entirely legal. As
the goal of the RFC is, essentially, "make most common getter/setter patterns
easy to add to a property without making an API-breaking method, so people
don't have to add redundant just-in-case getters and setters all the time,"
covering an easy-to-cover use case seems like a good thing to do.
It also ties into the question of the explict/implicit name, for the reason you
mentioned earlier (unspecified means mixed), not by intent. More on that in
another section.
== Re virtual properties:
Ilija and I talked this through, and there's pros and cons to a `virtual`
keyword. Ilija also suggested a `backed` keyword, which forces a backed
property to exist even if it's not used in the hook itself.
* Adding `virtual` adds more work for the developer, but more clarity. It
would also mean $this->$propName or $this->{__PROPERTY__} would work "as
expected", since there's no auto-detection for virtual-ness. On the downside,
if you have a could-be-virtual property but never actually use the backing
value, you have an extra backing value hanging around in memory that is
inaccessible normally, but will still show up in some serialization formats,
which could be unexpected. If you omit one of the hooks and forget to mark it
virtual, you'll still get the default of the other operation, which could be
unexpected. (Mostly this would be for a virtual-get that accidentally has a
default setter because you forgot to mark it `virtual`.)
* Doing autodetection as now, but with an added "make a backing value anyway"
flag would resolve the use case of "My set hook just calls a method, and that
method sets the property, but since the hook doesn't mention the property it
doesn't get created" problem. It would also allow for $this->$propName to work
if a property is explicitly backed. On the flipside, it's one more thing to
think about, and the above example it solves would be trivially solved by
having the method just return the value to set and letting the set hook do the
actual write, which is arguably better and more reliable code anyway.
* The status quo (auto-detection based on the presence of $this->propName).
This has the advantage it "just works" in the 95% case, without having to think
about anything extra. The downside is it does have some odd edge cases, like
needing $this->propName to be explicitly used.
I don't think any is an obvious winner. My personal preference would be for
status quo (auto-detect) or explicit-virtual always. I could probably live
with either, though I think I'd personally favor status quo. Thoughts from
others?
== Re reference-get
Allowing backed properties to have a reference return creates a situation where
any writes would then bypass the set hook, and thus any validation implemented
there. That is, it makes the validation unreliable. A major footgun. The
question is, do we favor caveat-emptor flexibility or correct-by-construction
safety? Personally I always lead toward the latter, though PHP in general
is... schizophrenic about it, I'd say. :-)
At this point, we'd much rather leave it blocked to avoid the issue; it's
easier to enable that loophole in the future if we really want it than to get
rid of it if it turns out to have been a bad idea.
There is one edge case that *might* make sense: If there is no set hook
defined, then there's no set hook to worry about bypassing. So it may be safe
to allow &get on backed properties IFF there is no set hook. I worry that is
"one more quirky edge case", though, so as above it may be better to skip for
now as it's easier to add later than remove. But if the consensus is to do
that, we're open to it. (Question for everyone.)
== Re
== Re arrays
> Regarding arrays, have you considered allowing array-index writes if
an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
unset($_temp);"
That's already discussed in the RFC:
> The simplest approach would be to copy the array, modify it accordingly, and
> pass it to set hook. This would have a large and obscure performance penalty,
> and a set implementation would have no way of knowing which values had
> changed, leading to, for instance, code needing to revalidate all elements
> even if only one has changed.
Unless we were OK with that bypassing the set hook entirely if defined, which,
as noted above, means any safety guarantees provided by a set hook are
bypassed, leading to untrustworthy code.
== Re hook shorthands and return values
Ilija and I have been discussing this for a bit, and we've both budged a
little. :-) Here's our counter-proposal:
- Drop the "top level" shorthand, for get-only hooks.
- Keep the => shorthand for the get hook itself.
- For a set hook, the {} form has no return value; set the value yourself
however you want.
- For a set hook, the => form implies a backed value and will set the property
to whatever value that evaluates to.
So these are equivalent:
public $foo { set { $this->foo = $value; } }
public $foo { set => $value; }
These are equivalent:
public string $foo {
get {
return strtoupper($this->foo);
}
}
public string $foo { get => strtoupper($this->foo); }
And this goes away:
public string $foo => strtoupper($this->foo);
That covers the common cases with an arrow-function-like syntax that behaves as
you'd expect (it returns things), and allows a longer version with arbitrarily
complex logic if desired. It also means that each syntax variant does mean
something importantly different.
Would that be an acceptable compromise? (Question for everyone.)
== Re the $value variable in set
Honestly, Rowan's earlier point here is the strongest argument in favor for me
of the current RFC approach. Anywhere else in PHP, something that looks like a
parameter and has no type, like `($param)`, means its type is `mixed`. It
would be weird and confusing to be different here. That's above and beyond the
issue of forcing people to retype something obvious every time. (I cite again,
recent PHP's trend toward removing needless boilerplate, which is very good.)
Requiring that the type be specified, for consistency, makes little sense if
the type is not allowed to vary. You're just repeating a string from earlier
on the same line, for no particular benefit.
I genuinely don't understand the pushback on $value. It's something you learn
once and never have to think about again. It's consistent.
Ilija jokingly suggested making it always $value, unconditionally, and allowing
only the type to be specified if widening:
public int $foo { set(int|float) => floor($value); }
Though I suspect that won't go over well, either. :-)
So what makes the most sense to me is to keep $value optional, but IF you
specify an alternate name, you must also specify a type (which may be wider).
So these are equivalent:
public int $foo { set (int $value) => $value + 1 }
public int $foo { set => $value + 1 }
And only those forms are legal. But you could also do this, if the situation
called for it:
public int $foo { set(int|float $num) => floor($num) + 1; }
This "all or nothing" approach seems like it strikes the best balance, gives
the most flexibility where needed while still having the least redundancy when
not needed, and when a name/type is provided, its behavior is the same as for a
method being inherited.
Does that sound acceptable? (Again, question for everyone.)
The alternative that gives the most future-flexibility is to do neither: The
variable is called $value, period, you can't change it, and you can't change
the type, either. There is no () after set, ever. Punt both of those to a
later follow-up. I'd prefer to include both now, but including neither now is
the next-safer option.
## Regarding $field
Sigh, now y'all like it. :-P Most of the feedback on this has been negative,
so I'm inclined to leave it out at this point, unless there's a major swing in
feedback to bring it back. But the RFC seems more likely to pass without it
than with right now.
--Larry Garfield