> On 28 Feb 2024, at 06:17, Larry Garfield <la...@garfieldtech.com> wrote:
>
> On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote:
>> [Including my full previous reply, since the list and gmail currently
>> aren't being friends. Apologies that this leads to rather a lot of
>> reading in one go...]
>
> Eh, I'd prefer a few big emails that come in slowly to lots of little emails
> that come in fast. :-)
>
>>> On 21/02/2024 18:55, Larry Garfield wrote:
>>>> Hello again, fine Internalians.
>>>>
>>>> After much on-again/off-again work, Ilija and I are back with a more
>>>> polished property access hooks/interface properties RFC.
>>>
>>>
>>> Hello, and a huge thanks to both you and Ilija for the continued work
>>> on this. I'd really like to see this feature make it into PHP, and
>>> agree with a lot of the RFC.
>>>
>>>
>>> My main concern is the proliferation of things that look the same but
>>> act differently, and things that look different but act the same:
>
> *snip*
>
>>> - a and b are both what we might call "traditional" properties, and
>>> equivalent to each other; a uses legacy syntax which we haven't
>>> removed for some reason
>
> I don't know why we haven't removed `var` either. I can't recall the last
> time I saw it in real code. But that's out of scope here.
>
> *snip*
>
>> I think there's some really great functionality in the RFC, and would
>> love for it to succeed in some form, but I think it would benefit from
>> removing some of the "magic".
>>
>>
>> Regards,
>>
>> --
>> Rowan Tommins
>> [IMSoP]
>
>
> I'm going to try and respond to a couple of different points together here,
> including from later in the thread, as it's just easier.
>
> == Re, design philosophy:
>
>> In C#, all "properties" are virtual - as soon as you have any
>> non-default "get", "set" or "init" definition, it's up to you to declare
>> a separate "field" to store the value in. Swift's "computed properties"
>> are similar: if you have a custom getter or setter, there is no backing
>> store; to add behaviour to a "stored property", you use the separate
>> "property observer" hooks.
>>
>> Kotlin's approach is philosophically the opposite: there are no fields,
>> only properties, but properties can access a hidden "backing field" via
>> the special keyword "field". Importantly, omitting the setter doesn't
>> make the property read-only, it implies set(value) { field = value }
>
> A little history here to help clarify how we ended up where we are: The
> original RFC as we designed it modeled very closely on Swift, with 4 hooks.
> Using get/set at all would create a virtual property and you were on your
> own, while the beforeSet/afterSet hooks would not. We ran that design by
> some PHP Foundation sponsors a year ago (I don't actually know who, Roman did
> it for us), and the general feedback was "we like the idea, but woof this is
> complicated with all these hooks and having to make my own backing property
> for all these little things. Couldn't this be simplified?" We thought a bit
> more, and I off-handedly suggested to Ilija "I mean, would it be possible to
> just detect if a get/set hook is using a backing store and make it
> automatically? Then we could get rid of the before/after hooks." He gave it
> a quick try and found that was straightforward, so we pivoted to that
> simplified version. We then realized that we had... mostly just recreated
> Kotlin's design, so shrugged happily and went on with life.
>
> As noted in an earlier email, C#, Kotlin, and Swift all have different
> stances on the variable name for the incoming value. We originally modeled
> on Swift so had that model (optional newVal name), and also because we liked
> how compact it was. When we switched to the simplified, incidentally
> Kotlin-esque approach, we just kept the optional variable as it works.
>
> I think where that ended up is pretty nice, personally, even if it is not a
> direct map of any particular other language.
>
> == Re asymmetric typing:
>
> This is capability already present today if using a setter method.
>
> class Person {
> private $name;
>
> public function setName(UnicodeString|string $name)
> {
> $this->name = $value instanceof UnicodeString ? $value : new
> UnicodeString($value);
> }
> }
>
> And widening the parameter type in a child class is also entirely legal. As
> the goal of the RFC is, essentially, "make most common getter/setter patterns
> easy to add to a property without making an API-breaking method, so people
> don't have to add redundant just-in-case getters and setters all the time,"
> covering an easy-to-cover use case seems like a good thing to do.
>
> It also ties into the question of the explict/implicit name, for the reason
> you mentioned earlier (unspecified means mixed), not by intent. More on that
> in another section.
>
> == Re virtual properties:
>
> Ilija and I talked this through, and there's pros and cons to a `virtual`
> keyword. Ilija also suggested a `backed` keyword, which forces a backed
> property to exist even if it's not used in the hook itself.
>
> * Adding `virtual` adds more work for the developer, but more clarity. It
> would also mean $this->$propName or $this->{__PROPERTY__} would work "as
> expected", since there's no auto-detection for virtual-ness. On the
> downside, if you have a could-be-virtual property but never actually use the
> backing value, you have an extra backing value hanging around in memory that
> is inaccessible normally, but will still show up in some serialization
> formats, which could be unexpected. If you omit one of the hooks and forget
> to mark it virtual, you'll still get the default of the other operation,
> which could be unexpected. (Mostly this would be for a virtual-get that
> accidentally has a default setter because you forgot to mark it `virtual`.)
> * Doing autodetection as now, but with an added "make a backing value anyway"
> flag would resolve the use case of "My set hook just calls a method, and that
> method sets the property, but since the hook doesn't mention the property it
> doesn't get created" problem. It would also allow for $this->$propName to
> work if a property is explicitly backed. On the flipside, it's one more
> thing to think about, and the above example it solves would be trivially
> solved by having the method just return the value to set and letting the set
> hook do the actual write, which is arguably better and more reliable code
> anyway.
> * The status quo (auto-detection based on the presence of $this->propName).
> This has the advantage it "just works" in the 95% case, without having to
> think about anything extra. The downside is it does have some odd edge
> cases, like needing $this->propName to be explicitly used.
>
> I don't think any is an obvious winner. My personal preference would be for
> status quo (auto-detect) or explicit-virtual always. I could probably live
> with either, though I think I'd personally favor status quo. Thoughts from
> others?
>
I agree that a flag to make the field *virtual* (thus disabling the backing
store) makes more sense than a flag to make it backed; It's also easier to
understand when comparing hooked properties with regular properties
(essentially, backed is the default, you have to opt-in to it being virtual). I
don't think the edge cases of "auto" make it worthwhile just to not need
"virtual".
> == Re reference-get
>
> Allowing backed properties to have a reference return creates a situation
> where any writes would then bypass the set hook, and thus any validation
> implemented there. That is, it makes the validation unreliable. A major
> footgun. The question is, do we favor caveat-emptor flexibility or
> correct-by-construction safety? Personally I always lead toward the latter,
> though PHP in general is... schizophrenic about it, I'd say. :-)
>
> At this point, we'd much rather leave it blocked to avoid the issue; it's
> easier to enable that loophole in the future if we really want it than to get
> rid of it if it turns out to have been a bad idea.
>
> There is one edge case that *might* make sense: If there is no set hook
> defined, then there's no set hook to worry about bypassing. So it may be
> safe to allow &get on backed properties IFF there is no set hook. I worry
> that is "one more quirky edge case", though, so as above it may be better to
> skip for now as it's easier to add later than remove. But if the consensus
> is to do that, we're open to it. (Question for everyone.)
>
I don't have strong feeling about this, but in general I usually tend to prefer
options that are consistent, and give power/options to the developer. If
references are opt-in anyway, I see that as accepting the trade-offs. If a
developer doesn't want to allow by-ref modifications of the property, why would
they make it referenceable in the first place? This sounds a bit like
disallowing regular public properties because they might be modified outside
the class - that's kind of the point, surely.
> == Re
>
> == Re arrays
>
>> Regarding arrays, have you considered allowing array-index writes if
> an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
> as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
> unset($_temp);"
>
> That's already discussed in the RFC:
>
>> The simplest approach would be to copy the array, modify it accordingly, and
>> pass it to set hook. This would have a large and obscure performance
>> penalty, and a set implementation would have no way of knowing which values
>> had changed, leading to, for instance, code needing to revalidate all
>> elements even if only one has changed.
>
> Unless we were OK with that bypassing the set hook entirely if defined,
> which, as noted above, means any safety guarantees provided by a set hook are
> bypassed, leading to untrustworthy code.
>
> == Re hook shorthands and return values
>
> Ilija and I have been discussing this for a bit, and we've both budged a
> little. :-) Here's our counter-proposal:
>
> - Drop the "top level" shorthand, for get-only hooks.
> - Keep the => shorthand for the get hook itself.
> - For a set hook, the {} form has no return value; set the value yourself
> however you want.
> - For a set hook, the => form implies a backed value and will set the
> property to whatever value that evaluates to.
>
> So these are equivalent:
>
> public $foo { set { $this->foo = $value; } }
> public $foo { set => $value; }
>
> These are equivalent:
>
> public string $foo {
> get {
> return strtoupper($this->foo);
> }
> }
> public string $foo { get => strtoupper($this->foo); }
>
> And this goes away:
>
> public string $foo => strtoupper($this->foo);
>
> That covers the common cases with an arrow-function-like syntax that behaves
> as you'd expect (it returns things), and allows a longer version with
> arbitrarily complex logic if desired. It also means that each syntax variant
> does mean something importantly different.
>
> Would that be an acceptable compromise? (Question for everyone.)
>
I think the examples given are clear, and the lack of the top-level short
closure-esque version makes it more obvious. Forgive me, I must have missed
some of the previous comments - is there a reason the 'full' setter can't
return a value, for the sake of consistency? I understand that you don't want
"return to set" to be the *only* option, for the sake of e.g. change/audit
logging type functionality (i.e. set and then some action to record that the
change was made), but it seems a little odd and inconsistent to me that the
return value of a short closure would be used when the return value of the long
version isn't. This isn't really a major issue, I'm just curious if there was
some explanation about it?
> == Re the $value variable in set
>
> Honestly, Rowan's earlier point here is the strongest argument in favor for
> me of the current RFC approach. Anywhere else in PHP, something that looks
> like a parameter and has no type, like `($param)`, means its type is `mixed`.
> It would be weird and confusing to be different here. That's above and
> beyond the issue of forcing people to retype something obvious every time.
> (I cite again, recent PHP's trend toward removing needless boilerplate, which
> is very good.) Requiring that the type be specified, for consistency, makes
> little sense if the type is not allowed to vary. You're just repeating a
> string from earlier on the same line, for no particular benefit.
>
> I genuinely don't understand the pushback on $value. It's something you
> learn once and never have to think about again. It's consistent.
>
> Ilija jokingly suggested making it always $value, unconditionally, and
> allowing only the type to be specified if widening:
>
> public int $foo { set(int|float) => floor($value); }
>
> Though I suspect that won't go over well, either. :-)
>
> So what makes the most sense to me is to keep $value optional, but IF you
> specify an alternate name, you must also specify a type (which may be wider).
> So these are equivalent:
>
> public int $foo { set (int $value) => $value + 1 }
> public int $foo { set => $value + 1 }
>
> And only those forms are legal. But you could also do this, if the situation
> called for it:
>
> public int $foo { set(int|float $num) => floor($num) + 1; }
>
> This "all or nothing" approach seems like it strikes the best balance, gives
> the most flexibility where needed while still having the least redundancy
> when not needed, and when a name/type is provided, its behavior is the same
> as for a method being inherited.
>
> Does that sound acceptable? (Again, question for everyone.)
>
My only question with this is the same as I had in an earlier reply (and I'm
not sure it was ever answered directly?), and you allude to this yourself:
everywhere *else*, `($var)` means a parameter with type `mixed`. Why is the
type *required* here, when you've specifically said you want to avoid
boilerplate? If we're going to assume people can understand that `(implicit
property-type $value) is implicit, surely we can also assume that they will
understand "specifying a parameter without a type" means the parameter has no
type (i.e. is `mixed`).
Again, for myself I'd be likely to type it (or regular parameters, properties,
etc) as `mixed` if that's what I want *anyway*, but the inconsistency here
seems odd, unless there's some until-now unknown drive to deprecate type-less
parameters/properties/etc.
> The alternative that gives the most future-flexibility is to do neither: The
> variable is called $value, period, you can't change it, and you can't change
> the type, either. There is no () after set, ever. Punt both of those to a
> later follow-up. I'd prefer to include both now, but including neither now
> is the next-safer option.
>
>
> ## Regarding $field
>
> Sigh, now y'all like it. :-P Most of the feedback on this has been negative,
> so I'm inclined to leave it out at this point, unless there's a major swing
> in feedback to bring it back. But the RFC seems more likely to pass without
> it than with right now.
>
> --Larry Garfield
>
Cheers
Stephen