On Mon, Mar 4, 2024 at 4:40 PM Stephen Reay <[email protected]> wrote:
>
>
>
> > On 28 Feb 2024, at 06:17, Larry Garfield <[email protected]> wrote:
> >
> > On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote:
> >> [Including my full previous reply, since the list and gmail currently
> >> aren't being friends. Apologies that this leads to rather a lot of
> >> reading in one go...]
> >
> > Eh, I'd prefer a few big emails that come in slowly to lots of little
> > emails that come in fast. :-)
> >
> >>> On 21/02/2024 18:55, Larry Garfield wrote:
> >>>> Hello again, fine Internalians.
> >>>>
> >>>> After much on-again/off-again work, Ilija and I are back with a more
> >>>> polished property access hooks/interface properties RFC.
> >>>
> >>>
> >>> Hello, and a huge thanks to both you and Ilija for the continued work
> >>> on this. I'd really like to see this feature make it into PHP, and
> >>> agree with a lot of the RFC.
> >>>
> >>>
> >>> My main concern is the proliferation of things that look the same but
> >>> act differently, and things that look different but act the same:
> >
> > *snip*
> >
> >>> - a and b are both what we might call "traditional" properties, and
> >>> equivalent to each other; a uses legacy syntax which we haven't
> >>> removed for some reason
> >
> > I don't know why we haven't removed `var` either. I can't recall the last
> > time I saw it in real code. But that's out of scope here.
> >
> > *snip*
> >
> >> I think there's some really great functionality in the RFC, and would
> >> love for it to succeed in some form, but I think it would benefit from
> >> removing some of the "magic".
> >>
> >>
> >> Regards,
> >>
> >> --
> >> Rowan Tommins
> >> [IMSoP]
> >
> >
> > I'm going to try and respond to a couple of different points together here,
> > including from later in the thread, as it's just easier.
> >
> > == Re, design philosophy:
> >
> >> In C#, all "properties" are virtual - as soon as you have any
> >> non-default "get", "set" or "init" definition, it's up to you to declare
> >> a separate "field" to store the value in. Swift's "computed properties"
> >> are similar: if you have a custom getter or setter, there is no backing
> >> store; to add behaviour to a "stored property", you use the separate
> >> "property observer" hooks.
> >>
> >> Kotlin's approach is philosophically the opposite: there are no fields,
> >> only properties, but properties can access a hidden "backing field" via
> >> the special keyword "field". Importantly, omitting the setter doesn't
> >> make the property read-only, it implies set(value) { field = value }
> >
> > A little history here to help clarify how we ended up where we are: The
> > original RFC as we designed it modeled very closely on Swift, with 4 hooks.
> > Using get/set at all would create a virtual property and you were on your
> > own, while the beforeSet/afterSet hooks would not. We ran that design by
> > some PHP Foundation sponsors a year ago (I don't actually know who, Roman
> > did it for us), and the general feedback was "we like the idea, but woof
> > this is complicated with all these hooks and having to make my own backing
> > property for all these little things. Couldn't this be simplified?" We
> > thought a bit more, and I off-handedly suggested to Ilija "I mean, would it
> > be possible to just detect if a get/set hook is using a backing store and
> > make it automatically? Then we could get rid of the before/after hooks."
> > He gave it a quick try and found that was straightforward, so we pivoted to
> > that simplified version. We then realized that we had... mostly just
> > recreated Kotlin's design, so shrugged happily and went on with life.
> >
> > As noted in an earlier email, C#, Kotlin, and Swift all have different
> > stances on the variable name for the incoming value. We originally modeled
> > on Swift so had that model (optional newVal name), and also because we
> > liked how compact it was. When we switched to the simplified, incidentally
> > Kotlin-esque approach, we just kept the optional variable as it works.
> >
> > I think where that ended up is pretty nice, personally, even if it is not a
> > direct map of any particular other language.
> >
> > == Re asymmetric typing:
> >
> > This is capability already present today if using a setter method.
> >
> > class Person {
> > private $name;
> >
> > public function setName(UnicodeString|string $name)
> > {
> > $this->name = $value instanceof UnicodeString ? $value : new
> > UnicodeString($value);
> > }
> > }
> >
> > And widening the parameter type in a child class is also entirely legal.
> > As the goal of the RFC is, essentially, "make most common getter/setter
> > patterns easy to add to a property without making an API-breaking method,
> > so people don't have to add redundant just-in-case getters and setters all
> > the time," covering an easy-to-cover use case seems like a good thing to do.
> >
> > It also ties into the question of the explict/implicit name, for the reason
> > you mentioned earlier (unspecified means mixed), not by intent. More on
> > that in another section.
> >
> > == Re virtual properties:
> >
> > Ilija and I talked this through, and there's pros and cons to a `virtual`
> > keyword. Ilija also suggested a `backed` keyword, which forces a backed
> > property to exist even if it's not used in the hook itself.
> >
> > * Adding `virtual` adds more work for the developer, but more clarity. It
> > would also mean $this->$propName or $this->{__PROPERTY__} would work "as
> > expected", since there's no auto-detection for virtual-ness. On the
> > downside, if you have a could-be-virtual property but never actually use
> > the backing value, you have an extra backing value hanging around in memory
> > that is inaccessible normally, but will still show up in some serialization
> > formats, which could be unexpected. If you omit one of the hooks and
> > forget to mark it virtual, you'll still get the default of the other
> > operation, which could be unexpected. (Mostly this would be for a
> > virtual-get that accidentally has a default setter because you forgot to
> > mark it `virtual`.)
> > * Doing autodetection as now, but with an added "make a backing value
> > anyway" flag would resolve the use case of "My set hook just calls a
> > method, and that method sets the property, but since the hook doesn't
> > mention the property it doesn't get created" problem. It would also allow
> > for $this->$propName to work if a property is explicitly backed. On the
> > flipside, it's one more thing to think about, and the above example it
> > solves would be trivially solved by having the method just return the value
> > to set and letting the set hook do the actual write, which is arguably
> > better and more reliable code anyway.
> > * The status quo (auto-detection based on the presence of $this->propName).
> > This has the advantage it "just works" in the 95% case, without having to
> > think about anything extra. The downside is it does have some odd edge
> > cases, like needing $this->propName to be explicitly used.
> >
> > I don't think any is an obvious winner. My personal preference would be
> > for status quo (auto-detect) or explicit-virtual always. I could probably
> > live with either, though I think I'd personally favor status quo. Thoughts
> > from others?
> >
>
> I agree that a flag to make the field *virtual* (thus disabling the backing
> store) makes more sense than a flag to make it backed; It's also easier to
> understand when comparing hooked properties with regular properties
> (essentially, backed is the default, you have to opt-in to it being virtual).
> I don't think the edge cases of "auto" make it worthwhile just to not need
> "virtual".
>
> > == Re reference-get
> >
> > Allowing backed properties to have a reference return creates a situation
> > where any writes would then bypass the set hook, and thus any validation
> > implemented there. That is, it makes the validation unreliable. A major
> > footgun. The question is, do we favor caveat-emptor flexibility or
> > correct-by-construction safety? Personally I always lead toward the
> > latter, though PHP in general is... schizophrenic about it, I'd say. :-)
> >
> > At this point, we'd much rather leave it blocked to avoid the issue; it's
> > easier to enable that loophole in the future if we really want it than to
> > get rid of it if it turns out to have been a bad idea.
> >
> > There is one edge case that *might* make sense: If there is no set hook
> > defined, then there's no set hook to worry about bypassing. So it may be
> > safe to allow &get on backed properties IFF there is no set hook. I worry
> > that is "one more quirky edge case", though, so as above it may be better
> > to skip for now as it's easier to add later than remove. But if the
> > consensus is to do that, we're open to it. (Question for everyone.)
> >
>
> I don't have strong feeling about this, but in general I usually tend to
> prefer options that are consistent, and give power/options to the developer.
> If references are opt-in anyway, I see that as accepting the trade-offs. If a
> developer doesn't want to allow by-ref modifications of the property, why
> would they make it referenceable in the first place? This sounds a bit like
> disallowing regular public properties because they might be modified outside
> the class - that's kind of the point, surely.
>
> > == Re
> >
> > == Re arrays
> >
> >> Regarding arrays, have you considered allowing array-index writes if
> > an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
> > as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
> > unset($_temp);"
> >
> > That's already discussed in the RFC:
> >
> >> The simplest approach would be to copy the array, modify it accordingly,
> >> and pass it to set hook. This would have a large and obscure performance
> >> penalty, and a set implementation would have no way of knowing which
> >> values had changed, leading to, for instance, code needing to revalidate
> >> all elements even if only one has changed.
> >
> > Unless we were OK with that bypassing the set hook entirely if defined,
> > which, as noted above, means any safety guarantees provided by a set hook
> > are bypassed, leading to untrustworthy code.
> >
> > == Re hook shorthands and return values
> >
> > Ilija and I have been discussing this for a bit, and we've both budged a
> > little. :-) Here's our counter-proposal:
> >
> > - Drop the "top level" shorthand, for get-only hooks.
> > - Keep the => shorthand for the get hook itself.
> > - For a set hook, the {} form has no return value; set the value yourself
> > however you want.
> > - For a set hook, the => form implies a backed value and will set the
> > property to whatever value that evaluates to.
> >
> > So these are equivalent:
> >
> > public $foo { set { $this->foo = $value; } }
> > public $foo { set => $value; }
> >
> > These are equivalent:
> >
> > public string $foo {
> > get {
> > return strtoupper($this->foo);
> > }
> > }
> > public string $foo { get => strtoupper($this->foo); }
> >
> > And this goes away:
> >
> > public string $foo => strtoupper($this->foo);
> >
> > That covers the common cases with an arrow-function-like syntax that
> > behaves as you'd expect (it returns things), and allows a longer version
> > with arbitrarily complex logic if desired. It also means that each syntax
> > variant does mean something importantly different.
> >
> > Would that be an acceptable compromise? (Question for everyone.)
> >
>
> I think the examples given are clear, and the lack of the top-level short
> closure-esque version makes it more obvious. Forgive me, I must have missed
> some of the previous comments - is there a reason the 'full' setter can't
> return a value, for the sake of consistency? I understand that you don't want
> "return to set" to be the *only* option, for the sake of e.g. change/audit
> logging type functionality (i.e. set and then some action to record that the
> change was made), but it seems a little odd and inconsistent to me that the
> return value of a short closure would be used when the return value of the
> long version isn't. This isn't really a major issue, I'm just curious if
> there was some explanation about it?
>
> > == Re the $value variable in set
> >
> > Honestly, Rowan's earlier point here is the strongest argument in favor for
> > me of the current RFC approach. Anywhere else in PHP, something that looks
> > like a parameter and has no type, like `($param)`, means its type is
> > `mixed`. It would be weird and confusing to be different here. That's
> > above and beyond the issue of forcing people to retype something obvious
> > every time. (I cite again, recent PHP's trend toward removing needless
> > boilerplate, which is very good.) Requiring that the type be specified,
> > for consistency, makes little sense if the type is not allowed to vary.
> > You're just repeating a string from earlier on the same line, for no
> > particular benefit.
> >
> > I genuinely don't understand the pushback on $value. It's something you
> > learn once and never have to think about again. It's consistent.
> >
> > Ilija jokingly suggested making it always $value, unconditionally, and
> > allowing only the type to be specified if widening:
> >
> > public int $foo { set(int|float) => floor($value); }
> >
> > Though I suspect that won't go over well, either. :-)
> >
> > So what makes the most sense to me is to keep $value optional, but IF you
> > specify an alternate name, you must also specify a type (which may be
> > wider). So these are equivalent:
> >
> > public int $foo { set (int $value) => $value + 1 }
> > public int $foo { set => $value + 1 }
> >
> > And only those forms are legal. But you could also do this, if the
> > situation called for it:
> >
> > public int $foo { set(int|float $num) => floor($num) + 1; }
> >
> > This "all or nothing" approach seems like it strikes the best balance,
> > gives the most flexibility where needed while still having the least
> > redundancy when not needed, and when a name/type is provided, its behavior
> > is the same as for a method being inherited.
> >
> > Does that sound acceptable? (Again, question for everyone.)
> >
>
> My only question with this is the same as I had in an earlier reply (and I'm
> not sure it was ever answered directly?), and you allude to this yourself:
> everywhere *else*, `($var)` means a parameter with type `mixed`. Why is the
> type *required* here, when you've specifically said you want to avoid
> boilerplate? If we're going to assume people can understand that `(implicit
> property-type $value) is implicit, surely we can also assume that they will
> understand "specifying a parameter without a type" means the parameter has no
> type (i.e. is `mixed`).
>
> Again, for myself I'd be likely to type it (or regular parameters,
> properties, etc) as `mixed` if that's what I want *anyway*, but the
> inconsistency here seems odd, unless there's some until-now unknown drive to
> deprecate type-less parameters/properties/etc.
>
>
> > The alternative that gives the most future-flexibility is to do neither:
> > The variable is called $value, period, you can't change it, and you can't
> > change the type, either. There is no () after set, ever. Punt both of
> > those to a later follow-up. I'd prefer to include both now, but including
> > neither now is the next-safer option.
> >
> >
> > ## Regarding $field
> >
> > Sigh, now y'all like it. :-P Most of the feedback on this has been
> > negative, so I'm inclined to leave it out at this point, unless there's a
> > major swing in feedback to bring it back. But the RFC seems more likely to
> > pass without it than with right now.
> >
> > --Larry Garfield
> >
>
>
> Cheers
>
>
> Stephen
I would think that simply using return-to-set would be the simplest
solution, if you need to run something after it's set, you can use the
regular way of running code after a return:
try {
return $value + 100;
} finally {
// this runs after returning
}
Robert Landers
Software Engineer
Utrecht NL