Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Robert Landers Mon, 04 Mar 2024 07:52:30 -0800

On Mon, Mar 4, 2024 at 4:40 PM Stephen Reay <[email protected]> wrote:
>
>
>
> > On 28 Feb 2024, at 06:17, Larry Garfield <[email protected]> wrote:
> >
> > On Sun, Feb 25, 2024, at 10:16 PM, Rowan Tommins [IMSoP] wrote:
> >> [Including my full previous reply, since the list and gmail currently
> >> aren't being friends. Apologies that this leads to rather a lot of
> >> reading in one go...]
> >
> > Eh, I'd prefer a few big emails that come in slowly to lots of little 
> > emails that come in fast. :-)
> >
> >>> On 21/02/2024 18:55, Larry Garfield wrote:
> >>>> Hello again, fine Internalians.
> >>>>
> >>>> After much on-again/off-again work, Ilija and I are back with a more 
> >>>> polished property access hooks/interface properties RFC.
> >>>
> >>>
> >>> Hello, and a huge thanks to both you and Ilija for the continued work
> >>> on this. I'd really like to see this feature make it into PHP, and
> >>> agree with a lot of the RFC.
> >>>
> >>>
> >>> My main concern is the proliferation of things that look the same but
> >>> act differently, and things that look different but act the same:
> >
> > *snip*
> >
> >>> - a and b are both what we might call "traditional" properties, and
> >>> equivalent to each other; a uses legacy syntax which we haven't
> >>> removed for some reason
> >
> > I don't know why we haven't removed `var` either.  I can't recall the last 
> > time I saw it in real code.  But that's out of scope here.
> >
> > *snip*
> >
> >> I think there's some really great functionality in the RFC, and would
> >> love for it to succeed in some form, but I think it would benefit from
> >> removing some of the "magic".
> >>
> >>
> >> Regards,
> >>
> >> --
> >> Rowan Tommins
> >> [IMSoP]
> >
> >
> > I'm going to try and respond to a couple of different points together here, 
> > including from later in the thread, as it's just easier.
> >
> > == Re, design philosophy:
> >
> >> In C#, all "properties" are virtual - as soon as you have any
> >> non-default "get", "set" or "init" definition, it's up to you to declare
> >> a separate "field" to store the value in. Swift's "computed properties"
> >> are similar: if you have a custom getter or setter, there is no backing
> >> store; to add behaviour to a "stored property", you use the separate
> >> "property observer" hooks.
> >>
> >> Kotlin's approach is philosophically the opposite: there are no fields,
> >> only properties, but properties can access a hidden "backing field" via
> >> the special keyword "field". Importantly, omitting the setter doesn't
> >> make the property read-only, it implies set(value) { field = value }
> >
> > A little history here to help clarify how we ended up where we are: The 
> > original RFC as we designed it modeled very closely on Swift, with 4 hooks. 
> >  Using get/set at all would create a virtual property and you were on your 
> > own, while the beforeSet/afterSet hooks would not.  We ran that design by 
> > some PHP Foundation sponsors a year ago (I don't actually know who, Roman 
> > did it for us), and the general feedback was "we like the idea, but woof 
> > this is complicated with all these hooks and having to make my own backing 
> > property for all these little things.  Couldn't this be simplified?"  We 
> > thought a bit more, and I off-handedly suggested to Ilija "I mean, would it 
> > be possible to just detect if a get/set hook is using a backing store and 
> > make it automatically?  Then we could get rid of the before/after hooks."  
> > He gave it a quick try and found that was straightforward, so we pivoted to 
> > that simplified version.  We then realized that we had... mostly just 
> > recreated Kotlin's design, so shrugged happily and went on with life.
> >
> > As noted in an earlier email, C#, Kotlin, and Swift all have different 
> > stances on the variable name for the incoming value.  We originally modeled 
> > on Swift so had that model (optional newVal name), and also because we 
> > liked how compact it was.  When we switched to the simplified, incidentally 
> > Kotlin-esque approach, we just kept the optional variable as it works.
> >
> > I think where that ended up is pretty nice, personally, even if it is not a 
> > direct map of any particular other language.
> >
> > == Re asymmetric typing:
> >
> > This is capability already present today if using a setter method.
> >
> > class Person {
> >    private $name;
> >
> >    public function setName(UnicodeString|string $name)
> >    {
> >        $this->name  =  $value  instanceof UnicodeString ? $value : new  
> > UnicodeString($value);
> >    }
> > }
> >
> > And widening the parameter type in a child class is also entirely legal.  
> > As the goal of the RFC is, essentially, "make most common getter/setter 
> > patterns easy to add to a property without making an API-breaking method, 
> > so people don't have to add redundant just-in-case getters and setters all 
> > the time," covering an easy-to-cover use case seems like a good thing to do.
> >
> > It also ties into the question of the explict/implicit name, for the reason 
> > you mentioned earlier (unspecified means mixed), not by intent.  More on 
> > that in another section.
> >
> > == Re virtual properties:
> >
> > Ilija and I talked this through, and there's pros and cons to a `virtual` 
> > keyword.  Ilija also suggested a `backed` keyword, which forces a backed 
> > property to exist even if it's not used in the hook itself.
> >
> > * Adding `virtual` adds more work for the developer, but more clarity.  It 
> > would also mean $this->$propName or $this->{__PROPERTY__} would work "as 
> > expected", since there's no auto-detection for virtual-ness.  On the 
> > downside, if you have a could-be-virtual property but never actually use 
> > the backing value, you have an extra backing value hanging around in memory 
> > that is inaccessible normally, but will still show up in some serialization 
> > formats, which could be unexpected.  If you omit one of the hooks and 
> > forget to mark it virtual, you'll still get the default of the other 
> > operation, which could be unexpected.  (Mostly this would be for a 
> > virtual-get that accidentally has a default setter because you forgot to 
> > mark it `virtual`.)
> > * Doing autodetection as now, but with an added "make a backing value 
> > anyway" flag would resolve the use case of "My set hook just calls a 
> > method, and that method sets the property, but since the hook doesn't 
> > mention the property it doesn't get created" problem.  It would also allow 
> > for $this->$propName to work if a property is explicitly backed.  On the 
> > flipside, it's one more thing to think about, and the above example it 
> > solves would be trivially solved by having the method just return the value 
> > to set and letting the set hook do the actual write, which is arguably 
> > better and more reliable code anyway.
> > * The status quo (auto-detection based on the presence of $this->propName). 
> >  This has the advantage it "just works" in the 95% case, without having to 
> > think about anything extra.  The downside is it does have some odd edge 
> > cases, like needing $this->propName to be explicitly used.
> >
> > I don't think any is an obvious winner.  My personal preference would be 
> > for status quo (auto-detect) or explicit-virtual always.  I could probably 
> > live with either, though I think I'd personally favor status quo.  Thoughts 
> > from others?
> >
>
> I agree that a flag to make the field *virtual* (thus disabling the backing 
> store) makes more sense than a flag to make it backed; It's also easier to 
> understand when comparing hooked properties with regular properties 
> (essentially, backed is the default, you have to opt-in to it being virtual). 
> I don't think the edge cases of "auto" make it worthwhile just to not need 
> "virtual".
>
> > == Re reference-get
> >
> > Allowing backed properties to have a reference return creates a situation 
> > where any writes would then bypass the set hook, and thus any validation 
> > implemented there.  That is, it makes the validation unreliable.  A major 
> > footgun.  The question is, do we favor caveat-emptor flexibility or 
> > correct-by-construction safety?  Personally I always lead toward the 
> > latter, though PHP in general is... schizophrenic about it, I'd say. :-)
> >
> > At this point, we'd much rather leave it blocked to avoid the issue; it's 
> > easier to enable that loophole in the future if we really want it than to 
> > get rid of it if it turns out to have been a bad idea.
> >
> > There is one edge case that *might* make sense: If there is no set hook 
> > defined, then there's no set hook to worry about bypassing.  So it may be 
> > safe to allow &get on backed properties IFF there is no set hook.  I worry 
> > that is "one more quirky edge case", though, so as above it may be better 
> > to skip for now as it's easier to add later than remove.  But if the 
> > consensus is to do that, we're open to it.  (Question for everyone.)
> >
>
> I don't have strong feeling about this, but in general I usually tend to 
> prefer options that are consistent, and give power/options to the developer. 
> If references are opt-in anyway, I see that as accepting the trade-offs. If a 
> developer doesn't want to allow by-ref modifications of the property, why 
> would they make it referenceable in the first place? This sounds a bit like 
> disallowing regular public properties because they might be modified outside 
> the class - that's kind of the point, surely.
>
> > == Re
> >
> > == Re arrays
> >
> >> Regarding arrays, have you considered allowing array-index writes if
> > an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
> > as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
> > unset($_temp);"
> >
> > That's already discussed in the RFC:
> >
> >> The simplest approach would be to copy the array, modify it accordingly, 
> >> and pass it to set hook. This would have a large and obscure performance 
> >> penalty, and a set implementation would have no way of knowing which 
> >> values had changed, leading to, for instance, code needing to revalidate 
> >> all elements even if only one has changed.
> >
> > Unless we were OK with that bypassing the set hook entirely if defined, 
> > which, as noted above, means any safety guarantees provided by a set hook 
> > are bypassed, leading to untrustworthy code.
> >
> > == Re hook shorthands and return values
> >
> > Ilija and I have been discussing this for a bit, and we've both budged a 
> > little. :-)  Here's our counter-proposal:
> >
> > - Drop the "top level" shorthand, for get-only hooks.
> > - Keep the => shorthand for the get hook itself.
> > - For a set hook, the {} form has no return value; set the value yourself 
> > however you want.
> > - For a set hook, the => form implies a backed value and will set the 
> > property to whatever value that evaluates to.
> >
> > So these are equivalent:
> >
> > public $foo { set { $this->foo = $value; } }
> > public $foo { set => $value; }
> >
> > These are equivalent:
> >
> > public string $foo {
> >  get {
> >    return strtoupper($this->foo);
> >  }
> > }
> > public string $foo { get => strtoupper($this->foo); }
> >
> > And this goes away:
> >
> > public string $foo => strtoupper($this->foo);
> >
> > That covers the common cases with an arrow-function-like syntax that 
> > behaves as you'd expect (it returns things), and allows a longer version 
> > with arbitrarily complex logic if desired.  It also means that each syntax 
> > variant does mean something importantly different.
> >
> > Would that be an acceptable compromise?  (Question for everyone.)
> >
>
> I think the examples given are clear, and the lack of the top-level short 
> closure-esque version makes it more obvious. Forgive me, I must have missed 
> some of the previous comments - is there a reason the 'full' setter can't 
> return a value, for the sake of consistency? I understand that you don't want 
> "return to set" to be the *only* option, for the sake of e.g. change/audit 
> logging type functionality (i.e. set and then some action to record that the 
> change was made), but it seems a little odd and inconsistent to me that the 
> return value of a short closure would be used when the return value of the 
> long version isn't. This isn't really a major issue, I'm just curious if 
> there was some explanation about it?
>
> > == Re the $value variable in set
> >
> > Honestly, Rowan's earlier point here is the strongest argument in favor for 
> > me of the current RFC approach.  Anywhere else in PHP, something that looks 
> > like a parameter and has no type, like `($param)`, means its type is 
> > `mixed`.  It would be weird and confusing to be different here.  That's 
> > above and beyond the issue of forcing people to retype something obvious 
> > every time.  (I cite again, recent PHP's trend toward removing needless 
> > boilerplate, which is very good.)  Requiring that the type be specified, 
> > for consistency, makes little sense if the type is not allowed to vary.  
> > You're just repeating a string from earlier on the same line, for no 
> > particular benefit.
> >
> > I genuinely don't understand the pushback on $value.  It's something you 
> > learn once and never have to think about again.  It's consistent.
> >
> > Ilija jokingly suggested making it always $value, unconditionally, and 
> > allowing only the type to be specified if widening:
> >
> > public int $foo { set(int|float) => floor($value); }
> >
> > Though I suspect that won't go over well, either. :-)
> >
> > So what makes the most sense to me is to keep $value optional, but IF you 
> > specify an alternate name, you must also specify a type (which may be 
> > wider).  So these are equivalent:
> >
> > public int $foo { set (int $value) => $value + 1 }
> > public int $foo { set => $value + 1 }
> >
> > And only those forms are legal.  But you could also do this, if the 
> > situation called for it:
> >
> > public int $foo { set(int|float $num) => floor($num) + 1; }
> >
> > This "all or nothing" approach seems like it strikes the best balance, 
> > gives the most flexibility where needed while still having the least 
> > redundancy when not needed, and when a name/type is provided, its behavior 
> > is the same as for a method being inherited.
> >
> > Does that sound acceptable?  (Again, question for everyone.)
> >
>
> My only question with this is the same as I had in an earlier reply (and I'm 
> not sure it was ever answered directly?), and you allude to this yourself: 
> everywhere *else*, `($var)` means a parameter with type `mixed`. Why is the 
> type *required* here, when you've specifically said you want to avoid 
> boilerplate? If we're going to assume people can understand that `(implicit 
> property-type $value) is implicit, surely we can also assume that they will 
> understand "specifying a parameter without a type" means the parameter has no 
> type (i.e. is `mixed`).
>
> Again, for myself I'd be likely to type it (or regular parameters, 
> properties, etc) as `mixed` if that's what I want *anyway*, but the 
> inconsistency here seems odd, unless there's some until-now unknown drive to 
> deprecate type-less parameters/properties/etc.
>
>
> > The alternative that gives the most future-flexibility is to do neither: 
> > The variable is called $value, period, you can't change it, and you can't 
> > change the type, either.  There is no () after set, ever.  Punt both of 
> > those to a later follow-up.  I'd prefer to include both now, but including 
> > neither now is the next-safer option.
> >
> >
> > ## Regarding $field
> >
> > Sigh, now y'all like it. :-P  Most of the feedback on this has been 
> > negative, so I'm inclined to leave it out at this point, unless there's a 
> > major swing in feedback to bring it back.  But the RFC seems more likely to 
> > pass without it than with right now.
> >
> > --Larry Garfield
> >
>
>
> Cheers
>
>
> Stephen


I would think that simply using return-to-set would be the simplest
solution, if you need to run something after it's set, you can use the
regular way of running code after a return:

try {
  return $value + 100;
} finally {
  // this runs after returning
}

Robert Landers
Software Engineer
Utrecht NL

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Reply via email to