Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Larry Garfield Wed, 28 Feb 2024 06:27:08 -0800

On Wed, Feb 28, 2024, at 7:49 AM, Stephen Reay wrote:

*snip*


>> == Re virtual properties:
>> 
>> Ilija and I talked this through, and there's pros and cons to a `virtual` 
>> keyword.  Ilija also suggested a `backed` keyword, which forces a backed 
>> property to exist even if it's not used in the hook itself.
>> 
>> * Adding `virtual` adds more work for the developer, but more clarity.  It 
>> would also mean $this->$propName or $this->{__PROPERTY__} would work "as 
>> expected", since there's no auto-detection for virtual-ness.  On the 
>> downside, if you have a could-be-virtual property but never actually use the 
>> backing value, you have an extra backing value hanging around in memory that 
>> is inaccessible normally, but will still show up in some serialization 
>> formats, which could be unexpected.  If you omit one of the hooks and forget 
>> to mark it virtual, you'll still get the default of the other operation, 
>> which could be unexpected.  (Mostly this would be for a virtual-get that 
>> accidentally has a default setter because you forgot to mark it `virtual`.)
>> * Doing autodetection as now, but with an added "make a backing value 
>> anyway" flag would resolve the use case of "My set hook just calls a method, 
>> and that method sets the property, but since the hook doesn't mention the 
>> property it doesn't get created" problem.  It would also allow for 
>> $this->$propName to work if a property is explicitly backed.  On the 
>> flipside, it's one more thing to think about, and the above example it 
>> solves would be trivially solved by having the method just return the value 
>> to set and letting the set hook do the actual write, which is arguably 
>> better and more reliable code anyway.
>> * The status quo (auto-detection based on the presence of $this->propName).  
>> This has the advantage it "just works" in the 95% case, without having to 
>> think about anything extra.  The downside is it does have some odd edge 
>> cases, like needing $this->propName to be explicitly used.  
>> 
>> I don't think any is an obvious winner.  My personal preference would be for 
>> status quo (auto-detect) or explicit-virtual always.  I could probably live 
>> with either, though I think I'd personally favor status quo.  Thoughts from 
>> others?
>> 
>
> I agree that a flag to make the field *virtual* (thus disabling the 
> backing store) makes more sense than a flag to make it backed; It's 
> also easier to understand when comparing hooked properties with regular 
> properties (essentially, backed is the default, you have to opt-in to 
> it being virtual). I don't think the edge cases of "auto" make it 
> worthwhile just to not need "virtual". 

Uh, I can't tell if you're saying "use status quo" or "use the virtual 
keyword."  Please clarify. :-)

>> == Re reference-get
>> 
>> Allowing backed properties to have a reference return creates a situation 
>> where any writes would then bypass the set hook, and thus any validation 
>> implemented there.  That is, it makes the validation unreliable.  A major 
>> footgun.  The question is, do we favor caveat-emptor flexibility or 
>> correct-by-construction safety?  Personally I always lead toward the latter, 
>> though PHP in general is... schizophrenic about it, I'd say. :-)
>> 
>> At this point, we'd much rather leave it blocked to avoid the issue; it's 
>> easier to enable that loophole in the future if we really want it than to 
>> get rid of it if it turns out to have been a bad idea.
>> 
>> There is one edge case that *might* make sense: If there is no set hook 
>> defined, then there's no set hook to worry about bypassing.  So it may be 
>> safe to allow &get on backed properties IFF there is no set hook.  I worry 
>> that is "one more quirky edge case", though, so as above it may be better to 
>> skip for now as it's easier to add later than remove.  But if the consensus 
>> is to do that, we're open to it.  (Question for everyone.)
>> 
>
> I don't have strong feeling about this, but in general I usually tend 
> to prefer options that are consistent, and give power/options to the 
> developer. If references are opt-in anyway, I see that as accepting the 
> trade-offs. If a developer doesn't want to allow by-ref modifications 
> of the property, why would they make it referenceable in the first 
> place? This sounds a bit like disallowing regular public properties 
> because they might be modified outside the class - that's kind of the 
> point, surely.
>
>> == Re 
>> 
>> == Re arrays
>> 
>>> Regarding arrays, have you considered allowing array-index writes if
>> an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
>> as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
>> unset($_temp);"
>> 
>> That's already discussed in the RFC:
>> 
>>> The simplest approach would be to copy the array, modify it accordingly, 
>>> and pass it to set hook. This would have a large and obscure performance 
>>> penalty, and a set implementation would have no way of knowing which values 
>>> had changed, leading to, for instance, code needing to revalidate all 
>>> elements even if only one has changed.
>> 
>> Unless we were OK with that bypassing the set hook entirely if defined, 
>> which, as noted above, means any safety guarantees provided by a set hook 
>> are bypassed, leading to untrustworthy code.
>> 
>> == Re hook shorthands and return values
>> 
>> Ilija and I have been discussing this for a bit, and we've both budged a 
>> little. :-)  Here's our counter-proposal:

*snip*

> I think the examples given are clear, and the lack of the top-level 
> short closure-esque version makes it more obvious. Forgive me, I must 
> have missed some of the previous comments - is there a reason the 
> 'full' setter can't return a value, for the sake of consistency? I 
> understand that you don't want "return to set" to be the *only* option, 
> for the sake of e.g. change/audit logging type functionality (i.e. set 
> and then some action to record that the change was made), but it seems 
> a little odd and inconsistent to me that the return value of a short 
> closure would be used when the return value of the long version isn't. 
> This isn't really a major issue, I'm just curious if there was some 
> explanation about it?

Mainly because it introduces a lot more complexity.  As far as I'm aware, 
determining at runtime whether or not to make use of the return value is 
impossible.  (Ie, we cannot differentiate between "return", "no return 
statement", and "return null".)  So it would require compile time branching to 
generate two different pathways after lexically detecting if there is a 
"return" token present somewhere in the hook body.  That is probably doable, 
technically, but introduces more complexity to an already necessarily-large 
RFC.  It's also something that is simple enough to add later as an option 
without any BC breaks or implications for other parts of the design, so it's 
safe to punt.  (Some things are harder to punt on than others, as noted, but 
this one is easy/safe to skip for now.)

>> == Re the $value variable in set

*snip*

>> So what makes the most sense to me is to keep $value optional, but IF you 
>> specify an alternate name, you must also specify a type (which may be 
>> wider).  So these are equivalent:
>> 
>> public int $foo { set (int $value) => $value + 1 }
>> public int $foo { set => $value + 1 }
>> 
>> And only those forms are legal.  But you could also do this, if the 
>> situation called for it:
>> 
>> public int $foo { set(int|float $num) => floor($num) + 1; }
>> 
>> This "all or nothing" approach seems like it strikes the best balance, gives 
>> the most flexibility where needed while still having the least redundancy 
>> when not needed, and when a name/type is provided, its behavior is the same 
>> as for a method being inherited.
>> 
>> Does that sound acceptable?  (Again, question for everyone.)
>> 
>
> My only question with this is the same as I had in an earlier reply 
> (and I'm not sure it was ever answered directly?), and you allude to 
> this yourself: everywhere *else*, `($var)` means a parameter with type 
> `mixed`. Why is the type *required* here, when you've specifically said 
> you want to avoid boilerplate? If we're going to assume people can 
> understand that `(implicit property-type $value) is implicit, surely we 
> can also assume that they will understand "specifying a parameter 
> without a type" means the parameter has no type (i.e. is `mixed`).
>
> Again, for myself I'd be likely to type it (or regular parameters, 
> properties, etc) as `mixed` if that's what I want *anyway*, but the 
> inconsistency here seems odd, unless there's some until-now unknown 
> drive to deprecate type-less parameters/properties/etc.

If we went this route, then an untyped set param would likely imply "mixed", 
just like on methods.  Which, since mixed is the super type of everything, 
would still technically work, but would weaken the type enforcement and thus 
static analysis potential.  (Just as a method param can be widened in a child 
class all the way to mixed/omitted, and it would be unwise for all the same 
reasons.)

In the RFC as currently written, omitted means "derive from the property," 
which is a concept that doesn't exist in methods; the closest equivalent would 
be if omitting a type in a child method parameter meant "use the parent's type 
implicitly," which is not how that works right now.

--Larry Garfield

Re: [PHP-DEV] [RFC[ Property accessor hooks, take 2

Reply via email to