> On 28 Feb 2024, at 21:25, Larry Garfield <la...@garfieldtech.com> wrote:
>
> On Wed, Feb 28, 2024, at 7:49 AM, Stephen Reay wrote:
>
> *snip*
>
>>> == Re virtual properties:
>>>
>>> Ilija and I talked this through, and there's pros and cons to a `virtual`
>>> keyword. Ilija also suggested a `backed` keyword, which forces a backed
>>> property to exist even if it's not used in the hook itself.
>>>
>>> * Adding `virtual` adds more work for the developer, but more clarity. It
>>> would also mean $this->$propName or $this->{__PROPERTY__} would work "as
>>> expected", since there's no auto-detection for virtual-ness. On the
>>> downside, if you have a could-be-virtual property but never actually use
>>> the backing value, you have an extra backing value hanging around in memory
>>> that is inaccessible normally, but will still show up in some serialization
>>> formats, which could be unexpected. If you omit one of the hooks and
>>> forget to mark it virtual, you'll still get the default of the other
>>> operation, which could be unexpected. (Mostly this would be for a
>>> virtual-get that accidentally has a default setter because you forgot to
>>> mark it `virtual`.)
>>> * Doing autodetection as now, but with an added "make a backing value
>>> anyway" flag would resolve the use case of "My set hook just calls a
>>> method, and that method sets the property, but since the hook doesn't
>>> mention the property it doesn't get created" problem. It would also allow
>>> for $this->$propName to work if a property is explicitly backed. On the
>>> flipside, it's one more thing to think about, and the above example it
>>> solves would be trivially solved by having the method just return the value
>>> to set and letting the set hook do the actual write, which is arguably
>>> better and more reliable code anyway.
>>> * The status quo (auto-detection based on the presence of $this->propName).
>>> This has the advantage it "just works" in the 95% case, without having to
>>> think about anything extra. The downside is it does have some odd edge
>>> cases, like needing $this->propName to be explicitly used.
>>>
>>> I don't think any is an obvious winner. My personal preference would be
>>> for status quo (auto-detect) or explicit-virtual always. I could probably
>>> live with either, though I think I'd personally favor status quo. Thoughts
>>> from others?
>>>
>>
>> I agree that a flag to make the field *virtual* (thus disabling the
>> backing store) makes more sense than a flag to make it backed; It's
>> also easier to understand when comparing hooked properties with regular
>> properties (essentially, backed is the default, you have to opt-in to
>> it being virtual). I don't think the edge cases of "auto" make it
>> worthwhile just to not need "virtual".
>
> Uh, I can't tell if you're saying "use status quo" or "use the virtual
> keyword." Please clarify. :-)
I'm saying I think an explicit `virtual` keyword is the better option of the
three (because the benefits of the 'auto' mode don't outweigh the edge cases it
introduces).
>
>>> == Re reference-get
>>>
>>> Allowing backed properties to have a reference return creates a situation
>>> where any writes would then bypass the set hook, and thus any validation
>>> implemented there. That is, it makes the validation unreliable. A major
>>> footgun. The question is, do we favor caveat-emptor flexibility or
>>> correct-by-construction safety? Personally I always lead toward the
>>> latter, though PHP in general is... schizophrenic about it, I'd say. :-)
>>>
>>> At this point, we'd much rather leave it blocked to avoid the issue; it's
>>> easier to enable that loophole in the future if we really want it than to
>>> get rid of it if it turns out to have been a bad idea.
>>>
>>> There is one edge case that *might* make sense: If there is no set hook
>>> defined, then there's no set hook to worry about bypassing. So it may be
>>> safe to allow &get on backed properties IFF there is no set hook. I worry
>>> that is "one more quirky edge case", though, so as above it may be better
>>> to skip for now as it's easier to add later than remove. But if the
>>> consensus is to do that, we're open to it. (Question for everyone.)
>>>
>>
>> I don't have strong feeling about this, but in general I usually tend
>> to prefer options that are consistent, and give power/options to the
>> developer. If references are opt-in anyway, I see that as accepting the
>> trade-offs. If a developer doesn't want to allow by-ref modifications
>> of the property, why would they make it referenceable in the first
>> place? This sounds a bit like disallowing regular public properties
>> because they might be modified outside the class - that's kind of the
>> point, surely.
>>
>>> == Re
>>>
>>> == Re arrays
>>>
>>>> Regarding arrays, have you considered allowing array-index writes if
>>> an &get hook is defined? i.e. "$x->foo['bar'] = 42;" could be treated
>>> as semantically equivalent to "$_temp =& $x->foo; $_temp['bar'] = 42;
>>> unset($_temp);"
>>>
>>> That's already discussed in the RFC:
>>>
>>>> The simplest approach would be to copy the array, modify it accordingly,
>>>> and pass it to set hook. This would have a large and obscure performance
>>>> penalty, and a set implementation would have no way of knowing which
>>>> values had changed, leading to, for instance, code needing to revalidate
>>>> all elements even if only one has changed.
>>>
>>> Unless we were OK with that bypassing the set hook entirely if defined,
>>> which, as noted above, means any safety guarantees provided by a set hook
>>> are bypassed, leading to untrustworthy code.
>>>
>>> == Re hook shorthands and return values
>>>
>>> Ilija and I have been discussing this for a bit, and we've both budged a
>>> little. :-) Here's our counter-proposal:
>
> *snip*
>
>> I think the examples given are clear, and the lack of the top-level
>> short closure-esque version makes it more obvious. Forgive me, I must
>> have missed some of the previous comments - is there a reason the
>> 'full' setter can't return a value, for the sake of consistency? I
>> understand that you don't want "return to set" to be the *only* option,
>> for the sake of e.g. change/audit logging type functionality (i.e. set
>> and then some action to record that the change was made), but it seems
>> a little odd and inconsistent to me that the return value of a short
>> closure would be used when the return value of the long version isn't.
>> This isn't really a major issue, I'm just curious if there was some
>> explanation about it?
>
> Mainly because it introduces a lot more complexity. As far as I'm aware,
> determining at runtime whether or not to make use of the return value is
> impossible. (Ie, we cannot differentiate between "return", "no return
> statement", and "return null".) So it would require compile time branching
> to generate two different pathways after lexically detecting if there is a
> "return" token present somewhere in the hook body. That is probably doable,
> technically, but introduces more complexity to an already necessarily-large
> RFC. It's also something that is simple enough to add later as an option
> without any BC breaks or implications for other parts of the design, so it's
> safe to punt. (Some things are harder to punt on than others, as noted, but
> this one is easy/safe to skip for now.)
>
Right, i hadn't considered that whole "everything implicitly returns, even if
it's null" scenario. Makes sense. The inconsistency is still a bit jarring but
I understand the reasoning now, thanks.
>>> == Re the $value variable in set
>
> *snip*
>
>>> So what makes the most sense to me is to keep $value optional, but IF you
>>> specify an alternate name, you must also specify a type (which may be
>>> wider). So these are equivalent:
>>>
>>> public int $foo { set (int $value) => $value + 1 }
>>> public int $foo { set => $value + 1 }
>>>
>>> And only those forms are legal. But you could also do this, if the
>>> situation called for it:
>>>
>>> public int $foo { set(int|float $num) => floor($num) + 1; }
>>>
>>> This "all or nothing" approach seems like it strikes the best balance,
>>> gives the most flexibility where needed while still having the least
>>> redundancy when not needed, and when a name/type is provided, its behavior
>>> is the same as for a method being inherited.
>>>
>>> Does that sound acceptable? (Again, question for everyone.)
>>>
>>
>> My only question with this is the same as I had in an earlier reply
>> (and I'm not sure it was ever answered directly?), and you allude to
>> this yourself: everywhere *else*, `($var)` means a parameter with type
>> `mixed`. Why is the type *required* here, when you've specifically said
>> you want to avoid boilerplate? If we're going to assume people can
>> understand that `(implicit property-type $value) is implicit, surely we
>> can also assume that they will understand "specifying a parameter
>> without a type" means the parameter has no type (i.e. is `mixed`).
>>
>> Again, for myself I'd be likely to type it (or regular parameters,
>> properties, etc) as `mixed` if that's what I want *anyway*, but the
>> inconsistency here seems odd, unless there's some until-now unknown
>> drive to deprecate type-less parameters/properties/etc.
>
> If we went this route, then an untyped set param would likely imply "mixed",
> just like on methods. Which, since mixed is the super type of everything,
> would still technically work, but would weaken the type enforcement and thus
> static analysis potential. (Just as a method param can be widened in a child
> class all the way to mixed/omitted, and it would be unwise for all the same
> reasons.)
>
Having a `mixed` param in the set hook shouldn't weaken the actual backing
parameter though - when the hook writes to `$this->prop`, the parent type is
still enforced, *surely*? If not, why not?
As for how static analysis tools handle this concept - I'd have thought it's
too early to suggest what static analysis tools will or won't support given how
much they already support based on less-formal syntax like docblocks. It's
*already* possible to have a property that is reported (to static analysis
tools/IDEs/etc) as a fixed type, but accepts a wider type on write, with the
current mish-mash of typed properties, docblocks, magic getters and setters,
and the bizarre `unset` behaviour. This is simply converting that into
standardised syntax. The RFC itself proposes a scenario where a wider type is
accepted in the set hook. I find it hard to believe that a static analysis tool
can model "this property is `Foo` but it accepts `string|Foo` on write" but not
"this property is `Foo` but it accepts `mixed` on write". Heck if that's such a
problem what is said tool going to do when someone *explicitly* widens the
parameter to `mixed` in a set hook?
I would argue that if the language is providing better support for typed
properties by adding 'hooks' like this, the need for static analysis of those
specific parts reduces greatly - if someone wants to accept `mixed` when
storing as a string, and convert it in the hook, and the language can **enforce
those types at runtime**, why should some hypothetical static analysis be a
hangup for that?
> In the RFC as currently written, omitted means "derive from the property,"
> which is a concept that doesn't exist in methods; the closest equivalent
> would be if omitting a type in a child method parameter meant "use the
> parent's type implicitly," which is not how that works right now.
For the third time: I'm well aware of how parameter types work everywhere else,
and that's why I'm asking why the same behaviour isn't being followed here?
- You've said you want to avoid boilerplate; and
- You've said you would expect most people to just use the implicit `same-type
$value` parameter; and
- You've acknowledged that the existing 'standard' is that a parameter without
a type is considered `mixed`; and
- You've acknowledged in your RFC that there is a use-case for wanting to
accept a wider type than what a property stores internally.
So why then is it so unacceptable that the existing standard be followed, such
that a set hook with an "untyped" parameter would be treated as `mixed` as it
is everywhere else?
Yes, I know you said "widening to `mixed` is unwise". I don't seem to recall
amongst all the type-related previous RFCs, any that suggested that child
parameters widening to `mixed` (either explicitly or implicitly) should be
deprecated, so I'm sorry but I don't see much value in that argument.
>
> --Larry Garfield
>
Cheers
Stephen