Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-17 Thread Rowan Tommins [IMSoP]

On 11/03/2024 12:52, Gina P. Banyard wrote:

I would like to get some initial feedback on an RFC I've been working on for 
the last 5–6 months.
The RFC attempts to explain, and most importantly, improve the semantics around 
$container[$offset] as PHP is currently widely inconsistent.

[...]

RFC: 
https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md



Hi Gina,

I've just read through this thoroughly, and am simultaneously impressed 
with your investigation, and amazed at how many inconsistencies you found.



I think the proposed granular interfaces absolutely make sense, given 
the different uses people have for such offsets. My only hesitation is 
that if you want "everything", things become quite verbose:


class Foo implements DimensionFetchable, DimensionWritable, 
FetchAppendable, DimensionUnsettable { ... }


function 
bar(DimensionFetchable&DimensionWritable&FetchAppendable&DimensionUnsettable 
$container) { ... }


Unfortunately, I can't think of an easy solution to this without some 
form of type aliases.



As an experiment, I tried writing a variation of Python's "defaultdict" 
[1] using all the new hooks (without actually testing it against any 
implementation). Here's what I came up with: 
https://gist.github.com/IMSoP/fbd60c5379ccefcab6c5af25eacc259b


Most of it is straight-forward, but a couple of things stood out:

* Separating offsetFetch from offsetGet is really useful, because we can 
avoid "auto-vivifying" a key that's only been read, never updated. In 
other words, isset($foo['a']) can remain false after running 
var_dump($foo['a']), but $foo['a']++ should still work.


* The fetchAppend hook is quite confusing to implement, because it's 
used in a few subtly different scenarios. For instance, if it's actually 
$container[][$offset] = $value there is an implicit requirement that 
fetchAppend should return array|DimensionWritable, but presumably that 
has to be enforced after fetchAppend has returned. I'm not sure if 
there's anything that can be improved here; it probably just needs some 
examples in the user manual.


[1] 
https://docs.python.org/3/library/collections.html#collections.defaultdict



Over all, I think this is a really great proposal, and hope it proceeds 
smoothly.


Regards,

--
Rowan Tommins
[IMSoP]


Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-12 Thread Rob Landers


On Tue, Mar 12, 2024, at 02:36, Gina P. Banyard wrote:
> On Monday, 11 March 2024 at 16:11, Larry Garfield  
> wrote:
> 
> > 
> > Woof. That's my kind of RFC. :-) The extensive background helps a lot, 
> > thank you.
> > 
> > I am generally in favor of this, and have wanted more fine-grained 
> > ArrayAccess interfaces for a long time.
> > 
> > Thoughts in no particular order:
> > 
> > * "Dimension" is clearly based on the existing engine hooks, but as a 
> > user-space dev, it's a very confusing term. Possibly documentable if 
> > there's an obvious logic for it we could present, but something more 
> > self-evident is probably better.
> 
> I am open to naming changes, but "dimension" is very much the standard and 
> ubiquitous term for this, see multidimensional arrays, n-dimensional array, 
> etc.
> 
> > 
> > * I am not sure combining get and exists into a single interface is right. 
> > I'm not certain it's wrong, either, though. What is the argument for 
> > combining them?
> > 
> > * Do we have some guidance for what offsetGet() should do when 
> > offsetExists() is false? I know there isn't really any now, but it seems 
> > like the sort of thing to think about here.
> 
> I will answer both points together.
> The reason that checking an offset exists is combined with being able to read 
> it is that it just makes sense to be together.
> If not and only the "read" operation is supported, how do you know that an 
> offset actually exists before accessing it? You just access it and get an 
> Exception that you need to catch?
> It is also required for the null coalesce operator to function properly.
> 
> What offsetGet() does if the offset doesn't exist is up to the 
> implementation, you could return null as a default value or throw an 
> exception.
> The only expectation is that *if* the offset exists, then reading the value 
> should be possible.
> 
> If you can only write to a container, then being able to check something 
> exists is somewhat meaningless.
> 
> 
> > 
> > * You sort of skirt around but don't directly address the impact of this 
> > change on the long-standing desire to make array functions accept 
> > arrayified objects. What if any impact would this have there? Eg, could 
> > some array functions be made to accept array|DimensionRead without breaking 
> > the world?
> > 
> > * How, if at all, does this relate to iterable? I think it has no impact, 
> > but since it's in the same problem space it's worth confirming.
> 
> The former is actually also more related to iterable, as most array functions 
> need to be able to traverse the container to be able to do anything 
> meaningful with it.
> Or there would need to be a way to provide a "manifest" of what offsets are 
> set for things like array_search or array_flip.
> 
> > 
> > * You mention at one point applying shim code ArrayAccess to make it work 
> > like the new interfaces. Please expand on that. Do you mean the engine 
> > would automatically insert some shims, like how `__toString()` now 
> > magically implies `implements Stringable`? Or some trait that objects could 
> > use (either a user-space trait or an engine trait) that would provide such 
> > shims in an overridable fashion? I don't fully follow here.
> > 
> > * If I read correctly, an internal object that implements one of the 
> > dimension handlers will automagically appear to user-land as having the 
> > corresponding interface, much like `__toString()`/`Stringable`. Is that 
> > correct? It seemed implied but not fully stated. If so, a brief code 
> > example would help to make it clear in such a long RFC.
> 
> Move around a later point to respond to them together.
> Internal objects don't actually magically implement Stringable, this is 
> something internal objects must do themselves.
> Moreover, internals objects can support string casts without implementing 
> __toString(), see the GMP object which is the only example of this in php-src 
> (and I should fix this or if someone else wants an easy PR to do feel free).
> 
> To understand how the shim works, I first need to explain how interfaces 
> being implemented in a class work.
> 
> Internal interfaces can have a special handler called 
> interface_gets_implemented which gets called during compilation of classes.
> This is the mechanism used by the Throwable and the DateTimeInterface 
> interfaces to prevent userland classes from implementing them.
> The ArrayAccess interface has (and all the other Dimension ones actually 
> have) such a handler, and the "shim" is to set the append, fetch, and 
> fetch-append dimension handlers on the CE to magically support those 
> operations on the class for the time being.
> No methods are created on the class, the new interfaces are not implemented.
> To override the behaviour of append/fetch/fetch-append the relevant new 
> interface should be implemented.
> 
> This is conceived as a temporary measure to ease migration for userland 
> classes that support those opera

Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-11 Thread Gina P. Banyard
On Monday, 11 March 2024 at 17:26, Rob Landers  wrote:

> I’m still reading the RFC, but here: 
> https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md#objects
>
> You list “appending” twice.

Indeed, fixed.

I have also fixed another small mistake in the RFC that used offsetWrite() 
instead of offsetSet() for the method name of the DimensionWrite interface.

Best regards,
Gina P. Banyard

Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-11 Thread Gina P. Banyard
On Monday, 11 March 2024 at 16:11, Larry Garfield  
wrote:

> 
> Woof. That's my kind of RFC. :-) The extensive background helps a lot, thank 
> you.
> 
> I am generally in favor of this, and have wanted more fine-grained 
> ArrayAccess interfaces for a long time.
> 
> Thoughts in no particular order:
> 
> * "Dimension" is clearly based on the existing engine hooks, but as a 
> user-space dev, it's a very confusing term. Possibly documentable if there's 
> an obvious logic for it we could present, but something more self-evident is 
> probably better.

I am open to naming changes, but "dimension" is very much the standard and 
ubiquitous term for this, see multidimensional arrays, n-dimensional array, etc.

> 
> * I am not sure combining get and exists into a single interface is right. 
> I'm not certain it's wrong, either, though. What is the argument for 
> combining them?
> 
> * Do we have some guidance for what offsetGet() should do when offsetExists() 
> is false? I know there isn't really any now, but it seems like the sort of 
> thing to think about here.

I will answer both points together.
The reason that checking an offset exists is combined with being able to read 
it is that it just makes sense to be together.
If not and only the "read" operation is supported, how do you know that an 
offset actually exists before accessing it? You just access it and get an 
Exception that you need to catch?
It is also required for the null coalesce operator to function properly.

What offsetGet() does if the offset doesn't exist is up to the implementation, 
you could return null as a default value or throw an exception.
The only expectation is that *if* the offset exists, then reading the value 
should be possible.

If you can only write to a container, then being able to check something exists 
is somewhat meaningless.


> 
> * You sort of skirt around but don't directly address the impact of this 
> change on the long-standing desire to make array functions accept arrayified 
> objects. What if any impact would this have there? Eg, could some array 
> functions be made to accept array|DimensionRead without breaking the world?
> 
> * How, if at all, does this relate to iterable? I think it has no impact, but 
> since it's in the same problem space it's worth confirming.

The former is actually also more related to iterable, as most array functions 
need to be able to traverse the container to be able to do anything meaningful 
with it.
Or there would need to be a way to provide a "manifest" of what offsets are set 
for things like array_search or array_flip.

> 
> * You mention at one point applying shim code ArrayAccess to make it work 
> like the new interfaces. Please expand on that. Do you mean the engine would 
> automatically insert some shims, like how `__toString()` now magically 
> implies `implements Stringable`? Or some trait that objects could use (either 
> a user-space trait or an engine trait) that would provide such shims in an 
> overridable fashion? I don't fully follow here.
> 
> * If I read correctly, an internal object that implements one of the 
> dimension handlers will automagically appear to user-land as having the 
> corresponding interface, much like `__toString()`/`Stringable`. Is that 
> correct? It seemed implied but not fully stated. If so, a brief code example 
> would help to make it clear in such a long RFC.

Move around a later point to respond to them together.
Internal objects don't actually magically implement Stringable, this is 
something internal objects must do themselves.
Moreover, internals objects can support string casts without implementing 
__toString(), see the GMP object which is the only example of this in php-src 
(and I should fix this or if someone else wants an easy PR to do feel free).

To understand how the shim works, I first need to explain how interfaces being 
implemented in a class work.

Internal interfaces can have a special handler called 
interface_gets_implemented which gets called during compilation of classes.
This is the mechanism used by the Throwable and the DateTimeInterface 
interfaces to prevent userland classes from implementing them.
The ArrayAccess interface has (and all the other Dimension ones actually have) 
such a handler, and the "shim" is to set the append, fetch, and fetch-append 
dimension handlers on the CE to magically support those operations on the class 
for the time being.
No methods are created on the class, the new interfaces are not implemented.
To override the behaviour of append/fetch/fetch-append the relevant new 
interface should be implemented.

This is conceived as a temporary measure to ease migration for userland classes 
that support those operations already.


> 
> * Big +1 to removing the magic semi-silent casting when using weird key types.
> 
> * I feel like some of the sections could benefit from more short code 
> examples. Eg, What the heck does fetch-append on a null even look like? That 
> would h

Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-11 Thread Rob Landers


On Mon, Mar 11, 2024, at 13:52, Gina P. Banyard wrote:
> Hello internals,
> 
> I would like to get some initial feedback on an RFC I've been working on for 
> the last 5–6 months.
> The RFC attempts to explain, and most importantly, improve the semantics 
> around $container[$offset] as PHP is currently widely inconsistent.
> 
> The RFC is about six thousands words, so I would recommend a cup of 
> tea/coffee before starting to read it.
> 
> The main things which are still in flux is the naming of the new interfaces 
> and methods, the phrasing of error messages, and the behaviour around using 
> null as a container.
> Those are generally indicated with a TODO comment.
> 
> Of note is that I will also be going on holiday for 3 weeks from the 19th of 
> March, so ideally most of the feedback would come during that time frame so 
> that I can finalize the RFC after my holiday.
> 
> RFC: 
> https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md
> 
> 
> Best regards,
> 
> Gina P. Banyard
> 

I’m still reading the RFC, but here: 
https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md#objects

You list “appending” twice. 

— Rob

Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers

2024-03-11 Thread Larry Garfield
On Mon, Mar 11, 2024, at 12:52 PM, Gina P. Banyard wrote:
> Hello internals,
>
> I would like to get some initial feedback on an RFC I've been working 
> on for the last 5–6 months.
> The RFC attempts to explain, and most importantly, improve the 
> semantics around $container[$offset] as PHP is currently widely 
> inconsistent.
>
> The RFC is about six thousands words, so I would recommend a cup of 
> tea/coffee before starting to read it.
>
> The main things which are still in flux is the naming of the new 
> interfaces and methods, the phrasing of error messages, and the 
> behaviour around using null as a container.
> Those are generally indicated with a TODO comment.
>
> Of note is that I will also be going on holiday for 3 weeks from the 
> 19th of March, so ideally most of the feedback would come during that 
> time frame so that I can finalize the RFC after my holiday.
>
> RFC: 
> https://github.com/Girgias/php-rfcs/blob/master/container-offset-behaviour.md
>
>
> Best regards,
>
> Gina P. Banyard

Woof.  That's my kind of RFC. :-)  The extensive background helps a lot, thank 
you.

I am generally in favor of this, and have wanted more fine-grained ArrayAccess 
interfaces for a long time.

Thoughts in no particular order:

* "Dimension" is clearly based on the existing engine hooks, but as a 
user-space dev, it's a very confusing term.  Possibly documentable if there's 
an obvious logic for it we could present, but something more self-evident is 
probably better.

* I am not sure combining get and exists into a single interface is right.  I'm 
not certain it's wrong, either, though.  What is the argument for combining 
them?

* Do we have some guidance for what offsetGet() should do when offsetExists() 
is false?  I know there isn't really any now, but it seems like the sort of 
thing to think about here.

* You sort of skirt around but don't directly address the impact of this change 
on the long-standing desire to make array functions accept arrayified objects.  
What if any impact would this have there?  Eg, could some array functions be 
made to accept array|DimensionRead without breaking the world?

* How, if at all, does this relate to iterable?  I think it has no impact, but 
since it's in the same problem space it's worth confirming.

* You mention at one point applying shim code ArrayAccess to make it work like 
the new interfaces.  Please expand on that.  Do you mean the engine would 
automatically insert some shims, like how `__toString()` now magically implies 
`implements Stringable`?  Or some trait that objects could use (either a 
user-space trait or an engine trait) that would provide such shims in an 
overridable fashion?  I don't fully follow here.

* Big +1 to removing the magic semi-silent casting when using weird key types.

* I feel like some of the sections could benefit from more short code examples. 
 Eg, What the heck does fetch-append on a null even look like?  That would help 
illustrate why the current behavior is weird, or why some things need to stay 
non-obvious because they're used in odd cases. (Like how $a[1][2] is a by-ref 
fetch internally, something most people don't think about.)

* What would the changes to ArrayObject mean for a backing object that uses 
hooks on some properties?  It says __get/__set would be bypassed.  Would hooks 
also be bypassed?  (I have no clue what the preferred logic here is, honestly.  
Just thinking aloud and hoping hooks pass so that we have to think about this. 
:-) )

* If I read correctly, an internal object that implements one of the dimension 
handlers will automagically appear to user-land as having the corresponding 
interface, much like `__toString()`/`Stringable`.  Is that correct?  It seemed 
implied but not fully stated.  If so, a brief code example would help to make 
it clear in such a long RFC.

Again, thank you for your work on this, and I hope it passes easily.

--Larry Garfield