> On Jun 21, 2024, at 11:42, Niels Dossche <dossche.ni...@gmail.com> wrote:
> 
> On 21/06/2024 14:43, Robert Landers wrote:
>> On Fri, Jun 21, 2024 at 5:08 AM Andreas Hennings <andr...@dqxtech.net> wrote:
>>> 
>>> E.g. should something like array<int> be added to the type system in
>>> the future, or do we leave the type system behind, and rely on the new
>>> "guards"?
>>> public array $values is array<int>
>>> OR
>>> public array<int> $values
>>> 
>>> The concern here would be if in the future we plan to extend the type
>>> system in a way that is inconsistent or incompatible with the pattern
>>> matching system.
>>> 
>>> --- Andreas
>> 
>> I'm always surprised why arrays can't keep track of their internal
>> types. Every time an item is added to the map, just chuck in the type
>> and a count, then if it is removed, decrement the counter, and if
>> zero, remove the type. Thus checking if an array is `array<int>`
>> should be a near O(1) operation. Memory usage might be an issue (a
>> couple bytes per type in the array), but not terrible.... but then
>> again, I've been digging into the type system quite a bit over the
>> last few months.
> 
> And every time a modification happens, directly or indirectly, you'll
> have to modify the counts too. Given how much arrays / hash tables are
> used within the PHP codebase, this will eventually add up to a lot of
> overhead. A lot of internal functions that work with arrays will need
> to be audited and updated too. Lots of potential for introducing bugs.
> It's (unfortunately) not a matter of "just" adding some counts.


This is straying a bit for this RFC's discussion, but, I'm wondering if a 
better approach to generics for arrays would be to just not do generics for 
arrays.

Instead, have generics be a class-only thing, and add new built-in types (along 
the lines of the classes/interfaces in the Data Structures extension) 
specifically to provide collection support. This would accomplish several 
things:

* Separate object types (e.g. Array, Map, OrderedMap, Set, SparseArray, etc) 
rather than one "array" type that does everything. Each could have underlying 
storage and accessors optimized for one specific use-case, rather than having 
to be efficient with several different use-cases.
* No BC breaks. array and all the existing array_* functions remain untouched 
and unchanged. Somewhere years down the line, they can be discouraged in favor 
of the new interfaces.
* Being objects, these new data types would all have a fancy OOP interface, 
which could make chaining operations easy.

The major interoperability concern in this model would be the cost of 
translating between the new types and legacy array types at API boundaries for 
legacy code. Possibly this might limit utility to greenfield development. But 
since it'd be entirely new and opt-in types, there's no direct BC concerns, and 
maybe some of the typechecking perf hit when you validate inserts/updates could 
be elided by the optimizer in the presence of typehints. (e.g. you have an 
Array<int> and you insert a value the compiler or optimizer can prove is an 
int, you don't need to do a runtime type check.) There'd also probably have to 
be something done to maintain the COW semantics that array has without having 
to have explicit clone operations.

-John

Reply via email to