> On Jun 21, 2024, at 11:42, Niels Dossche <dossche.ni...@gmail.com> wrote:
>
> On 21/06/2024 14:43, Robert Landers wrote:
>> On Fri, Jun 21, 2024 at 5:08 AM Andreas Hennings <andr...@dqxtech.net> wrote:
>>>
>>> E.g. should something like array<int> be added to the type system in
>>> the future, or do we leave the type system behind, and rely on the new
>>> "guards"?
>>> public array $values is array<int>
>>> OR
>>> public array<int> $values
>>>
>>> The concern here would be if in the future we plan to extend the type
>>> system in a way that is inconsistent or incompatible with the pattern
>>> matching system.
>>>
>>> --- Andreas
>>
>> I'm always surprised why arrays can't keep track of their internal
>> types. Every time an item is added to the map, just chuck in the type
>> and a count, then if it is removed, decrement the counter, and if
>> zero, remove the type. Thus checking if an array is `array<int>`
>> should be a near O(1) operation. Memory usage might be an issue (a
>> couple bytes per type in the array), but not terrible.... but then
>> again, I've been digging into the type system quite a bit over the
>> last few months.
>
> And every time a modification happens, directly or indirectly, you'll
> have to modify the counts too. Given how much arrays / hash tables are
> used within the PHP codebase, this will eventually add up to a lot of
> overhead. A lot of internal functions that work with arrays will need
> to be audited and updated too. Lots of potential for introducing bugs.
> It's (unfortunately) not a matter of "just" adding some counts.
This is straying a bit for this RFC's discussion, but, I'm wondering if a
better approach to generics for arrays would be to just not do generics for
arrays.
Instead, have generics be a class-only thing, and add new built-in types (along
the lines of the classes/interfaces in the Data Structures extension)
specifically to provide collection support. This would accomplish several
things:
* Separate object types (e.g. Array, Map, OrderedMap, Set, SparseArray, etc)
rather than one "array" type that does everything. Each could have underlying
storage and accessors optimized for one specific use-case, rather than having
to be efficient with several different use-cases.
* No BC breaks. array and all the existing array_* functions remain untouched
and unchanged. Somewhere years down the line, they can be discouraged in favor
of the new interfaces.
* Being objects, these new data types would all have a fancy OOP interface,
which could make chaining operations easy.
The major interoperability concern in this model would be the cost of
translating between the new types and legacy array types at API boundaries for
legacy code. Possibly this might limit utility to greenfield development. But
since it'd be entirely new and opt-in types, there's no direct BC concerns, and
maybe some of the typechecking perf hit when you validate inserts/updates could
be elided by the optimizer in the presence of typehints. (e.g. you have an
Array<int> and you insert a value the compiler or optimizer can prove is an
int, you don't need to do a runtime type check.) There'd also probably have to
be something done to maintain the COW semantics that array has without having
to have explicit clone operations.
-John