> On 1 Feb 2022, at 21:46, tyson andre <tysonandre...@hotmail.com> wrote:
> 
> Hi internals,
> 
>> I've created a new RFC https://wiki.php.net/rfc/deque to add a `final class 
>> Deque`
>> 
>> This is based on the `Teds\Deque` implementation I've worked on
>> for the https://github.com/TysonAndre/pecl-teds PECL.
>> 
>> While `SplDoublyLinkedList` and its subclass `SplQueue`/`SplStack` exist in 
>> the SPL, they have several drawbacks
>> that are addressed by this RFC to add a `Deque` class (to use instead of 
>> those):
>> 
>> 1. `SplDoublyLinkedList` is internally represented by a doubly linked list,
>>    making it use roughly twice as much memory as the proposed `Deque`
>> 2. `push`/`pop`/`unshift`/`shift` from `SplDoublyLinkedList` are slower due 
>> to
>>    needing to allocate or free the linked list nodes.
>> 3. Reading values in the middle of the `SplDoublyLinkedList` is proportional 
>> to the length of the list,
>>    due to needing to traverse the linked list nodes.
>> 4. `foreach` Iteration behavior cannot be understood without knowing what 
>> constructed the
>>    `SplDoublyLinkedList` instance or set the flags.
>> 
>> It would be useful to have an efficient `Deque` container in the standard 
>> library
>> to provide an alternative without those drawbacks,
>> as well as for the following reasons:
>> 
>> 1. To save memory in applications or libraries that may need to store many 
>> lists of values or run for long periods of time.
>>    Notably, PHP's `array` type will never release allocated capacity.
>>    See 
>> https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html
>> 2. To provide a better alternative to `SplDoublyLinkedList`, `SplStack`, and 
>> `SplQueue`
>>    for use cases that require stacks or queues.
>> 3. As a more efficient option than `array` and `SplDoublyLinkedList`
>>    as a queue or `Deque`, especially for `unshift`.
>> 
>> A `Deque` is more efficient than an `array` when used as a queue, more 
>> readable, and easier to use correctly.
>> While it is possible to efficiently remove elements from the start of an 
>> `array` (in terms of insertion order) (though this makes 
>> reset()/array_key_first() inefficient),
>> it is very inefficient to prepend elements to the start of a large `array` 
>> due to needing to either copy the array
>> or move all elements in the internal array representation,
>> and an `array` would use much more memory than a `Deque` when used that way 
>> (and be slower).
>> 
>> There are also several pitfalls to using an array as a queue for larger 
>> queue sizes,
>> some of which are not obvious and discovered while writing the benchmarks.
>> (Having a better (double-ended) queue datastructure (`Deque`) than the 
>> `SplDoublyLinkedList`
>> would save users from needing to write code with these pitfalls):
>> 
>> 1. `array_key_first()` and reset()`takes time proportional to the number of 
>> elements `unset` from the start of an array,
>>    causing it to unexpectedly be extremely slow (quadratic time) after 
>> unsetting many elements at the start of the queue.
>>    (when the array infrequently runs out of capacity, buckets are moved to 
>> the front)
>> 2. `reset()` or `end()` will convert a variable to a reference,
>>    and php is less efficient at reading or writing to reference.
>>    Opcache is also less efficient at optimizing uses of variables using 
>> references.
>> 3. More obviously, `array_unshift` and `array_shift` will take time 
>> proportional to the number of elements in the array
>>    (to reindex and move existing/remaining elements).
> 
> I plan to start voting on https://wiki.php.net/rfc/deque on Friday, February 
> 4th.
> 
> Several changes have been made to https://wiki.php.net/rfc/deque#changelog
> after the feedback in https://externals.io/message/116100
> 
> - The class is now named `Collections\Deque`
> - The api documentation in https://wiki.php.net/rfc/deque#proposal was 
> expanded for methods.
> - Benchmarks were updated.
> - Like other standard datastructures, iteration over the deque is now over 
> the original object (instead of creating a copy), 
>  and mutating the deque will be reflected in `$iterator->current()` (and 
> moving the end with push()/pop() will affect where iteration ends).
> - Iteration will account for calls to shift/unshift moving the start of the 
> deque.
>  the offsets will be corrected and values won't be skipped or iterated over 
> multiple times.
>  (no matter how many iterators were created by `Deque->getIterator()`)
>  See https://wiki.php.net/rfc/deque#iteration_behavior
> - The get()/set() methods were removed, after feedback in 
> https://externals.io/message/116100#116214
> 
> A WebAssembly demo is available at 
> https://tysonandre.github.io/php-rfc-demo/deque/
> 
> Thanks,
> Tyson
> 
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
> 

Hi Tyson,

As a userland dev & library author it’s nice to see some progression on basic 
data structures, so thank you for your efforts on this!


Two little things in the RFC:

The proposed API switches between terms `front`, `back`, `start` and `end` in 
comments - is there meant to be a conceptual difference between front/start and 
end/back ?


In the "Why use this instead of array?” Section, the 3rd point seems cut off:
> Note that starting in php 8.2, array


Cheers

Stephen 

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to