> On 1 Feb 2022, at 21:46, tyson andre <tysonandre...@hotmail.com> wrote: > > Hi internals, > >> I've created a new RFC https://wiki.php.net/rfc/deque to add a `final class >> Deque` >> >> This is based on the `Teds\Deque` implementation I've worked on >> for the https://github.com/TysonAndre/pecl-teds PECL. >> >> While `SplDoublyLinkedList` and its subclass `SplQueue`/`SplStack` exist in >> the SPL, they have several drawbacks >> that are addressed by this RFC to add a `Deque` class (to use instead of >> those): >> >> 1. `SplDoublyLinkedList` is internally represented by a doubly linked list, >> making it use roughly twice as much memory as the proposed `Deque` >> 2. `push`/`pop`/`unshift`/`shift` from `SplDoublyLinkedList` are slower due >> to >> needing to allocate or free the linked list nodes. >> 3. Reading values in the middle of the `SplDoublyLinkedList` is proportional >> to the length of the list, >> due to needing to traverse the linked list nodes. >> 4. `foreach` Iteration behavior cannot be understood without knowing what >> constructed the >> `SplDoublyLinkedList` instance or set the flags. >> >> It would be useful to have an efficient `Deque` container in the standard >> library >> to provide an alternative without those drawbacks, >> as well as for the following reasons: >> >> 1. To save memory in applications or libraries that may need to store many >> lists of values or run for long periods of time. >> Notably, PHP's `array` type will never release allocated capacity. >> See >> https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html >> 2. To provide a better alternative to `SplDoublyLinkedList`, `SplStack`, and >> `SplQueue` >> for use cases that require stacks or queues. >> 3. As a more efficient option than `array` and `SplDoublyLinkedList` >> as a queue or `Deque`, especially for `unshift`. >> >> A `Deque` is more efficient than an `array` when used as a queue, more >> readable, and easier to use correctly. >> While it is possible to efficiently remove elements from the start of an >> `array` (in terms of insertion order) (though this makes >> reset()/array_key_first() inefficient), >> it is very inefficient to prepend elements to the start of a large `array` >> due to needing to either copy the array >> or move all elements in the internal array representation, >> and an `array` would use much more memory than a `Deque` when used that way >> (and be slower). >> >> There are also several pitfalls to using an array as a queue for larger >> queue sizes, >> some of which are not obvious and discovered while writing the benchmarks. >> (Having a better (double-ended) queue datastructure (`Deque`) than the >> `SplDoublyLinkedList` >> would save users from needing to write code with these pitfalls): >> >> 1. `array_key_first()` and reset()`takes time proportional to the number of >> elements `unset` from the start of an array, >> causing it to unexpectedly be extremely slow (quadratic time) after >> unsetting many elements at the start of the queue. >> (when the array infrequently runs out of capacity, buckets are moved to >> the front) >> 2. `reset()` or `end()` will convert a variable to a reference, >> and php is less efficient at reading or writing to reference. >> Opcache is also less efficient at optimizing uses of variables using >> references. >> 3. More obviously, `array_unshift` and `array_shift` will take time >> proportional to the number of elements in the array >> (to reindex and move existing/remaining elements). > > I plan to start voting on https://wiki.php.net/rfc/deque on Friday, February > 4th. > > Several changes have been made to https://wiki.php.net/rfc/deque#changelog > after the feedback in https://externals.io/message/116100 > > - The class is now named `Collections\Deque` > - The api documentation in https://wiki.php.net/rfc/deque#proposal was > expanded for methods. > - Benchmarks were updated. > - Like other standard datastructures, iteration over the deque is now over > the original object (instead of creating a copy), > and mutating the deque will be reflected in `$iterator->current()` (and > moving the end with push()/pop() will affect where iteration ends). > - Iteration will account for calls to shift/unshift moving the start of the > deque. > the offsets will be corrected and values won't be skipped or iterated over > multiple times. > (no matter how many iterators were created by `Deque->getIterator()`) > See https://wiki.php.net/rfc/deque#iteration_behavior > - The get()/set() methods were removed, after feedback in > https://externals.io/message/116100#116214 > > A WebAssembly demo is available at > https://tysonandre.github.io/php-rfc-demo/deque/ > > Thanks, > Tyson > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >
Hi Tyson, As a userland dev & library author it’s nice to see some progression on basic data structures, so thank you for your efforts on this! Two little things in the RFC: The proposed API switches between terms `front`, `back`, `start` and `end` in comments - is there meant to be a conceptual difference between front/start and end/back ? In the "Why use this instead of array?” Section, the 3rd point seems cut off: > Note that starting in php 8.2, array Cheers Stephen -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php