Hi internals, > > > I've created a new RFC https://wiki.php.net/rfc/deque to add a `final > > > class Deque` > > > > > > This is based on the `Teds\Deque` implementation I've worked on > > > for the https://github.com/TysonAndre/pecl-teds PECL. > > > > > > While `SplDoublyLinkedList` and its subclass `SplQueue`/`SplStack` exist > > > in the SPL, they have several drawbacks > > > that are addressed by this RFC to add a `Deque` class (to use instead of > > > those): > > > > > > 1. `SplDoublyLinkedList` is internally represented by a doubly linked > > > list, > > > making it use roughly twice as much memory as the proposed `Deque` > > > 2. `push`/`pop`/`unshift`/`shift` from `SplDoublyLinkedList` are slower > > > due to > > > needing to allocate or free the linked list nodes. > > > 3. Reading values in the middle of the `SplDoublyLinkedList` is > > > proportional to the length of the list, > > > due to needing to traverse the linked list nodes. > > > 4. `foreach` Iteration behavior cannot be understood without knowing what > > > constructed the > > > `SplDoublyLinkedList` instance or set the flags. > > > > > > It would be useful to have an efficient `Deque` container in the standard > > > library > > > to provide an alternative without those drawbacks, > > > as well as for the following reasons: > > > > > > 1. To save memory in applications or libraries that may need to store > > > many lists of values or run for long periods of time. > > > Notably, PHP's `array` type will never release allocated capacity. > > > See > > > https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html > > > 2. To provide a better alternative to `SplDoublyLinkedList`, `SplStack`, > > > and `SplQueue` > > > for use cases that require stacks or queues. > > > 3. As a more efficient option than `array` and `SplDoublyLinkedList` > > > as a queue or `Deque`, especially for `unshift`. > > > > > > A `Deque` is more efficient than an `array` when used as a queue, more > > > readable, and easier to use correctly. > > > While it is possible to efficiently remove elements from the start of an > > > `array` (in terms of insertion order) (though this makes > > > reset()/array_key_first() inefficient), > > > it is very inefficient to prepend elements to the start of a large > > > `array` due to needing to either copy the array > > > or move all elements in the internal array representation, > > > and an `array` would use much more memory than a `Deque` when used that > > > way (and be slower). > > > > > > There are also several pitfalls to using an array as a queue for larger > > > queue sizes, > > > some of which are not obvious and discovered while writing the benchmarks. > > > (Having a better (double-ended) queue datastructure (`Deque`) than the > > > `SplDoublyLinkedList` > > > would save users from needing to write code with these pitfalls): > > > > > > 1. `array_key_first()` and reset()`takes time proportional to the number > > > of elements `unset` from the start of an array, > > > causing it to unexpectedly be extremely slow (quadratic time) after > > > unsetting many elements at the start of the queue. > > > (when the array infrequently runs out of capacity, buckets are moved > > > to the front) > > > 2. `reset()` or `end()` will convert a variable to a reference, > > > and php is less efficient at reading or writing to reference. > > > Opcache is also less efficient at optimizing uses of variables using > > > references. > > > 3. More obviously, `array_unshift` and `array_shift` will take time > > > proportional to the number of elements in the array > > > (to reindex and move existing/remaining elements). > > > > I plan to start voting on https://wiki.php.net/rfc/deque on Friday, > > February 4th. > > > > Several changes have been made to https://wiki.php.net/rfc/deque#changelog > > after the feedback in https://externals.io/message/116100 > > > > - The class is now named `Collections\Deque` > > - The api documentation in https://wiki.php.net/rfc/deque#proposal was > > expanded for methods. > > - Benchmarks were updated. > > - Like other standard datastructures, iteration over the deque is now over > > the original object (instead of creating a copy), > > and mutating the deque will be reflected in `$iterator->current()` (and > > moving the end with push()/pop() will affect where iteration ends). > > - Iteration will account for calls to shift/unshift moving the start of the > > deque. > > the offsets will be corrected and values won't be skipped or iterated > > over multiple times. > > (no matter how many iterators were created by `Deque->getIterator()`) > > See https://wiki.php.net/rfc/deque#iteration_behavior > > - The get()/set() methods were removed, after feedback in > > https://externals.io/message/116100#116214 > > > > A WebAssembly demo is available at > > https://tysonandre.github.io/php-rfc-demo/deque/ > > I've updated the RFC https://wiki.php.net/rfc/deque yet again (no > implementation changes). I now plan to start voting on Saturday, February 5th. > > I've also updated the WebAssembly demo at > https://tysonandre.github.io/php-rfc-demo/deque/ to include a benchmark to > illustrate the differences in performance of shift/unshift at various > deque/array sizes. > A more detailed section on the performance of unshift/shift compared to array > and existing data structures was added to the RFC. > > I've also added a discussion section for the request to `return $this` > (https://externals.io/message/116100#116967), removed duplicated quotes, and > clarified some parts of the RFC.
There's more changes I found this morning while working on test cases for unrelated datastructures, so I'll be postponing the vote. 1. The maximum `Collections\Deque` size is probably going to be reduced to be the same as the maximum array size. For 64-bit php builds, that's roughly 2 billion (2147483648 elements, using at least 32 gigabytes of memory for array or Deque), which was more memory than I'd assumed (16 bytes per `zval`), and needing that much is expected to be a rare use case that can be revisited, e.g. if array size limits are raised. Collections would need to be converted to/from arrays for serialization, unserialization, debug output (var_export, json_encode, etc.), etc., so having a much larger maximum size didn't seem to have a compelling use case. 2. I found and proposed potential improvements for detection of infinite recursion in var_export/debug_zval_dump/json_encode affecting userland classes, existing spl classes, etc., which seemed worth finishing discussing before starting the vote. 3. Miscellaneous refactorings Thanks, Tyson -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php