Hi Levi Morrison,

> The name "deque" is used in the standard library of these languages:
> 
>  - C++: std::deque
>  - Java: java.util.Deque (interface)
>  - Python: collections.deque
>  - Swift: Collections.Deque (not standard lib, apparently, but Apple
> official? Don't know Swift)
>  - Rust: std::collections::VecDeque
> 
> And these don't have it in the standard library:
>  - Go
>  - C#
>  - C
>  - JavaScript
>
> Anyway, it's pretty decent evidence that:
>  1. This functionality is pretty widely used across languages.
>  2. This functionality should have "deque" be in the name, or be the
> complete name.

Thanks for putting that together.

For anyone wondering about the languages that don't have it:
The first 3 are compiled languages so there's the same performance for a 
standard library 
and third-party library code written in those libraries. 
For C and Go, the standard libraries of C and Go are written in C and Go.
C and Go also have a very minimal standard library as a design goal.
Everything in C and Go is a "native library", so a third-party library and a 
standard library would have the same performance.
(The standard library of C and Go use or allow embedding assembly for 
platform-dependent functionality or optimizations. 
Third party libraries in C or Go can also use inline assembly 
https://stackoverflow.com/a/23796420)
(Not familiar with C#'s standard library, I assume it's similar?)

And browser vendors have put immense amounts of effort on optimizing JavaScript
and the JIT compilers for JavaScript for low memory usage,
supporting inlining, working efficiently on various platforms, etc.

> Discussion naming for "vector" I can understand, as it's less widely
> used or sometimes means something specific to numbers, but there's
> little point in discussing the name "deque." It's well established in
> programming, whether PHP programmers are aware of it or not.

Yes, I'd agree.

It's a well established term in programming, and using a non-standard or more 
verbose term
would make it harder to find/remember this functionality for users moving to 
php from 
other programming languages, 
or for users moving from php to other programming languages.

Learning programming inevitably has unfamiliar terms such as `array`, `printf`, 
etc.
that are not commonly used in mathematics or daily life.

> As I see it, the discussion should be centered around:
>  1. The API Deque provides.
>  2. The package that provides it.
>  3. Whether Deque's API is consistent with other structures in the same 
> package.
>  4. Whether this should be included in php-src or left to extensions.
> 
> I suggest that we try to make PHP as homogenous in each bullet as we
> can with other languages:
>  1. Name it "Deque."
>  2. Put it in the namespace "Collections" (and if included in core,
> then "ext/collections").
>  3. Support common operations on Deque (pushing and popping items to
> both front and back, subscript operator works, iteration, size, etc)
> and add little else (don't be novel here). To me, the biggest thing
> left to discuss is a notion of a maximum size, which in my own
> experience is common for circular buffers (the implementation chosen
> for Deque) but not all languages have this.
>  4. This is less clear, but I'm in favor as long as we can provide a
few other data structures at the same time. Obviously, this a voter
judgement call on the yes/no.

I had planned on a minimal API for Deque.

"Maximum size" shouldn't be an issue for `Deque` specifically, it's a `size_t`. 
It isn't an issue for SplFixedArray and ArrayObject.
(or for 
PHP would just encounter a fatal error due to either

1. Hitting the `memory_limit` or available memory limit
while constructing or appending to the `Deque`
2. Seeing a fatal error due to integer overflow, which would only matter on 
32-bit php.
   (The `Deque` API doesn't have a setSize)
   The `safe_erealloc` macros allow you to do that.

For additions such as `Vector` and `Deque`, because we **can** choose a 
`size_t` (type large enough to store any size of memory) (and because 32-bit 
systems are increasingly rare),
I currently think always emitting an uncatchable fatal error (and exiting 
immediately) for impossibly large values
would be the most consistent (e.g. applications wouldn't reach `catch 
(Throwable $t)` 
blocks meant for a different purpose in an unexpected way, if they failed to 
validate inputs).
This is already done in array and SPL types.

```
php > var_dump(new SplFixedArray(PHP_INT_MAX));
Fatal error: Possible integer overflow in memory allocation 
(9223372036854775807 * 16 + 0) in php shell code on line 1

php > var_dump(array_fill(0, 2**31 - 2, null));

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to 
allocate 68719476744 bytes) in php shell code on line 1
php > var_dump(array_fill(0, 2**31, null));

Warning: Uncaught ValueError: array_fill(): Argument #2 ($count) is too large 
in php shell code:1
...

// Without memory_limit
php > ini_set('memory_limit', -1);
php > $x = array_fill(0, 2**31 - 1, null);

mmap() failed: [12] Cannot allocate memory

mmap() failed: [12] Cannot allocate memory

Fatal error: Out of memory (allocated 2097152) (tried to allocate 68719476744 
bytes) in php shell code on line 1
```

(If anyone's wondering about the error handling, php arrays have a size of 
signed `int32_t` (i.e. at most ~2 billion elements)
to support common use cases and many array instances, in a memory efficient 
way, while still being efficient in the engine's internals)

- If someone's application or library unexpectedly has users that end up 
needing to process more than 2 billion values
  (64GB of RAM in php 7.0-8.1), e.g. in reading/computing statistics from large 
binary files,
  the addition of `Deque` and `Vector` which don't have arbitrary size limits 
would be useful to them,
  and would reduce complaints or feature requests related to the ValueError.

> Given that I've suggested the most common options for these across
> many languages, it shouldn't be very controversial. The worst bit
> seems like picking the namespace "Collections" as this will break at
> least one package: https://github.com/danielgsims/php-collections. We
> should suggest that they vendor it anyway, as "collections" is common
> e.g. "Ramsey\Collections", "Doctrine\Common\Collections". I don't see
> a good alternative here to "Collections", as we haven't agreed on very
> much on past namespace proposals.

I don't plan to use any of the class names currently found in 
https://github.com/danielgsims/php-collections/tree/2.X/src
for future proposals, so that's less of an issue.

I also don't see good alternatives to "Collections". On a side note, other 
programming languages linked keep referring to collections packages as the 
plural `Collections\`,
and I also keep using the name `Collections` when thinking about it and find 
`use Collection\Deque;` unnatural for a reason I'm not sure about,
so the plural `Collections\` makes more sense.

I'd announced https://externals.io/message/116112 in case anyone had any ideas

- `DataStructures` was mentioned but I consider it too broad.
  https://en.m.wikipedia.org/wiki/List_of_data_structures is a massive list of 
which collections of objects is just a small part.
  It'd make more sense if everything was distributed as a single package such 
as a PECL, though

I also misread 
https://wiki.php.net/rfc/namespaces_in_bundled_extensions#proposal .
Sub-namespaces are allowed and could be used for functions/classlikes that were 
not
collections or ancestor types of collections in the future, if needed 
(`Collections\Xyz\function_dealing_with_collections()`).

Thanks,
Tyson
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to