[PHP-DEV] Re: [RFC] [Discussion] Add new function `array_group`

Boro Sitnikovski Wed, 31 May 2023 15:48:01 -0700

On 30.5.2023, at 15:13, Boro Sitnikovski <buritom...@gmail.com> wrote:

Updated the patch: added a test about increasing subsequences example, and a minor bugfix.

<array_group.patch>

On 30.5.2023, at 13:34, Boro Sitnikovski <buritom...@gmail.com> wrote:

Hello all,

As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.

I propose introducing a function to PHP core named `array_group`. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell's `groupBy` function.

For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:

```
<?php
$array = [
[ 'id' => 1, 'value' => 'foo' ],
[ 'id' => 1, 'value' => 'bar' ],
[ 'id' => 2, 'value' => 'baz' ],
];

$groups = [];
foreach ( $array as $element ) {
$groups[ $element['id'] ][] = $element;
}

var_dump( $groups );
```

This can now be achieved as follows (not preserving keys):

```
<?php
$array = [
[ 'id' => 1, 'value' => 'foo' ],
[ 'id' => 1, 'value' => 'bar' ],
[ 'id' => 2, 'value' => 'baz' ],
];

$groups = array_group( $array, function( $a, $b ) {
return $a['id'] == $b['id'];
} );
```

The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say, `<` or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.

In any case, I think a utility function such as `array_group` will be widely useful.

Please find attached a patch with a proposed implementation. Curious about your feedback.

Best,

Boro Sitnikovski

<array_group.patch>

Thank you all for the comments. I agree that there are many ways to do grouping, but based on the discussion here, I think we discussed two main grouping cases:

1. The grouping that _javascript_/.NET/Lodash/Scala/etc. do (this should be the default of `array_group`)

2. The grouping that Haskell does, the one I proposed earlier (this can be altered in a flag within `array_group`)

Based on this, I'd like to adjust my initial proposal, where we would have the following function: `function array_group(array $array, callable $callback, bool $consecutive_pairs = false): array {}`

If the argument `consecutive_pairs` is false, it will use the function's return value to do the grouping ($callback accepting single element in this case)

Otherwise, it will use the function's boolean return value to check if two consecutive elements need to be grouped ($callback accepting two elements in this case)

(This approach seems to be consistent with `array_filter` in the sense the callback accepts one or two arguments)

With a few example usages:

```

var_dump( array_group($arr1, function( $x ) {

return (string) strlen( $x );

} ) );

// Producing ['3' => ['one', 'two'], '5' => ['three']]

```

Another one:

```

$arr = [-1,2,-3,-4,2,1,2,-3,1,1,2];

$groups = array_group( $arr, function( $p1, $p2 ) {

return ($p1 > 0) == ($p2 > 0);

} );

// Producing [[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]

```

I believe this proposal captures many use cases, beyond the examples we discussed. Curious about any other thoughts.

I'm also attaching a PoC patch that implements this.

array_group.patch
Description: Binary data

[PHP-DEV] Re: [RFC] [Discussion] Add new function `array_group`

Reply via email to