Hi internals,

Sorting function in PHP currently do not guarantee stability, which means
that the result order of elements that compare equal is not defined.

To achieve a stable sort, you need to do something like this (untested):

$arrayAndPos = [];
$pos = 0;
foreach ($array as $value) {
    $arrayAndPos[] = [$value, $pos++];
}
usort($arrayAndPos, function($a, $b) use($compare) {
    return $compare($a[0], $b[0]) ?: $a[1] <=> $b[1];
});
$array = [];
foreach ($arrayAndPos as $elem) {
    $array[] = $elem[0];
}

This is both cumbersome and very inefficient. Those temporary arrays
significantly increase memory usage, and hurt performance of the sort.

I believe that it is important for us to provide a stable sorting option in
the standard library. We could introduce a separate stable function(s) for
this purpose, but I believe we can just as well make all our existing sorts
stable.

This does not require actually switching sorting algorithms to something
like Timsort. We can essentially do the same as the above PHP code, just
much more efficiently. Due to certain implementation details, we can do
this without memory usage increase, and with minimal performance impact.
I've put https://github.com/php/php-src/pull/5236 up as a prototype.

The only issue I ran into is that this change has a negative impact on code
using illegal comparison callbacks like this:

usort($array, function($a, $b) {
    return $a > $b;
});

The return value of the sorting callback should be $a <=> $b, but using $a
> $b also works right now, due to implementation details of the sorting
implementation (only the outcome of $compare($a, $b) > 0 ends up being used
by the sort).

This kind of incorrect code will break under the proposed implementation,
because we will now compare by original position if the comparison function
reports equality. Because the comparator reports equality inconsistently
(it says that $a == $b, but $b != $a), the sort results are also
inconsistent.

What do people think about this? Is there interest in making sorting
stable? Is it okay to break code using illegal comparison callbacks?

Regards,
Nikita

Reply via email to