On Sun, Aug 3, 2025 at 11:14 PM Luc Grosheintz <[email protected]>
wrote:
> An interesting case to consider is:
>
> bool same11(const std::extents<int, dyn, 2, 3>& e1,
> const std::extents<int, dyn, dyn, 3>& e2)
> { return e1 == e2; }
>
> Which has the following properties:
>
> - There's no mismatching static extents, preventing any
> short-circuiting.
>
> - There's a comparison between dynamic and static extents.
>
> - There's one trivial comparison: ... && 3 == 3.
>
> Let E[i] denote the array of static extents, D[k] denote the array of
> dynamic extents and k[i] be the index of the i-th extent in D.
> (Naturally, k[i] is only meaningful if i is a dynamic extent).
>
> The previous implementation results in assembly that's more or less a
> literal translation of:
>
> for (i = 0; i < 3; ++i)
> e1 = E1[i] == -1 ? D1[k1[i]] : E1[i];
> e2 = E2[i] == -1 ? D2[k2[i]] : E2[i];
> if e1 != e2:
> return false
> return true;
>
> While the proposed method results in assembly for
>
> if(D1[0] == D2[0]) return false;
> return 2 == D2[1];
>
> i.e.
>
> 110: 8b 17 mov edx,DWORD PTR [rdi]
> 112: 31 c0 xor eax,eax
> 114: 39 16 cmp DWORD PTR [rsi],edx
> 116: 74 08 je 120 <same11+0x10>
> 118: c3 ret
> 119: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
> 120: 83 7e 04 02 cmp DWORD PTR [rsi+0x4],0x2
> 124: 0f 94 c0 sete al
> 127: c3 ret
>
> It has the following nice properties:
>
> - It eliminated the indirection D[k[i]], because k[i] is known at
> compile time. Saving us a comparison E[i] == -1 and conditionally
> loading k[i].
>
> - It eliminated the trivial condition 3 == 3.
>
> The result is code that only loads the required values and performs
> exactly the number of comparisons needed by the algorithm. It also
> results in smaller object files. Therefore, this seems like a sensible
> change. We've check several other examples, including fully statically
> determined cases and high-rank examples. The example given above
> illustrates the other cases well.
>
> The constexpr condition:
>
> if constexpr (!_S_is_compatible_extents<...>)
> return false;
>
> is no longer needed, because the optimizer correctly handles this case.
> However, it's retained for clarity/certainty.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/mdspan (extents::operator==): Replace loop with
> pack expansion.
>
> Signed-off-by: Luc Grosheintz <[email protected]>
> ---
>
LGTM.
> libstdc++-v3/include/std/mdspan | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/mdspan
> b/libstdc++-v3/include/std/mdspan
> index ee0c2b65c7f..b6bed0b2845 100644
> --- a/libstdc++-v3/include/std/mdspan
> +++ b/libstdc++-v3/include/std/mdspan
> @@ -409,10 +409,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> return false;
> else
> {
> - for (size_t __i = 0; __i < __self.rank(); ++__i)
> - if (!cmp_equal(__self.extent(__i), __other.extent(__i)))
> - return false;
> - return true;
> + auto __impl = [&__self, &__other]<size_t... _Counts>(
> + index_sequence<_Counts...>)
> + { return (cmp_equal(__self.extent(_Counts),
> + __other.extent(_Counts)) && ...); };
> + return __impl(make_index_sequence<__self.rank()>());
> }
> }
>
> --
> 2.50.0
>
>