An interesting case to consider is:

  bool same11(const std::extents<int, dyn,   2, 3>& e1,
              const std::extents<int, dyn, dyn, 3>& e2)
  { return e1 == e2; }

Which has the following properties:

  - There's no mismatching static extents, preventing any
    short-circuiting.

  - There's a comparison between dynamic and static extents.

  - There's one trivial comparison: ... && 3 == 3.

Let E[i] denote the array of static extents, D[k] denote the array of
dynamic extents and k[i] be the index of the i-th extent in D.
(Naturally, k[i] is only meaningful if i is a dynamic extent).

The previous implementation results in assembly that's more or less a
literal translation of:

  for (i = 0; i < 3; ++i)
    e1 = E1[i] == -1 ? D1[k1[i]] : E1[i];
    e2 = E2[i] == -1 ? D2[k2[i]] : E2[i];
    if e1 != e2:
      return false
  return true;

While the proposed method results in assembly for

  if(D1[0] == D2[0]) return false;
  return 2 == D2[1];

i.e.

  110:  8b 17                  mov    edx,DWORD PTR [rdi]
  112:  31 c0                  xor    eax,eax
  114:  39 16                  cmp    DWORD PTR [rsi],edx
  116:  74 08                  je     120 <same11+0x10>
  118:  c3                     ret
  119:  0f 1f 80 00 00 00 00   nop    DWORD PTR [rax+0x0]
  120:  83 7e 04 02            cmp    DWORD PTR [rsi+0x4],0x2
  124:  0f 94 c0               sete   al
  127:  c3                     ret

It has the following nice properties:

  - It eliminated the indirection D[k[i]], because k[i] is known at
    compile time. Saving us a comparison E[i] == -1 and conditionally
    loading k[i].

  - It eliminated the trivial condition 3 == 3.

The result is code that only loads the required values and performs
exactly the number of comparisons needed by the algorithm. It also
results in smaller object files. Therefore, this seems like a sensible
change. We've check several other examples, including fully statically
determined cases and high-rank examples. The example given above
illustrates the other cases well.

The constexpr condition:

  if constexpr (!_S_is_compatible_extents<...>)
    return false;

is no longer needed, because the optimizer correctly handles this case.
However, it's retained for clarity/certainty.

libstdc++-v3/ChangeLog:

        * include/std/mdspan (extents::operator==): Replace loop with
        pack expansion.

Signed-off-by: Luc Grosheintz <luc.groshei...@gmail.com>
---
 libstdc++-v3/include/std/mdspan | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index f288af96cdb..5b47c95d2c6 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -349,10 +349,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
            return false;
          else
            {
-             for (size_t __i = 0; __i < __self.rank(); ++__i)
-               if (!cmp_equal(__self.extent(__i), __other.extent(__i)))
-                 return false;
-             return true;
+             auto __impl = [&__self, &__other]<size_t... _Counts>(
+                 index_sequence<_Counts...>)
+               { return (cmp_equal(__self.extent(_Counts),
+                                   __other.extent(_Counts)) && ...); };
+             return __impl(make_index_sequence<__self.rank()>());
            }
        }
 
-- 
2.50.0

Reply via email to