https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970

--- Comment #14 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Just confirm on aarch64 QEMU, it seems that ARM SVE has the same issue as RVV.

This is the test:

#include <stdint-gcc.h>

#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                      
\
  void __attribute__ ((noinline, noclone))                                    
\
  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x, 
\
                                INDEX_TYPE *restrict index)                   
\
  {                                                                           
\
    for (int i = 0; i < 100; ++i)                                             
\
      {                                                                       
\
        y[i * 2] = x[index[i * 2]] + 1;                                       
\
        y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                               
\
      }                                                                       
\
  }

TEST_LOOP (int16_t, int8_t)
#include <assert.h>

int
main (void)
{
#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                       
\
  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       
\
  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        
\
  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                     
\
  for (int i = 0; i < 202; i++)                                               
\
    {                                                                         
\
      src_##DATA_TYPE##_##INDEX_TYPE[i]                                       
\
        = (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));        
\
      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                   
\
    }                                                                         
\
  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,              
\
                                src_##DATA_TYPE##_##INDEX_TYPE,               
\
                                index_##DATA_TYPE##_##INDEX_TYPE);            
\
  for (int i = 0; i < 100; i++)                                               
\
    {                                                                         
\
      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                          
\
              == (src_##DATA_TYPE##_##INDEX_TYPE                              
\
                    [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                 
\
                  + 1));                                                      
\
      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                      
\
              == (src_##DATA_TYPE##_##INDEX_TYPE                              
\
                    [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]             
\
                  + 2));                                                      
\
    }

  RUN_LOOP (int16_t, int8_t)

  return 0;
}


compile: -march=armv8-a+sve -O3 -msve-vector-bits=256 -specs=rdimon.specs
QEMU:sve-default-vector-length=256 

The configuration above passed.

However, I tried -march=armv8-a+sve -O3 -msve-vector-bits=512
-fno-vect-cost-model -specs=rdimon.specs
QEMU:sve-default-vector-length=512

This configuration failed like RVV:
assertion "dest_int16_t_int8_t[i * 2] == (src_int16_t_int8_t
[index_int16_t_int8_t[i * 2]] + 1)" failed: file "tmp.c", line 52, function:
main

The reason I experiment on ARM SVE with vector-length = 512bits,
because I checked the dump IR on ARM SVE which is similiar with RVV:
https://godbolt.org/z/x74z7obYT

Hi, @Tamar. Could you double-check whether my analysis (This bug not only
happens on RVV, but also on ARM SVE) is correct or not ?

Reply via email to