On 6/14/21 8:09 AM, Peter Maydell wrote:
     - pass only ESIZE, not H, to macros in mve_helper.c

I've been thinking about the H* macros again while reading this.

While H##ESIZE is an improvement over passing in HN, I think we can do better and force the adjustment to match the type -- completely avoiding bugs you've caught at least twice during SVE review.

The form I'm playing with today is

#ifdef HOST_WORDS_BIGENDIAN
#define HTADJ(p) (7 >> __builtin_ctz(sizeof(*(p))))
#define HBADJ(p) (7 & (7 << __builtin_ctz(sizeof(*(p)))))
#else
#define HTADJ(p) 0
#define HBADJ(p) 0
#endif

/**
 * HT: adjust Host addressing by Type
 * @p: data pointer
 * @i: array index
 *
 * Adjust p[i] for host-endian addressing of sub-quadword values.
 */
#define HT(p, i)  ((p)[(i) ^ HADJ(p)])

/**
 * HB: adjust Host addressing by Bype
 * @p: data pointer
 * @i: byte offset
 *
 * Adjust p[i / sizeof(*p)] for host-endian addressing
 * of sub-quadword values.  Unlike HT, @i is not an array
 * index but a byte offset.
 */
#define HB(p, i) (*(__typeof(p))((uintptr_t)(p) + ((i) ^ H1ADJ(p))))

void bt(unsigned char  *x, long i) { H(x, i) = 0; }
void ht(unsigned short *x, long i) { H(x, i) = 0; }
void wt(unsigned int   *x, long i) { H(x, i) = 0; }
void qt(unsigned long  *x, long i) { H(x, i) = 0; }

void bb(unsigned char  *x, long i) { H1(x, i) = 0; }
void hb(unsigned short *x, long i) { H1(x, i) = 0; }
void wb(unsigned int   *x, long i) { H1(x, i) = 0; }
void qb(unsigned long  *x, long i) { H1(x, i) = 0; }

This gives us

#define DO_1OP(OP, TYPE, FN)                                            \
    void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm)         \
    {                                                                   \
        TYPE *d = vd, *m = vm;                                          \
        uint16_t mask = mve_element_mask(env);                          \
        unsigned e;                                                     \
        unsigned const esize = sizeof(TYPE);                            \
        for (e = 0; e < 16 / esize; e++, mask >>= esize) {              \
            mergemask(&HT(d, e), FN(HT(m, e)]), mask);                  \
        }                                                               \
        mve_advance_vpt(env);                                           \
    }

Thoughts?  Especially on the naming of the macros?
If the idea appeals, I'll do a pass over the existing code.


r~

Reply via email to