More bugs: addl/v should sign-extend the result, as addl does. As it is, we have uint64_t helper_addlv(CPUAlphaState *env, uint64_t op1, uint64_t op2) { uint64_t tmp = op1; op1 = (uint32_t)(op1 + op2); if (unlikely((tmp ^ op2 ^ (-1UL)) & (tmp ^ op1) & (1UL << 31))) { arith_excp(env, GETPC(), EXC_M_IOV, 0); } return op1; }
IOW, #include <stdio.h> long r; void __attribute__((noinline)) f(void) { asm __volatile( "subl $31, 1, $0\n\t" "addl $0, $0, $1\n\t" "addl/v $0, $0, $0\n\t" "subq $0, $1, $0\n\t" "stq $0, %0\n\t" : "=m"(r): :"$0", "$1"); } main() { f(); printf("%ld\n", r); } ends up printing 0 on actual hardware (all variants) and 4294967296 on qemu. Similar problem with subl/v - #include <stdio.h> long r; void __attribute__((noinline)) f(void) { asm __volatile( "subl $31, 1, $0\n\t" "subl/v $31, 1, $1\n\t" "subq $0, $1, $0\n\t" "stq $0, %0\n\t" : "=m"(r): :"$0", "$1"); } main() { f(); printf("%ld\n", r); } prints 0 on actual hw and -4294967296 on qemu. What constraints do we have on qemu host, anyway? Two's-complement, (int32_t)(uint32_t)x == x for any int32_t x? helper_mullv() seems to assume that... Oh, crap - our mull/v is sensitive to upper 32 bits of multiplicands. If you put 1UL<<32 into one register, 1 into another and say mull/v, result will be 0 and no overflow. qemu does int64_t res = (int64_t)op1 * (int64_t)op2; if (unlikely((int32_t)res != res)) { arith_excp(env, GETPC(), EXC_M_IOV, 0); } return (int64_t)((int32_t)res); which leads to overflow trap triggered for no good reason... Incidentally, all those guys ({add,sub,mul}[lq]/v) *do* assign the result (same as the variant without /v would) before entering the trap. So arith_excp() is wrong here. FWIW, why not just generate trunc_i64_i32 tmp, va trunc_i64_i32 tmp2, vb muls2_i32 tmp2, tmp, tmp, tmp2 ext32s_i64 vc, tmp2 maybe_overflow_32 tmp where maybe_overflow throws IOV unless tmp is 0 or -1? That would appear to suffice for mull/v. mulq/v would be muls2_i64 vc, tmp, va, vb maybe_overflow_64 tmp addl/v: trunc_i64_i32 tmp, va trunc_i64_i32 tmp2, vb add2_i32 tmp2, tmp, tmp, zero, tmp2, zero ext32s_i64 vc, tmp2 maybe_overflow_32 tmp etc. We'd need two helpers, differing only in argument type. Simple if (unlikely(arg && ~arg)) arith_excp(env, GETPC(), EXC_M_IOV, 0); would do. Not sure what flags would be needed in DEFINE_HELPER_... for those, though. Comments?