On 12/10/2015 06:15 AM, Chen Gang wrote: > +#define TILEGX_F_MAN_HBIT (1ULL << 59) ... > +static uint64_t fr_to_man(float64 d) > +{ > + uint64_t val = get_f64_man(d) << 7; > + > + if (get_f64_exp(d)) { > + val |= TILEGX_F_MAN_HBIT; > + } > + > + return val; > +}
One presumes that "HBIT" is the ieee implicit one bit. A better name or better comments would help there. Do we know for sure that "7" is the correct number of guard bits? From the gcc implementation of floatsidf, I might guess that the correct number is "4". > +static uint32_t get_fdouble_vexp(uint64_t n) > +{ > + return extract32(n, 7, 13); > +} What's a "vexp"? > +uint64_t helper_fdouble_unpack_min(CPUTLGState *env, > + uint64_t srca, uint64_t srcb) > +{ > + uint64_t v = 0; > + uint32_t expa = get_f64_exp(srca); > + uint32_t expb = get_f64_exp(srcb); > + > + if (float64_is_any_nan(srca) || float64_is_any_nan(srcb) > + || float64_is_infinity(srca) || float64_is_infinity(srcb)) { > + return 0; > + } else if (expa > expb) { > + if (expa - expb < 64) { > + set_fdouble_man(&v, fr_to_man(srcb) >> (expa - expb)); > + } else { > + return 0; > + } > + } else if (expa < expb) { > + if (expb - expa < 64) { > + set_fdouble_man(&v, fr_to_man(srca) >> (expb - expa)); I very sincerely doubt that a simple right-shift is correct. In order to obtain proper rounding for real computation, a sticky bit is required. That is, set bit 0 if any bits are shifted out. See the implementation of shift64RightJamming in fpu/softfloat-macros.h. > +uint64_t helper_fdouble_addsub(CPUTLGState *env, > + uint64_t dest, uint64_t srca, uint64_t srcb) > +{ > + if (get_fdouble_calc(srcb) == TILEGX_F_CALC_ADD) { > + return dest + srca; /* maybe set addsub overflow bit */ Definitely not. That would be part of packing. > +/* absolute-add/mul may cause add/mul carry or overflow */ > +static bool proc_oflow(uint64_t *flags, uint64_t *v, uint64_t *srcb) > +{ > + if (get_fdouble_man_of(*v)) { > + set_fdouble_vexp(flags, get_fdouble_vexp(*flags) + 1); > + *srcb >>= 1; > + *srcb |= *v << 63; > + *v >>= 1; > + clear_fdouble_man_of(v); > + } > + return get_fdouble_vexp(*flags) > TILEGX_F_EXP_DMAX; > +} > + > +uint64_t helper_fdouble_pack2(CPUTLGState *env, uint64_t flags /* dest */, > + uint64_t srca, uint64_t srcb) > +{ > + uint64_t v = srca; > + float64 d = float64_set_sign(float64_zero, get_fdouble_sign(flags)); > + > + /* > + * fdouble_add_flags, fdouble_sub_flags, or fdouble_mul_flags have > + * processed exceptions. So need not process fp_status, again. > + */ No need to process fp_status at all, actually. Tile-GX (and pro) do not support exception flags, so everything we do with fp_status is discarded. Indeed, we should probably not store fp_status in env at all, but create it on the stack in any function that actually needs one. > + > + if (get_fdouble_nan(flags)) { > + return float64_val(float64_default_nan); > + } else if (get_fdouble_inf(flags)) { > + return float64_val(d |= float64_infinity); s/|=/|/ > + /* absolute-mul needs left shift 4 + 1 bytes to match the real mantissa > */ > + if (get_fdouble_calc(flags) == TILEGX_F_CALC_MUL) { > + v <<= 5; > + v |= srcb >> 59; > + srcb <<= 5; > + } As with single, I don't like this calc thing. We can infer what's required from principals. We're given two words containing mantissa, and a "flags" word containing sign, exponent, and other flags. For add, sub, and floatsidf, the compiler passes us 0 as the low word; for mul the compiler passes us the result of a 64x64->128 bit multiply. The first step would be to normalize the 128-bit value so that the highest bit set is TILEGX_F_MAN_HBIT in the high word, adjusting the exponent in the process. Fold the low word into the sticky bit of the high word (high |= (low != 0)) for rounding purposes. The second step would be to round and pack, similar to roundAndPackFloat64, except that your HBIT is at a different place than softfloat.c. > + d = calc(fsrca, fsrcb, fp_status); /* also check exceptions */ There are no exceptions to check. r~