On Wed, Oct 26, 2016 at 10:22:10PM -0700, Richard Henderson wrote: > On 10/26/2016 08:47 PM, David Gibson wrote: > > > > +void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b) > > > > +{ > > > > + int i; > > > > + uint8_t s = 0; > > > > + for (i = 0; i < 16; i++) { > > > > + s ^= (b->u8[i] & 1); > > > > + } > > > > + r->u64[LO_IDX] = (!s) ? 0 : 1; > > > > + r->u64[HI_IDX] = 0; > > > > +} > > > > + > > I think you can implement these better. First mask with 0x01010101 > > (of the appropriate length) to extract the LSB bits of each byte. > > Then XOR the two halves together, then quarters and so forth, > > ln2(size) times to arrive at the parity. This is similar to the usual > > Hamming weight implementation. > > > > You don't even have to mask with 0x01010101 to start. Just fold halves til > you get to the byte level and then mask with 1.
Good point. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature