Re: [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers

2020-10-09 Thread Richard Henderson
On 9/23/20 6:48 AM, LIU Zhiwei wrote: >> +    for (i = 0; i < opr_sz_8; i += 2) { >>   uint64_t d0, d1; >>   -    d0  = n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; >> +    d0  = a[i + 0]; > Add once. >> +    d0 += n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; >>   d0

Re: [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers

2020-09-23 Thread LIU Zhiwei
On 2020/9/23 22:46, Richard Henderson wrote: On 9/23/20 3:01 AM, LIU Zhiwei wrote: On 2020/9/19 2:37, Richard Henderson wrote: For SVE, we potentially have a 4th argument coming from the movprfx instruction.  Currently we do not optimize movprfx, so the problem is not visible. Hi Richard,

Re: [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers

2020-09-23 Thread Richard Henderson
On 9/23/20 3:01 AM, LIU Zhiwei wrote: > > > On 2020/9/19 2:37, Richard Henderson wrote: >> For SVE, we potentially have a 4th argument coming from the >> movprfx instruction.  Currently we do not optimize movprfx, >> so the problem is not visible. > Hi Richard, > > I am a little confused.  If it

Re: [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers

2020-09-23 Thread LIU Zhiwei
On 2020/9/19 2:37, Richard Henderson wrote: For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 +++--- target/arm/

Re: [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers

2020-09-23 Thread LIU Zhiwei
On 2020/9/19 2:37, Richard Henderson wrote: For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Hi Richard, I am a little confused.  If it is not immediately preceded by a MOVPRFX instruction,

[PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers

2020-09-18 Thread Richard Henderson
For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 +++--- target/arm/sve.decode | 7 +- target/arm/translate