On 9/22/15 22:45, Richard Henderson wrote:
> On 09/21/2015 10:54 PM, Chen Gang wrote:
>> On 2015年09月19日 10:34, Richard Henderson wrote:
>>>
>>> There's a trick for this that's more efficient for 4 or more elements
>>> per vector (i.e. good for v2 and v1, but not v4):
>>>
>>>    a + b = (a & 0x7f7f7f7f) + (b & 0x7f7f7f7f)) ^ ((a ^ b) & 0x80808080)
>>>
>>>    a - b = (a | 0x80808080) - (b & 0x7f7f7f7f)) ^ ((a ^ ~b) & 0x80808080)
>>>
>>
>> For me, we need use "(a ^ b) & 0x80..." instead of "(a ^ ~b) & 0x80...".
> 
> No.  What you did wrong was not use (a | 0x80808080).
> 

Oh, sorry. I shall send patch v3 for it. :-)


Thanks.
-- 
Chen Gang (陈刚)

Open, share, and attitude like air, water, and life which God blessed

Reply via email to