On 23 January 2015 at 12:42, Christophe Lyon <christophe.l...@linaro.org> wrote: > On 23 January 2015 at 11:18, Tejas Belagod <tejas.bela...@arm.com> wrote: >> On 22/01/15 21:31, Christophe Lyon wrote: >>> >>> On 22 January 2015 at 16:22, Tejas Belagod <tejas.bela...@arm.com> wrote: >>>> >>>> On 22/01/15 14:28, Christophe Lyon wrote: >>>>> >>>>> >>>>> On 22 January 2015 at 12:19, Tejas Belagod <tejas.bela...@arm.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 21/01/15 15:07, Christophe Lyon wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 19 January 2015 at 17:54, Marcus Shawcroft >>>>>>> <marcus.shawcr...@gmail.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 19 January 2015 at 15:43, Christophe Lyon >>>>>>>> <christophe.l...@linaro.org> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 19 January 2015 at 14:29, Marcus Shawcroft >>>>>>>>> <marcus.shawcr...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 16 January 2015 at 17:52, Christophe Lyon >>>>>>>>>> <christophe.l...@linaro.org> wrote: >>>>>>>>>> >>>>>>>>>>>> OK provided, as per the previous couple, that we don;t regression >>>>>>>>>>>> or >>>>>>>>>>>> introduce new fails on aarch64[_be] or aarch32. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This patch shows failures on aarch64 and aarch64_be for vmax and >>>>>>>>>>> vmin >>>>>>>>>>> when the input is -NaN. >>>>>>>>>>> It's a corner case, and my reading of the ARM ARM is that the >>>>>>>>>>> result >>>>>>>>>>> should the same as on aarch32. >>>>>>>>>>> I haven't had time to look at it in more details though. >>>>>>>>>>> So, not OK? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> They should have the same behaviour in aarch32 and aarch64. Did you >>>>>>>>>> test on HW or a model? >>>>>>>>>> >>>>>>>>> I ran the tests on qemu for aarch32 and aarch64-linux, and on the >>>>>>>>> foundation model for aarch64*-elf. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Leave this one out until we understand why it fails. /Marcus >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I've looked at this a bit more. >>>>>>> We have >>>>>>> fmax v0.4s, v0.4s, v1.4s >>>>>>> where v0 is a vector of -NaN (0xffc00000) and v1 is a vector of 1. >>>>>>> >>>>>>> The output is still -NaN (0xffc00000), while the test expects >>>>>>> defaultNaN (0x7fc00000). >>>>>>> >>>>>> >>>>>> In the AArch32 execution state, Advanced SIMD FP arithmetic always uses >>>>>> the >>>>>> DefaultNaN setting regardless of the DN-bit value in the FPSCR. In >>>>>> AArch64 >>>>>> execution state, result of Advanced SIMD FP arithmetic operations >>>>>> depend >>>>>> on >>>>>> the value of the DN-bit i.e. either propagate the input NaN or generate >>>>>> DefaultNaN depending on the value of DN. >>>>> >>>>> >>>>> >>>>> Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I >>>>> can see only the latter (no diff between aarch32 and aarch64 in >>>>> FPProcessNan pseudo-code) >>>>> >>>> >>>> If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec - >>>> under DN: >>>> >>>> "The value of this bit only controls scalar floating-point arithmetic. >>>> Advanced SIMD arithmetic always uses the Default NaN setting, regardless >>>> of >>>> the value of the DN bit." >>>> >>>> Also on page 3180 for the description of VMAX(vector FP), it says: >>>> " >>>> * max(+0.0, -0.0) = +0.0 >>>> * If any input is a NaN, the corresponding result element is the default >>>> NaN. >>>> " >>>> >>> Oops I was looking at FMAX (vector) pg 936. >>> >>>> The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to >>>> FPMax() which is on pg. 2285 >>>> >>>> // StandardFPSCRValue() >>>> // ==================== >>>> FPCRType StandardFPSCRValue() >>>> return ‘00000’ : FPSCR.AHP : ‘11000000000000000000000000’ >>>> >>>> Here bit-25(FPSCR.DN) is set to 1. >>>> >>> >>> So, we should get defaultNaN too on aarch64, and no need to try to >>> force DN to 1 in gdb? >>> >>> What can be wrong? >>> >> >> On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading >> the same document. >> >> Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD) >> for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which >> means generate DefaultNaN() regardless. >> >> OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the >> pseudocode gets just FPCR. >> >> > Ok, that was my initial understanding but our discussion confused me. > > And that's why I tried to force DN = 1 in gdb before single-stepping over > fmax v0.4s, v0.4s, v1.4s > > but it changed nothing :-( > Hence my question about a gdb possible bug or misuse.
Hmm... user error, I missed one bit set $fpcr=0x2000000 works under gdb. > I'll try modifying the test to have it force DN=1. > Forcing DN=1 in the test makes it pass. I am going to look at adding that cleanly to my test, and resubmit it. Thanks, and sorry for the noise. >> Thanks, >> Tejas. >> >> >>>> Thanks, >>>> Tejas. >>>> >>>> >>>>>> If you're running your test in the AArch64 execution state, you'd want >>>>>> to >>>>>> define the DN bit and modify the expected results accordingly or have >>>>>> the >>>>>> test poll at runtime what the DN-bit is set to and check expected >>>>>> results >>>>>> dynamically. >>>>> >>>>> >>>>> Makes sense, I hadn't noticed the different aarch64 spec here. >>>>> >>>>>> I think the test already has expected behaviour for AArch32 execution >>>>>> state >>>>>> by expecting DefaultNaN regardless. >>>>> >>>>> >>>>> Yes. >>>>> >>>>>>> I have executed the test under GDB on AArch64 HW, and noticed that >>>>>>> fpcr >>>>>>> was 0. >>>>>>> I forced it to have DN==1: >>>>>>> set $fpcr=0x1000000 >>>>>>> but this didn't change the result. >>>>>>> >>>>>>> Does setting fpcr.dn under gdb actually work? >>>>>>> >>>>>> >>>>>> It should. Possibly a bug, patches welcome :-). >>>>>> >>>>> :-) >>>>> >>>> >>>> >>> >> >>