On 23 January 2015 at 11:18, Tejas Belagod <tejas.bela...@arm.com> wrote: > On 22/01/15 21:31, Christophe Lyon wrote: >> >> On 22 January 2015 at 16:22, Tejas Belagod <tejas.bela...@arm.com> wrote: >>> >>> On 22/01/15 14:28, Christophe Lyon wrote: >>>> >>>> >>>> On 22 January 2015 at 12:19, Tejas Belagod <tejas.bela...@arm.com> >>>> wrote: >>>>> >>>>> >>>>> On 21/01/15 15:07, Christophe Lyon wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 19 January 2015 at 17:54, Marcus Shawcroft >>>>>> <marcus.shawcr...@gmail.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 19 January 2015 at 15:43, Christophe Lyon >>>>>>> <christophe.l...@linaro.org> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 19 January 2015 at 14:29, Marcus Shawcroft >>>>>>>> <marcus.shawcr...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 16 January 2015 at 17:52, Christophe Lyon >>>>>>>>> <christophe.l...@linaro.org> wrote: >>>>>>>>> >>>>>>>>>>> OK provided, as per the previous couple, that we don;t regression >>>>>>>>>>> or >>>>>>>>>>> introduce new fails on aarch64[_be] or aarch32. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This patch shows failures on aarch64 and aarch64_be for vmax and >>>>>>>>>> vmin >>>>>>>>>> when the input is -NaN. >>>>>>>>>> It's a corner case, and my reading of the ARM ARM is that the >>>>>>>>>> result >>>>>>>>>> should the same as on aarch32. >>>>>>>>>> I haven't had time to look at it in more details though. >>>>>>>>>> So, not OK? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> They should have the same behaviour in aarch32 and aarch64. Did you >>>>>>>>> test on HW or a model? >>>>>>>>> >>>>>>>> I ran the tests on qemu for aarch32 and aarch64-linux, and on the >>>>>>>> foundation model for aarch64*-elf. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Leave this one out until we understand why it fails. /Marcus >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I've looked at this a bit more. >>>>>> We have >>>>>> fmax v0.4s, v0.4s, v1.4s >>>>>> where v0 is a vector of -NaN (0xffc00000) and v1 is a vector of 1. >>>>>> >>>>>> The output is still -NaN (0xffc00000), while the test expects >>>>>> defaultNaN (0x7fc00000). >>>>>> >>>>> >>>>> In the AArch32 execution state, Advanced SIMD FP arithmetic always uses >>>>> the >>>>> DefaultNaN setting regardless of the DN-bit value in the FPSCR. In >>>>> AArch64 >>>>> execution state, result of Advanced SIMD FP arithmetic operations >>>>> depend >>>>> on >>>>> the value of the DN-bit i.e. either propagate the input NaN or generate >>>>> DefaultNaN depending on the value of DN. >>>> >>>> >>>> >>>> Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I >>>> can see only the latter (no diff between aarch32 and aarch64 in >>>> FPProcessNan pseudo-code) >>>> >>> >>> If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec - >>> under DN: >>> >>> "The value of this bit only controls scalar floating-point arithmetic. >>> Advanced SIMD arithmetic always uses the Default NaN setting, regardless >>> of >>> the value of the DN bit." >>> >>> Also on page 3180 for the description of VMAX(vector FP), it says: >>> " >>> * max(+0.0, -0.0) = +0.0 >>> * If any input is a NaN, the corresponding result element is the default >>> NaN. >>> " >>> >> Oops I was looking at FMAX (vector) pg 936. >> >>> The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to >>> FPMax() which is on pg. 2285 >>> >>> // StandardFPSCRValue() >>> // ==================== >>> FPCRType StandardFPSCRValue() >>> return ‘00000’ : FPSCR.AHP : ‘11000000000000000000000000’ >>> >>> Here bit-25(FPSCR.DN) is set to 1. >>> >> >> So, we should get defaultNaN too on aarch64, and no need to try to >> force DN to 1 in gdb? >> >> What can be wrong? >> > > On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading > the same document. > > Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD) > for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which > means generate DefaultNaN() regardless. > > OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the > pseudocode gets just FPCR. > > Ok, that was my initial understanding but our discussion confused me.
And that's why I tried to force DN = 1 in gdb before single-stepping over fmax v0.4s, v0.4s, v1.4s but it changed nothing :-( Hence my question about a gdb possible bug or misuse. I'll try modifying the test to have it force DN=1. > Thanks, > Tejas. > > >>> Thanks, >>> Tejas. >>> >>> >>>>> If you're running your test in the AArch64 execution state, you'd want >>>>> to >>>>> define the DN bit and modify the expected results accordingly or have >>>>> the >>>>> test poll at runtime what the DN-bit is set to and check expected >>>>> results >>>>> dynamically. >>>> >>>> >>>> Makes sense, I hadn't noticed the different aarch64 spec here. >>>> >>>>> I think the test already has expected behaviour for AArch32 execution >>>>> state >>>>> by expecting DefaultNaN regardless. >>>> >>>> >>>> Yes. >>>> >>>>>> I have executed the test under GDB on AArch64 HW, and noticed that >>>>>> fpcr >>>>>> was 0. >>>>>> I forced it to have DN==1: >>>>>> set $fpcr=0x1000000 >>>>>> but this didn't change the result. >>>>>> >>>>>> Does setting fpcr.dn under gdb actually work? >>>>>> >>>>> >>>>> It should. Possibly a bug, patches welcome :-). >>>>> >>>> :-) >>>> >>> >>> >> > >