On 25/01/15 21:05, Christophe Lyon wrote:
On 23 January 2015 at 14:44, Christophe Lyon <christophe.l...@linaro.org> wrote:
On 23 January 2015 at 12:42, Christophe Lyon <christophe.l...@linaro.org> wrote:
On 23 January 2015 at 11:18, Tejas Belagod <tejas.bela...@arm.com> wrote:
On 22/01/15 21:31, Christophe Lyon wrote:

On 22 January 2015 at 16:22, Tejas Belagod <tejas.bela...@arm.com> wrote:

On 22/01/15 14:28, Christophe Lyon wrote:


On 22 January 2015 at 12:19, Tejas Belagod <tejas.bela...@arm.com>
wrote:


On 21/01/15 15:07, Christophe Lyon wrote:



On 19 January 2015 at 17:54, Marcus Shawcroft
<marcus.shawcr...@gmail.com> wrote:



On 19 January 2015 at 15:43, Christophe Lyon
<christophe.l...@linaro.org>
wrote:



On 19 January 2015 at 14:29, Marcus Shawcroft
<marcus.shawcr...@gmail.com> wrote:



On 16 January 2015 at 17:52, Christophe Lyon
<christophe.l...@linaro.org> wrote:

OK provided, as per the previous couple, that we don;t regression
or
introduce new fails on aarch64[_be] or aarch32.




This patch shows failures on aarch64 and aarch64_be for vmax and
vmin
when the input is -NaN.
It's a corner case, and my reading of the ARM ARM is that the
result
should the same as on aarch32.
I haven't had time to look at it in more details though.
So, not OK?




They should have the same behaviour in aarch32 and aarch64. Did you
test on HW or a model?

I ran the tests on qemu for aarch32 and aarch64-linux, and on the
foundation model for aarch64*-elf.




Leave this one out until we understand why it fails. /Marcus




I've looked at this a bit more.
We have
fmax    v0.4s, v0.4s, v1.4s
where v0 is a vector of -NaN (0xffc00000) and v1 is a vector of 1.

The output is still -NaN (0xffc00000), while the test expects
defaultNaN (0x7fc00000).


In the AArch32 execution state, Advanced SIMD FP arithmetic always uses
the
DefaultNaN setting regardless of the DN-bit value in the FPSCR. In
AArch64
execution state, result of Advanced SIMD FP arithmetic operations
depend
on
the value of the DN-bit i.e. either propagate the input NaN or generate
DefaultNaN depending on the value of DN.



Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I
can see only the latter (no diff between aarch32 and aarch64 in
FPProcessNan pseudo-code)


If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec -
under DN:

"The value of this bit only controls scalar floating-point arithmetic.
Advanced SIMD arithmetic always uses the Default NaN setting, regardless
of
the value of the DN bit."

Also on page 3180 for the description of VMAX(vector FP), it says:
"
*  max(+0.0, -0.0) = +0.0
* If any input is a NaN, the corresponding result element is the default
NaN.
"

Oops I was looking at FMAX (vector) pg 936.

The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to
FPMax() which is on pg. 2285

// StandardFPSCRValue()
// ====================
FPCRType StandardFPSCRValue()
return ‘00000’ : FPSCR.AHP : ‘11000000000000000000000000’

Here bit-25(FPSCR.DN) is set to 1.


So, we should get defaultNaN too on aarch64, and no need to try to
force DN to 1 in gdb?

What can be wrong?


On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading
the same document.

Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD)
for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which
means generate DefaultNaN() regardless.

OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the
pseudocode gets just FPCR.


Ok, that was my initial understanding but our discussion confused me.

And that's why I tried to force DN = 1 in gdb before single-stepping over
fmax    v0.4s, v0.4s, v1.4s

but it changed nothing :-(
Hence my question about a gdb possible bug or misuse.

Hmm... user error, I missed one bit
set $fpcr=0x2000000
works under gdb.

I'll try modifying the test to have it force DN=1.

Forcing DN=1 in the test makes it pass.

I am going to look at adding that cleanly to my test, and resubmit it.

Thanks, and sorry for the noise.

Here is the updated version:
- Now I set DN=1 on AArch64 in clean_results, as it is the main
initialization function.
- I removed the double negative :-)
- I removed the useless [u]int64 and poly variants

Christophe.

2015-01-25  Christophe Lyon  <christophe.l...@linaro.org>

* gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
(_ARM_FPSRC): Add DN and AHP fields.
(clean_results): Force DN=1 on AArch64.
* gcc.target/aarch64/advsimd-intrinsics/binary_op_no64.inc: New file.
* gcc.target/aarch64/advsimd-intrinsics/vhadd.c: New file.
* gcc.target/aarch64/advsimd-intrinsics/vhsub.c: New file.
* gcc.target/aarch64/advsimd-intrinsics/vmax.c: New file.
* gcc.target/aarch64/advsimd-intrinsics/vmin.c: New file.
* gcc.target/aarch64/advsimd-intrinsics/vrhadd.c: New file.


I guess you don't need the fake dependency fix for this as this is mostly called only once?

+  _ARM_FPSCR _afpscr_for_dn;
+  asm volatile ("mrs %0,fpcr" : "=r" (_afpscr_for_dn));
+  _afpscr_for_dn.b.DN = 1;
+  asm volatile ("msr fpcr,%0" : : "r" (_afpscr_for_dn));

Otherwise, your patch looks OK to me(but I can't approve it).

Thanks,
Tejas.


Reply via email to