On 24/10/13 20:03, Kugan wrote:
Hi Kyrill,
It happens for armv5te arm-none-linux-gnueabi. --with-mode=arm
--with-arch=armv5te --with-float=soft
Ah ok, I can reproduce it now. So, while I agree that we add a scan for vbit and
vbif to these testcases, there seems to be something dodgy going on with the
register allocation.
With -march=armv5te I'm getting the following snippet of code in the ltgt case:
.L12:
ldr r4, [ip]
ldr r5, [ip, #4]
ldr r6, [ip, #8]
ldr r7, [ip, #12]
vmov d20, r4, r5 @ v4sf
vmov d21, r6, r7
vcgt.f32 q8, q10, q9
vcgt.f32 q10, q9, q10
vorr q8, q8, q10
vmov d22, r4, r5 @ v4sf
vmov d23, r6, r7
vbit q11, q9, q8
vmov r4, r5, d22 @ v4sf
vmov r6, r7, d23
The second vcgt.f32 trashes q10, then recreates it in q11 with:
vmov d22, r4, r5 @ v4sf
vmov d23, r6, r7
so it can do the vbit. Surely there's something better that can be done?
In contrast, with -march=armv7-a we get:
.L12:
vld1.32 {q9}, [r4]!
vcgt.f32 q8, q9, q10
vcgt.f32 q11, q10, q9
vorr q8, q8, q11
vbsl q8, q10, q9
vst1.32 {q8}, [lr]!
So, while I agree with the patch, there seems to be some funny business with the
register allocation that could be worth looking into.
Thanks,
Kyrill
You can also find the logs here in
http://cbuild.validation.linaro.org/build/gcc-linaro-4.8-2013.10/logs/armv7l-precise-cbuild461-calxeda02_21_00_precise_armel-armv5r2/
I changed neon-vcond-gt.c too.
Thanks,
Kugan
2013-10-23 Kugan Vivekanandarajah <kug...@linaro.org>
* gcc.target/arm/neon-vcond-gt.c: Scan for vbsl or vbit or vbif.
* gcc.target/arm/neon-vcond-ltgt.c: Scan for vbsl or vbit or vbif.
* gcc.target/arm/neon-vcond-unordered.c: Scan for vbsl or vbit or vbif.