[Bug middle-end/32396] New: [PPC/Altivec, regression?] gcc uses 0 as altivec load/store index

2007-06-18 Thread sparky at pld-linux dot org
In altivec load/store instructions (lvx, stvx, ...) and lsvl/lsvr, when address
is supplied as pointer + well-known constant, gcc always calculates the actual
address in scalar unit and does not use sum in those instructions (puts 0 as
index). This slows-down some simple altivec loops.

Sample code:
vector unsigned char *vDst = dst;
vector unsigned char vSetTo = {}; /* zero */

do {
vec_st( vSetTo,  0, vDst );
vec_st( vSetTo, 16, vDst );
vDst += 2;
} while (--len);

gcc 4.1.2, 4.2.0, 4.3-20070615 produces:

.L3:
addi %r11,%r9,16
stvx %v0,0,%r9
addi %r9,%r9,32
stvx %v0,0,%r11
bdnz .L3

while, ideally, it should be:
li %r11,16
.L3:
stvx %v0,0,%r9
stvx %v0,%r11,%r9
addi %r9,%r9,32
bdnz .L3

gcc 3.3, with -O2, behaves quite well in this case (should use 0 instead of
r10):
li %r10,0
li %r11,16
.L13:
stvx %v0,%r10,%r9
stvx %v0,%r11,%r9
addi %r9,%r9,32
bdnz .L13


-- 
   Summary: [PPC/Altivec, regression?] gcc uses 0 as altivec
load/store index
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sparky at pld-linux dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32396



[Bug middle-end/32396] [PPC/Altivec, regression?] gcc uses 0 as altivec load/store index

2007-06-18 Thread sparky at pld-linux dot org


--- Comment #1 from sparky at pld-linux dot org  2007-06-18 21:11 ---
Created an attachment (id=13732)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13732&action=view)
simple testcase and benchmark

on 1.3GHz iBook built without USE_ASM runs in 2.335s, with USE_ASM runs in
1.815s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32396



[Bug middle-end/32401] New: [PPC/Altivec] Non optimal code structure with -mabi=altivec

2007-06-19 Thread sparky at pld-linux dot org
With altivec enabled gcc prepares additional space on the stack. Unlike earlier
versions gcc 4.3 removes stack modification instructions if it isn't used. With
just -maltivec or with -mabi=altivec when altivec isn't used it works very
well. But with -mabi=altivec and altivec used gcc produces code with stucture
similar to one produced by eariler gcc versions with just stack modification
instructions removed. Seems like stack isn't optimized early enough.


This simple code:

void
test ( int len )
{
if (len) {
vector unsigned char vSetTo = {};
asm volatile ("" : : "v" (vSetTo) ); /* do something */
}
}


"gcc-4.3 -O2 -maltivec -mregnames test.c -S" produces:

test:
cmpwi %cr7,%r3,0
beqlr- %cr7
vxor %v0,%v0,%v0
blr


while "gcc-4.3 -O2 -maltivec -mabi=altivec -mregnames test.c -S" produces:

test:
cmpwi %cr7,%r3,0
beq- %cr7,.L3  # <-- should be beqlr
vxor %v0,%v0,%v0
.L3:
blr


The letter one has same structure as produced by earlier gcc versions, but
without stack modification instructions:

gcc 4.1.3 produces:

test:
cmpwi %cr7,%r3,0
stwu %r1,-16(%r1)
vxor %v0,%v0,%v0
beq- %cr7,.L4
.L4:
addi %r1,%r1,16
blr


-- 
   Summary: [PPC/Altivec] Non optimal code structure with -
mabi=altivec
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sparky at pld-linux dot org
 GCC build triplet: powerpc*-linux
  GCC host triplet: powerpc*-linux
GCC target triplet: powerpc*-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32401



[Bug middle-end/32429] New: [PPC, missing optimization] stack space not optimized when stack not used

2007-06-20 Thread sparky at pld-linux dot org
A variety of options prepare some additional space on the stack. It isn't
optimized when stack isn't used. Those options are:

version  options
 3.3 -fpic -fPIC -mabi=altivec
 4.1.2   -fpic -fPIC -fpie -fPIE -maltivec
 4.2.0   -fpic -fPIC -fpie -fPIE -maltivec
4.3-pre  -fpic -fPIC -fpie -fPIE

problem with -maltivec has been partially fixed in 4.3 (see bug 32401)



$ gcc-3.3 --version
gcc-3.3 (GCC) 3.3.6 (Debian 1:3.3.6-15)

$ gcc-3.3 -S -O2 empty.c && grep -A3 "empty:" empty.s
empty:
blr
.size   empty, .-empty
.section.note.GNU-stack,"",@progbits
$ gcc-3.3 -S -O2 -fpic empty.c && grep -A3 "empty:" empty.s
empty:
stwu 1,-32(1)
addi 1,1,32
blr
$ gcc-3.3 -S -O2 -fPIC empty.c && grep -A3 "empty:" empty.s
empty:
stwu 1,-32(1)
addi 1,1,32
blr
$ gcc-3.3 -S -O2 -maltivec empty.c && grep -A3 "empty:" empty.s
empty:
blr
.size   empty, .-empty
.section.note.GNU-stack,"",@progbits
$ gcc-3.3 -S -O2 -maltivec -mabi=altivec empty.c && grep -A3 "empty:" empty.s
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc-3.3 -S -O2 -fpic -maltivec -mabi=altivec empty.c && grep -A3 "empty:"
empty.s
empty:
stwu 1,-32(1)
addi 1,1,32
blr
$ gcc-3.3 -S -O2 -fPIC -maltivec -mabi=altivec empty.c && grep -A3 "empty:"
empty.s
empty:
stwu 1,-32(1)
addi 1,1,32
blr



$ gcc --version
gcc (GCC) 4.1.2 (PLD-Linux)

$ gcc -S -O2 empty.c && grep -A3 "empty:" empty.s 
empty:
blr
.size   empty, .-empty
.ident  "GCC: (GNU) 4.1.2 (PLD-Linux)"
$ gcc -S -O2 -fpic empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc -S -O2 -fPIC empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc -S -O2 -fpie empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc -S -O2 -fPIE empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc -S -O2 -maltivec empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc -S -O2 -maltivec -mabi=altivec empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc -S -O2 -fpic -maltivec -mabi=altivec empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-32(1)
addi 1,1,32
blr
$ gcc -S -O2 -fpie -maltivec -mabi=altivec empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-32(1)
addi 1,1,32
blr



$ gcc-4.3 --version
gcc-4.3 (GCC) 4.3.0 20070615 (experimental)

$ gcc-4.3 -S -O2 empty.c && grep -A3 "empty:" empty.s 
empty:
blr
.size   empty, .-empty
.ident  "GCC: (GNU) 4.3.0 20070615 (experimental)"
$ gcc-4.3 -S -O2 -fpic empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc-4.3 -S -O2 -fPIC empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc-4.3 -S -O2 -fpie empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc-4.3 -S -O2 -fPIE empty.c && grep -A3 "empty:" empty.s 
empty:
stwu 1,-16(1)
addi 1,1,16
blr
$ gcc-4.3 -S -O2 -maltivec empty.c && grep -A3 "empty:" empty.s 
empty:
blr
.size   empty, .-empty
.ident  "GCC: (GNU) 4.3.0 20070615 (experimental)"
$ gcc-4.3 -S -O2 -maltivec -mabi=altivec empty.c && grep -A3 "empty:" empty.s 
empty:
blr
.size   empty, .-empty
.ident  "GCC: (GNU) 4.3.0 20070615 (experimental)"
$ gcc-4.3 -S -O2 -fpic -maltivec -mabi=altivec empty.c && grep -A3 "empty:"
empty.s 
empty:
stwu 1,-32(1)
addi 1,1,32
blr
$ gcc-4.3 -S -O2 -fPIC -maltivec -mabi=altivec empty.c && grep -A3 "empty:"
empty.s 
empty:
stwu 1,-32(1)
addi 1,1,32
blr
$ gcc-4.3 -S -O2 -fpie -maltivec -mabi=altivec empty.c && grep -A3 "empty:"
empty.s 
empty:
stwu 1,-32(1)
    addi 1,1,32
        blr
$ gcc-4.3 -S -O2 -fPIE -maltivec -mabi=altivec empty.c && grep -A3 "empty:"
empty.s 
empty:
stwu 1,-32(1)
addi 1,1,32
blr


-- 
   Summary: [PPC, missing optimization] stack space not optimized
when stack not used
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: sparky at pld-linux dot org
GCC target triplet: powerpc-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32429



[Bug middle-end/37154] static inline function problem

2008-09-29 Thread sparky at pld-linux dot org


--- Comment #6 from sparky at pld-linux dot org  2008-09-29 21:36 ---
I was trying to isolate the code which triggers this bug, but seems like the
code must be very complex to do so. Nevertheless I found exactly how the
resulting assembler code is broken.
Note: files gsignal.s and gsignal.s-non-inline are switched.

In file .s file with inlining, at line 15045 there's the conditional jump
corresponding to `if (!accumulator)' from original code, but the actual
comparison of the value and zero is nowhere near to be found.

15041 mr 7,31
15042 bl [EMAIL PROTECTED]
15043 .LBB1531:
15044 .LBB1532:
15045 .loc 1 2282 0
15046 beq- 3,.L1255 <--- missing: cmpwi 3,9,0
15047 .LVL1697:
15048 .LBE1532:
15049 .loc 1 2285 0
15050 lwz 9,116(1)


It looks like gcc thinks the comparison at line 14364 is enough. The code does
not do anything with cr3 along the path, but several external functions are
called, which AFAIR are allowed to change the value of cr3.

14360 .loc 1 2333 0
14361 lwz 9,28(22)
14362 .LVL1636:
14363 .loc 1 2334 0
14364 cmpwi 3,9,0
14365 .loc 1 2333 0
14366 stw 9,116(1)
14367 .LVL1637:
14368 .loc 1 2334 0
14369 beq- 3,.L1414


I was playing with newer glib2, so I'm not really sure about this file, but in
my case adding appropriate cmpwi 3,,0 instruction was enough to fix the
code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37154



[Bug middle-end/37154] static inline function problem

2008-10-01 Thread sparky at pld-linux dot org


--- Comment #10 from sparky at pld-linux dot org  2008-10-01 15:29 ---
In that case the bug report is incorrect.

The problem lays in glibc, in function lroundl which does not save cr3.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37154



[Bug middle-end/37154] static inline function problem

2008-10-01 Thread sparky at pld-linux dot org


--- Comment #11 from sparky at pld-linux dot org  2008-10-01 16:47 ---
Note: glibc problem have been detected and fixed already:
- bug: https://bugzilla.redhat.com/show_bug.cgi?id=450790
- fix: http://www.sourceware.org/ml/libc-hacker/2008-06/msg1.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37154