[Bug target/27308] Compiler generates incorrect code when calling a function with the result of an inline function as parameter

2006-04-26 Thread Eric dot Doenges at betty-tv dot com


--- Comment #6 from Eric dot Doenges at betty-tv dot com  2006-04-26 06:26 
---
Unfortunately, removing the __asm__ (r0) from __r0 does not circumvent the
problem.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27308



[Bug c/27305] New: Compiler generates incorrect code when calling functions

2006-04-25 Thread Eric dot Doenges at betty-tv dot com
Consider the following code:

typedef unsigned int  UINT32;
typedef unsigned char BOOL;
#define __SWI_BIOS_ContainerUsage  1234

#define __swicall1(type,name,type1,arg1)\
  static inline type name(type1 arg1) { \
register long __r0 __asm__ (r0) = (long)arg1; \
register long __res __asm__ (r0); \
__asm__ __volatile__ (swi\t%2\n\t \
  : =r (__res)\
  : 0 (__r0), i (__SWI_##name)  \
  : r1, r2, r3, ip, lr, cc, \
memory);  \
return((type)__res);\
  }
__swicall1(UINT32,BIOS_ContainerUsage,BOOL,verbose);
int sprintf(char *p, const char *frmt, ...);

void testme(char *tmp)
{
  sprintf(tmp,  %d%% Containers\n, BIOS_ContainerUsage(1));
  sprintf(tmp,  %d%% Containers\n, 2 * BIOS_ContainerUsage(1));
}


-- 
   Summary: Compiler generates incorrect code when calling functions
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: blocker
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: Eric dot Doenges at betty-tv dot com
  GCC host triplet: powerpc-apple-darwin8.5.0
GCC target triplet: arm-elf-unknown


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27305



[Bug c/27308] New: Compiler generates incorrect code when calling a function with the result of an inline function as parameter

2006-04-25 Thread Eric dot Doenges at betty-tv dot com
Consider the following code:


typedef unsigned int  UINT32;
typedef unsigned char BOOL;
#define __SWI_BIOS_ContainerUsage  1234

#define __swicall1(type,name,type1,arg1) \
  static inline type name(type1 arg1) {   \
register long __r0 __asm__ (r0) = (long)arg1;  \
register long __res __asm__ (r0); \
__asm__ __volatile__ (swi\t%2\n\t   \
  : =r (__res)   
\
  : 0 (__r0), i (__SWI_##name)\
  : r1, r2, r3, ip, lr, cc,
\
memory); 
 \
return((type)__res);  
\
  }
__swicall1(UINT32,BIOS_ContainerUsage,BOOL,verbose);
int sprintf(char *p, const char *frmt, ...);

void testme(char *tmp)
{
  sprintf(tmp,  %d%% Containers\n, BIOS_ContainerUsage(1));
  sprintf(tmp,  %d%% Containers\n, 2 * BIOS_ContainerUsage(1));
}

For the first call to sprintf, gcc generates the following assembler code:

mov r0, #1
swi #1234

ldr r5, .L3
mov r0, r4
mov r1, r5
mov r2, r4
bl  sprintf

This is clearly wrong, since r2 should hold the result of the swi (which is
returned in r0). For the
second call to sprintf, gcc generates correct code:

mov r0, #1
swi #1234

mov r2, r0, asl #1
mov r1, r5
mov r0, r4
ldmfd   sp!, {r4, r5, lr}
b   sprintf


-- 
   Summary: Compiler generates incorrect code when calling a
function with the result of an inline function as
parameter
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: blocker
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: Eric dot Doenges at betty-tv dot com
  GCC host triplet: powerpc-apple-darwin8.5.0
GCC target triplet: arm-elf-unknown


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27308



[Bug c/27308] Compiler generates incorrect code when calling a function with the result of an inline function as parameter

2006-04-25 Thread Eric dot Doenges at betty-tv dot com


--- Comment #3 from Eric dot Doenges at betty-tv dot com  2006-04-25 14:37 
---
Storing the result to memory generates correct code


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27308



[Bug c/27308] Compiler generates incorrect code when calling a function with the result of an inline function as parameter

2006-04-25 Thread Eric dot Doenges at betty-tv dot com


--- Comment #4 from Eric dot Doenges at betty-tv dot com  2006-04-25 14:43 
---
Removing the __asm__ (r0) from __res works around the bug - but then I cannot
depend on gcc
always allocating r0 for __res, can I ? I found no other way to tell gcc which
registers it must use.
I'm assuming this is a bug in gcc, not the asm constraint, because the same
code works flawlessly with
gcc-3.4.3.

As to simplifying the testcase - storing the result of BIOS_ContainerUsage to
memory generates correct
code regardless of wether __res is forced to r0 or not, making it worthless as
a test case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27308



[Bug c/27016] New: ARM optimizer produces severely suboptimal code

2006-04-04 Thread Eric dot Doenges at betty-tv dot com
The compiler creates extremely bad code for the ARM target.
Consider the following source file:

--- SNIP ---
unsigned int code_in_ram[100];

void testme(void)
{
  unsigned int *p_rom, *p_ram, *p_end, len;

  extern unsigned int _ram_erase_sector_start;
  extern unsigned int _ram_erase_sector_end;


  p_ram = code_in_ram;
  p_rom = _ram_erase_sector_start;
  len = ((unsigned int)_ram_erase_sector_end 
 - (unsigned int)_ram_erase_sector_start) / sizeof(unsigned int);

  for (p_rom = _ram_erase_sector_start, p_end = _ram_erase_sector_end;
   p_rom  p_end;) {
*p_ram++ = *p_rom++;
  }
}
--- SNIP ---

Compiled with arm-elf-gcc -mcpu=arm7tdmi -S -Os testme.c, we get the following
code:

--- SNIP ---
.file   testme.c
.text
.align  2
.global testme
.type   testme, %function
testme:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r1, .L6
ldr r2, .L6+4
@ lr needed for prologue
b   .L2
.L3:
ldr r3, [r1], #4
str r3, [r2, #-4]
.L2:
ldr r3, .L6+8
cmp r1, r3
add r2, r2, #4
bcc .L3
bx  lr
.L7:
.align  2
.L6:
.word   _ram_erase_sector_start
.word   code_in_ram
.word   _ram_erase_sector_end
.size   testme, .-testme
.comm   code_in_ram,400,4
.ident  GCC: (GNU) 4.1.0
--- SNIP ---

Even a cursory examination reveals that it would be a lot better to write:

ldr r1, .L6
ldr r2, .L6+4
ldr r0, .L6+8
b   .L2

.L3:
ldr r3, [r1], #4
str r3, [r2], #4
.L2:
cmp r1, r0
bcc .L3
bx  lr

This code would be one instruction shorter overall , and two instructions less
in the loop. The way
gcc-4.1.0 refuses to use post-indexed addressing for the store is especially
bizzare, since it does use
post-indexed addressing for the preceeding load. Gcc 3.4.3 does not exhibit
this behaviour; it compiles
the above code to:

   ldr r2, .L6
   ldr r0, .L6+4
   cmp r2,r0
   ldr r1, .L6
   movcs pc,lr

.L4:
   ldr r2,[r2],#4
   cmp r2, r0
   str r3,[r1],#4
   bcc .L4
   mov pc,lr

While not perfect either, this also only has 4 instructions in the loop.


-- 
   Summary: ARM optimizer produces severely suboptimal code
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: Eric dot Doenges at betty-tv dot com
  GCC host triplet: powerpc-apple-darwin8.5.0
GCC target triplet: arm-elf-unknown


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27016



[Bug middle-end/27016] ARM optimizer produces severely suboptimal code

2006-04-04 Thread Eric dot Doenges at betty-tv dot com


--- Comment #2 from Eric dot Doenges at betty-tv dot com  2006-04-04 09:13 
---
(In reply to comment #1)
 This code is undefined:
   len = ((unsigned int)_ram_erase_sector_end 
  - (unsigned int)_ram_erase_sector_start) / sizeof(unsigned int);
 
 That is obviously undefined as taking the difference between two pointers 
 which
 are not in the same array is undefined code.
 
 Even the comparision:
 p_rom  p_end;
 is undefined.
 
In the code I took this snippet from, _ram_erase_sector_start and
_ram_erase_sector_end are symbols
generated by the linker at the start and the end of a special segment which I
need to copy to ram,
so I would argue that these pointers do in fact refer to the same array (in
this case, the array is the
entire flash memory). 

However, none of this should affect the decision to use (or not to use) the
post-indexed addressing
mode. If I replace the for loop with a for (len = 100; len  0; --len), the
quality of the generated code
actually degrades even further:

ldr r2, .L7
ldr r1, .L7+4
@ lr needed for prologue
.L2:
ldr r3, [r1, #-4]
str r3, [r2, #-4]
ldr r3, .L7+8
add r2, r2, #4
cmp r2, r3
add r1, r1, #4
bne .L2
bx  lr

While I thinks it's nifty that gcc recognizes that it doesn't need to keep the
len variable, but instead
uses p_ram to determine when the loop is finished, I also think it's pretty
brain-dead that it won't
use post-indexed addressing for either the ldr or str in the loop. And why it
thinks it needs to load
the constant end address to compare against every time inside the loop instead
of once into a scratch
register outside the loop is anyone's guess.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27016