https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #11 from Christophe Lyon <clyon at gcc dot gnu.org> ---
(In reply to Wilco from comment #10)
> 
> For example:
> 
> int x;
> int f1 (void) { return x; }
> 
> with eg. -O2 -mcpu=cortex-m0 -mpure-code I get:
> 
>         movs    r3, #:upper8_15:#.LC1
>         lsls    r3, #8
>         adds    r3, #:upper0_7:#.LC1
>         lsls    r3, #8
>         adds    r3, #:lower8_15:#.LC1
>         lsls    r3, #8
>         adds    r3, #:lower0_7:#.LC1
>         @ sp needed
>         ldr     r3, [r3]
>         ldr     r0, [r3, #40]
>         bx      lr
> 
> That's an extra indirection through a literal... There should only be
> one ldr to read x.

Right, but the code is functional. I mentioned that problem when I submitted
the patch. I thought it was good to provide functionality and improve the
generated code later.
I wrote: "I haven't found yet how to make code for cortex-m0 apply upper/lower
relocations to "p" instead of .LC2. The current code looks functional, but
could be improved."

> 
> Big switch tables are produced for any Thumb-1 core, however I would expect
> Cortex-m0/m23 versions to look almost identical to the Cortex-m3 one, and
> use a sequence of comparisons instead of tables.
> 
> int f2 (int x, int y)
> {
>   switch (x)
>   {
>     case 0: return y + 0;
>     case 1: return y + 1;
>     case 2: return y + 2;
>     case 3: return y + 3;
>     case 4: return y + 4;
>     case 5: return y + 5;
>   }
>   return y;
> }
> 

I believe this is expected: as I wrote in my commit message
"CASE_VECTOR_PC_RELATIVE is now false with -mpure-code, to avoid generating
invalid assembly code with differences from symbols from two different sections
(the difference cannot be computed by the assembler)."

Maybe there's a possibility to tune this to detect cases where we can do
better?


> Immediate generation for common cases seems to be screwed up:
> 
> int f3 (void) { return 0x11000000; }
> 
> -O2 -mcpu=cortex-m0 -mpure-code:
> 
>         movs    r0, #17
>         lsls    r0, r0, #8
>         lsls    r0, r0, #8
>         lsls    r0, r0, #8
>         bx      lr

This is not optimal, but functional, right?


> This also regressed Cortex-m23 which previously generated:
> 
>         movs    r0, #136
>         lsls    r0, r0, #21
>         bx      lr
> Similar regressions happen with other immediates:
> 
> int f3 (void) { return 0x12345678; }
> 
> -O2 -mcpu=cortex-m23 -mpure-code:
> 
>         movs    r0, #86
>         lsls    r0, r0, #8
>         adds    r0, r0, #120
>         movt    r0, 4660
>         bx      lr
> 
> Previously it was:
> 
>         movw    r0, #22136
>         movt    r0, 4660
>         bx      lr
OK, I'll check how to fix that.


> Also relocations with a small offset should be handled within the
> relocation. I'd expect this to never generate an extra addition, let alone
> an extra literal pool entry:
> 
> int arr[10];
> int *f4 (void) { return &arr[1]; }
> 
> -O2 -mcpu=cortex-m3 -mpure-code generates the expected:
> 
>         movw    r0, #:lower16:.LANCHOR0+4
>         movt    r0, #:upper16:.LANCHOR0+4
>         bx      lr
> 
> -O2 -mcpu=cortex-m23 -mpure-code generates this:
> 
>         movw    r0, #:lower16:.LANCHOR0
>         movt    r0, #:upper16:.LANCHOR0
>         adds    r0, r0, #4
>         bx      lr

For cortex-m23, I get the same code with and without -mpure-code.

> 
> And cortex-m0 again inserts an extra literal load:
> 
>         movs    r3, #:upper8_15:#.LC0
>         lsls    r3, #8
>         adds    r3, #:upper0_7:#.LC0
>         lsls    r3, #8
>         adds    r3, #:lower8_15:#.LC0
>         lsls    r3, #8
>         adds    r3, #:lower0_7:#.LC0
>         ldr     r0, [r3]
>         adds    r0, r0, #4
>         bx      lr
Yes, same problem as in f1()


So I think -mpure-code for v6m is not broken, but yes the generated code can be
improved. So this may not be relevant to this PR?

Reply via email to