https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538
--- Comment #11 from Christophe Lyon <clyon at gcc dot gnu.org> --- (In reply to Wilco from comment #10) > > For example: > > int x; > int f1 (void) { return x; } > > with eg. -O2 -mcpu=cortex-m0 -mpure-code I get: > > movs r3, #:upper8_15:#.LC1 > lsls r3, #8 > adds r3, #:upper0_7:#.LC1 > lsls r3, #8 > adds r3, #:lower8_15:#.LC1 > lsls r3, #8 > adds r3, #:lower0_7:#.LC1 > @ sp needed > ldr r3, [r3] > ldr r0, [r3, #40] > bx lr > > That's an extra indirection through a literal... There should only be > one ldr to read x. Right, but the code is functional. I mentioned that problem when I submitted the patch. I thought it was good to provide functionality and improve the generated code later. I wrote: "I haven't found yet how to make code for cortex-m0 apply upper/lower relocations to "p" instead of .LC2. The current code looks functional, but could be improved." > > Big switch tables are produced for any Thumb-1 core, however I would expect > Cortex-m0/m23 versions to look almost identical to the Cortex-m3 one, and > use a sequence of comparisons instead of tables. > > int f2 (int x, int y) > { > switch (x) > { > case 0: return y + 0; > case 1: return y + 1; > case 2: return y + 2; > case 3: return y + 3; > case 4: return y + 4; > case 5: return y + 5; > } > return y; > } > I believe this is expected: as I wrote in my commit message "CASE_VECTOR_PC_RELATIVE is now false with -mpure-code, to avoid generating invalid assembly code with differences from symbols from two different sections (the difference cannot be computed by the assembler)." Maybe there's a possibility to tune this to detect cases where we can do better? > Immediate generation for common cases seems to be screwed up: > > int f3 (void) { return 0x11000000; } > > -O2 -mcpu=cortex-m0 -mpure-code: > > movs r0, #17 > lsls r0, r0, #8 > lsls r0, r0, #8 > lsls r0, r0, #8 > bx lr This is not optimal, but functional, right? > This also regressed Cortex-m23 which previously generated: > > movs r0, #136 > lsls r0, r0, #21 > bx lr > Similar regressions happen with other immediates: > > int f3 (void) { return 0x12345678; } > > -O2 -mcpu=cortex-m23 -mpure-code: > > movs r0, #86 > lsls r0, r0, #8 > adds r0, r0, #120 > movt r0, 4660 > bx lr > > Previously it was: > > movw r0, #22136 > movt r0, 4660 > bx lr OK, I'll check how to fix that. > Also relocations with a small offset should be handled within the > relocation. I'd expect this to never generate an extra addition, let alone > an extra literal pool entry: > > int arr[10]; > int *f4 (void) { return &arr[1]; } > > -O2 -mcpu=cortex-m3 -mpure-code generates the expected: > > movw r0, #:lower16:.LANCHOR0+4 > movt r0, #:upper16:.LANCHOR0+4 > bx lr > > -O2 -mcpu=cortex-m23 -mpure-code generates this: > > movw r0, #:lower16:.LANCHOR0 > movt r0, #:upper16:.LANCHOR0 > adds r0, r0, #4 > bx lr For cortex-m23, I get the same code with and without -mpure-code. > > And cortex-m0 again inserts an extra literal load: > > movs r3, #:upper8_15:#.LC0 > lsls r3, #8 > adds r3, #:upper0_7:#.LC0 > lsls r3, #8 > adds r3, #:lower8_15:#.LC0 > lsls r3, #8 > adds r3, #:lower0_7:#.LC0 > ldr r0, [r3] > adds r0, r0, #4 > bx lr Yes, same problem as in f1() So I think -mpure-code for v6m is not broken, but yes the generated code can be improved. So this may not be relevant to this PR?