[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 david.welch at netronome dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #10 from david.welch at netronome dot com --- How do I get some feedback on this? Do I need to create a new ticket? This is not about a system hang, this is about GCC output that causes data to be executed as code in the pipeline. Was detected through a hang, but perfectly valid address spaces are affected. Quite clearly a gcc bug. The root cause is GCC is feeding data into the pipeline to be executed. Just because ARM didnt publish it doesnt mean their core is without other undocumented problems. The MMU is too late the data has started to execute, so that at best is a hack, not a solution.
[Bug target/82150] New: Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 Bug ID: 82150 Summary: Produces a branch prefetch which causes a hang Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: david.welch at netronome dot com Target Milestone: --- export TARGET=arm-none-eabi ../gcc-$GCCVER/configure --target=$TARGET --prefix=$PREFIX --without-headers --with-newlib --with-gnu-as --with-gnu-ld --enable-languages='c' but we first found this on a 4.8.3, dont have a reason to assume it applies to all versions. take something like this unsigned int more_fun ( unsigned int ); unsigned int fun ( void ) { return(more_fun(0x12344700)+1); } arm-none-eabi-gcc -mthumb -march=armv6 -O2 -c so.c -o so.o : 0: b510push{r4, lr} 2: 4802ldr r0, [pc, #8]; (c ) 4: f7ff fffe bl 0 8: 3001addsr0, #1 a: bd10pop {r4, pc} c: 12344700eorsne r4, r4, #0, 14 And there is the problem. This is not limited to thumb mode unsigned int more_fun ( unsigned int ); unsigned int fun ( void ) { return(more_fun(0xe12fff10)+1); } : 0: e92d4010push{r4, lr} 4: e59f0008ldr r0, [pc, #8]; 14 8: ebfebl 0 c: e281add r0, r0, #1 10: e8bd8010pop {r4, pc} 14: e12fff10bx r0 same problem. Found on an arm11 mpcore but assume that the older arm11s and possibly even armv7s have this issue, will see when I get there. The core does not see pop pc as an unconditional branch it continues to process the instructions in the pipe while the pop is finishing, it prefetches the address in r0 in both of the above cases, because the DATA that follows the pop happens to resemble an instruction, specifically bx but I wonder if other instructions are a problem as well. The prefetch reads the fetch line at whatever address that register that happens to be encoded. This can cause a read of perpherals which are clear on read, or pull a byte out of a uart, or in our case touch an address that doesnt answer on the axi bus and hang the processor. Now because the armv4t didnt support mode switching with a pop using -march=armv4t produces code that doesnt cause the processor to fail. : 0: b510push{r4, lr} 2: 4803ldr r0, [pc, #12] ; (10 ) 4: f7ff fffe bl 0 8: 3001addsr0, #1 a: bc10pop {r4} c: bc02pop {r1} e: 4708bx r1 10: 12344700eorsne r4, r4, #0, 14 I cant possibly be the first person to see this after all of these years (and although I cant think off hand of another instruction set where the pc is also treated like a GPR, there are other targets that are affected), so I am hoping there is already a command line switch other than downgrading wholesale to armvt. If not can we add a command line switch to avoid this problem? I would think a branch to self instruction following the pop would work or like armv4t dont pop into the pc but in arm pop to lr and then bx lr or thumb as you do in armv4t pop to r0-r3 and bx to that.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #2 from david.welch at netronome dot com --- ARM does not have an errata on this for this core from what I was given. Dont know why they would, at best it would fall into the "unpredictable results" category. Errata or not was hoping there could be an option if not one already. the armv4t one is an option but affects more than just this one thing I would assume but dont know gcc internals, so to big of a hammer.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #3 from david.welch at netronome dot com --- The problem exists as well with ldr pc,[something]. I have not dug through gcc but did some compilation experiments, not nearly enough to be 100% sure, but for switch statements the code generated always appears to do a comparison (perhaps after a subtract or other modification, an ldrls pc,[], then an unconditional branch to deal with the last item (or a default). If that is always the rule that is safe. And for a function table, an array of function pointers, it did the math using gprs and then a mov lr,pc ; bx rn. an ldr pc,[] literal pool data will cause this undesired prefetch.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #5 from david.welch at netronome dot com --- it is definitely doing prefetching by not realizing those instructions are unconditional branches. most likely going with strongly ordered rather than the XN bit but noted as a workaround. Since the armv4t does not support the pop pc and there are runtime flags, wanted to first know what options are there or would they have to be added. What other cores have been reported as having this issue, where there any compiler additions made for them?
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #7 from david.welch at netronome dot com --- This is an armv6 not an armv7. So far I have not seen that the mmu or cache or branch prediction is required for proper operation of the core. I have so far not see this on other cores, but still working on that it is very much present on this core. I would rather not have to use the mmu as a kludge.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #8 from david.welch at netronome dot com --- gcc is treating these instructions as unconditional branches, but the core does NOT treat these instructions as unconditional branches. The disconnect is quite clear between the code produced and the core behavior, kludges and workarounds are interesting, but the volume of other similar situations that gcc has responded to in its code generation is confusing here. Why generate code that works for the core in one case but not in another. Can you please elaborate?
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #9 from david.welch at netronome dot com --- Basically gcc is generating a sequence where data starts to execute in the pipe. I cant imagine that is a good idea to let the processor execute data when you can avoid it instead of a pop {...pc} ; some data a pop { ... lr} ; bx lr creates a data hazard, the bx doesnt execute until the register change has resolved. Other cores might not execute the words after a pop in the pipeline if pc is one of the popped values but this core does. Patching this instruction sequence after the execution has started is just a kludge.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #11 from david.welch at netronome dot com --- I wish I had know this when I filed this ticket, there is an ARM Errata for this issue that was issued before or in 2009. 720247: Speculative Instruction fetches can be made anywhere in the memory map I have researched this bug on this core and provided a workaround that ARM was not able or willing. (put a nop after unconditional branch instructions that modify the pc like pop {r4,pc}, but not bx lr...,anything other than another branch instruction that causes a speculative fetch). So if you require an ARM Errata in order to fix something, there you go it exists. It is still present in gcc 10 (has been present all this time). I have not examined gcc 11 yet as it has not been formally released. unsigned int more_fun ( unsigned int ); unsigned int fun ( void ) { return(more_fun(0x12344700)+1); } Disassembly of section .text: : 0: b510push{r4, lr} 2: 4802ldr r0, [pc, #8]; (c ) 4: f7ff fffe bl 0 8: 3001addsr0, #1 a: bd10pop {r4, pc} c: 12344700.word 0x12344700 .thumb .inst.n 0x4700 Disassembly of section .text: <.text>: 0: 4700bx r0 and there is the speculative execution that causes a read (that can be anywhere in the address space) arm-none-eabi-gcc --version arm-none-eabi-gcc (GCC) 10.2.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. One could examine everything after a branch like this for another branch as a real instruction or embedded in the top of the pool a nop may be simpler after each of the at-risk instructions.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #12 from david.welch at netronome dot com --- I my case this was found with a hang, but the problem exists as a read, which means it can cause a read to a read sensitive peripheral causing adverse affects.
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 --- Comment #13 from david.welch at netronome dot com --- Very sorry it has been years since I did this research, a simple nop wont fix it but a branch to self will. bad TEST: push {r4,lr} pop {r4,pc} bx r0 /*.hword 0x4700*/ nop nop bad TEST: push {r4,lr} pop {r4,pc} nop bx r0 /*.hword 0x4700*/ nop nop good TEST: push {r4,lr} pop {r4,pc} b . bx r0 /*.hword 0x4700*/ nop nop
[Bug target/82150] Produces a branch prefetch which causes a hang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82150 david.welch at netronome dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #15 from david.welch at netronome dot com --- Please read the errata and not blow off this ticket. The MMU is not being used, this is a verified problem, acknowledge by ARM as well as being independently discovered. The problem has been present and known by ARM for years, as well as being reported a while ago to gnu/gcc. Use the mmu is not a valid solution to fix a known, demonstrable, bug in the compiler.