[Bug c++/66172] -fno-threadsafe-statics suppresses guard functions but not guard variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66172 --- Comment #3 from Marc Singer eleventen at gmail dot com --- I've come to the same conclusion. My hope was that I could eliminate the guard and force the compiler to initialize block scoped statics at the start of execution. It looks like the standard stands in the way of this simplification.
[Bug c++/66173] -fno-threadsafe-statics suppresses guard functions but not guard variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66173 --- Comment #1 from Marc Singer eleventen at gmail dot com --- I neglected to include information about the version of the compiler. This is a 64 bit compiler on amd64. # g++ --version g++ (Debian 4.9.2-10) 4.9.2 Copyright (C) 2014 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[Bug c++/66173] New: -fno-threadsafe-statics suppresses guard functions but not guard variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66173 Bug ID: 66173 Summary: -fno-threadsafe-statics suppresses guard functions but not guard variables Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eleventen at gmail dot com Target Milestone: --- Created attachment 35553 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35553action=edit Source file demonstrating that the guard variable isn't suppressed. The use of the -fno-threadsafe-statics eliminates the function references to the guard functions, __cxa_guard_acquire __cxa_guard_release but it doesn't eliminate the variables used to guard the initialization. A compiled version of the attached file using the g++ command line therein generated no references to the guard functions, but the guard variable remains. # nm -C guard | grep guard 00600a88 b guard variable for f()::a
[Bug c++/66172] New: -fno-threadsafe-statics suppresses guard functions but not guard variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66172 Bug ID: 66172 Summary: -fno-threadsafe-statics suppresses guard functions but not guard variables Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eleventen at gmail dot com Target Milestone: --- The use of the -fno-threadsafe-statics eliminates the function references to the guard functions, __cxa_guard_acquire __cxa_guard_release but it doesn't eliminate the variables used to guard the initialization. A compiled version of the attached file using the g++ command line therein generated no references to the guard functions, but the guard variable remains. # nm -C guard | grep guard 00600a88 b guard variable for f()::a --- Comment #1 from Marc Singer eleventen at gmail dot com --- I neglected to include the version information: # g++ --version g++ (Debian 4.9.2-10) 4.9.2
[Bug inline-asm/56884] New: ARM thumb16 mnemonic lsls not recognized for CPU cortex-m0.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56884 Bug #: 56884 Summary: ARM thumb16 mnemonic lsls not recognized for CPU cortex-m0. Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: minor Priority: P3 Component: inline-asm AssignedTo: unassig...@gcc.gnu.org ReportedBy: eleven...@gmail.com It looks like the inline assembler when -mcpu=cortex=m0 is selected doesn't recognize the lsls mnemonic. It assembles lsl and emits the instruction that the disassembler identifies as lsls. The hitch is that this mismatch requires different inline assembler when the cpu changes between M0 and M3/M4. On M3, the lsl instruction will not set the condition flags. --- Given a source file: void test () { __asm volatile (lsls r0, #1); } For Cortex-M0, elf@cerise lsl-bug /opt/gcc/bin/arm-none-eabi-gcc -g -Os -c -mcpu=cortex-m0 -mthumb lsl.c -o lsl.o /tmp/ccuEyZSU.s: Assembler messages: /tmp/ccuEyZSU.s:29: Error: instruction not supported in Thumb16 mode -- `lsls r0,#1' If the instruction is changed to lsl r0, #1 the compiler is happy and the emitted machine code is correct. The disassembler accurately identifies the instruction as lsls because that's the only form of lsl that the M0 supports. void test () { __asm volatile (lsl r0, #1); 0: 0040lslsr0, r0, #1 } 2: 4770bx lr So, I think that the issue is only in the inline assembler in that it doesn't accept the lsls opcodeor is there another explanation? --- For references: elf@cerise lsl-bug /opt/gcc/bin/arm-none-eabi-gcc -v -mcpu=cortex-m0 -mthumb Using built-in specs. COLLECT_GCC=/opt/gcc/bin/arm-none-eabi-gcc COLLECT_LTO_WRAPPER=/opt/gcc/libexec/gcc/arm-none-eabi/4.7.2/lto-wrapper Target: arm-none-eabi Configured with: ../gcc-4.7.2/configure --target=arm-none-eabi --prefix=/opt/gcc --enable-multilib --enable-languages=c,c++ --with-newlib --with-gnu-as --with-gnu-ld --disable-nls --disable-shared --disable-threads --with-headers=newlib/libc/include --disable-libssp --disable-libstdcxx-pch --disable-libmudflap --disable-libgomp --disable-werror --with-system-zlib --disable-newlib-supplied-syscalls Thread model: single gcc version 4.7.2 (GCC)
[Bug c/56620] New: Memcpy optimization may lead to unaligned access on ARM Thumb
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56620 Bug #: 56620 Summary: Memcpy optimization may lead to unaligned access on ARM Thumb Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: eleven...@gmail.com Created attachment 29668 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29668 Sample source file Whereas #53016 resolved to an alignment problem with the underlying structures, this is a case where the builtin memcpy optimization emits instructions that may access words on a non-word boundary. The target is an ARM Cortex-M4. The compiler was generated by the summon-gcc project in order to gain access to the hard-float feature of GCC 4.7.2. The problem didn't appear until I implemented faults on unaligned access per the ARM recommendations. The example was compiled with both -O2 and -Os. In both instances the compiler emitted unaligned ldr instructions. Below is pasted the disassembled code at fault. // Here we load the base address of the data array of bytes. a: 4e0cldr r6, [pc, #48] ; (3c copy_unaligned+0x3c) c: f44f 74ff mov.w r4, #510; 0x1fe 10: 19a3addsr3, r4, r6 // r3 has 510 - 16*i, an word unaligned offset assuming the data array aligned. 12: 466dmov r5, sp 14: f103 0710 add.w r7, r3, #16 // This next instruction faults. 18: 6818ldr r0, [r3, #0] 1a: 6859ldr r1, [r3, #4] 1c: 462amov r2, r5 1e: c203stmia r2!, {r0, r1} 20: 3308addsr3, #8 22: 42bbcmp r3, r7 24: 4615mov r5, r2 26: d1f7bne.n 18 copy_unaligned+0x18 3c: .word 0x I recall that a few versions back, GCC started putting stack allocated arrays of bytes on odd or non-word address boundaries. If this is the case, I don't see how memcpy could every legally emit the code above, even if it knew that the array offset was word aligned. While producing the sample, I noticed that it was necessary to have both the unaligned offset like 510 and the indexed offset i*16 to trigger the errant code. Cheers
[Bug c/56620] Memcpy optimization may lead to unaligned access on ARM Thumb
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56620 --- Comment #1 from eleventen at gmail dot com 2013-03-14 18:13:03 UTC --- Created attachment 29669 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29669 Sample source object
[Bug target/56620] Memcpy optimization may lead to unaligned access on ARM Thumb
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56620 --- Comment #3 from Marc Singer eleventen at gmail dot com 2013-03-14 18:26:02 UTC --- The compiler was built as follows: elf@cerise ~/memcpy-bug /opt/gcc/arm-none-eabi/bin/gcc -v Using built-in specs. COLLECT_GCC=/opt/gcc/arm-none-eabi/bin/gcc Target: arm-none-eabi Configured with: ../gcc-4.7.2/configure --target=arm-none-eabi --prefix=/opt/gcc --enable-multilib --enable-languages=c,c++ --with-newlib --with-gnu-as --with-gnu-ld --disable-nls --disable-shared --disable-threads --with-headers=newlib/libc/include --disable-libssp --disable-libstdcxx-pch --disable-libmudflap --disable-libgomp --disable-werror --with-system-zlib --disable-newlib-supplied-syscalls Thread model: single gcc version 4.7.2 (GCC) The invoking command line, available at the top of the sample source file, is reproduced here for clarity. arm-none-eabi-gcc -std=c99 -g -Os -c -mcpu=cortex-m3 -mthumb memcpy-test.c -o memcpy-test.o
[Bug target/56620] Memcpy optimization may lead to unaligned access on ARM Thumb
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56620 Marc Singer eleventen at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID --- Comment #5 from Marc Singer eleventen at gmail dot com 2013-03-14 21:05:42 UTC --- Indeed the compiler documentation states that the ARMv6M and older default to no unaligned accesses, but that changed in v7. Thanks.
[Bug target/56620] Memcpy optimization may lead to unaligned access on ARM Thumb
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56620 --- Comment #6 from Marc Singer eleventen at gmail dot com 2013-03-14 21:35:54 UTC --- For the sake of posterity, the Cortex-M3 and M4 do handle unaligned accesses properly in hardware though with the expected performance penalty. It is the fact that I enforced alignment by making configuration changes to the MCU that caused the issue. And I did so on the recommendation of ARM which isn't universally justified in the Cortex TRM: To ensure a smooth transition, ARM recommends that code designed to operate on other Cortex-M profile processor architectures obey the following rules and configure the Configuration and Control Register (CCR) appropriately: • • • use word transfers only to access registers in the NVIC and System Control Space (SCS). treat all unused SCS registers and register fields on the processor as Do-Not-Modify. configure the following fields in the CCR: — STKALIGN bit to 1 — UNALIGN_TRP bit to 1 — Leave all other bits in the CCR register as their original value.