https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94485
--- Comment #8 from Dimitri Gorokhovik <dimitri.gorokhovik at free dot fr> --- I was able to reduce same code (see the attached file bug-6.cpp). -- when compiled correctly, running it produces the following (expected) output: cube: ({ 0, 0, 0 }, { 1, 1, 1 }) cube: ({ 0, 0, 1 }, { 1, 1, 2 }) cube: ({ 0, 0, 2 }, { 1, 1, 3 }) cube: ({ 0, 1, 0 }, { 1, 2, 1 }) cube: ({ 0, 1, 1 }, { 1, 2, 2 }) cube: ({ 0, 1, 2 }, { 1, 2, 3 }) cube: ({ 0, 2, 0 }, { 1, 3, 1 }) cube: ({ 0, 2, 1 }, { 1, 3, 2 }) cube: ({ 0, 2, 2 }, { 1, 3, 3 }) cube: ({ 1, 0, 0 }, { 2, 1, 1 }) cube: ({ 1, 0, 1 }, { 2, 1, 2 }) cube: ({ 1, 0, 2 }, { 2, 1, 3 }) cube: ({ 1, 1, 0 }, { 2, 2, 1 }) cube: ({ 1, 1, 1 }, { 2, 2, 2 }) cube: ({ 1, 1, 2 }, { 2, 2, 3 }) cube: ({ 1, 2, 0 }, { 2, 3, 1 }) cube: ({ 1, 2, 1 }, { 2, 3, 2 }) cube: ({ 1, 2, 2 }, { 2, 3, 3 }) cube: ({ 2, 0, 0 }, { 3, 1, 1 }) cube: ({ 2, 0, 1 }, { 3, 1, 2 }) cube: ({ 2, 0, 2 }, { 3, 1, 3 }) cube: ({ 2, 1, 0 }, { 3, 2, 1 }) cube: ({ 2, 1, 1 }, { 3, 2, 2 }) cube: ({ 2, 1, 2 }, { 3, 2, 3 }) cube: ({ 2, 2, 0 }, { 3, 3, 1 }) cube: ({ 2, 2, 1 }, { 3, 3, 2 }) cube: ({ 2, 2, 2 }, { 3, 3, 3 }) count = 27 -- when compiled incorrectly, it prints out: count = 0 Tested with build g++ (GCC) 11.0.0 20200924 (experimental). In order to compile and run: g++ -std=c++17 -O3 -o bug-6 bug-6.cpp && ./bug-6 This builds for implicit '-m64' (x86_64) and produces invalid output. To get valid output, compile with either of the following: -m32 -O0 (instead of -O3) -fno-tree-sra one of: -DFIX_0, -DFIX_1, -DFIX_2, -DFIX_3, -DFIX_4 >From my limited understanding of tree dumps, here is what roughly happens: -- the routine 'begin()', line 183, returns 'struct iterator' by value. The latter has the size of 14 bytes so returned "in registers". Forcing it to be returned via memory ==> issue goes away. (Methods to force: make bigger than 16 bytes, make some fields volatile, use -m32). Note also that, when the routine is evaluated as constexpr (in static_assert), the issue is not reproduced. -- all called routines (pretty much) are inlined inside one call, to 'count_them ()'. Prevent the inlining of the routine 'can_be_incremented ()' ==> issue goes away. (Methods to prevent: define FIX_1.) -- SRA replaces several fields of the 'struct iterator' (line 150), most importantly 'idx_' (line 153). Disable SRA ==> issue goes away (-fno-tree-sra or -O0). This replacement by tree-SRA somehow doesn't propagate the writes to the replacement vars of idx_to the original parts of the structure living "in the return registers". When the return value lives in memory, the writes are propagated correctly. The compiler then eliminates the loop in 'can_be_incremented' and evaluates the call to that routine to 'false' (line 163). Forcibly keeping the loop (-DFIX_2) or replacing it by non-loop code (-DFIX_0) ==> issue goes away.