[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #58 from Richard Biener --- Thus fixed.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #57 from Richard Biener --- It might be not ideal but it seems unless somebody finds the time to analyze the difference the "fix" did and thereby identifies the problem itself closing the bug is the most efficient way of dealing with it :/
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #56 from Jürgen Reuter --- What do we do now? We know the offending commit, and the commit that fixed (or "fixed") it. Closing? Do we understand what happened here, so why it went wrong and why it got fixed?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #55 from Jürgen Reuter --- Actually, according to my testing, the last commit where the gfortran produced failing code, ishttps://gcc.gnu.org/git/?p=gcc.git;a=commit;h=c496d15954cdeab7f9039328f94a6f62cf893d5f (Aldy Hernandez A singleton irange etc.) and the first one working again is https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1f7e5a7b91862b999aab88ee0319052aaf00f0f1 (Vladimir Makarov) that seems to have fixed it. The commit from Vladimir fixed an issue in RTL, but I am not sure what to conclude from this.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #54 from Jürgen Reuter --- (In reply to Jürgen Reuter from comment #53) > Additional comment: the commit which fixed/"fixed" this offending commit > came between July 3 and July 10. Wildly speculating, it would be this commit maybe, https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf2737cda53a83332db1a1a021653447b05a7e7 ???
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #53 from Jürgen Reuter --- Additional comment: the commit which fixed/"fixed" this offending commit came between July 3 and July 10.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #52 from Jürgen Reuter --- (In reply to Jakub Jelinek from comment #51) > The easiest would be to bisect gcc in the suspected ranges, that way you'd > know for sure which git commit introduced the problem and which > fixed/"fixed" it. > If it is about what the compiler emits, one doesn't have to build whole gcc > from scratch each time, but can just --disable-bootstrap build it and during > bisection > whenever git is updated just ./config.status --recheck; ./config.status; > make -jN in libcpp, libiberty and gcc subdirectories and use f951/gfortran > binariers from that instead of the ones from the initial build to build your > project. This was the offending commit by Richard Sayle, on Saturday June 17: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=96c3539f2a38134cb76d8ab2e924e0dc70b2ccbd = i386: Two minor tweaks to ix86_expand_move. This patch splits out two (independent) minor changes to i386-expand.cc's ix86_expand_move from a larger patch, given that it's better to review and commit these independent pieces separately from a more complex patch. The first change is to test for CONST_WIDE_INT_P before calling ix86_convert_const_wide_int_to_broadcast. Whilst stepping through this function in gdb, I was surprised that the code was continually jumping into this function with operands that obviously weren't appropriate. The second change is to generalize the optimization for efficiently moving a TImode value to V1TImode (via V2DImode), to cover all 128-bit vector modes. Hence for the test case: typedef unsigned long uv2di __attribute__ ((__vector_size__ (16))); uv2di foo2(__int128 x) { return (uv2di)x; } we'd previously move via memory with: foo2: movq%rdi, -24(%rsp) movq%rsi, -16(%rsp) movdqa -24(%rsp), %xmm0 ret with this patch we now generate with -O2 (the same as V1TImode): foo2: movq%rdi, %xmm0 movq%rsi, %xmm1 punpcklqdq %xmm1, %xmm0 ret and with -O2 -msse4 the even better: foo2: movq%rdi, %xmm0 pinsrq $1, %rsi, %xmm0 ret The new test case is unimaginatively called sse2-v1ti-mov-2.c given the original test case just for V1TI mode was called sse2-v1ti-mov-1.c. 2023-06-17 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_move): Check that OP1 is CONST_WIDE_INT_P before calling ix86_convert_wide_int_to_broadcast. Generalize special case for converting TImode to V1TImode to handle all 128-bit vector conversions. gcc/testsuite/ChangeLog * gcc.target/i386/sse2-v1ti-mov-2.c: New test case. === Now the question is, was this commit later reverted? Or changed in a different manner
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #51 from Jakub Jelinek --- The easiest would be to bisect gcc in the suspected ranges, that way you'd know for sure which git commit introduced the problem and which fixed/"fixed" it. If it is about what the compiler emits, one doesn't have to build whole gcc from scratch each time, but can just --disable-bootstrap build it and during bisection whenever git is updated just ./config.status --recheck; ./config.status; make -jN in libcpp, libiberty and gcc subdirectories and use f951/gfortran binariers from that instead of the ones from the initial build to build your project.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #50 from Jürgen Reuter --- How to proceed here? Since almost exactly a month the current gcc git master doesn't show this problem anymore, from our CI I can deduce that the version on July 3rd still failed, while the version on July 10th worked again. Since then the problem didn't show up again. My guess is that something has changed in the optimizer again (maybe because of a different problem/regression). Is it worth to find the offending commit and see when and how it was fixed (maybe even accidentally), or shall we add a gcc testsuite for regression testing, and close this issue?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #49 from Jürgen Reuter --- (In reply to anlauf from comment #48) > (In reply to anlauf from comment #47) > > However, when I use -O2 together with an -march= flag, the code works. > > I've tested -march=sandybridge, -march=haswell, -march=skylake, > > -march=native. > > It FPEs without. > > And it FPEs with core2,nehalem,westmere! > > Next I tried: > > -march=sandybridge -mno-avx # FPE! > -march=sandybridge # OK. Yes, I can fully confirm your findings, also the ones from comment #47. I was looking at the commits in the period June 12-18 which could have caused this, some which seem potential candidates are: 2023-06-18 Honza PR tree-optimization/109849 2023-06-16 Jakub Jelinek PR tree-optimization/110271 * tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children) : Ignore return value from match_arith_overflow, instead call match_uaddc_usubc only if gsi_stmt (gsi) is still stmt. (This one sounds pretty suspicious to me) 2023-06-16 Richard Biener PR tree-optimization/110269 * fold-const.cc (fold_binary_loc): Merge x != 0 folding 2023-06-13 Alexandre Oliva * range-op-float.cc (frange_nextafter): Drop inline. (frelop_early_resolve): Add static. (frange_float): Likewise 2023-06-12 Andrew MacLeod PR tree-optimization/110205 * range-op-float.cc (range_operator::fold_range): Add default FII fold routine. * range-op-mixed.h (class operator_gt): Add missing final overrides. * range-op.cc (range_op_handler::fold_range): Add RO_FII case. 2023-06-12 Andrew MacLeod * gimple-range-gori.cc (gori_compute::condexpr_adjust): Do not pass type. [...] (there is a long list of commits by Andrew on June 12) 2023-06-12 Andre Vieira PR middle-end/110142 * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Don't pass subtype to vect_widened_op_tree and remove subtype parameter, also remove superfluous overloaded function definition. (vect_recog_widen_plus_pattern): Remove subtype parameter and dont pass to call to vect_recog_widen_op_pattern. (vect_recog_widen_minus_pattern): Likewise. (^^^ this one also looks suspicious to me) Any ideas which could have caused the changes?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #48 from anlauf at gcc dot gnu.org --- (In reply to anlauf from comment #47) > However, when I use -O2 together with an -march= flag, the code works. > I've tested -march=sandybridge, -march=haswell, -march=skylake, > -march=native. > It FPEs without. And it FPEs with core2,nehalem,westmere! Next I tried: -march=sandybridge -mno-avx # FPE! -march=sandybridge # OK.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #47 from anlauf at gcc dot gnu.org --- (In reply to Jürgen Reuter from comment #46) > The issue goes away with -O0, with -O1 and with -O2 -fno-tree-vectorize. > I might want to find the offending commit in the week of June 12-19 in the > tree-optimizer, but I don't know whether I have time to do so. Hopefully, > with this > smaller reproducer you can figure out what happens (and help solving it) I recommend adding -ffpe-trap=zero,overflow,invalid to the flags. It is code2.f90 that is sensible to -ftree-vectorize; the two other files can be compiled even with -O3. However, when I use -O2 together with an -march= flag, the code works. I've tested -march=sandybridge, -march=haswell, -march=skylake, -march=native. It FPEs without. Do you see the same?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #46 from Jürgen Reuter --- (In reply to Jürgen Reuter from comment #45) > Created attachment 55492 [details] > Smaller stand-alone reproducer > > I will give more information in a comment, this contains 3 files and a > Makefile. This is a standalone reproducer with a total of 8k lines. It needs to be in three different files, as fusing the 2nd and 3rd file eliminates the optimizer problem of this issue, while fusing the 1st and the 2nd leeds to an ICE in trans-array.c (reported separately) and is independent of this problem here. The issue goes away with -O0, with -O1 and with -O2 -fno-tree-vectorize. I might want to find the offending commit in the week of June 12-19 in the tree-optimizer, but I don't know whether I have time to do so. Hopefully, with this smaller reproducer you can figure out what happens (and help solving it)
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #45 from Jürgen Reuter --- Created attachment 55492 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55492&action=edit Smaller stand-alone reproducer I will give more information in a comment, this contains 3 files and a Makefile.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #44 from Jürgen Reuter --- (In reply to anlauf from comment #43) > Mabye the fprem issue was a red herring from the beginning, pointing to a > problem in a different place. > > I recompiled each module in a loop with -O0 until the FPE went away. > > instances_sub.f90 seems the file someone wants to look at. > > Works at -O0, -O1, -Os, -O2 -fno-tree-vectorize > Fails at -O2, -O3 > > on x86_64-pc-linux-gnu. > > Jürgen: can you reduce this even more with this information? Thanks, this info is helpful. So it is the setting up of the full process via the instances module, which is in agreement with the fact that the simple test with only the RNG did not fail. I will be busy for several days, but hopefully in a week from now, I'll know more.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #43 from anlauf at gcc dot gnu.org --- Mabye the fprem issue was a red herring from the beginning, pointing to a problem in a different place. I recompiled each module in a loop with -O0 until the FPE went away. instances_sub.f90 seems the file someone wants to look at. Works at -O0, -O1, -Os, -O2 -fno-tree-vectorize Fails at -O2, -O3 on x86_64-pc-linux-gnu. Jürgen: can you reduce this even more with this information?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #42 from Jürgen Reuter --- (In reply to Jakub Jelinek from comment #41) > > 0x04f5dc90 is pseudo NaN: > Pseudo Not a Number. The sign bit is meaningless. The 8087 and 80287 treat > this as a Signaling Not a Number. The 80387 and later treat this as an > invalid operand. > So, if that comes from some random number generator, I'd say that random > number generator should be fixed not to create the erroneous cases for > https://en.wikipedia.org/wiki/Extended_precision Hm, the example provided does not use extended precision.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #41 from Jakub Jelinek --- (In reply to Uroš Bizjak from comment #39) > (In reply to anlauf from comment #36) > > Breakpoint 2, rng_stream.rng_stream_s::mmm_mod (x1=330289839997, > > x2=4294967087) at rng_stream_sub.f90:336 > > 336 res = mod (x1, x2) > > (gdb) info float > > R7: Valid 0x401be51fb578 +480507567 > > R6: Valid 0x401be51fb578 +480507567 > > R5: Zero0x +0 > > R4: Zero0x +0 > > R3: Zero0x +0 > > R2: Zero0x +0 > > R1: Zero0x +0 > > =>R0: Special 0x04f5dc90 Unsupported 0x04f5dc90 is pseudo NaN: Pseudo Not a Number. The sign bit is meaningless. The 8087 and 80287 treat this as a Signaling Not a Number. The 80387 and later treat this as an invalid operand. So, if that comes from some random number generator, I'd say that random number generator should be fixed not to create the erroneous cases for https://en.wikipedia.org/wiki/Extended_precision
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #40 from anlauf at gcc dot gnu.org --- (In reply to Jürgen Reuter from comment #38) > At the moment unfortunately too busy to provide a smaller reproducer (which > also still has a small dependency on a dynamic library), I have just commented out the references to dlopen, dlclose, dlsym, dlerror in os_interface_sub.f90, removed the -ldl and can still reproduce the failure.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #39 from Uroš Bizjak --- (In reply to anlauf from comment #36) > Breakpoint 2, rng_stream.rng_stream_s::mmm_mod (x1=330289839997, > x2=4294967087) at rng_stream_sub.f90:336 > 336 res = mod (x1, x2) > (gdb) info float > R7: Valid 0x401be51fb578 +480507567 > R6: Valid 0x401be51fb578 +480507567 > R5: Zero0x +0 > R4: Zero0x +0 > R3: Zero0x +0 > R2: Zero0x +0 > R1: Zero0x +0 > =>R0: Special 0x04f5dc90 Unsupported Here is the problem. FPREM chokes on invalid input in R0. [1] Says that IA (invalid arithmetic) exception is generated for unsupported format, and this is what happened above: #IA Source operand is an SNaN value, modulus is 0, dividend is ∞, or unsupported format. [1] https://www.felixcloutier.com/x86/fprem
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #38 from Jürgen Reuter --- At the moment unfortunately too busy to provide a smaller reproducer (which also still has a small dependency on a dynamic library), but one more info: inserting the explicit operations instead of the intrinsic mod function leads to no more NaNs with the gfortran 14, but still is numerically different from the one with previous gfortran versions: so it looks like it leads to a different random number sequence which is really disturbing.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #37 from anlauf at gcc dot gnu.org --- After the FPE: (gdb) info float R7: Valid 0x401be51fb578 +480507567 R6: Valid 0x401be51fb578 +480507567 R5: Zero0x +0 R4: Zero0x +0 R3: Zero0x +0 R2: Zero0x +0 R1: Zero0x +0 =>R0: Special 0x04f5dc90 Unsupported Status Word: 0x82c1 IE ES SF C1 TOP: 0 Control Word:0x0372 DM UM PM PC: Extended Precision (64-bits) RC: Round to nearest Tag Word:0x0556 Instruction Pointer: 0x00:0x004031d2 Operand Pointer: 0x00:0x011b5708 Opcode: 0xdd45
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #36 from anlauf at gcc dot gnu.org --- Breakpoint 2, rng_stream.rng_stream_s::mmm_mod (x1=330289839997, x2=4294967087) at rng_stream_sub.f90:336 336 res = mod (x1, x2) (gdb) info float R7: Valid 0x401be51fb578 +480507567 R6: Valid 0x401be51fb578 +480507567 R5: Zero0x +0 R4: Zero0x +0 R3: Zero0x +0 R2: Zero0x +0 R1: Zero0x +0 =>R0: Special 0x04f5dc90 Unsupported Status Word: 0x TOP: 0 Control Word:0x0372 DM UM PM PC: Extended Precision (64-bits) RC: Round to nearest Tag Word:0x0556 Instruction Pointer: 0x00:0x Operand Pointer: 0x00:0x Opcode: 0x (gdb) n Program received signal SIGFPE, Arithmetic exception. 0x00678e6a in rng_stream.rng_stream_s::mmm_mod (x1=330289839997, x2=4294967087) at rng_stream_sub.f90:336 336 res = mod (x1, x2)
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #35 from Uroš Bizjak --- (In reply to anlauf from comment #33) > (In reply to Jakub Jelinek from comment #32) > > Then maybe r13-6361-g8020c9c42349f51f75239b > > is the commit that changed it? > > Would be good to put a breakpoint at that instruction and see in which > > iteration it results in NaN and what operands it had... > > Program received signal SIGFPE, Arithmetic exception. > 0x00678f1a in rng_stream.rng_stream_s::mmm_mod (x1=330289839997, > x2=4294967087) at rng_stream_sub.f90:336 > 336 res = mod (x1, x2) > (gdb) p x1 > $1 = 330289839997 > (gdb) p x2 > $2 = 4294967087 > > Strangely enough, a small testcase with these arguments does not fail... Please show the FP registers (and coprocessor state, 'info float') just before the FPREM instruction. These two values (as shown) are nothing special, but perhaps FP register value contains something that FPREM does not like. Also, please show the state after FPREM is executed. Please note that FPREM is performed in the loop, so perhaps a couple of trips through the loop will be needed.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #34 from anlauf at gcc dot gnu.org --- A few more data points: reverting r13-6361-g8020c9c42349f51f75239b on 13-branch fixes the issue: no fprem generated, no FPE. Adding -ffinite-math-only to the modified 13-branch restores the FPE. Compiling the affected module (only) with 12-branch and linking everything with 14-mainline shows the same: fprem is used only with -ffinite-math-only, and I get an FPE even with 12-branch in that case. Same with 11-branch. I am still not sure why it cannot be reproduced with a smaller example, thus I hope that Jürgen can provide a significantly smaller reproducer.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #33 from anlauf at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #32) > Then maybe r13-6361-g8020c9c42349f51f75239b > is the commit that changed it? > Would be good to put a breakpoint at that instruction and see in which > iteration it results in NaN and what operands it had... Program received signal SIGFPE, Arithmetic exception. 0x00678f1a in rng_stream.rng_stream_s::mmm_mod (x1=330289839997, x2=4294967087) at rng_stream_sub.f90:336 336 res = mod (x1, x2) (gdb) p x1 $1 = 330289839997 (gdb) p x2 $2 = 4294967087 Strangely enough, a small testcase with these arguments does not fail...
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 Jakub Jelinek changed: What|Removed |Added CC||uros at gcc dot gnu.org --- Comment #32 from Jakub Jelinek --- Then maybe r13-6361-g8020c9c42349f51f75239b is the commit that changed it? Would be good to put a breakpoint at that instruction and see in which iteration it results in NaN and what operands it had...
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #31 from anlauf at gcc dot gnu.org --- Looking at rng_stream_sub.o with objdump, I see fprem generated for 13 & 14, but not for 12. I haven't yet found an option to suppress its generation and fall back to the behavior of 12-branch.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #30 from anlauf at gcc dot gnu.org --- BTW: you can get a traceback on FP exceptions by adding to the linker options: -ffpe-trap=zero,overflow,invalid
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #29 from Jürgen Reuter --- (In reply to anlauf from comment #28) > Update: recompiling that file with 13-branch fails for me, too. > Playing with the one-line patch for pr86277 makes no difference, fortunately. > > Compiling the file with gfortran-12 seems to work ok. > > So is this really a 14-only regression, or is 13-branch already suspicious? We have gcc 13.1 in our CI, everything works fine there. I am still working on a smaller test, but have very bad connection rn.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #28 from anlauf at gcc dot gnu.org --- Update: recompiling that file with 13-branch fails for me, too. Playing with the one-line patch for pr86277 makes no difference, fortunately. Compiling the file with gfortran-12 seems to work ok. So is this really a 14-only regression, or is 13-branch already suspicious?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #27 from anlauf at gcc dot gnu.org --- (In reply to Jürgen Reuter from comment #26) > It is included here: > https://www.desy.de/~reuter/downloads/repro002.tar.xz > I am working on a smaller example right now. Good. I can reproduce the failure, but here's what others need to know: - I have to rm -f nlo_9_p2.i1.phs nlo_9_p2.m1.vg2 each time *before* running the test. ??? - I am using the modification to rng_stream_sub.f90 from comment#24 with the printout added - I am switching between res = mod (x1, x2) and res = x1 - int(x1/x2) * x2 - I am disabling optimization completely for this file and added to Makefile: rng_stream_sub.o: rng_stream_sub.f90 $(FC) $(FCFLAGS) -c $< -O0 -fdump-tree-original -fdump-tree-optimized which gives (v1 is with intrinsic mod, v2 is with explicitly coded mod): --- rng_stream_sub.f90.005t.original.v1 2023-06-29 20:44:58.148284991 +0200 +++ rng_stream_sub.f90.005t.original.v2 2023-06-29 20:45:45.408160849 +0200 @@ -3,7 +3,7 @@ { real(kind=8) res; - res = __builtin_fmod (*x1, *x2); + res = *x1 - (real(kind=8)) (integer(kind=4)) (*x1 / *x2) * *x2; return res; } as expected. The dump-tree-optimized looks unsuspicious to me.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #26 from Jürgen Reuter --- (In reply to anlauf from comment #25) > Unfortunately, there is no main.f90, which is needed to build whizard. > Indeed, sorry, cf. below > The Makefile needs to be modified to take into account that pythia.f > needs preprocessing, e.g.: > > %.o: %.f > $(FC) $(FCFLAGS) -c $< -cpp > > Furthermore, one needs to compile serially; parallel make does not seem to > be supported. I changed the pythia.f to make the preprocessing unnecessary. > > Can you please provide the missing file? It is included here: https://www.desy.de/~reuter/downloads/repro002.tar.xz I am working on a smaller example right now.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #25 from anlauf at gcc dot gnu.org --- (In reply to Jürgen Reuter from comment #24) > Here is a first reproducer without the need for OCaml, unfortunately a bit > too big to be uploaded, here is the link: > https://www.desy.de/~reuter/downloads/repro001.tar.xz > the tarball contains Fortran files that compile to two binaries, ./whizard > and ./whizard_check. Unfortunately, there is no main.f90, which is needed to build whizard. The Makefile needs to be modified to take into account that pythia.f needs preprocessing, e.g.: %.o: %.f $(FC) $(FCFLAGS) -c $< -cpp Furthermore, one needs to compile serially; parallel make does not seem to be supported. Can you please provide the missing file?
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #24 from Jürgen Reuter --- Here is a first reproducer without the need for OCaml, unfortunately a bit too big to be uploaded, here is the link: https://www.desy.de/~reuter/downloads/repro001.tar.xz the tarball contains Fortran files that compile to two binaries, ./whizard and ./whizard_check. After compilation, perform ./whizard r1.sin to run the program. There will be NaNs generated in our RNG stream random number generator. They originate from an erroneous optimization by the gcc/gfortran tree-optimizer. This code resides in rng_stream_sub.f90, in the function mult_mod. Eliminating the intrinsic function mod and explicitly doing the calculation makes the problem go away. function mult_mod (a, b, c, m) result (v) real(default), intent(in) :: a real(default), intent(in) :: b real(default), intent(in) :: c real(default), intent(in) :: m real(default) :: v integer :: a1 real(default) :: a2 v = a * b + c if (v >= two53 .or. v <= -two53) then a1 = int (a / two17) a2 = a - a1 * two17 v = mmm_mod (a1 * b, m) v = v * two17 + a2 * b + c end if v = mmm_mod (v, m) if (v < 0.0_default) v = v + m contains elemental function mmm_mod (x1, x2) result (res) real(default), intent(in) :: x1, x2 real(default) :: res res = x1 - int(x1/x2) * x2 end function mmm_mod end function mult_mod
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #23 from anlauf at gcc dot gnu.org --- You could check the input arguments for validity, e.g. using ieee_is_finite from the intrinsic ieee_arithmetic module. use, intrinsic :: ieee_arithmetic, only: ieee_is_finite ... if (.not. ieee_is_finite (a)) then print *, "bad: a=", a stop 1 end if As last resort I still recommend what I wrote in comment#15: build (=link) your executable from *.o from your project build tree with known-good objects but replacing one candidate.o by the one from the build tree showing the problem. And I really mean: link only und run.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #22 from Jürgen Reuter --- (In reply to anlauf from comment #21) > I forgot to mention that you need to check that the location where a symptom > is seen sometimes may not be the location of the cause. Indeed, I think you are right and the problem is elsewhere. I don't really know where to continue.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #21 from anlauf at gcc dot gnu.org --- I forgot to mention that you need to check that the location where a symptom is seen sometimes may not be the location of the cause.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #20 from anlauf at gcc dot gnu.org --- If that doesn't help: there appear to be recent optimizations for divmod. Try declaring a1, a2 as volatile.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #19 from Jürgen Reuter --- (In reply to anlauf from comment #18) > (In reply to Jürgen Reuter from comment #17) > > How would I set up such a bisection for the n git commits between June 12 to > > June 19? Unfortunately, I cannot really get a small reproducer > > I didn't mean that. I meant doing a bisection on the .o files of your code. > > But given that you have isolated a procedure, that is not necessary. > > You could try to defeat optimization by using a temporary v0 for v and > declare it as volatile. Would be interesting to see if that makes a > difference. I tried both things, or at least partially, didn't help. It also is a problem only when called in a very complicated setup in our program, in complicated setups, it works. I fear, we have to change the functionality in our program, sadly, if we are not to be stuck for all times to version of gcc < 14.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #18 from anlauf at gcc dot gnu.org --- (In reply to Jürgen Reuter from comment #17) > How would I set up such a bisection for the n git commits between June 12 to > June 19? Unfortunately, I cannot really get a small reproducer I didn't mean that. I meant doing a bisection on the .o files of your code. But given that you have isolated a procedure, that is not necessary. You could try to defeat optimization by using a temporary v0 for v and declare it as volatile. Would be interesting to see if that makes a difference.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #17 from Jürgen Reuter --- How would I set up such a bisection for the n git commits between June 12 to June 19? Unfortunately, I cannot really get a small reproducer
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #16 from Jürgen Reuter --- It seems that it is this function where the NaNs appear: function mult_mod (a, b, c, m) result (v) real(default), intent(in) :: a real(default), intent(in) :: b real(default), intent(in) :: c real(default), intent(in) :: m real(default) :: v integer :: a1 real(default) :: a2 v = a * b + c if (v >= two53 .or. v <= -two53) then a1 = int (a / two17) a2 = a - a1 * two17 v = mod (a1 * b, m) v = v * two17 + a2 * b + c end if v = mod (v, m) if (v < 0.0_default) v = v + m end function mult_mod particularly mod (v, m) gets evaluated to NaN, even if a replace it with v = mod (v0, m) to avoid potential aliasing problems. It appears only in a very complex setup, not in a 100 line program.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 anlauf at gcc dot gnu.org changed: What|Removed |Added CC|anlauf at gmx dot de | --- Comment #15 from anlauf at gcc dot gnu.org --- (In reply to Jürgen Reuter from comment #14) > Did anybody manage to reproduce this? > Download https://whizard.hepforge.org/downloads/?f=whizard-3.1.2.tar.gz > You need OCaml as a prerequisite, though. > Then configure, make, > cd tests/functional_tests > make check TESTS=nlo_9.run > This will fail, as there are NaNs produced in our RNG module which are > presumably caused by this regression in the tree-optimizer. At the moment I > am deeply struggling with generating a reproducer but I don't know how tbh. I may be telling you the obvious, but here's what I do in cases where changes in optimization in new compilers cause failures and recompiling is expensive: - create standalone-version of Fortran code and testcase - have two build trees in parallel, (a) working and (b) failing - relink by successively replacing objects in (a) by those from (b) - run each binary until the failure occurs In your case you are lucky in that you get a crash. If testing is expensive, it may be worth to do bisecting on sets of objects. I avoid building of shared libs for the project to ease testing. Note: there might be multiple bad objects. This works for me even with compilers on platforms, even if that takes a day or two.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #14 from Jürgen Reuter --- Did anybody manage to reproduce this? Download https://whizard.hepforge.org/downloads/?f=whizard-3.1.2.tar.gz You need OCaml as a prerequisite, though. Then configure, make, cd tests/functional_tests make check TESTS=nlo_9.run This will fail, as there are NaNs produced in our RNG module which are presumably caused by this regression in the tree-optimizer. At the moment I am deeply struggling with generating a reproducer but I don't know how tbh.
[Bug tree-optimization/110311] [14 Regression] regression in tree-optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110311 --- Comment #13 from Jürgen Reuter --- I changed the component from fortran to tree-optimization, as the problematic commit during that week was in that component. The only commit in the fortran component turns out to be unproblematic.