https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53957
prop_design at protonmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |prop_design at protonmail dot com --- Comment #19 from prop_design at protonmail dot com --- hi everyone, I'm not sure if this is the right place to ask this or not, but it relates to the topic. I can't find the other thread about graphite auto-parallelization that I made a long time ago. I tried gfortran 10.1.0 via MSYS2. It seems to work very well on the latest version of PROP_DESIGN. MP_PROP_DESIGN had some extra loops for benchmarking. I found it made it harder for the optimizer so I deleted that code and just use the 'real' version of the code it was based on called PROP_DESIGN_MAPS. So that's the actual propeller design code with no additional looping for benchmarking purposes. I've found no Fortran compiler and do the auto-parallelization the way I would like. The only code that would implement any at run time actually slowed the code way down instead of sped it up. I still have my original problem with gfortran. That is, at runtime no actual parallelization occurs. The code runs the exact same as if the commands are not present. Oddly though, the code does say it auto-parallelized many loops. Although, not the loops that would really help, but at least it shows it's doing something. That's an improvement from when I started these threads. The problem is if I compile with the following: gfortran PROP_DESIGN_MAPS.f -o PROP_DESIGN_MAPS.exe -O3 -ffixed-form -static -march=x86-64 -mtune=generic -mfpmath=sse -mieee-fp -pthread -ftree-parallelize-loops=2 -floop-parallelize-all -fopt-info-loop It runs the exact same way as if I compile with: gfortran PROP_DESIGN_MAPS.f -o PROP_DESIGN_MAPS.exe -O3 -ffixed-form -static -march=x86-64 -mtune=generic -mfpmath=sse -mieee-fp Again, gfortran does say it auto-parallelize some loops. So it's very odd. I have searched the net and can't find anything that has helped. I'm wondering if for Linux users, the code actually will work in parallel. That would at least narrow the problem down some. I'm using Windows 10 and the code will only run with one core. Compiling both ways it shows 2 threads used for awhile and then drops to 1 thread. The good news from when this was posted is that gfortran ran the code at the same speed as the PGI Community Edition Compiler. Since they just stopped developing that, I switched back to gfortran. I no longer have Intel Fortran to test. That was the compiler that actually did run the code in parallel, but it ran twice as slow instead of twice as fast. That was a year or two ago. I don't know if it's any better now. I'm wondering if there is some sort of issue with -pthread not being able to call anything more than one core on Windows 10. You can download PROP_DESIGN at https://propdesign.jimdofree.com Inside the download are all the *.f files. I also have c.bat files in there with the compiler options I used. The auto-parallelization commands are not present, since they don't seem to be working still. At least on Windows 10. The code now runs much faster than it used to, due to many bug fixes and improvements I've made over the years. However, you can get it to run really slow for testing purposes. In the settings file for the program change the defaults like this: 1 ALLOW VORTEX INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER, NON-DIM, DEFAULT = 2) 2 ALLOW BLADE-TO-BLADE INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER, NON-DIM, DEFAULT = 2) or like this 1 ALLOW VORTEX INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER, NON-DIM, DEFAULT = 2) 1 ALLOW BLADE-TO-BLADE INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER, NON-DIM, DEFAULT = 2) The first runs very slow, the second incredibly slow. I just close the command window once I've seen if the code is running in parallel or not. With the defaults set at 2 for each of those values the code runs so fast you can't really get a sense of what's going on. Thanks for any help, Anthony