https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53957

prop_design at protonmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |prop_design at protonmail dot 
com

--- Comment #19 from prop_design at protonmail dot com ---
hi everyone,

I'm not sure if this is the right place to ask this or not, but it relates to
the topic. I can't find the other thread about graphite auto-parallelization
that I made a long time ago.

I tried gfortran 10.1.0 via MSYS2. It seems to work very well on the latest
version of PROP_DESIGN. MP_PROP_DESIGN had some extra loops for benchmarking. I
found it made it harder for the optimizer so I deleted that code and just use
the 'real' version of the code it was based on called PROP_DESIGN_MAPS. So
that's the actual propeller design code with no additional looping for
benchmarking purposes.

I've found no Fortran compiler and do the auto-parallelization the way I would
like. The only code that would implement any at run time actually slowed the
code way down instead of sped it up.

I still have my original problem with gfortran. That is, at runtime no actual
parallelization occurs. The code runs the exact same as if the commands are not
present. Oddly though, the code does say it auto-parallelized many loops.
Although, not the loops that would really help, but at least it shows it's
doing something. That's an improvement from when I started these threads.

The problem is if I compile with the following:

gfortran PROP_DESIGN_MAPS.f -o PROP_DESIGN_MAPS.exe -O3 -ffixed-form -static
-march=x86-64 -mtune=generic -mfpmath=sse -mieee-fp -pthread
-ftree-parallelize-loops=2 -floop-parallelize-all -fopt-info-loop


It runs the exact same way as if I compile with:

gfortran PROP_DESIGN_MAPS.f -o PROP_DESIGN_MAPS.exe -O3 -ffixed-form -static
-march=x86-64 -mtune=generic -mfpmath=sse -mieee-fp


Again, gfortran does say it auto-parallelize some loops. So it's very odd. I
have searched the net and can't find anything that has helped.

I'm wondering if for Linux users, the code actually will work in parallel. That
would at least narrow the problem down some. I'm using Windows 10 and the code
will only run with one core. Compiling both ways it shows 2 threads used for
awhile and then drops to 1 thread.

The good news from when this was posted is that gfortran ran the code at the
same speed as the PGI Community Edition Compiler. Since they just stopped
developing that, I switched back to gfortran. I no longer have Intel Fortran to
test. That was the compiler that actually did run the code in parallel, but it
ran twice as slow instead of twice as fast. That was a year or two ago. I don't
know if it's any better now.

I'm wondering if there is some sort of issue with -pthread not being able to
call anything more than one core on Windows 10.

You can download PROP_DESIGN at https://propdesign.jimdofree.com

Inside the download are all the *.f files. I also have c.bat files in there
with the compiler options I used. The auto-parallelization commands are not
present, since they don't seem to be working still. At least on Windows 10.

The code now runs much faster than it used to, due to many bug fixes and
improvements I've made over the years. However, you can get it to run really
slow for testing purposes. In the settings file for the program change the
defaults like this:

1           ALLOW VORTEX INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER, NON-DIM,
DEFAULT = 2)
2           ALLOW BLADE-TO-BLADE INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER,
NON-DIM, DEFAULT = 2)

or like this

1           ALLOW VORTEX INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER, NON-DIM,
DEFAULT = 2)
1           ALLOW BLADE-TO-BLADE INTERACTIONS (1) FOR YES (2) FOR NO (INTEGER,
NON-DIM, DEFAULT = 2)

The first runs very slow, the second incredibly slow. I just close the command
window once I've seen if the code is running in parallel or not. With the
defaults set at 2 for each of those values the code runs so fast you can't
really get a sense of what's going on.

Thanks for any help,

Anthony

Reply via email to