Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
On 5/7/24 20:44, Toon Moene wrote: On 5/7/24 20:35, Andrew Pinski wrote: On Tue, May 7, 2024 at 11:31 AM Toon Moene wrote: On 5/7/24 00:02, Toon Moene wrote: OK, perhaps on the aarch64 I need the following option to make the comparison fair: ‘rdma’ Enable Round Double Multiply Accumulate instructions. This is on by default for -march=armv8.1-a. I.e., -mno-rdma (I hope that's correct - I'll will try that when the Sun rises again and I have some power to run the AArch64 machine ...). Well, I did two independent runs with gfortran-13.2 and the following options: -O3 -march=armv8.1-a+rdma and -O3 -march=armv8.1-a+nordma No difference in the number of error runs exceeding the prescribed thresholds. So, unless I made a mistake in the option specification (or the compiler silently ignored them because they were not applicable to my machine - ugh), the cause of the problem lies elsewhere. AARCH64 armv8-a has FMA as part of its base ISA. So you want to try with `-ffp-contract=off` instead. RDMA turns on/off instructions which are not used by the auto-vectorizer (yet) and used by intrinsics for them (If I read the code correctly). Ah, thanks - I'll try that tomorrow. Yep, that did it: --> LAPACK TESTING SUMMARY <-- Processing LAPACK Testing output found in the TESTING directory SUMMARY nb test run numerical error other error === = REAL1327023 0 (0.000%)0 (0.000%) DOUBLE PRECISION1327845 0 (0.000%)0 (0.000%) COMPLEX 786775 0 (0.000%)0 (0.000%) COMPLEX16 787842 0 (0.000%)0 (0.000%) --> ALL PRECISIONS 4229485 0 (0.000%)0 (0.000%) So, obviously, the threshold values for these tests were derived on a machine without fused-multiply-add, or without using them if present. This is perhaps not surprising, as the default build-and-test setup (make.inc.example) of the LAPACK package as distributed from netlib.org lists as the compiler choice: FC = gfortran FFLAGS = -O2 -frecursive FFLAGS_DRV = $(FFLAGS) FFLAGS_NOOPT = -O0 -frecursive which means that the choice of architecture on x86-64 would be "generic" and wouldn't include FMA instructions. If the authors had used that setup in deriving the thresholds, it is not surprising that you need -ffp-contract=off on architectures that include FMA instructions by default. Thanks for helping me out with this ! -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
On 5/7/24 20:35, Andrew Pinski wrote: On Tue, May 7, 2024 at 11:31 AM Toon Moene wrote: On 5/7/24 00:02, Toon Moene wrote: OK, perhaps on the aarch64 I need the following option to make the comparison fair: ‘rdma’ Enable Round Double Multiply Accumulate instructions. This is on by default for -march=armv8.1-a. I.e., -mno-rdma (I hope that's correct - I'll will try that when the Sun rises again and I have some power to run the AArch64 machine ...). Well, I did two independent runs with gfortran-13.2 and the following options: -O3 -march=armv8.1-a+rdma and -O3 -march=armv8.1-a+nordma No difference in the number of error runs exceeding the prescribed thresholds. So, unless I made a mistake in the option specification (or the compiler silently ignored them because they were not applicable to my machine - ugh), the cause of the problem lies elsewhere. AARCH64 armv8-a has FMA as part of its base ISA. So you want to try with `-ffp-contract=off` instead. RDMA turns on/off instructions which are not used by the auto-vectorizer (yet) and used by intrinsics for them (If I read the code correctly). Ah, thanks - I'll try that tomorrow. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
On 5/7/24 00:02, Toon Moene wrote: OK, perhaps on the aarch64 I need the following option to make the comparison fair: ‘rdma’ Enable Round Double Multiply Accumulate instructions. This is on by default for -march=armv8.1-a. I.e., -mno-rdma (I hope that's correct - I'll will try that when the Sun rises again and I have some power to run the AArch64 machine ...). Well, I did two independent runs with gfortran-13.2 and the following options: -O3 -march=armv8.1-a+rdma and -O3 -march=armv8.1-a+nordma No difference in the number of error runs exceeding the prescribed thresholds. So, unless I made a mistake in the option specification (or the compiler silently ignored them because they were not applicable to my machine - ugh), the cause of the problem lies elsewhere. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
On 5/6/24 23:35, Toon Moene wrote: On 5/6/24 23:32, Andrew Pinski wrote: Did you test x86_64 with -march=native (or with -mfma) or just -O3? The reason why I am asking is aarch64 includes FMA by default while x86_64 does not. Most recent x86_64 includes an FMA instruction but since the base ISA does not include it, it is not enabled by default. I am suspect the aarch64 "excessive exceeding the threshold for errors" are all caused by the more use of FMA rather than anything else. Aah, I forgot to include that tidbit, because its readily apparent from the full logs - I compiled with *just* -O3. Thanks, OK, perhaps on the aarch64 I need the following option to make the comparison fair: ‘rdma’ Enable Round Double Multiply Accumulate instructions. This is on by default for -march=armv8.1-a. I.e., -mno-rdma (I hope that's correct - I'll will try that when the Sun rises again and I have some power to run the AArch64 machine ...). I must say I didn't expected this - the discussion on the "Intel" side was always that the fact that fused multiply-add instruction didn't express the "real computations" expressed by the program meant that they were evil and therefore had to be hidden behind some special compiler option that made it very clear that those instruction were evil. Again, thanks to point me to the difference (in philosophy, if not math) between to the two continents (i.e., the Americas and Europe's - before Brexit - England :-) Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Tests of gcc development beyond its testsuite (in this case, for gfortran)
On 5/6/24 23:32, Andrew Pinski wrote: Did you test x86_64 with -march=native (or with -mfma) or just -O3? The reason why I am asking is aarch64 includes FMA by default while x86_64 does not. Most recent x86_64 includes an FMA instruction but since the base ISA does not include it, it is not enabled by default. I am suspect the aarch64 "excessive exceeding the threshold for errors" are all caused by the more use of FMA rather than anything else. Aah, I forgot to include that tidbit, because its readily apparent from the full logs - I compiled with *just* -O3. Thanks, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Tests of gcc development beyond its testsuite (in this case, for gfortran)
I have now, for some time, ran LAPACK's test programs on my gcc/gfortran builds on both on the x86_64-linux-gnu architecture, as well as the aarch64-linux-gnu one (see, e.g., http://moene.org/~toon/lapack-amd64-gfortran13-O3). The results are rather alarming - this is r15-202 for aarch64 vs r15-204 for x86_64 (compiled with -O3): diff lapack-amd64-gfortran15-O3 lapack-aarch64-gfortran15-O3 3892,3895c3928,3931 < REAL 1327023 0 (0.000%)0 (0.000%) < DOUBLE PRECISION 1300917 6 (0.000%)0 (0.000%) < COMPLEX786775 0 (0.000%)0 (0.000%) < COMPLEX16 787842 0 (0.000%)0 (0.000%) --- > REAL 1317063 71 (0.005%)0 (0.000%) > DOUBLE PRECISION 1318331 54 (0.004%)4 (0.000%) > COMPLEX767023 390 (0.051%)0 (0.000%) > COMPLEX16 772338 305 (0.039%)0 (0.000%) 3897c3933 < --> ALL PRECISIONS 4202557 6 (0.000%)0 (0.000%) --- > --> ALL PRECISIONS 4174755 820 (0.020%)4 (0.000%) Note the excessive exceeding the threshold for errors on the aarch64 side (>). Of course, this is only an excerpt of the full log file - there is more information in it to zoom in on the errors on the aarch64 side (note that the x86_64 side is not faultless). Is there a way to pass this information to our websites, so that we do not "forget" this - or in the alternative, follow the progress in solving this ? Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Sourceware mitigating and preventing the next xz-backdoor
On 4/3/24 20:25, Ian Lance Taylor wrote: Note that the attack really didn't have anything to do with compressing data. The library used an IFUNC to change the PLT of a different function, so it effectively took control of the code that verified the cryptographic key. The only part of the attack that involved compression was the fact that it happened to live in a compression library. And it wouldn't matter whether the code that verified the cryptographic key was run as root either; the effect of the attack was to say that the key was OK, and that sshd should execute the command, and of course that execution must be done on behalf of the requesting user, which (as I understand it) could be root. Ah, OK - that's what I missed. Does your explanation mean that - if, as I do in my sshd config file - you *forbid* root access via sshd in *any* way, you wouldn't be vulnerable ? Thanks, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Sourceware mitigating and preventing the next xz-backdoor
On 4/1/24 17:06, Mark Wielaard wrote: A big thanks to everybody working this long Easter weekend who helped analyze the xz-backdoor and making sure the impact on Sourceware and the hosted projects was minimal. Thanks for those efforts ! Now, I have seen two more days of thinking about this vulnerability ... but no one seem to address the following issues: A hack was made in liblzma, which, when the code was executed by a daemon that by virtue of its function, *has* to be run as root, was effective. Two questions arise (as far as I am concerned): 1. Do daemons like sshd *have* to be linked with shared libraries ? Or could it be left to the security minded of the downstream (binary) distributions to link it statically with known & proven correct libraries ? 2. Is it a limitation of the Unix / Linux daemon concept that, once such a process needs root access, it has to have root access *always* - even when performing trivial tasks like compressing data ? I recall quite well (vis-a-vis question 2) that the VMS equivalent would drop all privileges at the start of the code, and request only those relevant when actually needed (e.g., to open a file for reading that was owned by [the equivalent on VMS] of root - or perform other functions that only root could do), and then drop them immediately afterwards again. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
aarch64 testresults of r14-7229 fails math/cmplx in the libgo tests.
Hi to the AARCH64 team, I managed to do a (bog-standard normal) bootstrap on my Aarch64 system. See: https://gcc.gnu.org/pipermail/gcc-testresults/2024-January/805347.html It got this for the libgo tests: === libgo tests === Running target unix FAIL: math/cmplx === libgo Summary === # of expected passes195 # of unexpected failures1 /dev/shm/bld2283/./gcc/gccgo version 14.0.1 20240114 (experimental) [master r14-7229-ged5bf2080c5] (GCC) Subsequently, I ran LAPACK's test programs (using lapack-3.11.0) with optimization options -O3 -march=native -mtune=native, and I got the following differences with respect to the same run using gfortran 13 as installed on Debian testing (updated last Saturday): diff thunderx-gfortran13.txt thunderx-gfortran14.txt ... 201,204c341,344 < REAL 1306803 30 (0.002%)0 (0.000%) < DOUBLE PRECISION 1318331 56 (0.004%)4 (0.000%) < COMPLEX770551 298 (0.039%)0 (0.000%) < COMPLEX16 779394 98 (0.013%)0 (0.000%) --- > REAL 1317063 71 (0.005%)0 (0.000%) > DOUBLE PRECISION 1318331 54 (0.004%)4 (0.000%) > COMPLEX767023 390 (0.051%)0 (0.000%) > COMPLEX16 174950 41371 (23.647%) 64 (0.037%) 206c346 < --> ALL PRECISIONS 4175079 482 (0.012%)4 (0.000%) --- > --> ALL PRECISIONS 3577367 41886 (1.171%)68 (0.002%) Note that the only differences are in COMPLEX16 (i.e., complex computations using 64 bit REAL and IMAGINARY parts). Perhaps the failing libgo test case is sufficient to track this down ... I suppose you can reproduce that one more easily. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Complex numbers support: discussions summary
Sylvain, Is this on a branch in your github repository https://github.com/kalray/gcc somewhere ? That would make it easier to test it for me (and probably others). See for instance my mail here (d.d. Thu Oct 5 14:45:05 GMT 2023): https://gcc.gnu.org/pipermail/gcc/2023-October/242643.html Thanks in advance. Kind regards, Toon Moene. On 10/16/23 11:14, Sylvain Noiry via Gcc wrote: Hi, We are trying to update our patches on complex numbers to take into account what has been discussed. The main change from our previous patches consists of replacing vectors of complex types with classical vectors of real types (ex V4SF instead of V2SC) associated with existing complex opcodes (like .COMPLEX_MUL) when vectorizing. Non vectored complex modes are also replaced by vectors of two reals at the end of the middle-end (ex SC to V2SF), so that it can reuse already existing patterns. Indeed, non complex specific operations like an addition does not require an specific pattern anymore, and already implementing patterns like cmul, cmul_conj, cadd90,... can be used. To do so, the cplxlower pass has been cut into two passes: - The first one replace complex specific opcodes with dedicated opcodes (like .COMPLEX_MUL replacing MUL_EXPR with SC mode), but complex modes are kept at this point. Unsupported native operations are also lowered, because we assume that it's better to lower and hope for standard optimizations in the middle-end than trying to vectorize with near-zero chance, and then lower only after. - The second one almost only remaps non vectored complex modes into vector of two reals (like SC to V2SF). So the vectorizer takes complex modes as input but vectorize with vectors of real modes (ex V4SF vector mode for SC). Because complex specific opcodes have been set before, no confusion with real operations is possible. We also may use vectors of two reals as inputs, but vectorizing small vector modes into bigger ones (like V2SF to V4SF) is not possible. Here are some advantages of this new approach: - No more vectors of complex modes - The vectorization of complex operations is improved, because split and unified vectored statements can easely be mixed as it uses the same vector type. We can also imagine to test multiple options (First: native vectored, second: split vectored, third: unified scalar,...). - It reuses patterns for vectors of two reals for non complex specific operations, and also already existing complex patterns like cmul implemented on aarch64, which could mean almost free performance gains on many targets. On the performance side, we can still exploit the full potential of complex instructions on KVX. To illustrate the gains on aarch64 without rewriting any patterns (except a mov), here is the assembly generated for a vector complex mul mul add with -O2 -mcpu=neoverse-v1 (and without ffast-math like with SLP): void vfmma (_Complex float a[restrict N], _Complex float b[restrict N], _Complex float c[restrict N], _Complex float d[restrict N]) { for (int i = 0; i < N; i++) c[i] += a[i] * b[i] * d[i]; } vfmma: movi v3.4s, 0 mov x4, 0 .align 5 .L2: ldr q2, [x1, x4] mov v1.16b, v3.16b ldr q0, [x0, x4] fcmla v1.4s, v0.4s, v2.4s, #0 fcmla v1.4s, v0.4s, v2.4s, #90 ldr q0, [x2, x4] ldr q2, [x3, x4] fcmla v0.4s, v2.4s, v1.4s, #0 fcmla v0.4s, v2.4s, v1.4s, #90 str q0, [x2, x4] add x4, x4, 16 cmp x4, 256 bne .L2 ret We have only done some experimentation with this approach. If you think that it could be interesting we will try to develop it more. Thanks, Sylvain -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Complex numbers support: discussions summary
On 9/26/23 20:40, Toon Moene wrote: On 9/26/23 09:30, Richard Biener via Gcc wrote: On Mon, Sep 25, 2023 at 5:17 PM Sylvain Noiry via Gcc wrote: As I said at the end of the presentation, we have written a paper which explains our implementation in details. You can find it on the wiki page of the Cauldron (https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=view&target=Exposing+Complex+Numbers+to+Target+Back-ends+%28paper%29.pdf). Thanks for the detailed presentation at the Cauldron. My personal summary is that I'm less convinced delaying lowering is the way to go. Thanks Sylvain for the quick summary of the discussion - it helps a great deal now that the discussion is still fresh in our memory. I found time today to run some tests. First of all, the result of the gcc test harness as applied to the top of the complex/kvx branch in the https://github.com/kalray/gcc repository: https://gcc.gnu.org/pipermail/gcc-testresults/2023-October/797627.html I think there are several complex failures here that are not in "standard" 12.2 release (for x86_64-linux-gnu). I also compiled all of lapack-3.11.0 with that compiler and obtained the same results as with gcc/gfortran 13.2: --> LAPACK TESTING SUMMARY <-- Processing LAPACK Testing output found in the TESTING directory SUMMARY nb test run numerical error other error === = REAL1327023 0 (0.000%)0 (0.000%) DOUBLE PRECISION1300917 6 (0.000%)0 (0.000%) COMPLEX 786775 0 (0.000%)0 (0.000%) COMPLEX16 787842 0 (0.000%)0 (0.000%) --> ALL PRECISIONS 4202557 6 (0.000%)0 (0.000%) Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Test with an lto-build of libgfortran.
On 9/28/23 21:26, Jakub Jelinek wrote: It is worse than that, usually the LTO format changes e.g. any time any option or parameter is added on a release branch (several times a year) and at other times as well. Though, admittedly GCC is the single package that actually could get away with LTO in lib*.a libraries, at least in some packagings (if the static libraries are in gcc specific subdirectories rather than say /usr/lib{,64} or similar and if the packaging of gcc updates both the compiler and corresponding static libraries in a lock-step. Because in that case LTO in there will be always used only by the same snapshot from the release branch and so should be compatible with the LTO in it. This might be an argument to make it a configure option, e.g. --enable-lto-runtime. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Test with an lto-build of libgfortran.
On 9/28/23 07:33, Thomas Koenig wrote: Hi Toon, [ I wrote: ] The full question of "lto-ing" run time libraries is more complicated than just "whether it works" as those who attended the BoF will recall. I didn't attend the Cauldron (but that discussion would have been very interesting). I think for libgfortran, a first step would be additional work to get declarations on both sides to agree (which is worth doing anyway). Best regards Thomas The big problem in *distributing* GCC (i.e., the collection) with lto'd run-time libraries is that the format of the lto structure changes with releases. If a compiler (by accident) picks up a run time library with non-matching lto objects, it might crash (or "introduce subtle errors in a once working program"). I.e., like the problem the gfortran community had with the changing format of our .mod files. But it would be a big win for Fortran ... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Test with an lto-build of libgfortran.
Hi all, During the GNU Tools Cauldron we discussed (at the BoF: IPA & LTO) the possibility (and hazards) of building the run time libraries for various compilers with -flto, enabling an -flto -static linking of programs with the run time library available during link time optimizations. Today I tried that on my (AMD Ryzen 7 5800U) laptop with gcc version 14.0.0 20230926 (experimental) [master r14-4282-g53daf67fd55] (GCC) with the following "quick hack": diff --git a/libgfortran/configure b/libgfortran/configure index cd176b04a14..69a2b4a8881 100755 --- a/libgfortran/configure +++ b/libgfortran/configure @@ -5959,11 +5959,11 @@ fi # Add -Wall -fno-repack-arrays -fno-underscoring if we are using GCC. have_real_17=no if test "x$GCC" = "xyes"; then - AM_FCFLAGS="-I . -Wall -Werror -fimplicit-none -fno-repack-arrays -fno-underscoring" + AM_FCFLAGS="-I . -Wall -Werror -fimplicit-none -fno-repack-arrays -fno-underscoring -flto" ## We like to use C11 and C99 routines when available. This makes ## sure that ## __STDC_VERSION__ is set such that libc includes make them available. - AM_CFLAGS="-std=gnu11 -Wall -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wextra -Wwrite-strings -Werror=implicit-function-declaration -Werror=vla" + AM_CFLAGS="-std=gnu11 -Wall -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wextra -Wwrite-strings -Werror=implicit-function-declaration -Werror=vla -flto" ## Compile the following tests with the same system header contents ## that we'll encounter when compiling our own source files. CFLAGS="-std=gnu11 $CFLAGS" The build of this compiler (languages=fortran) completed without problems (no test results - not enough time). I then proceeded to build LAPACK with the following build options: CFLAGS = -O3 -flto -flto-partition=none -static and FFLAGS = -O3 -flto -flto-partition=none -static This gave the same test results of the LAPACK test suite as the build with the same compiler, but without an lto'd libgfortran. The lto-ing of libgfortran did succeed, because I did get a new warning: gfortran -O3 -flto -flto-partition=none -static -o xlintstrfz zchkrfp.o zdrvrfp.o zdrvrf1.o zdrvrf2.o zdrvrf3.o zdrvrf4.o zerrrfp.o zlatb4.o zlaipd.o zlarhs.o zsbmv.o zget04.o zpot01.o zpot03.o zpot02.o chkxer.o xerbla.o alaerh.o aladhd.o alahd.o alasvm.o ../../libtmglib.a ../../liblapack.a ../../librefblas.a In function 'xtoa_big', inlined from 'write_z' at /home/toon/compilers/gcc/libgfortran/io/write.c:1296:11, inlined from 'formatted_transfer_scalar_write' at /home/toon/compilers/gcc/libgfortran/io/transfer.c:2136:4: /home/toon/compilers/gcc/libgfortran/io/write.c:1222:6: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 1222 | *q = '\0'; | ^ /home/toon/compilers/gcc/libgfortran/io/write.c: In function 'formatted_transfer_scalar_write': /home/toon/compilers/gcc/libgfortran/io/write.c:1291:8: note: at offset [34, 4294967294] into destination object 'itoa_buf' of size 33 1291 | char itoa_buf[GFC_XTOA_BUF_SIZE]; |^ which was (of course) not given with a non-lto libgfortran. The full question of "lto-ing" run time libraries is more complicated than just "whether it works" as those who attended the BoF will recall. Hope this helps, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Complex numbers support: discussions summary
On 9/26/23 09:30, Richard Biener via Gcc wrote: On Mon, Sep 25, 2023 at 5:17 PM Sylvain Noiry via Gcc wrote: As I said at the end of the presentation, we have written a paper which explains our implementation in details. You can find it on the wiki page of the Cauldron (https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=view&target=Exposing+Complex+Numbers+to+Target+Back-ends+%28paper%29.pdf). Thanks for the detailed presentation at the Cauldron. My personal summary is that I'm less convinced delaying lowering is the way to go. Thanks Sylvain for the quick summary of the discussion - it helps a great deal now that the discussion is still fresh in our memory. Some thought I came up with (of course, only after the end of the conference): In what way is the handling of the complex type different from that of the 128 bit real (i.e., float) type ? Both are not implemented on most architectures; on most they require two registers (or possibly two memory location that do not necessarily have to be adjacent) to be implemented. Yet both are supported by the middle end - consider the clear equivalence of the handling of variables a and b when looking at the result of -fdump-tree-ssa (on x86_64) for: cat 128.f90 parameter (iq=kind(1q0)) real(kind=iq) :: a, b read*, a, b print*, a / b end and: cat complex.f90 complex a,b read*,a,b print*,a/b end Hope this helps for a continuing fruitful discussion. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Complex numbers in compilers - upcoming GNU Tools Cauldron.
On 9/12/23 11:25, Richard Biener wrote: On Tue, Sep 5, 2023 at 10:44 PM Toon Moene wrote: This is even obvious in weather forecasting software I have to deal with *today* (all written in Fortran). Some models use complex variables to encode the "spectral" (wave-decomposed) computations in parts where that is useful - others just "degrade" those algorithms to explicitly use reals. Lack of applications / benchmarks using complex numbers is also a problem for any work on this. Yes, a certain amount of circular "reasoning" is at work here. However, there is sufficient Fortran programming out there to study how to compile complex number arithmetic ... LAPACK and its test programs. I realize that they are not "benchmarks" in the sense that they do not give you a measure how to speed up the code the compiler generates; but they are real-life complex number algorithms coded in a programming language that had complex number support from the beginning. Hope this helps, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Complex numbers in compilers - upcoming GNU Tools Cauldron.
This is going to be an interesting discussion. In the upcoming GNU Tools Cauldron meeting the representation of complex numbers in GCC will be discussed from the following "starting point": "Complex numbers are used to describe many physical phenomenons and are of prime importance in data signal processing. Nevertheless, despite being part of the C and C++ standards since C99, they are still not completely first class citizens in mainstream compilers." *This* is from the Fortran 66 Standard (http://moene.org/~toon/f66.pdf - a photocopy of the 1966 Standard): - - - - - Chapter 4. Data Types: ... 4.2.4 Complex Type. A complex datum is processor approximation to the value of a complex number. ... - - - - - I can recall people complaining about the way complex arithmetic was handled by compilers since the late 70s. This is even obvious in weather forecasting software I have to deal with *today* (all written in Fortran). Some models use complex variables to encode the "spectral" (wave-decomposed) computations in parts where that is useful - others just "degrade" those algorithms to explicitly use reals. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
/home/toon/compilers/gcc/libgfortran/generated/matmul_i1.c:1781:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in vpternlog_redundant_operand_mask, at
Wonder if I am the only one to see this: https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/792616.html To quote: during RTL pass: split1 /home/toon/compilers/gcc/libgfortran/generated/matmul_i1.c: In function 'matmul_i1_avx512f': /home/toon/compilers/gcc/libgfortran/generated/matmul_i1.c:1781:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in vpternlog_redundant_operand_mask, at config/i386/i386.cc:19460 1781 | } | ^ during RTL pass: split1 /home/toon/compilers/gcc/libgfortran/generated/matmul_i2.c: In function 'matmul_i2_avx512f': /home/toon/compilers/gcc/libgfortran/generated/matmul_i2.c:1781:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in vpternlog_redundant_operand_mask, at config/i386/i386.cc:19460 1781 | } | ^ 0x7a5cc7 rtl_check_failed_type2(rtx_def const*, int, int, int, char const*, int, char const*) /home/toon/compilers/gcc/gcc/rtl.cc:761 0x82bf8d vpternlog_redundant_operand_mask(rtx_def**) /home/toon/compilers/gcc/gcc/config/i386/i386.cc:19460 0x1f1295b split_44 /home/toon/compilers/gcc/gcc/config/i386/sse.md:12730 0x1f1295b split_63 /home/toon/compilers/gcc/gcc/config/i386/sse.md:28428 0xe7663b try_split(rtx_def*, rtx_insn*, int) /home/toon/compilers/gcc/gcc/emit-rtl.cc:3800 0xe76cff try_split(rtx_def*, rtx_insn*, int) /home/toon/compilers/gcc/gcc/emit-rtl.cc:3972 0x11b2938 split_insn /home/toon/compilers/gcc/gcc/recog.cc:3385 0x11b2eff split_all_insns() /home/toon/compilers/gcc/gcc/recog.cc:3489 0x11dd9c8 execute /home/toon/compilers/gcc/gcc/recog.cc:4413 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. make[3]: *** [Makefile:4584: matmul_i1.lo] Error 1 make[3]: *** Waiting for unfinished jobs -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: [RFC] Exposing complex numbers to target backends
On 7/5/23 17:12, Sylvain Noiry via Gcc wrote: Hi, My name is Sylvain, I am an intern at Kalray and I work on improving the GCC backend for the KVX target. The KVX ISA has dedicated instructions for the handling of complex numbers, which cannot be selected by GCC due to how complex numbers are handled internally. My goal is to make GCC able to expose to machine description files new patterns dealing with complex numbers. I already have a proof of concept which can increase performance even on other backends like x86 if the new patterns are implemented. I do not have the expertise to evaluate if your approach is the way we want to go forward with the handling of complex numbers in the middle-/back-end(s) of GCC. However, I *do* have a sizable amount of (Fortran) code over here that uses complex numbers in a day to day operation that I am willing to test (obviously, "day to day operation" is to be interpreted loosely here - what I do is following the *real* operation at my employer (KNMI) with my own weather forecasting computations at home, compiled with GCC). I suppose that your patch is against the master branch of GCC's repository - so I'll have to test that first, cleanly. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: gcd_1.c:188:13: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
On 2/2/22 10:11, Marc Glisse wrote: On Wed, 2 Feb 2022, Toon Moene wrote: Fascinating. My gcc directory had both gmp-6.2.1 and -6.1.0, but the symbolic link 'gmp' pointed to the old one. A similar problem for mpc, mpfr and isl ... You need to pass --force to contrib/download_prerequisites if you want them to be updated. Ah, that's useful, didn't know that. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: gcd_1.c:188:13: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
On 2/1/22 22:44, Marc Glisse wrote: On Tue, 1 Feb 2022, Toon Moene wrote: I just ran a "ubsan" build on my x86_64-linux-gnu system. Maybe try with a more recent version of GMP first? gcd_1.c has only 103 lines in release 6.2.1. A stack trace (UBSAN_OPTIONS=print_stacktrace=1) would make it easier to guess where this is coming from. Fascinating. My gcc directory had both gmp-6.2.1 and -6.1.0, but the symbolic link 'gmp' pointed to the old one. A similar problem for mpc, mpfr and isl ... I will retest. Thanks ! -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
gcd_1.c:188:13: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
I just ran a "ubsan" build on my x86_64-linux-gnu system. See: https://gcc.gnu.org/pipermail/gcc-testresults/2022-February/754454.html This is an interesting failure: Executing on host: /home/toon/scratch/bld1142336/gcc/testsuite/gfortran29/../../gfortran -B/home/toon/scratch/bld1142336/gcc/testsuite/gfortran29/../../ -B/home/toon/scratch/bld1142336/x86_64-pc-linux-gnu/./libgfortran/ /home/toon/compilers/gcc/gcc/testsuite/gfortran.dg/graphite/pr39516.f -fdiagnostics-plain-output -fdiagnostics-plain-output-O -O2 -ftree-loop-linear -S -o pr39516.s(timeout = 300) spawn -ignore SIGHUP /home/toon/scratch/bld1142336/gcc/testsuite/gfortran29/../../gfortran -B/home/toon/scratch/bld1142336/gcc/testsuite/gfortran29/../../ -B/home/toon/scratch/bld1142336/x86_64-pc-linux-gnu/./libgfortran/ /home/toon/compilers/gcc/gcc/testsuite/gfortran.dg/graphite/pr39516.f -fdiagnostics-plain-output -fdiagnostics-plain-output -O -O2 -ftree-loop-linear -S -o pr39516.s gcd_1.c:188:13: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' FAIL: gfortran.dg/graphite/pr39516.f -O (test for excess errors) Excess errors: gcd_1.c:188:13: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' Note that the test case is pure Fortran source. The undefined error seems to come from a function inside the graphite library ... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: [PATCH] Mass rename of C++ .c files to .cc suffix
On 1/11/22 13:56, Martin Liška wrote: Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Plus it survives build of all FEs (--enable-languages=all) on x86_64-linux-gnu and I've built all cross compilers. Does this also rename .c files in the fortran and libgfortran directories ? I would recommend to send this message to the fort...@gcc.gnu.org list too, then. Not everyone reads the gcc and gcc-patches lists ... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: [gen-14164] Invitation to the CERT Vendor Meeting 2022
On 1/7/22 21:14, cert+donotreply--- via Gcc wrote: Topic 2: There's no such thing as free software, or, how to invest in OSS security. Wasn't this Cygnus motto: "We make free software affordable ?" Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Can gcc itself be tested with ubsan? If so, how?
On 9/28/21 8:35 AM, Erick Ochoa via Gcc wrote: Can ubsan be used on the compiler itself? I regularly build the compiler(s) natively with ubsan enabled, see for instance: https://gcc.gnu.org/pipermail/gcc-testresults/2021-September/719448.html The configure line tells you how to do it (towards the end of the mail): configure flags: --prefix=/home/toon/compilers/install/gcc --with-gnu-as --with-gnu-ld --enable-languages=all,ada --disable-multilib --disable-nls --with-build-config=bootstrap-ubsan --enable-checking=all (the enable-checking part is not relevant, and can be omitted). Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: GCC association with the FSF
On 4/14/21 6:18 PM, Jeff Law via Gcc wrote: On 4/14/2021 6:08 AM, Richard Biener via Gcc wrote: On April 14, 2021 12:19:16 PM GMT+02:00, Jonathan Wakely via Gcc wrote: N.B. Jeff is no longer @redhat.com so I've changed the CC On Wed, 14 Apr 2021 at 11:03, Thomas Koenig wrote: - All gfortran developers move to the new branch. This will not happen, I can guarantee you that. This is the part I'm curious about (the rest is obvious, it follows >from there being finite resources and the nature of any fork). But I'm not going to press for reasons. Note the only viable fork will be on the current hosting (which isn't FSF controlled) with the downside of eventually losing the gcc.gnu.org DNS and thus a need to "switch" to a sourceware.org name. I strongly suspect you're right here. Ultimately if one fork reaches critical mass, then it survives and the other dies. That's a function of the developer community. Right now I don't see the nightmare scenario of both forks being viable playing out -- however I'm more concerned now than I was before due Thomas's comments. When plans for the EGCS were underway, and the (then) Fortran supporters were into the plans, it scared the hell out of me, because it was completely unclear to me where it would end. But in the end: I am a supporter of Free Software, not a organization, or a person, but *developers* who support Free Software. That's what got me to go for the fork of EGCS - and I have not been disappointed. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: [patch, doc] Update people who can review gfortran patches
On 9/25/20 8:02 AM, Thomas Koenig via Fortran wrote: Hello world, for review of its patches, gfortran relies on a group of people who can approve patches. Unfortuntately, many of them are not active. Others, who have the capability and who have acted as de facto approvers (without anybody minding) are missing. This (somewhat overdue) patch rectifies that. It adds Tobias Burnus, Jakub Jelinek and Dominique d'Humieres to the list of people who can approve other people's patches in gfortran and libgfortran. Among the people who are currently acitive reviewers, there was unanimous consent that this should be done. I'm not 100% sure we need steering committee approval for this (Toon?), if so, I'd like to request it with this mail. Well, I would say, given the procedure you followed in asking relevant people for their consent and the fact that these are not names "out of the blue", I am convinced the steering committee would approve of this. If questions arise, I will take care of them. Thanks for your effort ! Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: GCC / GFortran (9.3.0; Cygwin 64) Internal Compiler Error on NINT() Function
ave-temps to the complete compilation command, or, in the case of a bug report for the GNAT front end, a complete set of source files (see below). -> gfortran -save-temps -o nint_error.e *nint_error.f90* nint_error.f90:17:0: 17 | m=nint(y,i16) | internal compiler error: in build_round_expr, at fortran/trans-intrinsic.c:396 Please submit a full bug report, with preprocessed source if appropriate. See <https://gcc.gnu.org/bugs/> for instructions. (and with IDNINT()): -> gfortran -save-temps -o nint_error.e nint_error.f90 nint_error.f90:17:7: 17 | m=idnint(y,i16) | 1 Error: Too many arguments in call to ‘idnint’ at (1) Thanks for looking into this, Bernd (Eggen) PS Here a part of the output if omitting the "KIND" optional argument in NINT(): -> ./nint_error.e | & more i16= 16 1 1 1. 0 2 2 2. 1 3 4 4. 3 4 8 8. 7 [...] 31 1073741824 1073741824.000 1073741823 32 2147483648 2147483648.000 2147483647 33 4294967296 4294967296.000 *-1* As you can see, after 2^31-1 = 2147483647 it goes wrong and yields -1 If increasing the integer by 1, it goes wrong thus: [...] 2147483647 2147483647.000 2147483647 2147483648 2147483648.000 -2147483648 [...] -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: gcc math functions for OpenMP vectoization
On 6/5/20 6:10 PM, Tobias Burnus wrote: On 6/5/20 4:11 PM, Jakub Jelinek via Gcc wrote: It is glibc that provides them, not GCC. See https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/fpu/bits/math-vector.h;h=0801905da7b85e2f43fb6b682a7b84c5eec469d4;hb=HEAD Minor addition: That header file is included in math.h, i.e. automatically available. For Fortran/gfortran there is math-vector-fortran.h (also provided by glibc) which has the same functions and a similar effect. I wonder if there are Linux distributions where this actually effected already. I know for sure that it is not in Debian Testing (as of two weeks ago) and Red Hat Fedora 30 (similarly). Do you know of any ? Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: Fixing gcc git logs
On 1/2/20 11:49 AM, Richard Earnshaw wrote: On 02/01/2020 02:00, Jerry wrote: In the following git log entry, I made a typo on the PR number in the libgfortran ChangeLog file. I noticed this right after the git commit, while editing the git log. ... If you've pushed the branch to a public tree then it might be too late, depending on the push policy for branches (in gcc trunk we will not permit rewriting history in that way because it breaks other users trying to pull). Some more private branches might permit such rewriting of history. I am glad you explained it. Previously, the only reaction I got from git-cognoscenti was "Don't do that - it will ruin history for everybody!". Thanks ! Might 2020 be the year of git for GCC. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Test GCC conversion with reposurgeon available
On 12/26/19 10:30 PM, Eric S. Raymond wrote: Me, I don't undertstand why version-control systems designed for distributed use don't ignore timezones entirely and display all times in UTC - relative time is surely more imoortant than the commit time's relationship to solar noon wherever the keyboard happened to be. But I don't make these decisions. So we are going to base this world wide free software endeavor on a source code system that doesn't keep time by UTC ? My God - imagine if weather forecasting was done this way. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: GCC Spec2017 optimization Wiki
Over a month ago, I wrote , about SPEC2017: Of the 7 benchmarks that are (partly) written in Fortran, Cactus is free software (LGPL'd) and the 3 geological ones (wrf, cam4 and roms) are "obtainable" (need to register to get the source code). Of course, that means you get "a" version of the code, not necessarily what is in the SPEC benchmark, but at least it enables us to join in the analysis. exchange2 was written by Michael Metcalf, of countless Fortran books, whom I met once (when I was on the Fortran Standardization Committee). He might be persuaded to give us a copy for analysis if this really is an outlier in performance. Although I still can't completely vouch for it's correctness, I have written a Sudoku solver (exchange2 is a form of a Sudoku solver) in Fortran 2018. In principle - if you can find the initial clue arrangement - it can solve 3x3, 4x4, 5x5, and 6x6 Sudoku's. Up til now, I have only been able to test it on 3x3 and 4x4 examples. You'll find it on my web page (indicated below). -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: GCC Spec2017 optimization Wiki
On 10/6/19 5:14 PM, Thomas Koenig wrote: Hi Tamar, In general our approach is to identify areas for improvement in a benchmark and provide a testcase that's independent of the benchmark when reporting it in a PR upstream. Sounds like a good approach, in principle. If the people who are doing the identfying know Fortran well, that would work even better (do they?), and if they could be persuaded to work on gfortran directly, that would probably be best. Of the 7 benchmarks that are (partly) written in Fortran, Cactus is free software (LGPL'd) and the 3 geological ones (wrf, cam4 and roms) are "obtainable" (need to register to get the source code). Of course, that means you get "a" version of the code, not necessarily what is in the SPEC benchmark, but at least it enables us to join in the analysis. exchange2 was written by Michael Metcalf, of countless Fortran books, whom I met once (when I was on the Fortran Standardization Committee). He might be persuaded to give us a copy for analysis if this really is an outlier in performance. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: gcc-10 Bug report
On 10/3/19 10:01 PM, Thomas Koenig wrote: Hi, I am using gcc version 10.0.0 20190825 (experimental) . During the Lapack compilation I got the following errors: I had the nagging feeling that this error analysis might not be correct. If you use a slightly newer version, the error message will become clearer: sbdsvdx.f:777:39: 420 | CALL SCOPY( N, D, 1, WORK( IETGK ), 2 ) | 2 .. 777 | CALL SCOPY( N*2, Z( 1,I ), 1, WORK, 1 ) | 1 Fehler: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1) This is a violation of the Fortran standard by the Lapack code. To allow this idiom, you can add the -fallow-argument-mismatch argument to the OPTS variable in make.inc before building. For simplicity I use the wording in the Fortran 77 Standard - I don't think subsequent Standards changed this or made it obsolescent. 15.9.3.3 Arrays as Dummy Arguments Within a program unit, the array declarator given for an array provides all array declarator information needed for the array in an execution of the program unit. The number and size of dimensions in an actual argument array declarator may be different from the number and size of the dimensions in an associated dummy argument array declarator. A dummy argument that is an array may be associated with an actual argument that is an array, array element, or array element substring. Of course, you need to read much more of the F77 Standard to find the definitions of all these terms, but I think the last line quoted actually *allows* passing WORK( IETGK ) as an actual argument associated with an array dummy argument. Shoot me. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Bug in closed-source, proprietary software that I do not have access to
On 5/25/19 7:43 PM, Toon Moene wrote: On 5/25/19 7:31 PM, Thomas Koenig wrote: Hi Toon, On 5/25/19 7:01 PM, Steve Kargl wrote: For WRF, I suppose you or Martin could be a good citizen and contact the project to report a bug. I have thought about this. As a person with experience building and running weather forecasting codes, I would be first in line to try this. But the SPEC code could be modified and not resemble a fresh WRF setup ... The current version of netcdf has, in module_netcdf_nf_interfaces.F90, !- nf_put_vara_double -- Interface Function nf_put_vara_double(ncid, varid, start, counts, dvals) & RESULT(status) USE netcdf_nf_data, ONLY: RK8 Integer, Intent(IN) :: ncid, varid Integer, Intent(IN) :: start(*), counts(*) Real(RK8), Intent(IN) :: dvals(*) Integer :: status End Function nf_put_vara_double End Interface which looks good (well, it's not BIND(C)). In another file, it has it as a simple INTEGER declaration with EXTERNAL. I don't know which of the two is actually used. OK. One thing I can do (tomorrow) is to download WRF, build it and see how it uses netcdf. The trunk compiler I have at hand is revision 271618, so it includes your update that's the subject of PR90539. What I started with is some modern versions of the libraries needed: hdf5-1.10.5, netcdf-c-4.6.3, and netcdf-fortran-4.4.5, built with trunk revision 271618. netcdf-c-4.6.3 (the C netcdf library) passed all its own tests. netcdf-fortran-4.4.5 is the library of Fortran "glue" routines to the C library. These are the relevant C interoperability comments during configure, indicating that it will use BIND(C) declarations for interfacing with the C library: checking for Fortran flag to compile .f90 files... none checking fortran 90 modules inclusion flag... -I checking if Fortran compiler supports Fortran 2003 ISO_C_BINDING... yes checking if Fortran compiler supports Fortran 2008 ISO_FORTRAN_ENV additions... yes checking if Fortran compiler supports TS29113 standard extension... yes checking whether F03 native code is desired... yes However, there *is* a Segmentation Fault when running one of its tests: cat nf_test/tst_f90.log Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x7f915ef0d8b0 in ??? #1 0x7f915ef0cae3 in ??? #2 0x7f915eb6483f in ??? #3 0x7f915fa151cd in __netcdf_MOD_nf90_put_var_1d_fourbytereal at /home/toon/netcdf/netcdf-fortran-4.4.5/fortran/netcdf_expanded.f90:940 #4 0x402cfc in netcdftest at /home/toon/netcdf/netcdf-fortran-4.4.5/nf_test/tst_f90.f90:95 #5 0x40226c in main at /home/toon/netcdf/netcdf-fortran-4.4.5/nf_test/tst_f90.f90:7 FAIL tst_f90 (exit status: 139) This is the error location in /home/toon/netcdf/netcdf-fortran-4.4.5/fortran/netcdf_expanded.f90: 939nf90_put_var_1D_FourByteReal = & 940 nf_put_vara_real(ncid, varid, localStart, localCount, values) 941 end if 942end function nf90_put_var_1D_FourByteReal So at least *a* Segmentation Fault is reproducible with these freely available libraries .... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Bug in closed-source, proprietary software that I do not have access to
On 5/25/19 7:31 PM, Thomas Koenig wrote: Hi Toon, On 5/25/19 7:01 PM, Steve Kargl wrote: For WRF, I suppose you or Martin could be a good citizen and contact the project to report a bug. I have thought about this. As a person with experience building and running weather forecasting codes, I would be first in line to try this. But the SPEC code could be modified and not resemble a fresh WRF setup ... The current version of netcdf has, in module_netcdf_nf_interfaces.F90, !- nf_put_vara_double -- Interface Function nf_put_vara_double(ncid, varid, start, counts, dvals) & RESULT(status) USE netcdf_nf_data, ONLY: RK8 Integer, Intent(IN) :: ncid, varid Integer, Intent(IN) :: start(*), counts(*) Real(RK8), Intent(IN) :: dvals(*) Integer :: status End Function nf_put_vara_double End Interface which looks good (well, it's not BIND(C)). In another file, it has it as a simple INTEGER declaration with EXTERNAL. I don't know which of the two is actually used. OK. One thing I can do (tomorrow) is to download WRF, build it and see how it uses netcdf. The trunk compiler I have at hand is revision 271618, so it includes your update that's the subject of PR90539. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Bug in closed-source, proprietary software that I do not have access to
On 5/25/19 7:01 PM, Steve Kargl wrote: For WRF, I suppose you or Martin could be a good citizen and contact the project to report a bug. I have thought about this. As a person with experience building and running weather forecasting codes, I would be first in line to try this. But the SPEC code could be modified and not resemble a fresh WRF setup ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Bug in closed-source, proprietary software that I do not have access to
On 5/25/19 2:52 PM, Thomas Koenig wrote: Hi, consider this: There is a bug, confirmed by several people. This occurs in closed-source, proprietary software, and appears to be due to one of my commits. Despite considerable help from somebody who has access to the source, and putting in quite a few (volunteer) hours myself, there is no test case. So, what to do? Close the PR as INVALID? This would be our standard policy, correct? FYI, the proprietary, closed-source software is SPEC, the corresponding PR is 90539, and the friendly helper is Martin Liska. But the problem seems to be related to netcdf (your comment #22), which is freely available (don't know off-hand which license). Does the problem trigger with netcdf's own test programs ? That would open a way to investigate. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: On-Demand range technology [3/5] - The Prototype
On 5/24/19 12:27 AM, Eric Botcazou wrote: There are a couple of testcases in the testsuite that, I believe, require a minimal form of support for symbolic ranges: gcc.dg/tree-ssa/vrp94.c and gnat.dg/opt40.adb. They deal with the following pattern in C: if (x >= y) return 1; z = y - x; if (z <= 0) abort (); return z; where we want to eliminate the abort. Of course the C version doesn't really make sense on its own, but it's the translation of the Ada version where the if (z <= 0) abort (); is generated by the compiler (it's a range check in Ada parlance). I bet compiling anything Fortran-y with array bound checking on (-fbounds-check) would generate ginormous numbers of opportunities for symbolic range checking ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Large Fortran program succesfully LTO'd.
Honza, At the Cauldron meeting last week I mentioned that I wasn't able to compile our "small" weather forecasting program with LTO. In the mean time I have read some bug reports with "multiple prevailing ..." errors, which made me try linking with the 'gold' linker - that worked. So the only things building the software that I changed were adding -flto and -fuse-ld=gold. Some statistics of the source code: 3902 files totaling 1081944 lines. The result works flawlessly. Over the weekend I will study the LTO diagnostics to see what decisions were made with respect to inlining and other optimizations. Thanks ! -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Interesting statistics on vectorization for Skylake avx512 (i9-7900) - 8.1 vs. 7.3.
Consider the attached Fortran code (the most expensive routine, computation-wise, in our weather forecasting model). verint.s.7.3 is the result of: gfortran -g -O3 -S -march=native -mtune=native verint.f using release 7.3. verint.s.8.1 is the result of: gfortran -g -O3 -S -march=native -mtune=native verint.f using the recently released GCC 8.1. $ wc -l verint.s.7.3 verint.s.8.1 7818 verint.s.7.3 6087 verint.s.8.1 $ grep vfma verint.s.7.3 | wc -l 381 $ grep vfma verint.s.8.1 | wc -l 254 but: $ grep vfma verint.s.7.3 | grep -v ss | wc -l 127 $ grep vfma verint.s.8.1 | grep -v ss | wc -l 127 and: $ grep movaps verint.s.7.3 | wc -l 306 $ grep movaps verint.s.8.3 | wc -l 270 Finally: $ grep zmm verint.s.7.3 | wc -l 1494 $ grep zmm verint.s.8.1 | wc -l 0 $ grep ymm verint.s.7.3 | wc -l 379 $ grep ymm verint.s.8.1 | wc -l 1464 I haven't had the opportunity to test this for speed (is quite complicated, as I have to build several support libraries with 8.1, like openmpi, netcdf, hdf{4|5}, fftw ...) -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news # 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F" # 1 "" # 1 "" # 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F" c Library:grdy $RCSfile$, $Revision: 7536 $ c checked in by $Author: ovignes $ at $Date: 2009-12-18 14:23:36 +0100 (Fri, 18 Dec 2009) $ c $State$, $Locker$ c $Log$ c Revision 1.3 1999/04/22 09:30:45 DagBjoerge c MPP code c c Revision 1.2 1999/03/09 10:23:13 GerardCats c Add SGI paralllellisation directives DOACROSS c c Revision 1.1 1996/09/06 13:12:18 GCats c Created from grdy.apl, 1 version 2.6.1, by Gerard Cats c SUBROUTINE VERINT ( I KLON , KLAT , KLEV , KINT , KHALO I , KLON1 , KLON2 , KLAT1 , KLAT2 I , KP , KQ , KR R , PARG , PRES R , PALFH , PBETH R , PALFA , PBETA , PGAMA ) C C*** C C VERINT - THREE DIMENSIONAL INTERPOLATION C C PURPOSE: C C THREE DIMENSIONAL INTERPOLATION C C INPUT PARAMETERS: C C KLON NUMBER OF GRIDPOINTS IN X-DIRECTION C KLAT NUMBER OF GRIDPOINTS IN Y-DIRECTION C KLEV NUMBER OF VERTICAL LEVELS C KINT TYPE OF INTERPOLATION C= 1 - LINEAR C= 2 - QUADRATIC C= 3 - CUBIC C= 4 - MIXED CUBIC/LINEAR C KLON1 FIRST GRIDPOINT IN X-DIRECTION C KLON2 LAST GRIDPOINT IN X-DIRECTION C KLAT1 FIRST GRIDPOINT IN Y-DIRECTION C KLAT2 LAST GRIDPOINT IN Y-DIRECTION C KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KRARRAY OF INDEXES FOR VERTICAL DISPLACEMENTS C PARG ARRAY OF ARGUMENTS C PALFH ALFA HAT C PBETH BETA HAT C PALFA ARRAY OF WEIGHTS IN X-DIRECTION C PBETA ARRAY OF WEIGHTS IN Y-DIRECTION C PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION C C OUTPUT PARAMETERS: C C PRES INTERPOLATED FIELD C C HISTORY: C C J.E. HAUGEN 1 1992 C C*** C IMPLICIT NONE C INTEGER KLON , KLAT , KLEV , KINT , KHALO, IKLON1 , KLON2 , KLAT1 , KLAT2 C INTEGER KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT) REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV) , RPRES(KLON,KLAT) , R PALFH(KLON,KLAT) , PBETH(KLON,KLAT) , R PALFA(KLON,KLAT,4) , PBETA(KLON,KLAT,4), R PGAMA(KLON,KLAT,4) C INTEGER JX, JY, IDX, IDY, ILEV REAL Z1MAH, Z1MBH C IF (KINT.EQ.1) THEN C LINEAR INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) ) ) C+ + + PGAMA(JX,JY,2)*( C+ + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV ) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV ) ) ) ENDDO ENDDO C ELSE +IF (KINT.EQ.2) THEN C QUADRATIC INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,IL
Re: Loop fusion.
On 04/24/2018 09:18 AM, Richard Biener wrote: On Mon, Apr 23, 2018 at 8:35 PM, Toon Moene wrote: On 04/23/2018 01:00 PM, Richard Biener wrote: Note that while it looks "obvious" in the above source fragment the IL that is presented to optimizers may make things a lot less "low-hanging". Well, the loops are generated by the front end, so I *assume* they are basically the same ... The issue will be boiler-plate code like duplicated loop header checks. That said, it's perfectly understandable that other Fortran compilers have high-level loop opts deeply embedded within their frontends... I agree that this would be more easily handled in the Fortran front end. However, for that it would first have to get a (high level) basic block finder, because it has to be established that consecutive array expressions are part of the same basic block. I discussed high (i.e., Fortran-) level basic blocks briefly in my 2007 GCC Summit talk (http://moene.org/~toon/GCCSummit-2007.pdf, paragraph 7), but I do not think anyone really worked on it. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Loop fusion.
On 04/23/2018 01:00 PM, Richard Biener wrote: On Sun, Apr 22, 2018 at 4:27 PM, Toon Moene wrote: A few days ago there was a rant on the Fortran Standardization Committee's e-mail list about Fortran's "whole array arithmetic" being unoptimizable. An example picked at random from our weather forecasting code: ZQICE(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YI%MP) ZQLI(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YL%MP) ZQRAIN(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YR%MP) ZQSNOW(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YS%MP) The reaction from one of the members of the committee (about "their" compiler): 'And multiple consecutive array statements with the same shape are “fused” exactly so that the compiler can generate good cache use. This sort of optimization is pretty low hanging fruit.' As far as I can see loop fusion as a stand-alone optimization is not supported as yet, although some mention is made in the context of graphite. Is this something that should be pursued ? In principle GRAPHITE can handle loop fusion but yes, standalone fusion is sth useful. Note that while it looks "obvious" in the above source fragment the IL that is presented to optimizers may make things a lot less "low-hanging". Well, the loops are generated by the front end, so I *assume* they are basically the same ... Probably the largest problem to address is the heuristic for preventing register pressure going through the roof ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Loop fusion.
A few days ago there was a rant on the Fortran Standardization Committee's e-mail list about Fortran's "whole array arithmetic" being unoptimizable. An example picked at random from our weather forecasting code: ZQICE(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YI%MP) ZQLI(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YL%MP) ZQRAIN(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YR%MP) ZQSNOW(1:NPROMA,1:NFLEVG) = PGFL(1:NPROMA,1:NFLEVG,YS%MP) The reaction from one of the members of the committee (about "their" compiler): 'And multiple consecutive array statements with the same shape are “fused” exactly so that the compiler can generate good cache use. This sort of optimization is pretty low hanging fruit.' As far as I can see loop fusion as a stand-alone optimization is not supported as yet, although some mention is made in the context of graphite. Is this something that should be pursued ? Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: libmvec simd math functions in fortran
On 11/01/2017 05:26 PM, Jakub Jelinek wrote: On Wed, Nov 01, 2017 at 04:23:11PM +, Szabolcs Nagy wrote: is there a way to get vectorized math functions in fortran? in c code there is attribute simd declarations or openmp declare simd pragma to tell the compiler which functions have simd variant, but i see no such thing in fortran. !$omp declare simd should work fine in fortran (with -fopenmp or -fopenmp-simd). Note that - if you don't want to change the Fortran code, this - almost two years old - proposal would work: https://gcc.gnu.org/ml/gcc/2016-01/msg00025.html Obviously, I'll only be able to implement this once retirement comes around (i.e., after 2023). Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Warning annoyances in list_read.c
On 03/27/2017 08:29 PM, Marek Polacek wrote: On Mon, Mar 27, 2017 at 11:16:32AM -0700, Steve Kargl wrote: On Mon, Mar 27, 2017 at 07:41:12PM +0200, Marek Polacek wrote: On Mon, Mar 27, 2017 at 07:33:05PM +0200, Toon Moene wrote: The person developing the warning could *at least* have bootstrapped all languages and detected, warned and helped the Fortran/Ada/whatever side to cope with it. Of course "the person" had bootstrapped and tested all the languages before adding the warning. If only any of you bothered to check the fortran/ ChangeLogs: fortran/ != libgfortran/ I'm aware. But it's unfair to say that I hadn't tested Fortran when I, actually, had. OK - I see that building libgfortran without -Werror didn't help you here (having to manually go through all the logs), so I retract my comment. My excuses. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Warning annoyances in list_read.c
On 03/27/2017 06:45 PM, Marek Polacek wrote: On Mon, Mar 27, 2017 at 09:27:34AM -0700, Steve Kargl wrote: But that's okay. I now understand that it is acceptable for a developer to commit a change that causes issues for other developers, and said developer can turn a blind eye. Nonsense. The person developing the warning could *at least* have bootstrapped all languages and detected, warned and helped the Fortran/Ada/whatever side to cope with it. [ Man, I'm glad we don't have this problem in Fortran-the-language ]. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Fwd: Test announcement
I guessed correctly - .al is Albania. What's my prize ? Forwarded Message Subject: Test announcement Date: Mon, 27 Feb 2017 16:21:04 + From: Noreply via gcc Reply-To: Noreply , Noreply To: gcc@gcc.gnu.org test http://tracking.desktop.al/tracking/unsubscribe?msgid=xuZXM1pE-Fzjv0SRTEY2-Q2
Re: ICE on using -floop-nest-optimize
On 01/06/2017 03:28 PM, Kyrill Tkachov wrote: On 06/01/17 14:22, Toon Moene wrote: On the attached (Fortran) source, the following version of gfortran draws an ICE: $ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 6.2.1-5' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.2.1 20161124 (Debian 6.2.1-5) using the following command line arguments: gfortran -S -g -Ofast -fprotect-parens -fbacktrace -march=native -mtune=native -floop-nest-optimize corr_to_spec_2D.F The error message is: corr_to_spec_2D.F:3:0: subroutine corr_to_spec_2D(nx_local,ny_local, internal compiler error: in create_pw_aff_from_tree, at graphite-sese-to-poly.c:445 Please submit a full bug report, with preprocessed source if appropriate. See for instructions. I will retry this with trunk gfortran as soon as my automatic builds have constructed that compiler. In the mean time - anyone has a clue ? Looks like PR 69823 ? Yep - thanks. So I don't have to put it into Bugzilla - even if the trunk still fails. Saves some work - thanks again ! -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
ICE on using -floop-nest-optimize
On the attached (Fortran) source, the following version of gfortran draws an ICE: $ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 6.2.1-5' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.2.1 20161124 (Debian 6.2.1-5) using the following command line arguments: gfortran -S -g -Ofast -fprotect-parens -fbacktrace -march=native -mtune=native -floop-nest-optimize corr_to_spec_2D.F The error message is: corr_to_spec_2D.F:3:0: subroutine corr_to_spec_2D(nx_local,ny_local, internal compiler error: in create_pw_aff_from_tree, at graphite-sese-to-poly.c:445 Please submit a full bug report, with preprocessed source if appropriate. See for instructions. I will retry this with trunk gfortran as soon as my automatic builds have constructed that compiler. In the mean time - anyone has a clue ? Thanks, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news c Library: hljb $Id: corr_to_spec_2D.F 8416 2010-09-08 08:52:33Z ovignes $ c subroutine corr_to_spec_2D(nx_local,ny_local, xny_global, xnxl_global,nyl_global, xkmax_local, xkmax_global,lmax_global, xmype,nproc,nydim,kdim, xlmaxe, xjmin_list,ny_list, xkmin_list,kmax_list, xlevmin_list,levmax_list, xgridsize,lscale, xspec_dens_2D, corr_index) c implicit none c c--- c integer nx_local,ny_local, xny_global, xnxl_global,nyl_global, xkmax_local, xkmax_global,lmax_global, xmype,nproc,nydim,kdim, xlmaxe(0:kmax_local), xjmin_list(nproc),ny_list(nproc), xkmin_list(nproc),kmax_list(nproc), xlevmin_list(0:kdim,nproc), xlevmax_list(0:kdim,nproc) real gridsize,lscale real x spec_dens_2D(-lmax_global:lmax_global,0:kmax_local) integer corr_index c c--- c c Local work space c integer nextended,kextended,i,j,j_global,jlev,k,l,kwave real dist,dx,dy, sum_spec real, allocatable ::corr_extended(:,:,:) complex, allocatable :: spec_corr(:,:) c complex spec_dens_cmpx(-lmax_global:lmax_global,0:kmax_local) real phys_corr_appr(nx_local,ny_local) real phys_corr_orig(nx_local,ny_local) c real spec_eps parameter (spec_eps = 0.01) c c--- c c Allocate space for correlation in physical space c including extension zone and in spectral space c nextended = nyl_global/ny_global+1 if (mype.eq.1) write(6,*)' nextended=',nextended allocate(corr_extended(nxl_global,ny_local,nextended)) allocate(spec_corr(-lmax_global:lmax_global,0:kmax_local)) c c--- c c Construct correlation in physical space call corr_ext( nextended, nxl_global, xny_local,ny_global,nyl_global, x gridsize, lscale, xcorr_extended, corr_index, xmype,nproc,jmin_list ) c
Re: GCC 7.0.0 Status Report (2016-10-21)
On 10/26/2016 11:24 AM, Richard Biener wrote: On Tue, Oct 25, 2016 at 9:41 PM, Toon Moene wrote: But that is for code that read math function prototypes in C style .h files - so not for Fortran or Ada. That was the purpose of my proposal: to treat glibc vectorized log/exp/sin/cos/tan functions like the vendor specific once (-mveclibabi=svml and -mveclibabi=acml), which is front end language agnostic. Ah, indeed. Not sure if I would go the -mveclibabi route though as the glibc implementation is basically OpenMP-SIMD. The list of "vectorized" functions may also depend on glibc version and target so maybe glibc can ship something like a fortran math module which gfortran can include transparently? Like we have the predefs header in C? That is, glibc needs a way to communicate its configuration to fortran... (not sure if a GCC configure time config would be good enough) OK, I still went with the apparent "go-ahead" as worded in https://gcc.gnu.org/ml/gcc/2016-01/msg00045.html but I agree I should test this more vis-a-vis the OpenMP-SIMD approach and whether it can be done as a -mveclibabi adaptation. Anyway - far too late for the GCC-7 effort ... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: GCC 7.0.0 Status Report (2016-10-21)
On 10/25/2016 10:16 AM, Richard Biener wrote: On Mon, Oct 24, 2016 at 10:20 PM, Toon Moene wrote: Note that I haven't found the time to implement the vectorization of log/exp/sin/cos/tan functions that I described here: https://gcc.gnu.org/ml/gcc/2016-01/msg00039.html It works transparently already if you have recent glibc which adds the appropriate attribute to the math function prototypes (basically one release after the release that first implemented the routines though the required patch is trivial to backport as well). But that is for code that read math function prototypes in C style .h files - so not for Fortran or Ada. That was the purpose of my proposal: to treat glibc vectorized log/exp/sin/cos/tan functions like the vendor specific once (-mveclibabi=svml and -mveclibabi=acml), which is front end language agnostic. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: GCC 7.0.0 Status Report (2016-10-21)
On 10/21/2016 03:46 PM, Jakub Jelinek wrote: Status == Trunk which will eventually become GCC 7 is still in Stage 1 but its end is near and we are planning to transition into Stage 3 starting Nov 13th end of day time zone of your choice. Note that I haven't found the time to implement the vectorization of log/exp/sin/cos/tan functions that I described here: https://gcc.gnu.org/ml/gcc/2016-01/msg00039.html Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Deprecating basic asm in a function - What now?
On 07/04/2016 03:43 PM, Jonathan Wakely wrote: On 22 June 2016 at 10:28, David Wohlferd wrote: And I *get* that it takes time to re-write this, and people have schedules, lives, a need for sleep. But even under the most insanely aggressive schedule I can imagine (if gcc continue to release ~1/year), it will be at least a year before there's a release that has the (disable-able) warning, and another year before we could even think about actually removing this. So someone who plans to use v8.0 in their production code on the day it is released still has a minimum of *two years* to get ready. It doesn't matter how much warning people have to fix such things, most of them won't do it. Then at the last minute some poor person has to spend days or weeks going through other people's code fixing all the problems. If the benefit isn't clear then it's just a pain and causes wailing and gnashing of teeth. We had at least 15 years of warning ahead of the Y2K problem. Nevertheless, it was only fixed in our code during March-September 1999. Or, as one of my colleagues quipped: The next time, they can ask someone else for this job. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: [isocpp-parallel] Proposal for new memory_order_consume definition
On 02/28/2016 05:13 PM, Linus Torvalds wrote: Yeah, let's just say that the original C designers were better at their job than a gaggle of standards people who were making bad crap up to make some Fortran-style programs go faster. The original C designers were defining a language that would make it easy to write operating systems in (and not having to rely on assembler). I misled the quote where they said they first tried Fortran (and concluded it didn't fit their purpose). BTW, Fortran was designed around floating point arithmetic (and its non-relation to the mathematical concept of the field of the reals). It used integers only for counting and indexing arrays, so it had no purpose for "signed integers that overflowed". Therefore, to the Fortran standard, this was "undefined". It was literally "undefined" - as it was not described by the standard's text. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: GNU C library's libmvec and the GNU Compiler *Collection*.
On 01/06/2016 07:46 PM, Toon Moene wrote: Would it be possible to add an option -mveclibabi=glibc to cater for this *for all languages*; or is this too low level (after all, the glibc libmvec has code for multiple architectures). If so, at what level should this be implemented ? It doesn't look hard to implement this as a variant of -mveclibabi={svml|acml}. The implementation of these options is in config/i386/i386.c: 5272 /* Use external vectorized library in vectorizing intrinsics. */ 5273 if (opts_set->x_ix86_veclibabi_type) 5274 switch (opts->x_ix86_veclibabi_type) 5275 { 5276 case ix86_veclibabi_type_svml: 5277 ix86_veclib_handler = ix86_veclibabi_svml; 5278 break; 5279 5280 case ix86_veclibabi_type_acml: 5281 ix86_veclib_handler = ix86_veclibabi_acml; 5282 break; 5283 5284 default: 5285 gcc_unreachable (); 5286 } so I could just write a third "handler": case ix86_veclibabi_type_glibc: ix86_veclib_handler = ix86_veclibabi_glibc; and clone, say, ix86_veclibabi_svml to write ix86_veclibabi_glibc to do the right thing for generating calls to the glibc libmvec routines. I hope to have time for this during the summer. As this is stage 1 material anyway, that looks like the right point in time anyway. Of course, a more general, architecture independent approach might be preferable, but this at least would be a start. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: GNU C library's libmvec and the GNU Compiler *Collection*.
On 01/06/2016 07:46 PM, Toon Moene wrote: [ This is relevant for our code, because just the switch to *actual* single precision exp/log/sin/cos implementations in glibc's libm resulted in a decrease of the running time of our weather forecasting code by 25 % (this was in glibc 2.16, IIRC). ] The reference for this is (on an Ivy Bridge system): https://gcc.gnu.org/ml/gcc-help/2013-01/msg00175.html "I have made a home-build glibc-2.17 (on a core-avx system). It works great - linking against it (instead of using the current Debian-Testing's eglibc-2.13) brought the wall-clock time of my weather forecasting job down from 3:35 hours to 2:45 (mostly due to a more efficient implementation of powf, expf and logf)." So, in minutes of compute time: This is (215-165) / 215 = 0.23 (23 %). However, that number included a part that ran for an hour (60 minutes) in double precision. Excluding that we get (155 - 105) / 155 = 0.32 (32 %) improvement in performance for the single precision part of our weather forecasting code. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
GNU C library's libmvec and the GNU Compiler *Collection*.
All, I noticed, around half a year ago, that the incredible team around glibc found the time to implement vector math (libm) routines. Previously, free software adherents like me were dependent on vendor libraries via the -mveclibabi={svml|acml} (on Intel/AMD) for instance. However, the examples given on the glibc wiki (https://sourceware.org/glibc/wiki/libmvec, Example 1/Example 2) suggest that this is a C-only thing (this might make sense given that glibc is an implementation of the *C* library), but the above vendor-level options at least work for every front-end language, as far as I know. Would it be possible to add an option -mveclibabi=glibc to cater for this *for all languages*; or is this too low level (after all, the glibc libmvec has code for multiple architectures). If so, at what level should this be implemented ? [ This is relevant for our code, because just the switch to *actual* single precision exp/log/sin/cos implementations in glibc's libm resulted in a decrease of the running time of our weather forecasting code by 25 % (this was in glibc 2.16, IIRC). ] Thanks in advance for your suggestions. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: LLVM to get massive GPU support with Fortran
On 11/16/2015 11:02 PM, Jack Howarth wrote: FYI, this posting has a bit more detail on the actual implementation... http://lists.llvm.org/pipermail/llvm-dev/2015-November/092438.html That surely helps - thanks. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: LLVM to get massive GPU support with Fortran
On 11/16/2015 10:33 PM, Jack Howarth wrote: Of course one unknown is whether PGI had already done any work internally with the llvm middle-/back-end. If so, they might not be starting from scratch. Perhaps it helps if I repost the following from 12 years ago: https://gcc.gnu.org/ml/fortran/2003-11/msg00052.html Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: LLVM to get massive GPU support with Fortran
On 11/16/2015 10:11 PM, Jack Howarth wrote: On Mon, Nov 16, 2015 at 2:14 PM, Toon Moene wrote: To put this in a (timeline) perspective: On the 18th of March, 2000, I announced Andy Vaught's work on the g95 front-end to the gcc-patches mailing list. In 2004 (!) we merged the resulting compiler and run-time library into the gcc (cvs) repository (obviously, after the tree-ssa infrastructure went in - 2004-05-17, but before the creation of the 4.0 release branch - 2005-02-25). Then it took another 2 months for 4.0 to be released. Unless PGI manages to summon massively large (parallel) working groups to accomplish this, it might take a few years to fruition. On the other hand, the llvm-dev posting implies that PGI will be starting from an existing fortran front-end. If they only need to code the middle-/back-end integration of llvm into a pre-existing mature fortran front-end, the promised late 2016 release date might not be so unlikely. The g95 front-end I mentioned in my 2000-03-18 post to the gcc-patches mailing list was "an existing front-end" by virtue of the fact that Andy Vaught mailed it to me and it did the work. Between 2000 and 2004, this front-end was coupled to the rest of the infrastructure of the GNU Compiler Collection. This was not trivial (just as it will not be trivial to couple the PGI front-end to the LLVM infrastructure). We'll see how many years it'll take, but don't count me in on holding my breath. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: LLVM to get massive GPU support with Fortran
On 11/16/2015 12:58 AM, Steve Kargl wrote: On Mon, Nov 16, 2015 at 12:04:06AM +0100, Thomas Koenig wrote: See http://arstechnica.com/information-technology/2015/11/llvm-to-get-fortran-compiler-that-targets-parallel-gpus-in-clusters/ It is not entirely clear on what they plan to do. Use gfortran via dragonegg? The 3 DOE labs in the USA have contracted PGI to port (some of) there Fortran FE to LLVM and open source the result. http://lists.llvm.org/pipermail/llvm-dev/2015-November/092404.html To put this in a (timeline) perspective: On the 18th of March, 2000, I announced Andy Vaught's work on the g95 front-end to the gcc-patches mailing list. In 2004 (!) we merged the resulting compiler and run-time library into the gcc (cvs) repository (obviously, after the tree-ssa infrastructure went in - 2004-05-17, but before the creation of the 4.0 release branch - 2005-02-25). Then it took another 2 months for 4.0 to be released. Unless PGI manages to summon massively large (parallel) working groups to accomplish this, it might take a few years to fruition. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
What is guaranteed with the new numbering scheme of GCC releases ?
All, One of my colleagues on the Fortran Standardization Committee asked me the following question: "People are still not too familiar with the new GCC numbering scheme. My impression is that 5.2 is just a maintenance update of 5.1. However, they still want assurance that there are no call interface or module format changes between 5.1 and 5.2 so that libraries and modules built with 5.1 (MPI, for example) will still work with 5.2. Is that the case?" I went to https://gcc.gnu.org/develop.html#num_scheme for answering that question, but it is by far not explicit enough to answer it. Shouldn't we be documenting the shift in numbering schematics on a far more obvious location on our web site, and with more complete semantics ? Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Ubsan build of GCC 6.0 fails with: cp/search.c:1011:41: error: 'otype' may be used uninitialized in this function
See: https://gcc.gnu.org/ml/gcc-testresults/2015-09/msg00699.html Full error message: /home/toon/compilers/trunk/gcc/cp/search.c: In function 'int accessible_p(tree, tree, bool)': /home/toon/compilers/trunk/gcc/cp/search.c:1011:41: error: 'otype' may be used uninitialized in this function [-Werror=maybe-uninitialized] dfs_accessible_data d = { decl, otype }; ^ The host compiler is: toon@moene:~$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 5.2.1-16' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.2.1 20150903 (Debian 5.2.1-16) Any ideas ? Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Reduction Pattern ( Vectorization or Parallelization)
On 07/16/2015 12:53 PM, Richard Biener wrote: On Sun, Jul 5, 2015 at 1:57 PM, Ajit Kumar Agarwal For the following code For(j = 0; j <= N;j++) { y = d[j]; For( I = 0 ; I <8 ; i++) X(a[i]) = X(a[i]) + c[i] * y; } Fig(1). I think the issue here is dependences of X(A[i]) as A[i] might be the same for different i. In Fortran this is not allowed on the left-hand side of an assignment. Does C have any restrictions here ? -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Loop fusion.
On 04/22/2015 09:10 PM, Steven Bosscher wrote: On Wed, Apr 22, 2015 at 6:59 PM, Toon Moene wrote: Why is loop fusion important, especially in Fortran 90 and later programs ? Because without it, every array assignment is a single loop nest, isolated from related, same-shape assignments. Why is this a bad thing? When you're talking about single-node machines, separate loops is probably faster if your arrays are large enough: better cache locality and easier to vectorize. Loop fusion is only a win if you iterate through the same array variables. Writing such a pass is not so hard for the simple, most common cases. The front end could do some of the rewriting from F90-style array assignments to fused loops if it notices consecutive array assignments/operations on the same variables. It could well be that my artificial example was not what my colleague measured ... Indeed, I thought about the front end doing this, but that would limit it to those that the front end could recognize; on the other hand, that might be the right limitation. Thanks ! -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Loop fusion.
L.S., Last week, a colleague of mine from Meteo France held a talk at the yearly meeting of all researchers working on HARMONIE (see http://hirlam.org) discussing the performance of our code when compiled with each of the supported compilers on the Cray XC30 at ECMWF (http://www.ecmwf.int/en/computing/our-facilities). In the context of GCC this is relevant, because one of the three compilers is gfortran (version 4.9.2). One of his slides discussed the differences in optimizations that the three compilers offer; I was surprised to learn that GCC/gfortran doesn't do loop fusion *at all*. Note, I discussed loop fusion (among other optimizations) at LinuxExpo 99 (http://moene.org/~toon/nwp.ps) which, unsurprisingly, was held 16 years ago :-) Why is loop fusion important, especially in Fortran 90 and later programs ? Because without it, every array assignment is a single loop nest, isolated from related, same-shape assignments. Consider this (artificial, but typical) example [updating atmospheric quantities after the computation of the rate of change during a time step of the integration]: SUBROUTINE UPDATE_DT(T, U, V, Q, DTDT, DUDT, DVDT, DQDT, & & NLON, NLAT, NLEV, TSTEP) ... REAL, DIMENSION(NLON, NLAT, NLEV) :: T, U, V, Q, DTDT, DUDT, DVDT, DQDT ... T = T + TSTEP*DTDT ! Update temperature U = U + TSTEP*DUDT ! Update east-west wind component V = V + TSTEP*DVDT ! Update north-south wind component Q = Q + TSTEP*DQDT ! Update specific humidity ... END This generates four consecutive 3 deep loop nests over NLEV, NLAT, NLON. Of course, it would be much more efficient if this were just one loop nest, as Fortran 77 programmers would write it: DO JLEV = 1, NLEV DO JLAT = 1, NLAT DO JLON = 1, NLON T(JLON, JLAT, JLEV) = T(JLON, JLAT, JLEV) + TSTEP*DTDT(JLON, JLAT, JLEV) U(JLON, JLAT, JLEV) = U(JLON, JLAT, JLEV) + TSTEP*DUDT(JLON, JLAT, JLEV) V(JLON, JLAT, JLEV) = V(JLON, JLAT, JLEV) + TSTEP*DVDT(JLON, JLAT, JLEV) Q(JLON, JLAT, JLEV) = Q(JLON, JLAT, JLEV) + TSTEP*DQDT(JLON, JLAT, JLEV) ENDDO ENDDO ENDDO After a loop fusion optimization pass the Fortran 90 and the Fortran 77 code should result in the same assembler output. Is this something the Graphite infrastructure could help with ? From the wiki documentation I get the impression that it only works on single loop nests, but I must confess that I am not familiar with the nomenclature in its description ... Would it be hard to write a loop fusion pass otherwise ? Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: LTO bootstrap failure for GCC-5 prerelease.
On 04/17/2015 10:49 AM, Richard Biener wrote: On Fri, Apr 17, 2015 at 10:16 AM, Toon Moene wrote: See: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01975.html Comparing stages 2 and 3 warning: gcc/cc1-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs Bootstrap comparison failure! gcc/tree-ssa-uninit.o differs gcc/tree-switch-conversion.o differs gcc/tree-ssa-loop-ivcanon.o differs With LTO bootstrap you run into PR62077, can you try the workaround, --enable-stage1-checking=release? Richard. Yep, that worked: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg02034.html Thanks, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
LTO bootstrap failure for GCC-5 prerelease.
See: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01975.html Comparing stages 2 and 3 warning: gcc/cc1-checksum.o differs warning: gcc/cc1obj-checksum.o differs warning: gcc/cc1plus-checksum.o differs Bootstrap comparison failure! gcc/tree-ssa-uninit.o differs gcc/tree-switch-conversion.o differs gcc/tree-ssa-loop-ivcanon.o differs ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: lto bootstrap fails.
On 04/13/2015 06:00 PM, Trevor Saunders wrote: On Mon, Apr 13, 2015 at 05:46:35PM +0200, Toon Moene wrote: On 04/11/2015 01:33 AM, Jan Hubicka wrote: On Fri, Apr 10, 2015 at 11:18:39AM -0400, Trevor Saunders wrote: On Fri, Apr 10, 2015 at 03:59:19PM +0200, Toon Moene wrote: Like this: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01086.html ODR rears its head again ... huh, why is c/c-lang.h getting included in files linked into cc1plus? that seems strange. readelf -wl cc1plus | grep c-lang.h doesn't show anything. I tried to reproduce it and my bootstrap passes with same options as Toon's The following patch ought to be able to tell the particular translation unit causing the conflict. [ Patch elided ] The patch applied cleanly - this is what I got as a result: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01450.html I hope this is useful. ok, so the problem would seem to be graphite-scop-detection.c is including front end specific headers. Can you put a #error in cp-tree.h and recompile graphite-scop-detection.o to see what the path to including it is? Trev I get this: In file included from /home/toon/compilers/trunk/gcc/graphite-scop-detection.c:73:0: /home/toon/compilers/trunk/gcc/cp/cp-tree.h:52:2: error: #error #error ^ (See https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01493.html) -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: lto bootstrap fails.
On 04/11/2015 01:33 AM, Jan Hubicka wrote: On Fri, Apr 10, 2015 at 11:18:39AM -0400, Trevor Saunders wrote: On Fri, Apr 10, 2015 at 03:59:19PM +0200, Toon Moene wrote: Like this: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01086.html ODR rears its head again ... huh, why is c/c-lang.h getting included in files linked into cc1plus? that seems strange. readelf -wl cc1plus | grep c-lang.h doesn't show anything. I tried to reproduce it and my bootstrap passes with same options as Toon's The following patch ought to be able to tell the particular translation unit causing the conflict. [ Patch elided ] The patch applied cleanly - this is what I got as a result: https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg01450.html I hope this is useful. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
trunk revision 221714 gets segfault during lto bootstrap.
As is shown here: https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg03014.html. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Named parameters
On 03/16/2015 05:06 PM, David Brown wrote: Basically, the idea is this: int foo(int a, int b, int c); void bar(void) { foo(1, 2, 3); // Normal call foo(.a = 1, .b = 2, .c = 3) // Same as foo(1, 2, 3) foo(.c = 3, .b = 2, .a = 1) // Same as foo(1, 2, 3) } If only the first variant is allowed (with the named parameters in the order declared in the prototype), then this would not affect code generation at all - the designators could only be used for static error checking. If the second variant is allowed, then the parameters could be re-ordered. This is indeed very useful - Fortran has this since the Fortran 90 standard, albeit without the dots (it's unambiguous in Fortran). -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
More explicit what's wrong with this: FAILED: Bootstrap (build config: lto; languages: all; trunk revision 217898) on x86_64-unknown-linux-gnu
See: https://gcc.gnu.org/ml/gcc-testresults/2014-11/msg02259.html What's not in the log file sent to gcc-results: /usr/bin/ld: /dev/shm/wd4296/ccFei5Gg.ltrans0.ltrans.o: relocation R_X86_64_32S against `prime_tab.lto_priv.2' can not be used when making a shared o bject; recompile with -fPIC /dev/shm/wd4296/ccFei5Gg.ltrans0.ltrans.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status Makefile:409: recipe for target 'libcc1.la' failed make[3]: *** [libcc1.la] Error 1 Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Hmm, /sbin/ldconfig.real: /home/toon/compilers/install/lib/../lib64/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start.
On the gcc-results archive, you'll see this: https://gcc.gnu.org/ml/gcc-testresults/2014-10/msg02983.html but that doesn't show you the real problem: mkdir -p -- /home/toon/compilers/install/share/gcc-5.0.0/python/libjava libtool: install: /usr/bin/install -c .libs/libgcj-tools.so.15.0.0T /home/toon/compilers/install/lib/../lib64/libgcj-tools.so.15.0.0 libtool: install: (cd /home/toon/compilers/install/lib/../lib64 && { ln -s -f libgcj-tools.so.15.0.0 libgcj-tools.so.15 || { rm -f libgcj-tools.so.15 && ln -s libgcj-tools.so.15.0.0 libgcj-tools.so.15; }; }) libtool: install: (cd /home/toon/compilers/install/lib/../lib64 && { ln -s -f libgcj-tools.so.15.0.0 libgcj-tools.so || { rm -f libgcj-tools.so && ln -s libgcj-tools.so.15.0.0 libgcj-tools.so; }; }) libtool: install: /usr/bin/install -c .libs/libgcj-tools.lai /home/toon/compilers/install/lib/../lib64/libgcj-tools.la libtool: install: /usr/bin/install -c install/.libs/libgcj_bc.so.1.0.0 /home/toon/compilers/install/lib/../lib64/libgcj_bc.so.1.0.0 libtool: install: (cd /home/toon/compilers/install/lib/../lib64 && { ln -s -f libgcj_bc.so.1.0.0 libgcj_bc.so.1 || { rm -f libgcj_bc.so.1 && ln -s libgcj_bc.so.1.0.0 libgcj_bc.so.1; }; }) libtool: install: (cd /home/toon/compilers/install/lib/../lib64 && { ln -s -f libgcj_bc.so.1.0.0 libgcj_bc.so || { rm -f libgcj_bc.so && ln -s libgcj_bc.so.1.0.0 libgcj_bc.so; }; }) libtool: install: /usr/bin/install -c install/.libs/libgcj_bc.lai /home/toon/compilers/install/lib/../lib64/libgcj_bc.la libtool: finish: PATH="/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin" ldconfig -n /home/toon/compilers/install/lib/../lib64 /sbin/ldconfig.real: /home/toon/compilers/install/lib/../lib64/libstdc++.so.6.0.21-gdb.py is not an ELF file - it has the wrong magic bytes at the start. Well, you betcha a python script is not an ELF file - but why does the build process think so ? -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Bootstrap with Ada and bootstrap-config=ubsan fails because gnatlink is not linked against ubsan.
This: https://gcc.gnu.org/ml/gcc-testresults/2014-10/msg01552.html fails because: /home/toon/compilers/trunk/libiberty/xstrdup.c:33: undefined reference to `__ubsan_handle_nonnull_arg' /home/toon/compilers/trunk/libiberty/xstrdup.c:35: undefined reference to `__ubsan_handle_nonnull_arg' /home/toon/compilers/trunk/libiberty/xstrdup.c:35: undefined reference to `__ubsan_handle_nonnull_arg' /home/toon/compilers/trunk/libiberty/xstrdup.c:35: undefined reference to `__ubsan_handle_nonnull_return' /home/toon/compilers/trunk/libiberty/xstrdup.c:35: undefined reference to `__ubsan_handle_nonnull_arg' collect2: error: ld returned 1 exit status ../gcc-interface/Makefile:2585: recipe for target '../../gnatlink' failed make[3]: *** [../../gnatlink] Error 1 Kind regards, -- Toon Moene, KNMI (Weer/Onderzoek), The Netherlands Phone: +31 30 2206443; e-mail: mo...@knmi.nl
Re: msan and gcc ?
On 10/01/2014 08:00 PM, Kostya Serebryany wrote: -gcc folks. Why not use clang then? It offers many more nice features. What's the Fortran front-end called for clang (or do you really think we are going to write Weather Forecasting codes in C :-) ) Kind regards, On Wed, Oct 1, 2014 at 10:58 AM, Toon Moene wrote: On 10/01/2014 06:21 PM, VandeVondele Joost wrote: it was certainly worth it. since I see msan as a kind of valgrind replacement (similar functionality, but ~10x the speed, partially at the cost of more difficult deployment), I did a quick search in gcc bugzilla. 982 PRs mention valgrind, so such functionality is clearly heavily used. That would be interesting - valgrind is certainly impossible to use on our Weather Forecasting code (far too large, as valgrind helpfully points out). -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: msan and gcc ?
On 10/01/2014 06:21 PM, VandeVondele Joost wrote: it was certainly worth it. since I see msan as a kind of valgrind replacement (similar functionality, but ~10x the speed, partially at the cost of more difficult deployment), I did a quick search in gcc bugzilla. 982 PRs mention valgrind, so such functionality is clearly heavily used. That would be interesting - valgrind is certainly impossible to use on our Weather Forecasting code (far too large, as valgrind helpfully points out). -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
LTO detects violations of one-definition-rule ?
Like this (https://gcc.gnu.org/ml/gcc-testresults/2014-09/msg01374.html): /home/toon/compilers/trunk/gcc/tlink.c:62:16: error: type 'struct file_hash_entry' violates one definition rule [-Werror=odr] typedef struct file_hash_entry ^ /home/toon/compilers/trunk/libcpp/files.c:143:8: note: a different type is defined in another translation unit struct file_hash_entry ^ /home/toon/compilers/trunk/gcc/tlink.c:64:15: note: the first difference of corresponding definitions is field 'key' const char *key; ^ /home/toon/compilers/trunk/libcpp/files.c:145:27: note: a field with different name is defined in another translation unit struct file_hash_entry *next; ^ -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Got this one back (too large: 6.4 Mb) from gcc-results:
On 07/03/2014 07:11 PM, Marek Polacek wrote: On Thu, Jul 03, 2014 at 07:06:29PM +0200, Toon Moene wrote: Compiler version: 4.10.0 20140702 (experimental) (GCC) Platform: x86_64-unknown-linux-gnu configure flags: --prefix=/home/toon/compilers/install --with-gnu-as --with-gnu-ld --with-build-config=bootstrap-ubsan --enable-languages=all --disable-multilib --disable-nls --with-arch=core-avx2 --with-tune=core-avx2 Note: --with-build-config=bootstrap-ubsan Apparently, the bugs went wild ... I suspect that's because: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01549.html which will go away if/when we fix: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01624.html But I'm only guessing. Thanks - we will see. This is the schedule of test runs here at home: "branch"boot-config languages $ crontab -l 45 00 * * 0 $HOME/BootstrapGCC trunk lto all 45 00 * * 1 $HOME/BootstrapGCC fortran-dev lto fortran 45 00 * * 2 $HOME/BootstrapGCC trunk "asan --disable-werror" all 45 00 * * 3 $HOME/BootstrapGCC fortran-dev "asan --disable-werror" fortran 45 00 * * 4 $HOME/BootstrapGCC trunk ubsan all 45 00 * * 5 $HOME/BootstrapGCC fortran-dev ubsan fortran 45 00 * * 6 $HOME/BootstrapGCC trunk ubsan fortran So the next test with bootstrap-config=ubsan is on Saturday, at 00:45 CEST - see if your fix can beat that :-) Kind regards, and thanks for the explanation ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Got this one back (too large: 6.4 Mb) from gcc-results:
Compiler version: 4.10.0 20140702 (experimental) (GCC) Platform: x86_64-unknown-linux-gnu configure flags: --prefix=/home/toon/compilers/install --with-gnu-as --with-gnu-ld --with-build-config=bootstrap-ubsan --enable-languages=all --disable-multilib --disable-nls --with-arch=core-avx2 --with-tune=core-avx2 Note: --with-build-config=bootstrap-ubsan Apparently, the bugs went wild ... Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Reducing Register Pressure through Live range Shrinking through Loops!!
On 05/22/2014 10:16 PM, Vladimir Makarov wrote: It also permits to rematerialize not only on loop borders (although it is the most important points). That would certainly be interesting for the following hot subroutine in our weather forecasting model (attached). Note the loop from (line 157): +IF (KINT.EQ.3) THEN C CUBIC INTERPOLATION to (line 242): + + PALFA(JX,JY,4)*PARG(IDX+1,IDY+1,ILEV+1) ) ) ENDDO ENDDO Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news # 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F" # 1 "" # 1 "" # 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F" c Library:grdy $RCSfile$, $Revision: 7536 $ c checked in by $Author: ovignes $ at $Date: 2009-12-18 14:23:36 +0100 (Fri, 18 Dec 2009) $ c $State$, $Locker$ c $Log$ c Revision 1.3 1999/04/22 09:30:45 DagBjoerge c MPP code c c Revision 1.2 1999/03/09 10:23:13 GerardCats c Add SGI paralllellisation directives DOACROSS c c Revision 1.1 1996/09/06 13:12:18 GCats c Created from grdy.apl, 1 version 2.6.1, by Gerard Cats c SUBROUTINE VERINT ( I KLON , KLAT , KLEV , KINT , KHALO I , KLON1 , KLON2 , KLAT1 , KLAT2 I , KP , KQ , KR R , PARG , PRES R , PALFH , PBETH R , PALFA , PBETA , PGAMA ) C C*** C C VERINT - THREE DIMENSIONAL INTERPOLATION C C PURPOSE: C C THREE DIMENSIONAL INTERPOLATION C C INPUT PARAMETERS: C C KLON NUMBER OF GRIDPOINTS IN X-DIRECTION C KLAT NUMBER OF GRIDPOINTS IN Y-DIRECTION C KLEV NUMBER OF VERTICAL LEVELS C KINT TYPE OF INTERPOLATION C= 1 - LINEAR C= 2 - QUADRATIC C= 3 - CUBIC C= 4 - MIXED CUBIC/LINEAR C KLON1 FIRST GRIDPOINT IN X-DIRECTION C KLON2 LAST GRIDPOINT IN X-DIRECTION C KLAT1 FIRST GRIDPOINT IN Y-DIRECTION C KLAT2 LAST GRIDPOINT IN Y-DIRECTION C KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KRARRAY OF INDEXES FOR VERTICAL DISPLACEMENTS C PARG ARRAY OF ARGUMENTS C PALFH ALFA HAT C PBETH BETA HAT C PALFA ARRAY OF WEIGHTS IN X-DIRECTION C PBETA ARRAY OF WEIGHTS IN Y-DIRECTION C PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION C C OUTPUT PARAMETERS: C C PRES INTERPOLATED FIELD C C HISTORY: C C J.E. HAUGEN 1 1992 C C*** C IMPLICIT NONE C INTEGER KLON , KLAT , KLEV , KINT , KHALO, IKLON1 , KLON2 , KLAT1 , KLAT2 C INTEGER KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT) REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV) , RPRES(KLON,KLAT) , R PALFH(KLON,KLAT) , PBETH(KLON,KLAT) , R PALFA(KLON,KLAT,4) , PBETA(KLON,KLAT,4), R PGAMA(KLON,KLAT,4) C INTEGER JX, JY, IDX, IDY, ILEV REAL Z1MAH, Z1MBH C IF (KINT.EQ.1) THEN C LINEAR INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) ) ) C+ + + PGAMA(JX,JY,2)*( C+ + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV ) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV ) ) ) ENDDO ENDDO C ELSE +IF (KINT.EQ.2) THEN C QUADRATIC INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) + + PALFA(JX,JY,3)*PARG(IDX+1,IDY-1,ILEV-1) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) + + PALFA(JX,JY,3)*PARG(IDX+1,IDY ,ILEV-1) ) + + PBETA(JX,JY,3)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY+1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY+1,ILEV-1) + + PALFA(JX,JY,3)*PARG(IDX+1,IDY+1,ILEV-1) ) ) C+
ubsan bootstrap on x86_64 trunk rev. 210310: summary of results (BRRR):
This was the configure command: configure flags: --prefix=/home/toon/compilers/install --with-gnu-as --with-gnu-ld --with-build-config=bootstrap-ubsan --enable-languages=all --disable-multilib --disable-nls --with-arch=core-avx2 --with-tune=core-avx2 Here are the numbers: === gcc Summary === # of expected passes108884 # of unexpected failures442 === gfortran Summary === # of expected passes45221 # of unexpected failures565 === g++ Summary === # of expected passes86975 # of unexpected failures1012 See: http://gcc.gnu.org/ml/gcc-testresults/2014-05/msg00845.html -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Ah HAH - this SANITY stuff is finally going to show its value for GCC development itself.
Look at this: http://gcc.gnu.org/ml/gcc-testresults/2014-03/msg00450.html /home/toon/compilers/gcc/gcc/ira-color.c:1508:29: runtime error: signed integer overflow: -8847224 * 270 cannot be represented in type 'int' ... Configured by: configure --prefix=/home/toon/compilers/install --with-gnu-as --with-gnu-ld --with-build-config=bootstrap-ubsan --enable-languages=fortran,objc --disable-multilib --disable-nls --with-arch=core-avx2 --with-tune=core-avx2 -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: clang vs free software
On 01/24/2014 12:12 AM, Jonathan Wakely wrote: On 23 January 2014 22:56, Chris Lattner wrote: Unrelated to this thread, it would be great for this web page to get updated. You may find it to be "a better-supported point of view", but it is also comparing against clang 3.2, which is from the end of 2012, and a lot has changed since then. Like a lot has changed since the GCC 4.2 version used in http://clang.llvm.org/diagnostics.html :-) (I'm glad the page acknowledges it uses that version now, thanks to whoever did that!) Ask them about the Fortran performance. Cray, Inc. doesn't have any problem to include gfortran with their latest supercomputing offerings as one of the three supported compilers (their own, Intel's, and GNU). ECMWF [1] just bought one and is installing it. :-) [1] http://blogs.wsj.com/metropolis/2012/10/24/weather-journal-storm-could-make-next-week-a-mess/ -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: How to generate AVX512 instructions now (just to look at them).
On 01/03/2014 07:04 PM, Jakub Jelinek wrote: On Fri, Jan 03, 2014 at 05:04:55PM +0100, Toon Moene wrote: I am trying to figure out how the top-consuming routines in our weather models will be compiled when using AVX512 instructions (and their 32 512 bit registers). I thought an up-to-date trunk version of gcc, using the command line: <...>/gfortran -Ofast -S -mavx2 -mavx512f would do that. Unfortunately, I do not see any use of the new zmm.. registers, which might mean that AVX512 isn't used yet. This is how the nightly build job builds the trunk gfortran compiler: configure --prefix=/home/toon/compilers/install --with-gnu-as --with-gnu-ld --enable-languages=fortran<,other-language> --disable-multilib --disable-nls --with-arch=core-avx2 --with-tune=core-avx2 Is it the --with-arch=core-avx2 ? Or perhaps the --with-gnu-as --with-gnu-ld (because the installed ones do not support AVX512 yet ?). You shouldn't need assembler with AVX512 support just for -S, if I try say simple: void f1 (int *__restrict e, int *__restrict f) { int i; for (i = 0; i < 1024; i++) e[i] = f[i] * 7; } I don't doubt that would work, what I'm interested in, is (cat verintlin.f): SUBROUTINE VERINT ( I KLON , KLAT , KLEV , KINT , KHALO I , KLON1 , KLON2 , KLAT1 , KLAT2 I , KP , KQ , KR R , PARG , PRES R , PALFH , PBETH R , PALFA , PBETA , PGAMA ) C C*** C C VERINT - THREE DIMENSIONAL INTERPOLATION C C PURPOSE: C C THREE DIMENSIONAL INTERPOLATION C C INPUT PARAMETERS: C C KLON NUMBER OF GRIDPOINTS IN X-DIRECTION C KLAT NUMBER OF GRIDPOINTS IN Y-DIRECTION C KLEV NUMBER OF VERTICAL LEVELS C KINT TYPE OF INTERPOLATION C= 1 - LINEAR C= 2 - QUADRATIC C= 3 - CUBIC C= 4 - MIXED CUBIC/LINEAR C KLON1 FIRST GRIDPOINT IN X-DIRECTION C KLON2 LAST GRIDPOINT IN X-DIRECTION C KLAT1 FIRST GRIDPOINT IN Y-DIRECTION C KLAT2 LAST GRIDPOINT IN Y-DIRECTION C KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS C KRARRAY OF INDEXES FOR VERTICAL DISPLACEMENTS C PARG ARRAY OF ARGUMENTS C PALFH ALFA HAT C PBETH BETA HAT C PALFA ARRAY OF WEIGHTS IN X-DIRECTION C PBETA ARRAY OF WEIGHTS IN Y-DIRECTION C PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION C C OUTPUT PARAMETERS: C C PRES INTERPOLATED FIELD C C HISTORY: C C J.E. HAUGEN 1 1992 C C*** C IMPLICIT NONE C INTEGER KLON , KLAT , KLEV , KINT , KHALO, IKLON1 , KLON2 , KLAT1 , KLAT2 C INTEGER KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT) REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV) , RPRES(KLON,KLAT) , R PALFH(KLON,KLAT) , PBETH(KLON,KLAT) , R PALFA(KLON,KLAT,4) , PBETA(KLON,KLAT,4), R PGAMA(KLON,KLAT,4) C INTEGER JX, JY, IDX, IDY, ILEV REAL Z1MAH, Z1MBH C C LINEAR INTERPOLATION C DO JY = KLAT1,KLAT2 DO JX = KLON1,KLON2 IDX = KP(JX,JY) IDY = KQ(JX,JY) ILEV = KR(JX,JY) C PRES(JX,JY) = PGAMA(JX,JY,1)*( C + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV-1) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV-1) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV-1) ) ) C+ + + PGAMA(JX,JY,2)*( C+ + PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY-1,ILEV ) ) + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY ,ILEV ) + + PALFA(JX,JY,2)*PARG(IDX ,IDY ,ILEV ) ) ) ENDDO ENDDO C RETURN END i.e., real Fortran code, not just intrinsics :-) Thanks, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
How to generate AVX512 instructions now (just to look at them).
I am trying to figure out how the top-consuming routines in our weather models will be compiled when using AVX512 instructions (and their 32 512 bit registers). I thought an up-to-date trunk version of gcc, using the command line: <...>/gfortran -Ofast -S -mavx2 -mavx512f would do that. Unfortunately, I do not see any use of the new zmm.. registers, which might mean that AVX512 isn't used yet. This is how the nightly build job builds the trunk gfortran compiler: configure --prefix=/home/toon/compilers/install --with-gnu-as --with-gnu-ld --enable-languages=fortran<,other-language> --disable-multilib --disable-nls --with-arch=core-avx2 --with-tune=core-avx2 Is it the --with-arch=core-avx2 ? Or perhaps the --with-gnu-as --with-gnu-ld (because the installed ones do not support AVX512 yet ?). Thanks in advance, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Hmmm, I think we've seen this problem before (lto build):
http://gcc.gnu.org/ml/gcc-testresults/2013-12/msg1.html FAILED: Bootstrap (build config: lto; languages: fortran; trunk revision 205557) on x86_64-unknown-linux-gnu In function 'release', inlined from 'release' at /home/toon/compilers/gcc/gcc/vec.h:1428:3, inlined from '__base_dtor ' at /home/toon/compilers/gcc/gcc/vec.h:1195:0, inlined from 'compute_antic_aux' at /home/toon/compilers/gcc/gcc/tree-ssa-pre.c:2212:0, inlined from 'compute_antic' at /home/toon/compilers/gcc/gcc/tree-ssa-pre.c:2493:0, inlined from 'do_pre' at /home/toon/compilers/gcc/gcc/tree-ssa-pre.c:4738:23, inlined from 'execute' at /home/toon/compilers/gcc/gcc/tree-ssa-pre.c:4818:0: /home/toon/compilers/gcc/gcc/vec.h:312:3: error: attempt to free a non-heap object 'worklist' [-Werror=free-nonheap-object] ::free (v); ^ lto1: all warnings being treated as errors make[4]: *** [/dev/shm/wd26755/cczzGuTZ.ltrans13.ltrans.o] Error 1 make[4]: *** Waiting for unfinished jobs lto-wrapper: make returned 2 exit status /usr/bin/ld: lto-wrapper failed collect2: error: ld returned 1 exit status -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Vectorization: Loop peeling with misaligned support.
On 11/16/2013 04:25 AM, Tim Prince wrote: Many decisions on compiler defaults still are based on an unscientific choice of benchmarks, with gcc evidently more responsive to input from the community. I'm also quite convinced that we are hampered by the fact that there is no IPA on alignment in GCC. I bet that in the average Fortran program, most arrays are suitably aligned (after all, they're either a - by definition - SAVEd array in a module, or an ALLOCATEd array), and code that does this: CALL AAP(..., A(2), ...) is relatively sparse. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Spamming the gcc-testresults mailing list (or not).
I have now got two of these: - - - - - - - - - 8< - - - - - - - - - 8< - - - - - - - - - Hi. This is the qmail-send program at sourceware.org. I'm afraid I wasn't able to deliver your message to the following addresses. This is a permanent error; I've given up. Sorry it didn't work out. : In an effort to cut down on our spam intake, we block email that is detected as spam by the SpamAssassin program. Your email was flagged as spam by that program. See: http://spamassassin.apache.org/ for more details. See http://sourceware.org/lists.html#sourceware-list-info for more information. If you are not a "spammer", we apologize for the inconvenience. You can add yourself to the gcc.gnu.org "global allow list" by sending email *from*the*blocked*email*address* to: global-allow-subscribe-toon=moene@gcc.gnu.org For certain types of blocks, this will enable you to send email without being subjected to further spam blocking. This will not allow you to post to a list if you have been explicitly blocked, if you are posting an off-topic message, if you are sending an attachment that looks like a virus, etc. Contact gcc-testresults-ow...@gcc.gnu.org if you have questions about this. (#5.7.2) --- Below this line is a copy of the message. Return-Path: Received: (qmail 30906 invoked by uid 89); 14 Nov 2013 13:20:43 - Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.98 on sourceware.org X-Virus-Found: No X-Spam-Flag: YES X-Spam-SWARE-Status: Yes, score=5.7 required=5.0 tests=AWL,BAYES_99,KAM_STOCKTIP,RDNS_NONE,URIBL_BLOCKED autolearn=no version=3.3.2 X-Spam-Status: Yes, score=5.7 required=5.0 tests=AWL,BAYES_99,KAM_STOCKTIP,RDNS_NONE,URIBL_BLOCKED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: * X-HELO: moene.org Received: from Unknown (HELO moene.org) (80.101.130.238) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 14 Nov 2013 13:20:16 + Received: from toon by moene.org with local (Exim 4.80) (envelope-from ) id 1Vgwq6-00047o-5E for gcc-testresu...@gcc.gnu.org; Thu, 14 Nov 2013 14:20:06 +0100 To: gcc-testresu...@gcc.gnu.org Subject: FAILED: Bootstrap (build config: ubsan; languages: fortran; trunk revision 204790) on x86_64-unknown-linux-gnu Message-Id: From: Toon Moene Date: Thu, 14 Nov 2013 14:20:06 +0100 none needed if [ x"-fpic" != x ]; then \ /scratch/toon/bd4979/./prev-gcc/xgcc -B/scratch/toon/bd4979/./prev-gcc/ -B/home/toon/compilers/install/x86_64-unknown-linux- - - - - - - - - - 8< - - - - - - - - - 8< - - - - - - - - - whereas the one of today succeeded in getting through: http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg01098.html Do I need to worry ? -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Something wrong with bootstrap-lto, or lto itself:
Consider this: http://gcc.gnu.org/ml/gcc-testresults/2013-10/msg02329.html and http://gcc.gnu.org/ml/gcc-testresults/2013-10/msg02258.html /scratch/toon/bd5894/./prev-gcc/xg++ -B/scratch/toon/bd5894/./prev-gcc/ -B/home/toon/compilers/install/x86_64-unknown-linux-gnu/bin/ -nostdinc++ -B/scratch/toon/bd5894/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -B/scratch/toon/bd5894/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/scratch/toon/bd5894/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu -I/scratch/toon/bd5894/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include -I/home/toon/compilers/gcc/libstdc++-v3/libsupc++ -L/scratch/toon/bd5894/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -L/scratch/toon/bd5894/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -g -O2 -flto=jobserver -frandom-seed=1 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc -o cc1plus \ cp/cp-lang.o c-family/stub-objc.o cp/call.o cp/decl.o cp/expr.o cp/pt.o cp/typeck2.o cp/class.o cp/decl2.o cp/error.o cp/lex.o cp/parser.o cp/ptree.o cp/rtti.o cp/typeck.o cp/cvt.o cp/except.o cp/friend.o cp/init.o cp/method.o cp/search.o cp/semantics.o cp/tree.o cp/repo.o cp/dump.o cp/optimize.o cp/mangle.o cp/cp-objcp-common.o cp/name-lookup.o cp/cxx-pretty-print.o cp/cp-gimplify.o cp/cp-array-notation.o cp/lambda.o cp/vtable-class-hierarchy.o attribs.o incpath.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/array-notation-common.o c-family/c-ubsan.o i386-c.o glibc-c.o cc1plus-checksum.o libbackend.a main.o tree-browser.o libcommon-target.a libcommon.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -lcloog-isl -lisl -lmpc -lmpfr -lgmp -rdynamic -ldl -L../zlib -lz In function 'release', inlined from '_ZN7va_heap7reserveIbEEvRP3vecIT_S_8vl_embedEjb.part.95' at /home/toon/compilers/gcc/gcc/vec.h:288:7, inlined from 'reserve', inlined from '_ZN3vecIb7va_heap6vl_ptrE7reserveEjb.part.96' at /home/toon/compilers/gcc/gcc/vec.h:1367:3, inlined from 'reserve.constprop', inlined from 'reserve_exact' at /home/toon/compilers/gcc/gcc/vec.h:1387:45, inlined from 'safe_grow' at /home/toon/compilers/gcc/gcc/vec.h:1515:3, inlined from 'safe_grow_cleared' at /home/toon/compilers/gcc/gcc/vec.h:1529:3, inlined from 'vect_bb_vectorization_profitable_p' at /home/toon/compilers/gcc/gcc/tree-vect-slp.c:2027:0, inlined from 'vect_slp_analyze_bb_1' at /home/toon/compilers/gcc/gcc/tree-vect-slp.c:2174:55, inlined from 'vect_slp_analyze_bb' at /home/toon/compilers/gcc/gcc/tree-vect-slp.c:2229:44: /home/toon/compilers/gcc/gcc/vec.h:316:3: error: attempt to free a non-heap object 'life' [-Werror=free-nonheap-object] ::free (v); ^ lto1: all warnings being treated as errors -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Sorry, couldn't resist this one:
https://twitter.com/ToonMoene/status/392392928493973504 -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Compilation flags in libgfortran
On 10/15/2013 03:58 PM, Igor Zamyatin wrote: Hi All! Is there any particular reason that matmul* modules from libgfortran are compiled with -O2 -ftree-vectorize? I see some regressions on Atom processor after r202980 (http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00846.html) Why not just use O3 for those modules? Igor, It helps (:-) to send questions about gfortran and its run time library libgfortran cc'd to fort...@gcc.gnu.org, because not every GNU Fortran maintainer reads gcc@gcc.gnu.org Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Question about vectorization limit
On 05/31/2013 03:54 PM, Jakub Jelinek wrote: > I wrote: But this "inner loop" has at least 3 basic blocks - so what does the "loop->num_nodes != 2" test exactly codify ? With the above testcase it has just 2. Before ifcvt pass it still has 4: Ah, I missed that subtle part. So my example is just no complex enough to deal with "loop->num_nodes != 2" ... Thanks, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Question about vectorization limit
On 05/31/2013 03:41 PM, Jakub Jelinek wrote: On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote: SUBROUTINE XYZ(A, B, N) DIMENSION A(N), B(N) DO I = 1, N IF (A(I)> 0.0) THEN A(I) = B(I) / A(I) ELSE A(I) = B(I) ENDIF ENDDO END Well, in this case (with -Ofast) it is just the case that ifcvt or earlier passes did a poor job at moving the load from B(I) before the conditional, which, if we ignore exceptions, should be possible, as both branches read from the same memory. The store to A(I) is already hoisted by cselim out of the conditional. If you rewrite the above into: SUBROUTINE XYZ(A, B, N) DIMENSION A(N), B(N) DO I = 1, N C = B(I) IF (A(I)> 0.0) THEN A(I) = C / A(I) ELSE A(I) = C ENDIF ENDDO END then it is vectorized just fine. But this "inner loop" has at least 3 basic blocks - so what does the "loop->num_nodes != 2" test exactly codify ? Is Dehao just looking at the wrong test ? And why is this test there ? -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Question about vectorization limit
On 05/31/2013 10:20 AM, Richard Biener wrote: So - I doubt that you both do not get any ICEs and more performance. I added the second suggested patch: Index: tree-vect-loop-manip.c === --- tree-vect-loop-manip.c (revision 199454) +++ tree-vect-loop-manip.c (working copy) @@ -985,7 +985,7 @@ /* All loops have an outer scope; the only case loop->outer is NULL is for the function itself. */ || !loop_outer (loop) - || loop->num_nodes != 2 +/*|| loop->num_nodes != 2 */ || !empty_block_p (loop->latch) || !single_exit (loop) /* Verify that new loop exit condition can be trivially modified. */ And I still get no ICE sicking 3.5 million lines of Fortran on this compiler. This is pretty suspicious - it might well be that checks further down will prohibit vectorization to do anything on loops like SUBROUTINE XYZ(A, B, N) DIMENSION A(N), B(N) DO I = 1, N IF (A(I) > 0.0) THEN A(I) = B(I) / A(I) ELSE A(I) = B(I) ENDIF ENDDO END -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Question about vectorization limit
On 05/30/2013 02:46 AM, Dehao Chen wrote: In tree-vect-loop.c, it limits the vectorization only to loops that have 2 BBs: /* Inner-most loop. We currently require that the number of BBs is exactly 2 (the header and latch). Vectorizable inner-most loops look like this: (pre-header) | header<+ | || | +--> latch --+ | (exit-bb) */ if (loop->num_nodes != 2) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "not vectorized: control flow in loop."); return NULL; } Any insights why the limit is set to 2? We found that removing this limit actually improve performance for many applications. It might have been just "safety first" - we know how to do single basic block inner loops, let's stick with them for the moment (this development was started around a decade ago). Our 3.5 million lines of Fortran 90 code (mostly array expressions) and 125,000 lines of arbitrary C code is currently normally compiled with: $ gfortran -v Using built-in specs. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.3-4' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --with-system-zlib --enable-objc-gc --with-cloog --enable-cloog-backend=ppl --disable-cloog-version-check --disable-ppl-version-check --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.7.3 (Debian 4.7.3-4) So I tried it with: $ /usr/snp/bin/gfortran -v Using built-in specs. COLLECT_GCC=/usr/snp/bin/gfortran COLLECT_LTO_WRAPPER=/usr/snp/libexec/gcc/x86_64-unknown-linux-gnu/4.7.4/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4_7-branch/configure --prefix=/usr/snp --with-gnu-as --with-gnu-ld --enable-languages=fortran --disable-libmudflap --disable-multilib --disable-nls --with-arch=native --with-tune=native Thread model: posix gcc version 4.7.4 20130530 (prerelease) (GCC) augmented by this single change: toon@super:~/compilers/gcc-4_7-branch/gcc$ svn diff Index: tree-vect-loop.c === --- tree-vect-loop.c(revision 199454) +++ tree-vect-loop.c(working copy) @@ -1002,6 +1002,8 @@ | (exit-bb) */ + /* Disabled check + if (loop->num_nodes != 2) { if (vect_print_dump_info (REPORT_BAD_FORM_LOOPS)) @@ -1009,6 +1011,8 @@ return NULL; } + */ + if (empty_block_p (loop->header)) { if (vect_print_dump_info (REPORT_BAD_FORM_LOOPS)) Amazingly enough, I didn't hit *any* ICE. Also, running the generated executables produced reasonable results (you have to trust me that it is *very hard* to fake correct meteorological results if you blow up the generated code). Unfortunately, the relative importance of conditional code in inner loops is not sufficient to show any speedup on our code. Nevertheless, it would be a huge improvement on *other* codes if we could lift this restriction. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
undefined reference's in bootstrap-asan.
See: http://gcc.gnu.org/ml/gcc-testresults/2013-05/msg00841.html It seems that: /scratch/toon/bd24983/./gcc/xgcc -B/scratch/toon/bd24983/./gcc/ -B/home/toon/compilers/install/x86_64-unknown-linux-gnu/bin/ -B/home/toon/compilers/install/x86_64-unknown-linux-gnu/lib/ -isystem /home/toon/compilers/install/x86_64-unknown-linux-gnu/include -isystem /home/toon/compilers/install/x86_64-unknown-linux-gnu/sys-include-g -O2 -static-libstdc++ -static-libgcc -o fixincl fixincl.o fixtests.o fixfixes.o server.o procopen.o fixlib.o fixopts.o ../libiberty/libiberty.a should have a "libasan.a" included ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: gcc : c++11 : full support : eta?
On 01/23/2013 08:43 PM, Richard Biener wrote: Ah, well - the old issue that LLVM has just become a very good marketing machinery (and we've stayed at being a compiler - heh). The problem of being on a compiler-only list is that this is becoming a self-evident truth. However, as a meteorologist, I know better (HAH :-)) WRF (http://www.wrf-model.org/index.php) is our eternal Nemesis. It's a free weather forecasting model and data assimilation system developed by the best US academic institutions. "WRF has a rapidly growing community of users, and workshops and tutorials are held each year at NCAR. WRF is currently in operational use at NCEP, AFWA and other centers." So it is hopeless to fight it. It is free (our European-community-developed model is not) and it has a *huge* backing from the US academic community. Unfortunately for this virtual reality, it doesn't match the real one. We do still exist, and there is no indication (on the basis of meteorological verification) that we will be dethroned shortly. And yes, we understand why academia is mad with WRF - we *do* understand why they like to have their students play with it (it is much *easier*) - that doesn't mean that a focused course can't make students resident at KNMI (the Dutch Meteorological Institute) familiar enough with *our* weather model to work on it. Don't get your nickers in a twist (as we would say in the seventies). -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: contrib/test_summary mail error
On 01/04/2013 04:14 PM, Cynthia Rempel wrote: Hi, I tried to run the following command: ../gcc/contrib/test_summary -p my_commentary.txt -m gcc-testresu...@gcc.gnu.org | sh I added "| sed -e 's/Mail/mailx/'" before the "| sh" and it worked for Debian Tesing (several months ago). Apparently I had picked the wrong mail package too. Perhaps test_summary needs some sort of configure-type approach to figure out what the mail sender is on the host system ... -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news