https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121570
Bug ID: 121570
Summary: Very high cost of ieee_next_after function, gfortran
optimization failure?
Product: gcc
Version: 14.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fortran
Assignee: unassigned at gcc dot gnu.org
Reporter: b.j.braams at cwi dot nl
Target Milestone: ---
To clarify the subject heading, the appended code contains three timed loops
that each iterate over close to 2^32 values. The first loop contains only very
simple arithmetic, the second loop invokes the fortran 'nearest' intrinsic and
the third loop invokes ieee_next_after.
Using gfortran 14.2.1 with the -O5 option the execution times for the three
loops are 4.0 seconds, 12 seconds and 920 seconds respectively on my Intel
i7-1165G7 processor.
I think that Andi Kleen of the gcc-bugs mailing list may add some comments
here. He identified that under the hood ieee_next_after invokes routines to
save the fpu state before the actual computation and to restore it afterwards,
and this is where the bulk of the time is spent. He comments further that these
invocations of _gfortran_ieee_procedure_{entry,exit} may perhaps be optimized
out of the loop.
(I note that with 'ifx -O5' the ieee_next_after loop takes only 17 seconds on
my system.)
Test program follows.
program main
!..use and access
use iso_fortran_env, only : int32, real32, real64
use ieee_arithmetic
implicit none
!..data
integer (kind=int32) :: i32, j32
real (kind=real32) :: t32, r32
real (kind=real64) :: tim0, tim1
!..executable part
j32 = huge(0_int32)
i32 = -j32-1
r32 = 0
call cpu_time (tim0)
! 2^32 iterations of a simple arithmetic statement
do while (i32.ne.j32)
r32 = -r32+i32
i32 = i32+1
end do
call cpu_time (tim1)
write (*,'(a25,1pg9.2,1pg16.9)') &
'2^32 simple arithmetic:', tim1-tim0, r32
t32 = huge(0.0_real32)
r32 = -t32
call cpu_time (tim0)
! close to 2^32 iterations of fortran nearest
do while (r32.ne.t32)
r32 = nearest(r32,1.0_real32)
end do
call cpu_time (tim1)
write (*,'(a25,1pg9.2,1pg16.9)') &
'2^32 intrinsic nearest:', tim1-tim0, r32
t32 = huge(0.0_real32)
r32 = -t32
call cpu_time (tim0)
! close to 2^32 iterations of ieee_next_after
do while (r32.ne.t32)
r32 = ieee_next_after(r32,t32)
end do
call cpu_time (tim1)
write (*,'(a25,1pg9.2,1pg16.9)') &
'2^32 ieee_next_after:', tim1-tim0, r32
stop
end program main