Re: Help with fortran pointer ans OpenACC
Hi Tobias, many thanks for all these informations! - I confirm, this clause solve the problem with GNU Fortran - This workaround is fully compatible with Nvidia Fortran - The problem with Cray Fortran remains (with this compiler, the test-case runs but the result is wrong) I've tested the suggested options for debugging but I must admit that these intermediates files are a little bit out of my knowledge level to easily identify the problem. In my work, portability is important and I have access to different compilers and GPU architectures and may be my best contribution to GNU software is limited to these tests and reports with small test-cases from a large scientific code that is expected to run on most GPU with various compilers. Thanks again for your help and detailed explanation Patrick Le 23/08/2023 à 13:41, Tobias Burnus a écrit : Hi, On 23.08.23 12:19, Patrick Begou via Fortran wrote: For several days I have some trouble with OpenACC offloading and fortran pointers. I'm testing with a very small peace of code to investigate but I do not progress for several days and I need your help. Could someone give me advices or a small explanation on what I have not understood there ? First, for debugging, using -fdump-tree-original -fdump-tree-gimple -fdump-tree-omplower and looking at the file .* might help - grep for '#pragma'; this shows the internal representation (incomplete) in a C like output. That's how I spotted the issue below. * * * For gfortran, the issue is: !$acc parallel loop collapse(2) default(present) if(runongpu) It works if you add a "present(current)" → see also newly filed https://gcc.gnu.org/PR16 gfortran seems to follow - but not completely - the spec: Quoting from OpenACC 3.2: "2.6.2 Variables with Implicitly Determined Data Attributes": "A scalar variable will be treated as if it appears either: * In a copy clause if the compute construct is a kernels construct. * In a firstprivate clause otherwise. Note: Any default(none) clause visible at the compute construct applies to both aggregate and scalar variables. However, any default(present) clause visible at the compute construct applies only to aggregate variables." However, the glossary defines: "Aggregate variables – a variable of any non-scalar datatype, including array or composite variables. In Fortran, this includes any variable with allocatable or pointer attribute and character variables." And, as you have a (scalar) Fortran pointer, the following should have applied instead: "An aggregate variable will be treated as if it appears either: • In a present clause if there is a default(present) clause visible at the compute construct. • In a copy clause otherwise." * * * Thus: It looks as if your program is valid, adding 'present(current)' will fix it for gfortran (and Cray?), 'default(...)' and implicit mapping is confusing, and there is now a gfortran PR to track this issue: https://gcc.gnu.org/PR16 Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: Help with fortran pointer ans OpenACC
Hi, On 23.08.23 12:19, Patrick Begou via Fortran wrote: For several days I have some trouble with OpenACC offloading and fortran pointers. I'm testing with a very small peace of code to investigate but I do not progress for several days and I need your help. Could someone give me advices or a small explanation on what I have not understood there ? First, for debugging, using -fdump-tree-original -fdump-tree-gimple -fdump-tree-omplower and looking at the file .* might help - grep for '#pragma'; this shows the internal representation (incomplete) in a C like output. That's how I spotted the issue below. * * * For gfortran, the issue is: !$acc parallel loop collapse(2) default(present) if(runongpu) It works if you add a "present(current)" → see also newly filed https://gcc.gnu.org/PR16 gfortran seems to follow - but not completely - the spec: Quoting from OpenACC 3.2: "2.6.2 Variables with Implicitly Determined Data Attributes": "A scalar variable will be treated as if it appears either: * In a copy clause if the compute construct is a kernels construct. * In a firstprivate clause otherwise. Note: Any default(none) clause visible at the compute construct applies to both aggregate and scalar variables. However, any default(present) clause visible at the compute construct applies only to aggregate variables." However, the glossary defines: "Aggregate variables – a variable of any non-scalar datatype, including array or composite variables. In Fortran, this includes any variable with allocatable or pointer attribute and character variables." And, as you have a (scalar) Fortran pointer, the following should have applied instead: "An aggregate variable will be treated as if it appears either: • In a present clause if there is a default(present) clause visible at the compute construct. • In a copy clause otherwise." * * * Thus: It looks as if your program is valid, adding 'present(current)' will fix it for gfortran (and Cray?), 'default(...)' and implicit mapping is confusing, and there is now a gfortran PR to track this issue: https://gcc.gnu.org/PR16 Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Help with fortran pointer ans OpenACC
Hi everyone! For several days I have some trouble with OpenACC offloading and fortran pointers. I'm testing with a very small peace of code to investigate but I do not progress for several days and I need your help. The attached code goal is just to initialize some data on the GPU and is representative of my problem on a very large code. - It works fine with nvfortran (22.11) - it do not work with Gnu fortran (14.0.0 20230822 - experimental) => invalid memory - it do not work with Cray Fortran (15.1) => wrong results so I think the problem is the code, not the compiler. It is also difficult to find some openACC offloading examples using Fortran pointers and I'm stuck with this problem. Could someone give me advices or a small explanation on what I have not understood there ? Thanks for your help Patrick Code details: - all my fortran modules are grouped in the same file for simplification of the provided test-case. - compilation with GNU Firtran is: "gfortran -cpp -g -fopenacc grouped.f90" - setting "runongpu=.false." line 7 (no GPU) the result is: Default init OK Default value OK - setting "runongpu=.true." line 7 (no GPU) the result is: Default init OK libgomp: cuStreamSynchronize error: an illegal memory access was encountered - with nvhpc/22.11 and "runongpu=.true.", built with "nvfortran -acc=gpu,noautopar -gpu=cc80 -Minfo=accel grouped.f90" Default init OK Default value OK != ! Just to say run on the device or not. != module openacc_defs implicit none logical, save :: runongpu=.true. end module openacc_defs != ! Data structure for r2_tab and r2_ptr to manage pointers. ! r2_ptr is used in an allocatable array for a dynamivc number of r2_tab variables ! but could be used later in chained lists ! Memory is allocated on GPU each time. != module tab_m implicit none type r2_tab double precision, dimension(:,:), allocatable :: val integer :: dim1 integer :: dim2 end type r2_tab type r2_ptr type(r2_tab), pointer :: ptr type(r2_ptr), pointer :: next end type r2_ptr contains subroutine new_r2_tab(tab,n,m) implicit none integer, intent(in) ::n,m type(r2_tab), pointer, intent(inout) ::tab ! if (.not. associated(tab)) allocate(tab) if (allocated(tab%val)) deallocate(tab%val) allocate (tab%val(n,m)) tab%dim1=n tab%dim2=m tab%val(:,:)=1.0D0 !$acc enter data create(tab) !$acc enter data create(tab%val) end subroutine new_r2_tab end module tab_m != ! This module implements data processing (just an initialization here) ! if runongpu is .true. initialization is run on the GPU and then host is updated. != module manage_data use openacc_defs use tab_m implicit none contains subroutine set_default_val_gpu(liste, defval, nitems) implicit none integer, intent(in) :: nitems double precision, intent(in) :: defval type(r2_ptr), dimension(nitems) :: liste type(r2_tab), pointer :: current=>null() integer:: item,j,k do item=1, nitems current=>liste(item)%ptr !print*,current%dim1, current%dim2,size(current%val) !$acc parallel loop collapse(2) default(present) if(runongpu) do k=1, current%dim2 do j=1, current%dim1 current%val(j,k)=defval end do end do !$acc update if(runongpu) host(current%val) end do end subroutine set_default_val_gpu end module manage_data != ! main program. ! != program main use tab_m use manage_data implicit none integer, parameter:: N=5 type(r2_ptr), dimension(N) :: liste integer:: i,j,k type(r2_tab), pointer :: current=>null() double precision :: total ! Initialize do i=1,N nullify(liste(i)%ptr) nullify(liste(i)%next) end do ! Allocate (do not manage "next" pointer, all elements are set to 1.0) do i=1,N call new_r2_tab(liste(i)%ptr,N,i*N) end do ! Check all is correct on host side do i=1,N if (sum(liste(i)%ptr%val) .NE. N*i*N) then write(6,*)"Something goes wrong",sum(liste(i)%ptr%val)," != ",N*i*N STOP (1) end if end do write(6,*) "Default init OK" ! Update on host (runongpu is false) call