Re: Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray fail to pass, works on 15.1.1 20250712

Mikael Morin Sat, 26 Jul 2025 13:33:07 -0700

Le 26/07/2025 à 19:03, Harald Anlauf a écrit :

Gesendet: Samstag, 26. Juli 2025 um 15:17
Von: "Mikael Morin" <morin-mik...@orange.fr>
An: "Harald Anlauf" <anl...@gmx.de>, ve...@gmx.de
CC: fortran@gcc.gnu.org
Betreff: Re: Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of 
OpenCoarray fail to pass, works on 15.1.1 20250712


Le 24/07/2025 à 22:01, Harald Anlauf a écrit :

Hi Andre,

Gesendet: Donnerstag, 24. Juli 2025 um 10:25
Von: "Andre Vehreschild" <ve...@gmx.de>
An: "Harald Anlauf" <anl...@gmx.de>
CC: fortran@gcc.gnu.org
Betreff: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray 
fail to pass, works on 15.1.1 20250712

Hi Harald,

<snipp>

Did that help?


Actually this discussion is quite helpful to me, so I (and maybe
others) understand more of the underlying stuff.

I now spent some time looking thru portions of the F2023 standard,
and I think that it answers many questions in that respect:

- 10.2 Assignment, esp. 10.2.1.3 Interpretation of intrinsic
assignments

- 11.1.3 ASSOCIATE construct

- Transformational intrinsics: this_image, team_number, ...

It seems to be clear in most cases on which image something is
evaluated, and which order.

I mean, we are way off of the original question, which was if it
is ok to always compute a function result on the image initiating a
communication instead of in the caf_accessor.


I am still confused what you mean by "initiating a communication".


When you use OpenCoarrays and a Coindex gets executed a communication
is triggered.

OK.

The function you are talking about takes an argument, interpreted
in the way defined by the standard, and each image evalutes its
portion.


That is what I confused. I had the dumb idea to evaluate certain
functions not on the calling image, but on the remote one. (Again,
OpenCoarrays triggers a communication, when the coindex points to an
image different from this_image()). My last patch remedies this.
Function calls in an expression having a coindex are now always
evaluated on the calling image.


Can you elaborate what you mean here?  It is very unclear to me.
Function evaluation, or subroutine calls require their arguments
to be evaluated before actually invoking the procedures.

See F2023:15.5.3 Function reference and 15.5.4 Subroutine reference,
where there is nothing special about coarrays or coindexed objects.
No need to even remotely think about any doing something on
different images.  This is also consistent with the text on
assignments, associate, etc.

So do I understand your comment that the coarray implementation
does (did?) not respect the standard here?  Does is satisfy the
standard now?

In code such as

if (this_image() == 1) caf(:, team_number(row_team))[1, team_number
= -1] = row


team_number is a transformational function, which I expect to get
evaluated on each image where the condition is fulfilled.  I don't
see any communication involved.


Well, the expression caf(...)[1, team_number=-1] when this_image() /= 1
triggers a communication. The program is writing into "remote" memory
here. I.e. memory that belongs to image 1 in the initial team. When
this code is executed by image 1 of the row_team, which maps to image 4
in the initial team (just for simplicity; it may map to a different one,
but let's assume it is mapped linear here), then a portion of the caf
array in the initial team of image 1 is updated.


Well, this does not look right to me.

The "if (this_image() == 1)" prevents the assignment from being
executed for this_image() /= , no matter what you like.
If that is your proposal, then I am out.

When using
OpenCoarrays, this means that a message is composed, send to the remote
image's communication thread, executed there and a result is returned
indicating completion. This is where the communication is involved.
GFortran creates an accessor routine for writing data into `caf(:,
add_data%team_number_row_team) = data`. This routine is executed by the
communication thread on the remote image. My latest patch now corrects,
that `add_data%team_number_row_team` is correctly used instead of
`current_team(add_data%row_team)`. The latter can not be executed in
the communication thread, because `row_team` is a pointer into memory
of the calling image.

Yes, I know. All of this confusing and it also took me a longer time to
understand all of this and figure a way to do this fast and efficient.

Then there is the assignment, which is difficult.  I haven't thought
long enough about the consistency between the condition which refers
to the current team, and coindex 1 of the initial team.
(This is why I asked about communicators and alike, as this assignment
might be correct only under very special conditions, or I just don't
understand it.)


To my understanding that assignment is allowed by the standard. Any
concerns?

So can you clarify that your code evaluates in the standard-defined
way?


I hope the above did it.


No, unfortunately I either do not agree with your reasoning because
I do not understand it, or I simply do not understand coarrays.

MPI is way simpler to use.

Hopefully somebody else can help here.  I am lost...

(For reference, the discussion started in another thread:
https://gcc.gnu.org/pipermail/fortran/2025-July/062451.html)

Let's see if I understand the problem.  Consider this example:

program p
    implicit none
    integer :: img, data
    integer, allocatable :: res(:)[:]
    img = this_image()
    data = img * img + 10  ! Something different on each image
    allocate(res(num_images())[*], source=-1)
    res(get_val())[1] = data
    if (this_image() == 1) print *, res
contains
    pure function get_val()
      integer :: get_val
      get_val = img
    end function
end program

The function get_val() returns the current image, so when assigning to
res(get_val())[1], it should be evaluated on the local image, otherwise
only the first element of res is populated (and with conflicting values).

What is important is what the standard says.

Of course, but it is also important that we first agree on the questionwe want to answer or debate. I have the impression that in this threadevery message jumps to a different topic.

In the above code,
the array index is to be evalutated before the assignment takes place.
The following thus should be equvivalent:

   res(img)[1] = data
   res(get_val())[1] = data
   res(this_image())[1] = data

That's my interpretation as well, but Andre's messages mentionedevaluation on the remote image, which would mean something different.

Which are all the caf equivalent of mpi_gather with communicator mpi_comm_world
and root=0 (= image 1).  Confirmed with NAG and ifx

Please let's not bring MPI to the party.

Andre, Harald, is this the original topic of this thread?
Is my reasoning correct?


Yes and no.  Your example has only the initial team with the associated
communicator (this is my layman's interpretation).

Andre's testcase in addition uses teams in ways I do not yet understand.


I dropped teams because it appeared to me as a complicating nuisance.
Is it central to the problem we are trying to debate here?

Re: Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray fail to pass, works on 15.1.1 20250712

Reply via email to