Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray fail to pass, works on 15.1.1 20250712

Harald Anlauf Wed, 23 Jul 2025 14:21:04 -0700

Hi Andre,

> Gesendet: Mittwoch, 23. Juli 2025 um 10:06
> Von: "Andre Vehreschild" <ve...@gmx.de>
> An: "Harald Anlauf" <anl...@gmx.de>
> CC: fortran@gcc.gnu.org
> Betreff: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray 
> fail to pass, works on 15.1.1 20250712
>
> Hi Harald,
> 
> > I spent some time understanding why Intel fails on your example.
> > Playing around with your code below, adding print statements,
> > rearranging code, etc., it appears that ifx has a problem with:
> 
> I came to that conclusion, too. But we both agree that independent of 
> rewriting
> the change team to an associate the mapping is valid, right? I mean `row` just
> is an alias for a slice of the `caf` array. This should be like an array with 
> a
> pointer or allocatable attribute, that must not be reallocated, because it is
> sharing memory.
> 
> > >             associate(row => caf(:, team_number(row_team)))  
> > ...
> > >             col_t: change team(column_team)
> > >                 associate(cell => row(this_image(row_team)))
> > >                 cell = team_number(row_team)
> > >                 if (this_image(row_team) /= 1) 
> > > row(this_image(row_team))[1,
> > > team=row_team] = cell end associate
> > >             end team col_t  
> > 
> > If I understand correctly, there is a very complicated dependency
> > of lhs and rhs for the assignment
> > 
> >   if (this_image(row_team) /= 1) row(this_image(row_team))[1, team=row_team]
> > = cell
> > 
> > due to the associates.  Trying to rewrite it gives weird runtime errors.
> 
> Well, parallel program development usually does not make things easier.


But faster programs, if done right!  ;-)

> > This is beyond my paygrade.
> > 
> > Regarding the previous version:
> > 
> > > Anyhow to answer your initial question: The line:
> > > 
> > > if (this_image() == 1) caf(:, team_number(row_team))[1, team_number = -1] 
> > > =
> > > row  
> > 
> > Looking at it with my MPI knowledge, this does feel unusual.  What would be
> > the communicator here?  Is there an equivalent to MPI communicators with
> > Coarrays? (and does form_team create that communicator, change_team remap
> > coarray descriptors suitably, etc.?)  Or am I completely off?
> 
> MPI-Communicator wise: Every form team creates a new MPI-communicator, that 
> has
> only the images/processes participating in it. Every change team changes to
> that communicator on the current image, limiting the "world" view when current
> team is addressed. But the parent communicators are not unavailable! By using
> for example team_number = -1 one can address the initial communicator/team. In
> MPI terms MPI_COMM_WORLD. 
> 
> A coarray itself does not know in which communicator/team it lives, because
> coarrays created in parent teams are available in their children. Both
> libraries (OpenCoarrays as well as caf_shmem) just make sure, that coarrays
> that where allocated in a child team are unallocated when that team ends.
> 
> I totally understand that it feels unusual. Thinking about a Red-Black-SSOR 
> over
> MPI one would be writing into the "wrong" communicator here. But well, this is
> mixing up algorithms with techniques. Because one algo makes this feel wrong,
> it must not mean, that another algorithm can not do a reduce over all 
> processes.
> 
> Did that help?

Actually this discussion is quite helpful to me, so I (and maybe others)
understand more of the underlying stuff.

I now spent some time looking thru portions of the F2023 standard,
and I think that it answers many questions in that respect:

- 10.2 Assignment, esp. 10.2.1.3 Interpretation of intrinsic assignments

- 11.1.3 ASSOCIATE construct

- Transformational intrinsics: this_image, team_number, ...

It seems to be clear in most cases on which image something is evaluated,
and which order.

> I mean, we are way off of the original question, which was if it
> is ok to always compute a function result on the image initiating a
> communication instead of in the caf_accessor.

I am still confused what you mean by "initiating a communication".

The function you are talking about takes an argument, interpreted
in the way defined by the standard, and each image evalutes its
portion.

In code such as

> if (this_image() == 1) caf(:, team_number(row_team))[1, team_number = -1] = 
> row  

team_number is a transformational function, which I expect to get
evaluated on each image where the condition is fulfilled.  I don't
see any communication involved.

Then there is the assignment, which is difficult.  I haven't thought
long enough about the consistency between the condition which refers
to the current team, and coindex 1 of the initial team.
(This is why I asked about communicators and alike, as this assignment
might be correct only under very special conditions, or I just don't
understand it.)

> That patch has still to be merged
> or joined to the pr88076 series.

So can you clarify that your code evaluates in the standard-defined
way?

Cheers,
Harald

> Thanks for coming back to this.
> 
> Regards,
>       Andre
> -- 
> Andre Vehreschild * Email: vehre ad gmx dot de 
>

Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray fail to pass, works on 15.1.1 20250712

Reply via email to