Re: Test with an lto-build of libgfortran.

2023-09-28 Thread Richard Biener via Fortran
On Wed, Sep 27, 2023 at 11:48 PM Jeff Law via Fortran
 wrote:
>
>
>
> On 9/27/23 12:21, Toon Moene wrote:
>
> >
> > The lto-ing of libgfortran did succeed, because I did get a new warning:
> >
> > gfortran -O3 -flto -flto-partition=none -static  -o xlintstrfz zchkrfp.o
> > zdrvrfp.o zdrvrf1.o zdrvrf2.o zdrvrf3.o zdrvrf4.o zerrrfp.o zlatb4.o
> > zlaipd.o zlarhs.o zsbmv.o zget04.o zpot01.o zpot03.o zpot02.o chkxer.o
> > xerbla.o alaerh.o aladhd.o alahd.o alasvm.o ../../libtmglib.a
> > ../../liblapack.a ../../librefblas.a
> > In function 'xtoa_big',
> >  inlined from 'write_z' at
> > /home/toon/compilers/gcc/libgfortran/io/write.c:1296:11,
> >  inlined from 'formatted_transfer_scalar_write' at
> > /home/toon/compilers/gcc/libgfortran/io/transfer.c:2136:4:
> > /home/toon/compilers/gcc/libgfortran/io/write.c:1222:6: warning: writing
> > 1 byte into a region of size 0 [-Wstringop-overflow=]
> >   1222 |   *q = '\0';
> >|  ^
> > /home/toon/compilers/gcc/libgfortran/io/write.c: In function
> > 'formatted_transfer_scalar_write':
> > /home/toon/compilers/gcc/libgfortran/io/write.c:1291:8: note: at offset
> > [34, 4294967294] into destination object 'itoa_buf' of size 33
> >   1291 |   char itoa_buf[GFC_XTOA_BUF_SIZE];
> >|^
> >
> > which was (of course) not given with a non-lto libgfortran.
> Yea.  This certainly can happen with LTO.  These warnings would
> definitely be something worth investigating.
>
> Essentially the inlining enabled by LTO can expose a different set of
> diagnostics.

This particular place in libgfortran has

  /* write_z, which calls xtoa_big, is called from transfer.c,
 formatted_transfer_scalar_write.  There it is passed the kind as
 argument, which means a maximum of 16.  The buffer is large
 enough, but the compiler does not know that, so shut up the
 warning here.  */
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-overflow"
  *q = '\0';
#pragma GCC diagnostic pop

so obviously the #pragma doesn't survive through LTO.  Somehow I think
this is a known bug, but maybe I misremember (I think we are not streaming
any of the ad-hoc location parts).

Richard.

>
> Jeff


Re: gfortran: error: unrecognized argument in option '-mcmodel=medium'

2023-09-27 Thread Richard Biener via Fortran
On Tue, Sep 26, 2023 at 4:44 PM Lingadahally, Vishakha (2023)
 wrote:
>
> Dear GCC Team,
>
> I'm running Ubuntu 22 on my Mac virtually and my gfortran version is 11.4.0. 
> When I try to install a certain software package, I encounter the following 
> error:
>
> gfortran: error: unrecognized argument in option '-mcmodel=medium'
> gfortran: note: valid arguments to '-mcmodel=' are: large small tiny
>
> Is this due to attempting to run gfortran on arm64 architecture? Could you 
> please let me know how I could resolve the issue?

You have to turn to Ubuntu here, -mcmodel=medium is certainly
supported in GCC 11, maybe Ubuntu patches out
the support?

>
> Thank you
>
> Warm regards,
> Vishakha
>
> This email, its contents and any attachments are intended solely for the 
> addressee and may contain confidential information. In certain circumstances, 
> it may also be subject to legal privilege. Any unauthorised use, disclosure, 
> or copying is not permitted. If you have received this email in error, please 
> notify us and immediately and permanently delete it. Any views or opinions 
> expressed in personal emails are solely those of the author and do not 
> necessarily represent those of Royal Holloway, University of London. It is 
> your responsibility to ensure that this email and any attachments are virus 
> free.


GCC 12.2.1 Status Report (2023-05-02), branch frozen for release

2023-05-02 Thread Richard Biener via Fortran


The GCC 12 branch is now frozen in preparation for a GCC 12.3 release
candidate and the release of GCC 12.3 next week.

All changes require release manager approval.

Quality Data


Priority  #   Change from last report
---   ---
P10   -   1
P2  502   -  28
P3   49   -   7
P4  232   -   4
P5   24 
---   ---
Total P1-P3 551   -  36
Total   807   -  40


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2023-April/241138.html


Re: replacing hardware?

2023-04-14 Thread Richard Biener via Fortran
On Thu, Apr 13, 2023 at 6:43 PM Steve Kargl via Fortran
 wrote:
>
> All,
>
> The systems that I've used while hacking on gfortran
> bugs and features are starting to show their age.  I'm
> in the early stage of put together the wishlist for
> a budget friendly replacement.  While I'll likely go
> with a Ryzen7 cpu, NVME M2 drives, and as much memory
> as I can afford, I'm looking for recommendations for
> a budget friendly video/gpu card that will allow me to
> take a deeper dive into openacc/openmp and offloading.
> Anyone have a suggestion?

I am sofar doing "functional level" testing of offloading with
a GTX 1650 which is I think the most affordable thing you
can still get.  Not sure how future proof this will be though
with future support from NVIDIA (you still need CUDA for
offloading).  I have not yet had success with any consumer
AMD graphics product, the gcn backend requires
the radeon instinct compute cards.

So I suppose any "cheap" nvidia will do, of course
performance wise it will suck (esp. when doing FP64 - also
watch out if they eventually disable that completely on
some consumer card generations)

Richard.

> Note, I've mostly used FreeBSD over the last 25+ years.
> I suspect that pursuing offloading may  be easier with
> a flavor linux.  Any recommendation would also be
> appreciated.
>
> --
> Steve


Re: [Patch, fortran] PR37336 finalization

2023-03-15 Thread Richard Biener via Fortran
On Wed, 15 Mar 2023, Paul Richard Thomas wrote:

> Hi All,
> 
> I am awaiting a green light to commit this patch or not.

I'd say go ahead.

Richard.

> Cheers
> 
> Paul
> 
> 
> On Fri, 10 Mar 2023 at 16:49, Paul Richard Thomas <
> paul.richard.tho...@gmail.com> wrote:
> 
> > Hi Thomas,
> >
> > Before answering that, I thought that I had better try the polyhedron
> > suite with -fwrapv and -std=legacy together. As far as I can see, all is
> > well and so, yes, I think that is a good idea.
> >
> > Cheers
> >
> > Paul
> >
> >
> > 
> > Date & Time :  9 Mar 2023 18:13:04
> > Test Name   : gfor_13
> > Compile Command : gfc13 -ffast-math -funroll-loops -O3 -fwrapv -std=legacy
> > %n.f90 -o %n
> > Benchmarks  : ac aermod air capacita channel2 doduc gas_dyn2 fatigue2
> > induct2 linpk mp_prop_design nf protein rnflow test_fpu2 tfft2
> > Maximum Times   :1.0
> > Target Error %  :  0.100
> > Minimum Repeats : 1
> > Maximum Repeats : 2
> >
> >Benchmark   Compile  Executable   Ave Run  Number   Estim
> > Name(secs) (bytes)(secs) Repeats   Err %
> >-   ---  --   --- ---  --
> >   ac  0.00   55904  7.17   2  0.0628
> >   aermod  0.00 1280104 10.77   2  1.2769
> >  air  0.00  136392  4.09   2  0.4276
> > capacita  0.00  102680 23.79   2  0.7587
> > channel2  0.00   43928108.59   2  0.5834
> >doduc  0.00  194224 13.98   2  0.2468
> > gas_dyn2  0.00  104080105.46   2  0.1659
> > fatigue2  0.00   90752 62.86   2  1.3092
> >  induct2  0.00  224920 58.08   2  0.6594
> >linpk  0.00   51768  7.15   2  0.4892
> > mp_prop_desi  0.00   48432 94.28   2  0.0164
> >   nf  0.00   64480 11.16   2  0.0134
> >  protein  0.00  140592 24.22   2 10.0347
> >   rnflow  0.00  197704 20.74   2  0.1904
> >test_fpu2  0.00  147232 53.54   2  0.1093
> >tfft2  0.00   43896 55.09   2  3.8688
> >
> > Geometric Mean Execution Time =  26.19 seconds
> >
> >
> > 
> >
> > On Thu, 9 Mar 2023 at 17:30, Thomas Koenig  wrote:
> >
> >> Hi Paul,
> >>
> >>
> >> > -fdefault-integer-8 does indeed fix the problem with
> >> > rnflow.f90 but breaks tfft2.f90, with a type mismatch at lines 36 and
> >> 44.
> >> >
> >> >integer(8), parameter   :: jmul =  843314861  ! multiplicateur
> >> >integer(8), parameter   :: jadd =  453816693  ! constante
> >> additive
> >> > Does the job and is portable.
> >> >
> >>
> >> I think -frwapv (as suggested by Jakub) would be the better choice.
> >> The problem is the linear congruential pseudo-random number generators
> >> which were much used in earlier times (and are still present in
> >> legacy code), which violate the Fortran standards by assuming silent
> >> truncation.
> >>
> >> If a new optimization breaks this (widespread, but illegal) idiom,
> >> maybe the best way to deal with it is to add -frwapv to -std=legacy.
> >>
> >> What do you think?
> >>
> >> Best regards
> >>
> >> Thomas
> >>
> >
> >
> > --
> > "If you can't explain it simply, you don't understand it well enough" -
> > Albert Einstein
> >
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [patch, Fortran] Enable -fwrapv for -std=legacy

2023-03-10 Thread Richard Biener via Fortran



> Am 10.03.2023 um 18:54 schrieb Thomas Koenig via Fortran 
> :
> 
> Hello world, here's the patch that was discussed.
> 
> Regression-tested. OK for trunk?
> 
> Since this appeared only in gcc13, I see no need for a backport.
> I will also document this in the changes file.

The „problem“ is latent forever, I’m not sure it’s good to amend the 
kitchen-sink std=legacy option with -fwrapv since that has quite some negative 
effects on optimization.

Richard 

> Best regards
> 
>Thomas
> 
> Set -frapv if -std=legacy is set.
> 
> Fortran legacy codes sometimes contain linear congruential
> seudorandom number generators.  These generators implicitly depend
> on wrapping behavior on integer overflow, which is illegal Fortran,
> but the best they could to at the time.
> 
> A gcc13 change exposed this in rnflow, part of the Polyhedron
> benchmark, with -O3.  Rather than "regress" on such code, this patch
> enables -fwrapv if -std=legacy is enabled.  This allows the benchmark
> to run successfully, and presumably lots of other code as well.
> 
> gcc/fortran/ChangeLog:
> 
>PR fortran/109075
>* options.cc (gfc_handle_option):  If -std=legacy is set,
>also set -frwapv.
>* invoke.texi: Document the change.
> 


Re: [Patch, fortran] PR37336 finalization

2023-03-09 Thread Richard Biener via Fortran
On Thu, 9 Mar 2023, Thomas Koenig wrote:

> On 08.03.23 22:35, I wrote:
> > On 08.03.23 15:55, Paul Richard Thomas via Fortran wrote:
> >> As noted below, rnflow.f90 hangs with the unpatched mainline at -O3 but
> >> runs successfully at -O2.
> > 
> > I can confirm that.
> > 
> >> I presume that this is a serious regression since it involves optimization?
> >> Which component should I post it against?
> > 
> > Probably against tree-optimization.  If later analysis determines that
> > it is something else, people will reassign it.
> > 
> > This one probably calls for bisection.
> 
> I have submitted this as
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109075 .
> 
> Paul, thanks for catching this!
> 
> @Richard: Is there a possibility of doing regular Polyhedron runs
> at Suse in addition to the SPEC runs?  This could also be interesting.

We do run Polyhedron on a variety of machines already, you can visit
https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report
which is the benchmark set that includes polyhedron.

Richard.


Re: [Patch, fortran] PR37336 finalization

2023-03-09 Thread Richard Biener via Fortran
On Wed, 8 Mar 2023, Thomas Koenig wrote:

> On 08.03.23 15:55, Paul Richard Thomas via Fortran wrote:
> > As noted below, rnflow.f90 hangs with the unpatched mainline at -O3 but
> > runs successfully at -O2.
> 
> I can confirm that.
> 
> > I presume that this is a serious regression since it involves optimization?
> > Which component should I post it against?
> 
> Probably against tree-optimization.  If later analysis determines that
> it is something else, people will reassign it.
> 
> This one probably calls for bisection.

I have the following local change to rnflow with that I can't reproduce:

--- rnflow.f90.orig 2016-06-01 14:50:16.922376347 +0200
+++ rnflow.f90  2016-06-01 14:56:54.739045162 +0200
@@ -884,6 +884,7 @@
 ! generation maison d'une valeur pseudo-aleatoire uniforme sur (0,1)
 !
   integer, intent (inout) :: jsee   ! germe courant
+  integer(kind=8) :: jsee_long
   integer, parameter   :: jmul =  843314861 ! multiplicateur
   integer, parameter   :: jadd =  453816693 ! constante additive
   integer, parameter   :: j230 = 1073741824 ! 2 puissance 30
@@ -899,7 +900,9 @@
 !CRAY - The following multiply must be done with 64 bits (not 46 bits)
 !   The algoritm depends on the overflow characteristics of 
 !   a 32 or 64 bit multiply.
-  jsee = jsee * jmul + jadd
+  jsee_long = jsee;
+  jsee_long = jsee_long * jmul + jadd
+  jsee = jsee_long;
 !CRAY - Change to avoid 32 bit integer dependency
   !
   !  The original line is needlessly dependent on the

the existing comment already explains the issue.  I suppose
-fdefault-integer-8 would also work around the issue?

ISTR there is an old (closed invalid) bugreport about this as well.

Richard.


Re: [Patch, fortran] PR37336 finalization

2023-03-08 Thread Richard Biener via Fortran
On Wed, 8 Mar 2023, Paul Richard Thomas wrote:

> Hi All,
> 
> I ran the polyhedron testsuite with the patched gfortran-13.0.1 and 7.4(as
> used in the posted Linux test). The timings are comparable except for
> rnflow.f90.
> 
> As noted below, rnflow.f90 hangs with the unpatched mainline at -O3 but
> runs successfully at -O2.
> 
> I presume that this is a serious regression since it involves optimization?

I believe it has undefined behavior (accessing array out of bounds).

> Which component should I post it against?
> 
> Regards
> 
> Paul
> 
> gfortran 13.0.1 with patch
>Benchmark   Compile  Executable   Ave Run  Number   Estim
> Name(secs) (bytes)(secs) Repeats   Err %
>-   ---  --   --- ---  --
>   ac  0.00   55904  7.28   2  0.0550
>   aermod  0.00 1149032 10.32   2  0.0242
>  air  0.00  120224  3.57   2  0.3083
> capacita  0.00  110872 20.27   2  0.0765
> channel2  0.00   43928 98.23   2  0.2703
>doduc  0.00  190296 13.86   2  0.2453
> gas_dyn2  0.00  108176 96.77   2  0.1364
> fatigue2  0.00   90752 61.44   2  0.0618
>  induct2  0.00  224992 57.71   2  0.0572
>linpk  0.00   47672  5.54   2  0.1806
> mp_prop_desi  0.00   52640 94.50   2  0.0079
>   nf  0.00   64480  9.25   2  0.4053
>  protein  0.00  136496 20.83   2  0.9096
>   rnflow  0.00  181320   3417.15   2 99.8270
>test_fpu2  0.00  126752 52.35   2  0.1691
>tfft2  0.00   60280 37.61   2  0.9387
> 
> Geometric Mean Execution Time =  32.72 seconds
> rnflow hangs without patch as well. Seems to be a rather serious
> regression.
> gets stuck with -O3 in the loop starting at line 3566 in subroutine cptrf2
> 
> 
> 
> gfortran7.4
>Benchmark   Compile  Executable   Ave Run  Number   Estim
> Name(secs) (bytes)(secs) Repeats   Err %
>-   ---  --   --- ---  --
>   ac  0.00 3612576  7.30   2  0.0205
>   aermod  0.00 5204760 10.21   2  0.0784
>  air  0.00 3829736  4.05   2  0.0988
> capacita  0.00 3672512 22.25   2  0.1506
> channel2  0.00 3663368 87.22   2  0.5767
>doduc  0.00 3840336 13.60   2  0.0221
> gas_dyn2  0.00 3673920 89.54   2  0.1106
> fatigue2  0.00 3691256 74.34   2  0.0921
>  induct2  0.00 4062312 57.87   2  0.1348
>linpk  0.00 3591984  5.59   2  0.0358
> mp_prop_desi  0.00 3966920 93.99   2  0.0654
>   nf  0.00 3622112  9.27   2  0.0324
>  protein  0.00 3832280 22.10   2  0.1289
>   rnflow  0.00 4129984 23.49   2  0.7449
>test_fpu2  0.00 3940944 53.29   2  0.2561
>tfft2  0.00 3622040 36.56   2  0.1026
> 
> Geometric Mean Execution Time =  24.33 seconds
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [Patch, fortran] PR37336 finalization

2023-03-08 Thread Richard Biener via Fortran
On Wed, 8 Mar 2023, Paul Richard Thomas wrote:

> The alternative is that the patch be reviewed and committed as it is. I
> have been neglecting my daytime job to get to this point and must spend
> some time catching up.

That works for me as well - I understand the work to be done is on
the testsuite side, so we should get more time to test the actual
code changes on the real world.

Richard.


Re: [Patch, fortran] PR37336 finalization

2023-03-08 Thread Richard Biener via Fortran
On Wed, 8 Mar 2023, Thomas Koenig wrote:

> On 08.03.23 09:14, Richard Biener wrote:
> > While Fortran is not considered release critical it would be bad to
> > break say the build of SPEC CPU 2017 or Polyhedron very late in the
> > cycle.  I'd lean towards postponing this to early stage1 and eventually
> > backport it for GCC 13.2 if you would like this feature to be implemented
> > for GCC 13.
> 
> And now comes the problem - no Fortran maintanier has access to SPEC
> 2017, as far as I know.  The curse of closed-source benchmarks...

:/

But at least you can watch https://lnt.opensuse.org after-the-fact

> How extensive is SPEC using finalization?  My personal guess would be
> that it is not used extensively, since gfortran's implementation is
> pretty broken at the moment.

I'd say it probably doesn't use it.  But I was more worrying about
unwanted side-effects on code _not_ using finalization.

Do you know of any medium to large size Fortran code base that
uses finalization?

Richard.


Re: [Patch, fortran] PR37336 finalization

2023-03-08 Thread Richard Biener via Fortran
On Wed, 8 Mar 2023, Thomas Koenig wrote:

> Hi Paul,
> 
> > Last night, I scoped out the work required to get the patch ready to commit.
> > Sorting out the testcases will be the main load since they have grown
> > "organically". I propose to change over to one test for each paragraph of
> > F2018 7.5.6.2/7.5.6.3  and to verify them against
> > the other brands. I suspect that this will allow a weeding out of the
> > existing tests. It will take me a couple of weeks.
> 
> I am a little bit concerned that this might bring us too close to the
> release date. We are down to 22 P1 regressions as of now.
> 
> Richard, what do you think?

While Fortran is not considered release critical it would be bad to
break say the build of SPEC CPU 2017 or Polyhedron very late in the
cycle.  I'd lean towards postponing this to early stage1 and eventually
backport it for GCC 13.2 if you would like this feature to be implemented
for GCC 13.

But it's of course the Fortran maintainers decision.

Even though Fortran isn't release critical we might ask you to revert
if any such severe problems show up.

Thanks,
Richard.


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Richard Biener via Fortran
On Mon, Feb 20, 2023 at 5:23 PM Tobias Burnus  wrote:
>
> Hi Richard, hi all,
>
> On 20.02.23 13:46, Richard Biener wrote:
> > +  /* TODO: A more middle-end friendly alternative would be to use 
> > NULL_TREE
> > +as upper bound and store the value, e.g. as GFC_DECL_STRING_LEN.
> > +Caveat: this requires some cleanup throughout the code to 
> > consistently
> > +use some wrapper function.  */
> > +  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
> > +  tree tmp = TREE_TYPE (TYPE_SIZE (eltype));
> >
> > ...
> >
> > you are probably breaking type sharing here.  You could use
> > build_array_type_1 and pass false for 'shared' to get around that.  Note
> > that there's also canonical type building done in case 'eltype' is not
> > canonical itself.
>
> My feeling is that this is already somewhat broken. Currently, there
> is one type per decl as each has its own artificial length variable.
> I have no idea how this will be handled in the ME in terms of alias
> analysis. And whether shared=false makes sense here and what effect
> is has. (Probably yes.)
>
> In principle,
>integer(kind=8) .str., .str2;
>character(kind=1)[1:.str] * str;
>character(kind=1)[1:.str2] * str2;
> have the same type and iff .str == .str at runtime, they can alias.
> Example:
>str2 = str;
>.str2 = .str;
>
> I have no idea how the type analysis currently works (with or without 
> SAVE_EXPR)
> nor what effect shared=false has in this case.

alias analysis for array types looks only at the element type

> > The solution to the actual problem is a hack - you are relying on
> > re-evaluation of TYPE_SIZE, and for that, only from within accesses
> > from inside the frontend?
>
> I think this mostly helps with access inside the FE of the type 'size =
> TYPE_SIZE_UNIT(type)', which is used surprisingly often and is often
> directly evaluated (i.e. assigned to a temporary).

that's what I thought

> > Since gimplification will produce the result into a single temporary again, 
> > re-storing the "breakage".
> > So, does it_really_  fix things?
>
> It does seem to fix cases which do  'size = TYPE_SIZE_UNIT (type);' in
> the front end and then uses this size expression. Thus, there are fixed.
> However, there are many cases where things go wrong - with and without
> the patch. I keep discovering more and more :-(

I guess test coverage isn't too great with this feature then ;)

> * * *
>
> I still think that the proper way is to have NULL_TREE as upper value
> would be better in several ways, except that there is (too) much code

Yep.

> which relies on TYPE_UNIT_SIZE to work. (There are 117 occurrences).
> Additionally, there is more code doing assumptions in this area.
>
> Thus, the question is whether it makes sense as hackish partial solution
> or whether it should remain in the current broken stage until it is
> fixed properly.

I wonder if it makes more sense to individually fix the places using
TYPE_UNIT_SIZE in a wrong way?  You'd also get only "partial"
fixes, but at least those will be true and good?

Otherwise I defer to frontend maintainers if they agree to put in
a (partially working) hack like this.

Richard.

> Tobias,
>
> who would like to have more time for fixing such issues.
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Richard Biener via Fortran
On Mon, Feb 20, 2023 at 12:57 PM Jakub Jelinek  wrote:
>
> On Mon, Feb 20, 2023 at 12:48:38PM +0100, Tobias Burnus wrote:
> > On 20.02.23 12:15, Jakub Jelinek wrote:
> > > On Mon, Feb 20, 2023 at 12:07:43PM +0100, Tobias Burnus wrote:
> > > > As mentioned in the TODO for 'deferred', I think we really want
> > > > to have NULL as upper value for the domain for the type, but that
> > > > requires literally hundred of changes to the compiler, which
> > > > I do not want to due during Stage 4, but that are eventually
> > > > required.* — In any case, this patch fixes some of the issues
> > > > in the meanwhile.
> > > Yeah, the actual len can be in some type's lang_specific member.
> >
> > Actually, I think it should be bound to the DECL and not to the TYPE,
> > i.e. lang_decl not type_lang.
> >
> > I just see that, the latter already has a 'tree stringlen' (for I/O)
> > which probably could be reused for this purpose.
>
> I'd drop the
>  && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR
> and assert == SAVE_EXPR part, with SAVE_EXPRs one never knows if they
> are added around the whole expression or say some subexpression has
> it and then some trivial arithmetics happens on the SAVE_EXPR tree.
>
> > > Anyway, for the patch for now, I'd probably instead of stripping
> > > SAVE_EXPR overwrite the 2 sizes with newly built expressions.
> >
> > What I now did. (Unchanged otherwise, except that I now also mention
> > GFC_DECL_STRING_LEN in the TODO.)
> >
> > OK for mainline?
>
> If Richard doesn't object.

 tree
-gfc_get_character_type_len_for_eltype (tree eltype, tree len)
+gfc_get_character_type_len_for_eltype (tree eltype, tree len, bool deferred)
 {
   tree bounds, type;

   bounds = build_range_type (gfc_charlen_type_node, gfc_index_one_node, len);
   type = build_array_type (eltype, bounds);
   TYPE_STRING_FLAG (type) = 1;
-
+  if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
+{
+  /* TODO: A more middle-end friendly alternative would be to use NULL_TREE
+as upper bound and store the value, e.g. as GFC_DECL_STRING_LEN.
+Caveat: this requires some cleanup throughout the code to consistently
+use some wrapper function.  */
+  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
+  tree tmp = TREE_TYPE (TYPE_SIZE (eltype));

...

you are probably breaking type sharing here.  You could use
build_array_type_1 and pass false for 'shared' to get around that.  Note
that there's also canonical type building done in case 'eltype' is not
canonical itself.

The solution to the actual problem is a hack - you are relying on
re-evaluation of TYPE_SIZE, and for that, only from within accesses
from inside the frontend?  Since gimplification will produce the result
into a single temporary again, re-storing the "breakage".

So, does it _really_ fix things?

Richard.


>
> Jakub
>


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Richard Biener via Fortran
On Mon, Feb 20, 2023 at 11:05 AM Tobias Burnus  wrote:
>
> On 17.02.23 17:27, Steve Kargl wrote:
> > On Fri, Feb 17, 2023 at 12:13:52PM +0100, Tobias Burnus wrote:
> >> OK for mainline?
> > Short version: no.
>
> Would you mind to write a reasoning beyond only a single word?
>
> >> subroutine foo(n)
> >>integer :: n
> >>integer :: array(n*5)
> >>integer :: my_len
> >>...
> >>my_len = 5
> >>block
> >>  character(len=my_len, kind=4) :: str
> >>
> >>  my_len = 99
> >>  print *, len(str)  ! still shows 5 - not 99
> >>end block
> >> end
> > Are you sure about the above comment?
>
> Yes - for three reasons:
> * On the what-feels-right side: It does not make any sense to print
>any other value than 5 given that 'str' has been declared with len = 5.
> * On the GCC side, the SAVE_EXPR ensures that the length is evaluated
>early and then "saved" to ensure its original value is available

Generally SAVE_EXPR is used to make sure an expression is only evaluated
once.  It's DECL_EXPR that ensures something is evaluated early
and available.  So generally "unwrapping" a SAVE_EXPR looks dangerous
to me unless the SAVE_EXPR is really never necessary.

Richard.

> * The quoted text from the standard implies that this is what
>should happen.
>
> Why do you think that printing "5" is wrong? GCC does so since
> years; it still does so with my patch.
>
> Hence, can you elaborate? And also state which value you did expect instead?
>
> * * *
>
> The patch itself is about *deferred* length parameters, i.e.
> 'len=:', and thus for code like:
>
> character(len=:), pointer :: str
> ...
> allocate(character(len=4) :: str)
> print *, len(str)  ! should print 4
> ...
> allocate(character(len=99) :: str)
> print *, len(str)  ! should now print 99
> ...
>
> Currently, the SAVE_EXPR causes that the original value might
> get used, which is often 0 (by chance 0 initialized) or some
> random value like 57385973, depending what on what was on the
> stack before. - There are more issues with deferred strings,
> but at least one is solved by not having a SAVE_EXPR for
> deferred-length character strings.
>
> Tobias
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [patch, fortran] Fix common subexpression elimination with IEEE rounding (PR108329)

2023-01-09 Thread Richard Biener via Fortran
On Sun, Jan 8, 2023 at 5:21 PM Thomas Koenig  wrote:
>
> Hi Richard,
>
> >> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran 
> >> :
> >>
> >> Hi Thomas,
> >>
> >> Following your off-line explanation that the seemingly empty looking
> >> assembly line forces an effective reload from memory, all is now clear.
> >
> > It’s not a full fix (for register vars) and it’s ‚superior‘ to the call 
> > itself only because asm handling is implemented in a rather stupid way in 
> > the Alias oracle.  So I don’t think this is a „fix“ at all.
>
> There are no register variables in Fortran, this is Fortran FE only,
> and it is a fix in the sense that correct code is no longer miscompiled.

It's a quite big hammer and the fact that it "works" is just luck and
the fact that the memory barrier implied by the ieee_set_rouding_mode
does not is because by-reference passed arguments are marked by
the frontend so they can be CSEd since memory barriers may not
affect them.

As said, the fact that this "works" is just because we're lazy on GIMPLE:

/* If the statement STMT may clobber the memory reference REF return true,
   otherwise return false.  */

bool
stmt_may_clobber_ref_p_1 (gimple *stmt, ao_ref *ref, bool tbaa_p)
{
...
  else if (gimple_code (stmt) == GIMPLE_ASM)
return true;

> There's a FIXME in the code pointing to the relevant PR precisely
> because I think that this is less than elegant (as do you, obviously).
> Do you have other suggestions how to implement this?  If PR 34678
> is solved, this would probably provide a mechanism that we could
> simply re-use.

There is no reliable way to get this correct at the moment and if there
were good and easy ways to get this working they'd be implemented already.

Richard.

> Best regards
>
> Thomas


Re: [patch, fortran] Fix common subexpression elimination with IEEE rounding (PR108329)

2023-01-08 Thread Richard Biener via Fortran



> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran 
> :
> 
> Hi Thomas,
> 
> Following your off-line explanation that the seemingly empty looking
> assembly line forces an effective reload from memory, all is now clear.

It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself 
only because asm handling is implemented in a rather stupid way in the Alias 
oracle.  So I don’t think this is a „fix“ at all.

Richard 

> OK for mainline and for backporting as you see fit.
> 
> Thanks for the patch.
> 
> Paul
> 
> 
>> On Sat, 7 Jan 2023 at 15:46, Thomas Koenig via Fortran 
>> wrote:
>> 
>> Hello world,
>> 
>> this patch fixes Fortran's handling of common subexpression elimination
>> across ieee_set_rouding_mode calls.  It does so using a rather big
>> hammer, by issuing a memory barrier to force reload from memory
>> (and thus a recomputation).
>> 
>> This is a rather big hammer, so if there are more elegant ways
>> to fix it, I am very much open to suggestions.
>> 
>> If PR 34678 is fixed, then this solution can also be applied here.
>> 
>> OK for trunk?  How do you feel about a backport?
>> 
>> Best regards
>> 
>>Thomas
>> 
>> Add memory barrier for calls to ieee_set_rounding_mode.
>> 
>> gcc/fortran/ChangeLog:
>> 
>> PR fortran/108329
>> * trans-expr.cc (trans_memory_barrier): New functions.
>> (gfc_conv_procedure_call): Insert memory barrier for
>> ieee_set_rounding_mode.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> PR fortran/108329
>> * gfortran.dg/rounding_4.f90: New test.
> 
> 
> 
> -- 
> "If you can't explain it simply, you don't understand it well enough" -
> Albert Einstein


Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-14 Thread Richard Biener via Fortran
On Tue, Dec 13, 2022 at 5:29 PM Tobias Burnus  wrote:
>
> This is a 12/13 regression as come changes to fix the GFC/CFI descriptor
> that went into GCC 12 fail with the (bogus) descriptor passed via by a
> GCC-11-compiled program.
>
> As later GCC 12 changes moved the descriptor to the front end, those
> functions are only in libgomp.so to cater for old program. Richard
> suggested in the PR that the best way is to move to the GCC 11 version,
> such that libgfortran.so won't regress.
>
> I now did so - except for three fixes (cf. changelog). See also
> PR: https://gcc.gnu.org/PR108056
>
> There is no testcase as it needs to be compiled by GCC <= 11 and then
> run with linking (dynamically) to a GCC 12 or 13 libgfortran.
>
> OK for mainline and GCC 12?

OK for the branch if approved for trunk.

Richard.

>   * * *
>
> Note: It is strongly recommended to use GCC 12 (or 13) with array-descriptor
> C interop as many issues were fixed. Like for the testcase in the PR; in GCC 
> 11
> the type arriving in libgomp is BT_ASSUME ('type(*)'). But as the effective
> argument is passed as array descriptor through out, the 'float' (real(4)) type
> info is actually preservable (as GCC 12 cf. testcase of comment 0 and my 
> comment
> in the PR for the C part of the testcase).(*)
>
> Tobias
>
> ((*) This is not possible if using a scalar 'type(*)' or a 
> non-array-descriptor
> array in between. I think GCC 12 uses 'CFI_other' in the information-is-lost 
> case.)
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [PATCH] tree-optimization/99919 - bogus uninit diagnostic with bitfield guards

2022-12-12 Thread Richard Biener via Fortran
On Thu, Dec 8, 2022 at 12:07 PM Richard Biener via Fortran
 wrote:
>
> For the testcase in this PR what fold-const.cc optimize_bit_field_compare
> does to bitfield against constant compares is confusing the uninit
> predicate analysis and it also makes SRA obfuscate instead of optimize
> the code.  We've long had the opinion that those optimizations are
> premature but we do not have any replacement for the more complicated
> ones combining multiple bitfield tests.  The following disables mangling
> the case of a single bitfield test against constants but preserving
> the existing diagnostic and optimization to a compile-time determined
> value.
>
> This requires silencing a bogus uninit diagnostic in the Fortran
> frontend which I've done in a minimal way, avoiding initializing
> the 40 byte symbol_attribute structure.  There's several issues,
> one is the flag_coarrays is a global variable likely not CSEd
> to help the uninit predicate analysis, the other is us short-circuiting
> the flag_coarray == GFC_FCOARRAY_LIB && lhs_caf_attr.codimension
> accesses as both have no side-effects so the guard isn't effective.
> If the frontend folks are happy with this I can localize both
> lhs_caf_attr and rhs_caf_attr and copy out the two only flags
> tested by the code instead of the somewhat incomplete approach in
> the patch.  Any opinions here?
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Testing revealed this regresses

FAIL: c-c++-common/fold-masked-cmp-1.c  -Wc++-compat
scan-assembler-times testn?b 2
FAIL: c-c++-common/fold-masked-cmp-2.c  -Wc++-compat
scan-assembler-times testn?b 2

I filed PR108070 for the failure to optimize this on the RTL level
and put this change on hold :/

Richard.

> OK for the fortran parts?
>
> Thanks,
> Richard.
>
> PR tree-optimization/99919
> * fold-const.cc (optimize_bit_field_compare): Disable
> transforming the bitfield against constant compare optimization
> if the result is not statically determinable.
>
> gcc/fortran/
> * trans-expr.cc (gfc_trans_assignment_1): Split out
> lhs_codimension from lhs_caf_attr to avoid bogus uninit
> diagnostics.
>
> * gcc.dg/uninit-pr99919.c: New testcase.
> ---
>  gcc/fold-const.cc | 37 +++
>  gcc/fortran/trans-expr.cc |  6 +++--
>  gcc/testsuite/gcc.dg/uninit-pr99919.c | 22 
>  3 files changed, 30 insertions(+), 35 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/uninit-pr99919.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index cdfe3f50ae3..b72cc0a1d51 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -4559,7 +4559,6 @@ optimize_bit_field_compare (location_t loc, enum 
> tree_code code,
>  {
>poly_int64 plbitpos, plbitsize, rbitpos, rbitsize;
>HOST_WIDE_INT lbitpos, lbitsize, nbitpos, nbitsize;
> -  tree type = TREE_TYPE (lhs);
>tree unsigned_type;
>int const_p = TREE_CODE (rhs) == INTEGER_CST;
>machine_mode lmode, rmode;
> @@ -4667,13 +4666,7 @@ optimize_bit_field_compare (location_t loc, enum 
> tree_code code,
>  }
>
>/* Otherwise, we are handling the constant case.  See if the constant is 
> too
> - big for the field.  Warn and return a tree for 0 (false) if so.  We do
> - this not only for its own sake, but to avoid having to test for this
> - error case below.  If we didn't, we might generate wrong code.
> -
> - For unsigned fields, the constant shifted right by the field length 
> should
> - be all zero.  For signed fields, the high-order bits should agree with
> - the sign bit.  */
> + big for the field.  Warn and return a tree for 0 (false) if so.  */
>
>if (lunsignedp)
>  {
> @@ -4695,31 +4688,9 @@ optimize_bit_field_compare (location_t loc, enum 
> tree_code code,
> }
>  }
>
> -  if (nbitpos < 0)
> -return 0;
> -
> -  /* Single-bit compares should always be against zero.  */
> -  if (lbitsize == 1 && ! integer_zerop (rhs))
> -{
> -  code = code == EQ_EXPR ? NE_EXPR : EQ_EXPR;
> -  rhs = build_int_cst (type, 0);
> -}
> -
> -  /* Make a new bitfield reference, shift the constant over the
> - appropriate number of bits and mask it with the computed mask
> - (in case this was a signed field).  If we changed it, make a new one.  
> */
> -  lhs = make_bit_field_ref (loc, linner, lhs, unsigned_type,
> -   nbitsize, nbitpos, 1, lreversep);
> -
> -  rhs = const_binop (BIT_AND_EXPR,
> -const_binop (LSHIFT_EXPR,
> - fold_convert_loc (loc, 

[PATCH] tree-optimization/99919 - bogus uninit diagnostic with bitfield guards

2022-12-08 Thread Richard Biener via Fortran
For the testcase in this PR what fold-const.cc optimize_bit_field_compare
does to bitfield against constant compares is confusing the uninit
predicate analysis and it also makes SRA obfuscate instead of optimize
the code.  We've long had the opinion that those optimizations are
premature but we do not have any replacement for the more complicated
ones combining multiple bitfield tests.  The following disables mangling
the case of a single bitfield test against constants but preserving
the existing diagnostic and optimization to a compile-time determined
value.

This requires silencing a bogus uninit diagnostic in the Fortran
frontend which I've done in a minimal way, avoiding initializing
the 40 byte symbol_attribute structure.  There's several issues,
one is the flag_coarrays is a global variable likely not CSEd
to help the uninit predicate analysis, the other is us short-circuiting
the flag_coarray == GFC_FCOARRAY_LIB && lhs_caf_attr.codimension
accesses as both have no side-effects so the guard isn't effective.
If the frontend folks are happy with this I can localize both
lhs_caf_attr and rhs_caf_attr and copy out the two only flags
tested by the code instead of the somewhat incomplete approach in
the patch.  Any opinions here?

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK for the fortran parts?

Thanks,
Richard.

PR tree-optimization/99919
* fold-const.cc (optimize_bit_field_compare): Disable
transforming the bitfield against constant compare optimization
if the result is not statically determinable.

gcc/fortran/
* trans-expr.cc (gfc_trans_assignment_1): Split out
lhs_codimension from lhs_caf_attr to avoid bogus uninit
diagnostics.

* gcc.dg/uninit-pr99919.c: New testcase.
---
 gcc/fold-const.cc | 37 +++
 gcc/fortran/trans-expr.cc |  6 +++--
 gcc/testsuite/gcc.dg/uninit-pr99919.c | 22 
 3 files changed, 30 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/uninit-pr99919.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index cdfe3f50ae3..b72cc0a1d51 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -4559,7 +4559,6 @@ optimize_bit_field_compare (location_t loc, enum 
tree_code code,
 {
   poly_int64 plbitpos, plbitsize, rbitpos, rbitsize;
   HOST_WIDE_INT lbitpos, lbitsize, nbitpos, nbitsize;
-  tree type = TREE_TYPE (lhs);
   tree unsigned_type;
   int const_p = TREE_CODE (rhs) == INTEGER_CST;
   machine_mode lmode, rmode;
@@ -4667,13 +4666,7 @@ optimize_bit_field_compare (location_t loc, enum 
tree_code code,
 }
 
   /* Otherwise, we are handling the constant case.  See if the constant is too
- big for the field.  Warn and return a tree for 0 (false) if so.  We do
- this not only for its own sake, but to avoid having to test for this
- error case below.  If we didn't, we might generate wrong code.
-
- For unsigned fields, the constant shifted right by the field length should
- be all zero.  For signed fields, the high-order bits should agree with
- the sign bit.  */
+ big for the field.  Warn and return a tree for 0 (false) if so.  */
 
   if (lunsignedp)
 {
@@ -4695,31 +4688,9 @@ optimize_bit_field_compare (location_t loc, enum 
tree_code code,
}
 }
 
-  if (nbitpos < 0)
-return 0;
-
-  /* Single-bit compares should always be against zero.  */
-  if (lbitsize == 1 && ! integer_zerop (rhs))
-{
-  code = code == EQ_EXPR ? NE_EXPR : EQ_EXPR;
-  rhs = build_int_cst (type, 0);
-}
-
-  /* Make a new bitfield reference, shift the constant over the
- appropriate number of bits and mask it with the computed mask
- (in case this was a signed field).  If we changed it, make a new one.  */
-  lhs = make_bit_field_ref (loc, linner, lhs, unsigned_type,
-   nbitsize, nbitpos, 1, lreversep);
-
-  rhs = const_binop (BIT_AND_EXPR,
-const_binop (LSHIFT_EXPR,
- fold_convert_loc (loc, unsigned_type, rhs),
- size_int (lbitpos)),
-mask);
-
-  lhs = build2_loc (loc, code, compare_type,
-   build2 (BIT_AND_EXPR, unsigned_type, lhs, mask), rhs);
-  return lhs;
+  /* Otherwise do not prematurely optimize compares of bitfield members
+ to constants.  */
+  return 0;
 }
 
 /* Subroutine for fold_truth_andor_1: decode a field reference.
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index b95c5cf2f96..12c7dd7f26a 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -11654,9 +11654,11 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * 
expr2, bool init_flag,
 
   /* Only analyze the expressions for coarray properties, when in coarray-lib
  mode.  */
+  bool lhs_codimension = false;
   if (flag_coarray == GFC_FCOARRAY_LIB)
 {
   lhs_caf_attr = gfc_caf_attr (expr1, 

GCC 13.0.0 Status Report (2022-11-14), Stage 3 in effect now

2022-11-14 Thread Richard Biener via Fortran
Status
==

The GCC development branch which will become GCC 13 is now in
bugfixing mode (Stage 3) until the end of Jan 15th.

As usual the first weeks of Stage 3 are used to feature patches
posted late during Stage 1.  At some point unreviewed features
need to be postponed for the next Stage 1.


Quality Data


Priority  #   Change from last report
---   ---
P1  33
P2  473 
P3  113   +  29
P4  253   +   6
P5  25   
---   ---
Total P1-P3 619   +  29
Total   897   +  35


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2022-October/239690.html


Re: PING Re: [PATCH] Fortran: Remove double spaces in open() warning [PR99884]

2022-11-14 Thread Richard Biener via Fortran
On Mon, Nov 14, 2022 at 10:10 AM Bernhard Reutner-Fischer via
Gcc-patches  wrote:
>
> yearly ping. Ok for trunk after re-regtesting?

OK.

> thanks,
>
> On Sun, 31 Oct 2021 13:57:46 +0100
> Bernhard Reutner-Fischer  wrote:
>
> > From: Bernhard Reutner-Fischer 
> >
> > gcc/fortran/ChangeLog:
> >
> >   PR fortran/99884
> >   * io.c (check_open_constraints): Remove double spaces.
> > ---
> >  gcc/fortran/io.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/fortran/io.c b/gcc/fortran/io.c
> > index fc97df79eca..9506f35008e 100644
> > --- a/gcc/fortran/io.c
> > +++ b/gcc/fortran/io.c
> > @@ -2513,7 +2513,7 @@ check_open_constraints (gfc_open *open, locus *where)
> > spec = "";
> >   }
> >
> > -  warn_or_error (G_("%s specifier at %L not allowed in OPEN statement 
> > for "
> > +  warn_or_error (G_("%sspecifier at %L not allowed in OPEN statement 
> > for "
> >"unformatted I/O"), spec, loc);
> >  }
> >
>


Re: Handling of large stack objects in GPU code generation -- maybe transform into heap allocation?

2022-11-11 Thread Richard Biener via Fortran
On Fri, Nov 11, 2022 at 3:13 PM Thomas Schwinge  wrote:
>
> Hi!
>
> For example, for Fortran code like:
>
> write (*,*) "Hello world"
>
> ..., 'gfortran' creates:
>
> struct __st_parameter_dt dt_parm.0;
>
> try
>   {
> dt_parm.0.common.filename = 
> &"source-gcc/libgomp/testsuite/libgomp.oacc-fortran/print-1_.f90"[1]{lb: 1 
> sz: 1};
> dt_parm.0.common.line = 29;
> dt_parm.0.common.flags = 128;
> dt_parm.0.common.unit = 6;
> _gfortran_st_write (_parm.0);
> _gfortran_transfer_character_write (_parm.0, &"Hello world"[1]{lb: 
> 1 sz: 1}, 11);
> _gfortran_st_write_done (_parm.0);
>   }
> finally
>   {
> dt_parm.0 = {CLOBBER(eol)};
>   }
>
> The issue: the stack object 'dt_parm.0' is a half-KiB in size (yes,
> really! -- there's a lot of state in Fortran I/O apparently).  That's a
> problem for GPU execution -- here: OpenACC/nvptx -- where typically you
> have small stacks.  (For example, GCC/OpenACC/nvptx: 1 KiB per thread;
> GCC/OpenMP/nvptx is an exception, because of its use of '-msoft-stack'
> "Use custom stacks instead of local memory for automatic storage".)
>
> Now, the Nvidia Driver tries to accomodate for such largish stack usage,
> and dynamically increases the per-thread stack as necessary (thereby
> potentially reducing parallelism) -- if it manages to understand the call
> graph.  In case of libgfortran I/O, it evidently doesn't.  Not being able
> to disprove existance of recursion is the common problem, as I've read.
> At run time, via 'CU_JIT_INFO_LOG_BUFFER' you then get, for example:
>
> warning : Stack size for entry function 'MAIN__$_omp_fn$0' cannot be 
> statically determined
>
> That's still not an actual problem: if the GPU kernel's stack usage still
> fits into 1 KiB.  Very often it does, but if, as happens in libgfortran
> I/O handling, there is another such 'dt_parm' put onto the stack, the
> stack then overflows; device-side SIGSEGV.
>
> (There is, by the way, some similar analysis by Tom de Vries in
>  "[nvptx, openacc, openmp, testsuite]
> Recursive tests may fail due to thread stack limit".)
>
> Of course, you shouldn't really be doing I/O in GPU kernels, but people
> do like their occasional "'printf' debugging", so we ought to make that
> work (... without pessimizing any "normal" code).
>
> I assume that generally reducing the size of 'dt_parm' etc. is out of
> scope.
>
> There is a way to manually set a per-thread stack size, but it's not
> obvious which size to set: that sizes needs to work for the whole GPU
> kernel, and should be as low as possible (to maximize parallelism).
> I assume that even if GCC did an accurate call graph analysis of the GPU
> kernel's maximum stack usage, that still wouldn't help: that's before the
> PTX JIT does its own code transformations, including stack spilling.
>
> There exists a 'CU_JIT_LTO' flag to "Enable link-time optimization
> (-dlto) for device code".  This might help, assuming that it manages to
> simplify the libgfortran I/O code such that the PTX JIT then understands
> the call graph.  But: that's available only starting with recent
> CUDA 11.4, so not a general solution -- if it works at all, which I've
> not tested.
>
> Similarly, we could enable GCC's LTO for device code generation -- but
> that's a big project, out of scope at this time.  And again, we don't
> know if that at all helps this case.
>
> I see a few options:
>
> (a) Figure out what it is in the libgfortran I/O implementation that
> causes "Stack size [...] cannot be statically determined", and re-work
> that code to avoid that, or even disable certain things for nvptx, if
> feasible.
>
> (b) Also for GCC/OpenACC/nvptx use the GCC/OpenMP/nvptx '-msoft-stack'.
> I don't really want to do that however: it does introduce a bit of
> complexity in all the generated device code and run-time overhead that we
> generally would like to avoid.
>
> (c) I'm contemplating a tweak/compiler pass for transforming such large
> stack objects into heap allocation (during nvptx offloading compilation).
> 'malloc'/'free' do exist; they're slow, but that's not a problem for the
> code paths this is to affect.  (Might also add some compile-time
> diagnostic, of course.)  Could maybe even limit this to only be used
> during libgfortran compilation?  This is then conceptually a bit similar
> to (b), but localized to relevant parts only.  Has such a thing been done
> before in GCC, that I could build upon?
>
> Any other clever ideas?

Shrink st_parameter_dt (it's part of the ABI though, kind of).  Lots of the
bloat is from things that are unused for simpler I/O cases (so some
"inheritance" could help), and lots of the bloat is from using
string/length pairs using char * + size_t for what looks like could be
encoded a lot more efficiently.

There's probably not much low-hanging fruit.

Converting to heap allocation is difficult outside of the frontend and you
have 

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-19 Thread Richard Biener via Fortran
On Mon, Sep 19, 2022 at 9:31 AM Mikael Morin  wrote:
>
> Le 18/09/2022 à 12:48, Richard Biener a écrit :
> >
> >> Does *([1]) count as a pointer dereference?
> >
> > Yes, technically.
> >
> >>   Even in the original dump it is already simplified to a straight a[1].
> >
> > But this not anymore.  The check can probably be relaxed, it stems from the 
> > dual purpose of CLOBBER.
> >
> So the following makes the frontend-emitted IL valid, by handing the
> simplification over to the middle-end, but I can't help thinking that
> behavior won't be more reliable.

I think that will just delay the folding.  We are basically relying on
the frontends to
restrict what they produce, the *ptr case was specifically for C++
this in CTOR/DTORs
and during inlining we throw away all clobbers that end up "wrong" but
I guess we
might have latent issues when we obfuscate the address in the caller enough and
get undesired simplification ...

>
> diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
> index f8fcd2d97d9..5fb9a3a536d 100644
> --- a/gcc/fortran/trans-expr.cc
> +++ b/gcc/fortran/trans-expr.cc
> @@ -6544,8 +6544,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol *
> sym,
>&& !sym->attr.elemental)
>  {
>tree var;
> - var = build_fold_indirect_ref_loc (input_location,
> -parmse.expr);
> + var = build1_loc (input_location, INDIRECT_REF,
> +   TREE_TYPE (TREE_TYPE
> (parmse.expr)),
> +   parmse.expr);
>tree clobber = build_clobber (TREE_TYPE (var));
>gfc_add_modify (, var, clobber);
>  }
>


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Richard Biener via Fortran



> Am 18.09.2022 um 11:10 schrieb Mikael Morin :
> 
> Le 18/09/2022 à 08:12, Richard Biener a écrit :
>>> On Sat, Sep 17, 2022 at 9:33 PM Mikael Morin  wrote:
>>> 
>>> Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :
 
 I have a concern about this part, though.  My understanding at the
 time was that it is not possible to clobber an individual array
 element, but that this clobbers anything off the pointer that this
 is based on.
 
>>> Well, we need the middle-end guys to give a definitive answer on this
>>> topic, but I think it would be a very penalizing limitation if that was
>>> the case.  I have assumed that the clobber spanned the value it was
>>> applied on, neither more nor less, so just the array element in case of
>>> array elements.
>> There is IL verification that the LHS of a CLOBBER is either
>> a declaration or a pointer dereference, no array or component
>> selection is allowed there.  Now, nothing should go wrong here,
>> but we might eventually just drop those CLOBBERs or ICE if
>> we frontend hands us an "invalid" one.
> 
> Obviously I have assumed too much here; it's probably best to drop this patch.
> It is unfortunate as there is some desirable behavior within reach here.  The 
> test shows that the patch permits the elimination of a useless store.  And IL 
> verification doesn't seem that upset with it by the way.
> Does *([1]) count as a pointer dereference?

Yes, technically.

>  Even in the original dump it is already simplified to a straight a[1].

But this not anymore.  The check can probably be relaxed, it stems from the 
dual purpose of CLOBBER.

Richard 

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Richard Biener via Fortran
On Sat, Sep 17, 2022 at 9:33 PM Mikael Morin  wrote:
>
> Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :
> >
> > Hi Mikael,
> >
> >> This adds support for clobbering of partial variable references, when
> >> they are passed as actual argument and the associated dummy has the
> >> INTENT(OUT) attribute.
> >> Support includes array elements, derived type component references,
> >> and complex real or imaginary parts.
> >>
> >> This is done by removing the check for lack of subreferences, which is
> >> basically a revert of r9-4911-gbd810d637041dba49a5aca3d085504575374ac6f.
> >> This removal allows more expressions than just array elements,
> >> components and complex parts, but the other expressions are excluded by
> >> other conditions: substrings are excluded by the check on expression
> >> type (CHARACTER is excluded), KIND and LEN references are rejected by
> >> the compiler as not valid in a variable definition context.
> >>
> >> The check for scalarness is also updated as it was only valid when there
> >> was no subreference.
> >
> > First, thanks a lot for digging into this subject. I have looked through
> > the patch series, and it looks very good so far.
> >
> > I have a concern about this part, though.  My understanding at the
> > time was that it is not possible to clobber an individual array
> > element, but that this clobbers anything off the pointer that this
> > is based on.
> >
> Well, we need the middle-end guys to give a definitive answer on this
> topic, but I think it would be a very penalizing limitation if that was
> the case.  I have assumed that the clobber spanned the value it was
> applied on, neither more nor less, so just the array element in case of
> array elements.

There is IL verification that the LHS of a CLOBBER is either
a declaration or a pointer dereference, no array or component
selection is allowed there.  Now, nothing should go wrong here,
but we might eventually just drop those CLOBBERs or ICE if
we frontend hands us an "invalid" one.

Richard.

> > So,
> >
> >integer, dimension(3) :: a
> >
> >a(1) = 1
> >a(3) = 3
> >call foo(a(1))
> >
> > would also invalidate the store to a(3).  Is my understanding correct?
>
> I think it was the case before patch 2 in in the series, because the
> clobber was applied to the symbol decl, so in the case of the expression
> A(1), it was applied to A which is the full array.  After patch 2, the
> clobber is applied to the expression A(1), so the element alone.
>
> > If so, I think this we cannot revert that patch (which was introduced
> > because of a regression).
> >
> The testcase from the patch was not specifically checking lack of
> side-effect clobbers, so I have double-checked with the following
> testcase, which should lift your concerns.
> I propose to keep the patch with the testcase added to it.  What do you
> think?
>
> Mikael
>


Re: [patch, fortran, doc] Mention new CONVERT options for POWER

2022-05-02 Thread Richard Biener via Fortran
On Sat, Apr 30, 2022 at 1:26 AM Bernhard Reutner-Fischer
 wrote:
>
> On Fri, 29 Apr 2022 20:03:55 +0200
> Thomas Koenig  wrote:
>
> > On 28.04.22 19:17, Bernhard Reutner-Fischer wrote:
> > > ISTM that this should be s/mod.e/mode./ ?
> >
> > Yep, fixed as obvious and simple on trunk with
> > r13-49-g4a8b45e8bc8246bd141dad65f571a3e0cc499c6b .
>
> thanks!
> >
> > OK for backport to gcc-12? (This is both extremely safe and
> > not particularly important :-)
>
> I'd say yes because it's 1) doc 2) trivial 3) obvious :)
>
> But formally, i think RM approval is needed ATM.
> Richard?

OK.


Re: [patch, fortran, doc] Mention new CONVERT options for POWER

2022-04-28 Thread Richard Biener via Fortran
On Wed, Apr 27, 2022 at 10:34 PM Thomas Koenig via Fortran
 wrote:
>
> Hi,
>
> as we say in German "Nicht documentiert ist nicht gemacht", if
> it is not documented, it wasn't done.
>
> So I thought it would be time to document the changes to the various
> ways of specifying CONVERT before gcc12 went out of the door, so
> here is a patch for that.
>
> If that goes in, I will also write up something for gcc-12/changes.html.
>
> OK for trunk?  Suggestions for improvements?

OK for trunk.

Richard.

> Best regards
>
> Thomas
>
> Document changes to CONVERT for -mabi-ieeelongdouble for POWER
>
> gcc/fortran/ChangeLog:
>
> * gfortran.texi: Mention r16_ieee and r16_ibm.
> * invoke.texi: Likewise.


Re: [PATCH 0/4] Use pointer arithmetic for array references [PR102043]

2022-04-19 Thread Richard Biener via Fortran
On Sat, Apr 16, 2022 at 6:57 PM Mikael Morin via Gcc-patches
 wrote:
>
> Hello,
>
> this is a fix for PR102043, which is a wrong code bug caused by the
> middle-end concluding from array indexing that the array index is
> non-negative.  This is a wrong assumption for "reversed arrays",
> that is arrays whose descriptor makes accesses to the array from
> last element to first element.  More generally the wrong cases are
> arrays with a descriptor having a negative stride for at least one
> dimension.
>
> I have been trying to fix this by stopping the front-end from generating
> bogus code, by either stripping array-ness from descriptor data
> pointers, or by changing the initialization of data pointers to point
> to the first element in memory order instead of the first element in
> access order (which is the last in memory order for reversed arrays).
> Both ways are very impacting changes to the frontend and I haven’t
> been able to eliminate all the regressions in time using either way.
>
> However, Richi showed with a patch attached to the PR that array
> references are crucial for the problem to appear, and everything works
> if array indexing is replaced with pointer arithmetic.  This is
> much simpler and doesn’t imply invasive changes to the frontend.
>
> I have built on top of his patch to keep the array indexing in cases
> where the change to pointer arithmetic is not necessary, either because
> the array is not a fortran array with a descriptor, or because it’s
> known to be contiguous.  This has the benefit of reducing the churn
> in the dumped code patterns used in the testsuite.  It also avoids
> ICE regression such as interface_12.f90 or result_in_spec.f90, but
> I can’t exclude that those could be a real problem made latent.
>
> Patches 1 to 3 are preliminary changes to avoid regressions.  The main
> change is number 4, the last in the series.
>
> Regression tested on x86_64-pc-linux-gnu.  OK for master?

I've also tested the patch and built SPEC CPU 2017 successfully
on x86_64 with -Ofast -flto -march=znver2.  For 548.exchange2_r
I see a ~3% runtime regression caused by the change, the
other 6 Fortran using benchmarks show no runtime behavior change.
I have not analyzed the 548.exchange2_r regression (but confirmed
it with a 3-run).

That said, I believe this is the only reasonable thing to do for GCC 12,
all other options require invasive changes in the middle-end.

So OK from my side, I'm not familiar with the GFortran frontend enough
to review the changes besides the gfc_build_array_ref chage though.

Thanks,
Richard.


>
> Mikael Morin (4):
>   fortran: Pre-evaluate string pointers. [PR102043]
>   fortran: Update index extraction code. [PR102043]
>   fortran: Generate an array temporary reference [PR102043]
>   fortran: Use pointer arithmetic to index arrays [PR102043]
>
>  gcc/fortran/trans-array.cc|  60 +-
>  gcc/fortran/trans-expr.cc |   9 +-
>  gcc/fortran/trans-io.cc   |  48 -
>  gcc/fortran/trans.cc  |  42 +++-
>  gcc/fortran/trans.h   |   4 +-
>  .../gfortran.dg/array_reference_3.f90 | 195 ++
>  gcc/testsuite/gfortran.dg/c_loc_test_22.f90   |   4 +-
>  gcc/testsuite/gfortran.dg/dependency_49.f90   |   3 +-
>  gcc/testsuite/gfortran.dg/finalize_10.f90 |   2 +-
>  .../gfortran.dg/negative_stride_1.f90 |  25 +++
>  .../gfortran.dg/vector_subscript_8.f90|  16 ++
>  .../gfortran.dg/vector_subscript_9.f90|  21 ++
>  12 files changed, 401 insertions(+), 28 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/array_reference_3.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/negative_stride_1.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/vector_subscript_8.f90
>  create mode 100644 gcc/testsuite/gfortran.dg/vector_subscript_9.f90
>
> --
> 2.35.1
>


Re: [PATCH] Fortran: Add location info to OpenMP tree nodes

2022-04-05 Thread Richard Biener via Fortran
On Tue, Apr 5, 2022 at 6:12 AM Sandra Loosemore  wrote:
>
> On 3/25/22 20:03, Sandra Loosemore wrote:
> > I've got another patch forthcoming (stage 1 material) that adds some new
> > diagnostics for non-rectangular loops during gimplification of OMP
> > nodes.  When I was working on that, I discovered that the Fortran front
> > end wasn't attaching location information to the tree nodes
> > corresponding to the various OMP directives, so the new errors weren't
> > coming out with location info either.  I went through trans-openmp.cc
> > and fixed all the places where make_node was being called to explicitly
> > set the location.
> >
> > I don't have a test case specifically for this change, but my test cases
> > for the new diagnostics in the non-rectangular loops patch do exercise
> > it.  Is this OK for trunk now, or for stage 1 when we get there?
>
> Ping!  Even a quick review and "this isn't suitable for GCC 12" answer
> would be helpful.
>
> https://gcc.gnu.org/pipermail/fortran/2022-March/057706.html

OK if nobody objects in 24h.

Richard.

> The definitely-stage-1 patch that exercises this is here:
>
> https://gcc.gnu.org/pipermail/fortran/2022-March/057707.html
>
> -Sandra


Re: [PATCH] fortran: Fix up initializers of param(0) PARAMETERs [PR103691]

2022-03-26 Thread Richard Biener via Fortran



> Am 26.03.2022 um 12:28 schrieb Thomas Koenig :
> 
> On 25.03.22 12:34, Jakub Jelinek via Fortran wrote:
>> What is the behavior with a RANGE_EXPR when one has { [0..10] = ++i;
>> }, is that applying the side-effects 11 times or once ?
> 
> For side effects during the evaluation of expression, Fortran has a
> clear "if you depend on it, it's your fault" rule.  In F 2018, it says
> 
> 10.1.7  Evaluation of operands
> 
> 1 It is not necessary for a processor to evaluate all of the operands of
> an expression, or to evaluate entirely each operand, if the value of the
> expression can be determined otherwise.
> 
> Also, the semantics of
> 
> a(a:b) = expr
> 
> say that the expression on the LHS is evaluated only once before
> assignment.  So, anything that looks like that should be translated
> to
> 
> tmp = ++i;
> [0..10] = tmp;

Note I was not trying to question middle-end semantic here but gfortran se_expr 
(?) one. Is there a Fortan input where Jakob’s patch would cause a side-effect 
to be dropped and is that valid?

Richard.

> 
> 


Re: [PATCH] fortran: Fix up initializers of param(0) PARAMETERs [PR103691]

2022-03-25 Thread Richard Biener via Fortran
On Fri, Mar 25, 2022 at 12:34 PM Jakub Jelinek  wrote:
>
> On Fri, Mar 25, 2022 at 12:16:40PM +0100, Richard Biener wrote:
> > On Fri, Mar 25, 2022 at 11:13 AM Tobias Burnus  
> > wrote:
> > >
> > > On 25.03.22 09:57, Jakub Jelinek via Fortran wrote:
> > > > On the gfortran.dg/pr103691.f90 testcase the Fortran ICE emits
> > > >static real(kind=4) a[0] = {[0 ... -1]=2.0e+0};
> > > > That is an invalid RANGE_EXPR where the maximum is smaller than the 
> > > > minimum.
> > > >
> > > > The following patch fixes that.  If TYPE_MAX_VALUE is smaller than
> > > > TYPE_MIN_VALUE, the array is empty and so doesn't need any initializer,
> > > > if the two are equal, we don't need to bother with a RANGE_EXPR and
> > > > can just use that INTEGER_CST as the index and finally for the 2+ values
> > > > in the range it uses a RANGE_EXPR as before.
> > > >
> > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > >
> > > LGTM – thanks for taking care of Fortran patches and regressions.
> > >
> > > > 2022-03-25  Jakub Jelinek  
> > > >
> > > >   PR fortran/103691
> > > >   * trans-array.cc (gfc_conv_array_initializer): If TYPE_MAX_VALUE 
> > > > is
> > > >   smaller than TYPE_MIN_VALUE (i.e. empty array), throw the 
> > > > initializer
> > > >   on the floor, if TYPE_MIN_VALUE is equal to TYPE_MAX_VALUE, use 
> > > > just
> > > >   the TYPE_MIN_VALUE as index instead of RANGE_EXPR.
> > >
> > > I am not sure whether "throw the initializer on the floor" is the best 
> > > wording
> > > for a changelog. I think I prefer a wording like "ignore the initializer" 
> > > or
> > > another less idiomatic expression. And I think a ';' before the second 
> > > 'if'
> > > also increases readability.
> >
> > Can there be side-effects in those initializer elements in Fortran?
>
> For PARAMETERs certainly not, those need to be constant.
> Even otherwise, this is in a routine that does
>   /* Create a constructor from the list of elements.  */
>   tmp = build_constructor (type, v);
>   TREE_CONSTANT (tmp) = 1;
>   return tmp;
> at the end so I wouldn't expect side-effects anywhere.

Ah, didn't see that.

> Also, I think typically in the Fortran FE side-effects would go into
> se.pre and se.post sequences, not into se.expr, and this routine
> doesn't emit those se.pre/se.post sequences anywhere, so presumably it
> assumes they don't exist.
>
> What is the behavior with a RANGE_EXPR when one has { [0..10] = ++i; },
> is that applying the side-effects 11 times or once ?

11 times is what is documented.

Richard.

>
>
> Jakub
>


Re: [PATCH] fortran: Fix up initializers of param(0) PARAMETERs [PR103691]

2022-03-25 Thread Richard Biener via Fortran
On Fri, Mar 25, 2022 at 11:13 AM Tobias Burnus  wrote:
>
> On 25.03.22 09:57, Jakub Jelinek via Fortran wrote:
> > On the gfortran.dg/pr103691.f90 testcase the Fortran ICE emits
> >static real(kind=4) a[0] = {[0 ... -1]=2.0e+0};
> > That is an invalid RANGE_EXPR where the maximum is smaller than the minimum.
> >
> > The following patch fixes that.  If TYPE_MAX_VALUE is smaller than
> > TYPE_MIN_VALUE, the array is empty and so doesn't need any initializer,
> > if the two are equal, we don't need to bother with a RANGE_EXPR and
> > can just use that INTEGER_CST as the index and finally for the 2+ values
> > in the range it uses a RANGE_EXPR as before.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> LGTM – thanks for taking care of Fortran patches and regressions.
>
> > 2022-03-25  Jakub Jelinek  
> >
> >   PR fortran/103691
> >   * trans-array.cc (gfc_conv_array_initializer): If TYPE_MAX_VALUE is
> >   smaller than TYPE_MIN_VALUE (i.e. empty array), throw the initializer
> >   on the floor, if TYPE_MIN_VALUE is equal to TYPE_MAX_VALUE, use just
> >   the TYPE_MIN_VALUE as index instead of RANGE_EXPR.
>
> I am not sure whether "throw the initializer on the floor" is the best wording
> for a changelog. I think I prefer a wording like "ignore the initializer" or
> another less idiomatic expression. And I think a ';' before the second 'if'
> also increases readability.

Can there be side-effects in those initializer elements in Fortran?

Richard.

> Tobias
>
> > --- gcc/fortran/trans-array.cc.jj 2022-02-04 14:36:55.113603791 +0100
> > +++ gcc/fortran/trans-array.cc2022-03-24 16:14:58.334498775 +0100
> > @@ -6267,10 +6267,17 @@ gfc_conv_array_initializer (tree type, g
> > else
> >   gfc_conv_structure (, expr, 1);
> >
> > -  CONSTRUCTOR_APPEND_ELT (v, build2 (RANGE_EXPR, gfc_array_index_type,
> > -  TYPE_MIN_VALUE (TYPE_DOMAIN (type)),
> > -  TYPE_MAX_VALUE (TYPE_DOMAIN (type))),
> > -   se.expr);
> > +  if (tree_int_cst_lt (TYPE_MAX_VALUE (TYPE_DOMAIN (type)),
> > +TYPE_MIN_VALUE (TYPE_DOMAIN (type
> > + break;
> > +  else if (tree_int_cst_equal (TYPE_MIN_VALUE (TYPE_DOMAIN (type)),
> > +TYPE_MAX_VALUE (TYPE_DOMAIN (type
> > + range = TYPE_MIN_VALUE (TYPE_DOMAIN (type));
> > +  else
> > + range = build2 (RANGE_EXPR, gfc_array_index_type,
> > + TYPE_MIN_VALUE (TYPE_DOMAIN (type)),
> > + TYPE_MAX_VALUE (TYPE_DOMAIN (type)));
> > +  CONSTRUCTOR_APPEND_ELT (v, range, se.expr);
> > break;
> >
> >   case EXPR_ARRAY:
> >
> >   Jakub
> >
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


GCC 12.0.1 Status Report (2022-01-17), Stage 4 starts now

2022-01-17 Thread Richard Biener via Fortran


Status
==

The GCC master branch is now in regression and documentation fixing
mode (Stage 4) in preparation for the release of GCC 13.  Re-opening
of general development will happen once we reach zero P1 regressions
which is when we branch for the release.  Time wise history projects
that to happen around mid to end of April 2022.

Take the quality data below with a big grain of salt - most of the
new P3 classified bugs will become P1 or P2 (generally every
regression against GCC 11 is to be considered P1 if it concerns
primary or secondary platforms).


Quality Data


Priority  #   Change from last report
---   ---
P1  38+  8
P2  310   +  3
P3  286   +  7
P4  221   +  1
P5  25
---   ---
Total P1-P3 634   + 18
Total   880   + 19


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2022-January/238060.html


GCC 12.0.0 Status Report (2022-01-10), Stage 3 ends Jan 16th

2022-01-10 Thread Richard Biener via Fortran
Status
==

The GCC development branch is open for general bugfixing (Stage 3)
and will transition to regression and documentation fixing only
(Stage 4) on the end of Jan 16th.

Take the quality data below with a big grain of salt - most of the
new P3 classified bugs will become P1 or P2 (generally every
regression against GCC 11 is to be considered P1 if it concerns
primary or secondary platforms).


Quality Data


Priority  #   Change from last report
---   ---
P1   30   -  4
P2  307   +  1
P3  279   + 42
P4  220   + 13
P5   25
---   ---
Total P1-P3 616   + 39
Total   861   + 52


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2021-November/237741.html


Re: [PATCH] Avoid some -Wunreachable-code-ctrl

2021-11-30 Thread Richard Biener via Fortran
On Tue, 30 Nov 2021, Mikael Morin wrote:

> On 30/11/2021 14:25, Richard Biener wrote:
> > On Tue, 30 Nov 2021, Mikael Morin wrote:
> > 
> >> Le 29/11/2021 ? 16:03, Richard Biener via Gcc-patches a ?crit :
> >>> diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
> >>> index f5ba7cecd54..16ee2afc9c0 100644
> >>> --- a/gcc/fortran/frontend-passes.c
> >>> +++ b/gcc/fortran/frontend-passes.c
> >>> @@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t
> >>> exprfn,
> >>> void *data)
> >>>   case EXPR_OP:
> >>> WALK_SUBEXPR ((*e)->value.op.op1);
> >>> WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> >>> - break;
> >>>   case EXPR_FUNCTION:
> >>> for (a = (*e)->value.function.actual; a; a = a->next)
> >>>   WALK_SUBEXPR (a->expr);
> >>
> >> I?m uncomfortable with the above change.
> >> It makes it look like there is a fall through, but there is not.
> >> Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
> >> instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
> >> optimization.
> > 
> > Ah, it follows the style in tree.c:walk_tree_1 where break was used
> > inconsistently after WALK_SUBTREE_TAIL which was then more obvious
> > to me to clean up.  I didn't realize the fortran FE only had a
> > single WALK_SUBEXPR_TAIL.
> > 
> > I'm not sure inlining will make the situation more clear, for
> > sure using WALK_SUBEXPR would but it might loose the tailcall.
> > 
> > Would you accept an additional comment after WALK_SUBEXPR_TAIL like
> > 
> >case EXPR_OP:
> >  WALK_SUBEXPR ((*e)->value.op.op1);
> >  WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> >  /* tail-recurse  */
> > 
> My preference would be a gcc_unreachable() or something similar, but I
> understand it would get a warning as well?
> 
> Without better idea, I?m fine with an even more explicit comment:
> 
> /* No fallthru because of the tail recursion above.  */
> 
> > ?  Btw, a fallthru would be diagnosed by GCC unless we put
> > 
> >  /* Fallthru  */
> > 
> > here.
> Sure, but my main concern was misreading from programmers (including me),
> which is not diagnosed by compilers.
> 
> > Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
> > or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
> > be more obvious?
> > 
> I think the comment above would be enough.

Installed as follows.

Richard.

>From e5c2a436ef7596d254ffefd279742382b1ff546b Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Tue, 30 Nov 2021 15:25:17 +0100
Subject: [PATCH] Add comment to indicate tail recursion
To: gcc-patc...@gcc.gnu.org

My previous change removed an unreachable break; there (an
unreachable continue; would have been more to the point).  The
following re-adds a comment explaining that WALK_SUBEXPR_TAIL
does not fall through but tail recurses.

2021-11-30  Richard Biener  

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Add comment to
indicate tail recursion.
---
 gcc/fortran/frontend-passes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index 16ee2afc9c0..4764c834f4f 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,6 +5229,7 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn, 
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
+   /* No fallthru because of the tail recursion above.  */
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);
-- 
2.31.1


Re: [PATCH] Avoid some -Wunreachable-code-ctrl

2021-11-30 Thread Richard Biener via Fortran
On Tue, 30 Nov 2021, Mikael Morin wrote:

> Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :
> > diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
> > index f5ba7cecd54..16ee2afc9c0 100644
> > --- a/gcc/fortran/frontend-passes.c
> > +++ b/gcc/fortran/frontend-passes.c
> > @@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn,
> > void *data)
> >  case EXPR_OP:
> >WALK_SUBEXPR ((*e)->value.op.op1);
> >WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> > -   break;
> >  case EXPR_FUNCTION:
> >for (a = (*e)->value.function.actual; a; a = a->next)
> >  WALK_SUBEXPR (a->expr);
> 
> I’m uncomfortable with the above change.
> It makes it look like there is a fall through, but there is not.
> Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
> instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
> optimization.

Ah, it follows the style in tree.c:walk_tree_1 where break was used
inconsistently after WALK_SUBTREE_TAIL which was then more obvious
to me to clean up.  I didn't realize the fortran FE only had a 
single WALK_SUBEXPR_TAIL.

I'm not sure inlining will make the situation more clear, for
sure using WALK_SUBEXPR would but it might loose the tailcall.

Would you accept an additional comment after WALK_SUBEXPR_TAIL like

  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
/* tail-recurse  */

?  Btw, a fallthru would be diagnosed by GCC unless we put

/* Fallthru  */

here.  Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
be more obvious?

Thanks,
Richard.


[PATCH] Only return after resetting type_param_spec_list

2021-11-29 Thread Richard Biener via Fortran
This fixes an appearant mistake in gfc_insert_parameter_exprs.

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2021-11-29  Richard Biener  

gcc/fortran/
* decl.c (gfc_insert_parameter_exprs): Only return after
resetting type_param_spec_list.
---
 gcc/fortran/decl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index c0fec90e3e0..4971638f9b6 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3733,9 +3733,9 @@ gfc_insert_parameter_exprs (gfc_expr *e, 
gfc_actual_arglist *param_list)
 {
   gfc_actual_arglist *old_param_spec_list = type_param_spec_list;
   type_param_spec_list = param_list;
-  return gfc_traverse_expr (e, NULL, _parameter_exprs, 1);
-  type_param_spec_list = NULL;
+  bool res = gfc_traverse_expr (e, NULL, _parameter_exprs, 1);
   type_param_spec_list = old_param_spec_list;
+  return res;
 }
 
 /* Determines the instance of a parameterized derived type to be used by
-- 
2.31.1


GCC 12.0.0 Status Report (2021-11-15), Stage 3 in effect NOW

2021-11-15 Thread Richard Biener via Fortran


Status
==

The GCC development branch now is open for general bugfixing (Stage 3).

Take the quality data below with a big grain of salt - most of the
new P3 classified bugs will become P1 or P2 (generally every
regression against GCC 11 is to be considered P1 if it concerns
primary or secondary platforms).


Quality Data


Priority  #   Change from last report
---   ---
P1   34   +  19
P2  306   +  24 
P3  237   +  44
P4  207   +   5
P5   25
---   ---
Total P1-P3 577   +  87
Total   809   +  92


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2021-October/237464.html


Re: [PATCH] vect: Remove vec_outside/inside_cost fields

2021-11-11 Thread Richard Biener via Fortran
On Thu, Nov 11, 2021 at 10:45 AM Jan Hubicka via Gcc-patches
 wrote:
>
> > > >
> > > > I think the patch causes the following on x86_64-linux-gnu:
> > > > FAIL: gfortran.dg/inline_matmul_17.f90   -O   scan-tree-dump-times 
> > > > optimized "matmul_r4" 2
> > >
> > > I get that failure even with d70ef65692f (from before the patches
> > > I committed today).
> >
> > Sorry, you are right, it's one revision before:
> > d70ef65692fced7ab72e0aceeff7407e5a34d96d
> >
> > Honza, can you please take a look?
> The test looks for matmul_r4 calls which we now optimize out in fre1.
> This is because alias info is better now
>
>   afunc (&__var_5_mma);
>   _188 = __var_5_mma.dim[0].ubound;
>   _189 = __var_5_mma.dim[0].lbound;
>   _190 = _188 - _189;
>   _191 = _190 + 1;
>   _192 = MAX_EXPR <_191, 0>;
>   _193 = (real(kind=4)) _192;
>   _194 = __var_5_mma.dim[1].ubound;
>   _195 = __var_5_mma.dim[1].lbound;
>   _196 = _194 - _195;
>   _197 = _196 + 1;
>   _198 = MAX_EXPR <_197, 0>;
>   _199 = (real(kind=4)) _198;
>   _200 = _193 * _199;
>   _201 = _200 * 3.0e+0;
>   if (_201 <= 1.0e+9)
> goto ; [INV]
>   else
> goto ; [INV]
>:
>   c = {};
>
>
>   afunc (&__var_5_mma);
>   c = {};
>
> Now afunc writes to __var_5_mma only indirectly so I think it is correct that
> we optimize the conditional out.
>
> Easy fix would be to add -fno-ipa-modref, but perhaps someone with
> better understanding of Fortran would help me to improve the testcase so
> the calls to matmul_r4 remains reachable?

I think the two matmul_r4 cases were missed optimizations before so just
changing the expected number of calls to zero is the correct fix here.  Indeed
we can now statically determine the matrices are not large and so only
keep the inline copy.

Richard.

>
> Honza


Re: Silence additional warning in gfortran.dg/do_subscript_3.f90

2021-11-10 Thread Richard Biener via Fortran
On Wed, Nov 10, 2021 at 4:00 PM Jan Hubicka via Gcc-patches
 wrote:
>
> Hi,
> the testcase tests for out of bound accesses warnings and with ipa-modref 
> improvements
> it now triggers a new warning:
>
> /aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:11:9: 
> Warning: (1)
> /aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:10:47: 
> Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2)
> /aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:19:9: 
> Warning: (1)
> /aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:18:45: 
> Warning: Array reference at (1) out of bounds (6 > 5) in loop beginning at (2)
> /aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:19:50: 
> Warning: iteration 5 invokes undefined behavior 
> [-Waggressive-loop-optimizations]
> /aux/hubicka/trunk-git/gcc/testsuite/gfortran.dg/do_subscript_3.f90:18:9: 
> note: within this loop
>
> I suppose we now are able to propagate array bounds better into the
> nested function.
>
> The last warning is new and correct even though little bit redundant.  I think
> we may just silence it?  I wonder why we do not get same fact on the first 
> loop
> (which hits out of bound access already at iteration 0).
>
> Looks OK?

I guess so - but it looks like the testcase exercises diagnostics in
the frontend so I wonder
whether simply using { dg-options "-O0" } might be more appropriate?

> Honza
>
> gcc/testsuite/ChangeLog:
>
> 2021-11-10  Jan Hubicka  
>
> * gfortran.dg/do_subscript_3.f90: Add 
> -Wno-aggressive-loop-optimizations.
>
> diff --git a/gcc/testsuite/gfortran.dg/do_subscript_3.f90 
> b/gcc/testsuite/gfortran.dg/do_subscript_3.f90
> index 2f62f58142b..18ed9a2f0c9 100644
> --- a/gcc/testsuite/gfortran.dg/do_subscript_3.f90
> +++ b/gcc/testsuite/gfortran.dg/do_subscript_3.f90
> @@ -1,4 +1,5 @@
>  ! { dg-do compile }
> +! { dg-additional-options "-Wno-aggressive-loop-optimizations" }
>  ! PR fortran/91424
>  ! Check that only one warning is issued inside blocks, and that
>  ! warnings are also issued for contained subroutines.


Re: [PATCH 0/5] Fortran manual updates

2021-11-04 Thread Richard Biener via Fortran
On Thu, Nov 4, 2021 at 11:05 AM Martin Liška  wrote:
>
> On 11/2/21 16:56, Sandra Loosemore wrote:
> > On 11/2/21 9:20 AM, Martin Liška wrote:
> >> On 11/2/21 15:48, Sandra Loosemore wrote:
> >>> On 11/2/21 2:51 AM, Martin Liška wrote:
>  On 11/2/21 00:56, Sandra Loosemore wrote:
> > I'll wait a couple days before committing these patches, in case
> > anybody wants to give some feedback, especially on technical issues.
> 
>  Hello.
> 
>  Appreciate the work you did, but the patchset will cause quite some 
>  conflicts
>  in the prepared Sphinx migration patch I've sent to the mailing list :/
>  Anyway, I will rebase my patches. For the future, are you planning doing 
>  similar
>  documentation reorganization for a manual? Based on discussion with 
>  Gerald, I hope
>  we can finish the transition before the end of this year.
> >>>
> >>> My understanding was that, if this conversion is indeed going to happen, 
> >>> it's going to be automated by scripts?
> >>
> >> Exactly, but the conversion needs some manual post-processing that I've 
> >> already done.
> >>
> >>>   I hadn't seen any discussion of it on the list for months and thought 
> >>> the whole idea was on hold or scrapped, since it hasn't happened yet.
> >>
> >> There was almost no response, so that's why I contacted Gerald about help.
> >
> > I have to admit that I was buried in technical work at the time of the 
> > previous discussion (in fact, the Fortran things I am now trying to 
> > document), and didn't have time to look at the proposed changes in any 
> > detail.  I have wondered, though, why it's necessary to do this change  
> > if people don't like the way Texinfo formats output, can't we fix Texinfo?
>
> That's a reasonable question. Well, I believe the technical dept (feature 
> set) of Texinfo (compared to more modern tools) is significant and I don't 
> want
> to spend my time hacking a HTML, Javascipt and so on. Moreover, Sphinx is 
> massively used: https://www.sphinx-doc.org/en/master/examples.html
> and the tool is actively developed.
>
>
> > Or hack it to translate the sources to something like DocBook instead, and 
> > then adopt that as our source format?  I can write documentation in any 
> > markup format, but it seems to me that structured XML-based formats are a 
> > lot more amenable to scripted manipulation than either Texinfo or 
> > restructured text.  If the rest of the community is set on Sphinx, I'm fine 
> > with that, but I kind of don't see the point, myself.  :-S
>
> We think with David that DocBook is too complicated and a markup is a better 
> choice (from that perspective, Texinfo is fine).
>
> >
> >>> In any case it does not seem reasonable to freeze the current Texinfo 
> >>> docs for months while waiting for it to happen, especially as we are 
> >>> heading into the end of the release cycle and people are finishing up 
> >>> changes and new features they need to document.
> >>
> >> Sure, I can easily rebase normal changes, but you are suggesting a 
> >> complete redesign/renaming. It's going to take me some time,
> >> but I'll rebase my patches.
> >
> > Well, what I've done is hardly a "complete" redesign/renaming of the 
> > Fortran manual -- I've barely scratched the surface on it.  My main goal 
> > was just to update the bit-rotten standards conformance sections, which 
> > were unfortunately spread among multiple places in the document.  I did 
> > consolidate those few sections, but I did not make any big-picture changes 
> > to the organization of the manual, and I have not even reviewed any other 
> > parts of it for accuracy or relevance.  I'd been thinking about making a 
> > pass to do some copy-editing things, like making sure all chapter/section 
> > titles use consistent title case capitalization, but I will hold off on 
> > that if it's going to cause problems.
>
> I see, thanks for doing that.

Sandra - please go forward with improving the manual with the current
texinfo setup.  There's zero reason to hold
off on improving user level documentation.

Thanks,
Richard.

> Martin
>
> >
> > -Sandra
>


Re: [r12-4457 Regression] FAIL: gfortran.dg/deferred_type_param_6.f90 -Os execution test on Linux/x86_64

2021-10-18 Thread Richard Biener via Fortran
On Sat, Oct 16, 2021 at 8:24 PM Jan Hubicka via Gcc-patches
 wrote:
>
> Hi,
> >
> > FAIL: gfortran.dg/deferred_type_param_6.f90   -O1  execution test
> > FAIL: gfortran.dg/deferred_type_param_6.f90   -Os  execution test
> Sorry for the breakage.  This time it seems like bug in Fortran FE
> which was previously latent:
>
> __attribute__((fn spec (". . R ")))
> void subfunc (character(kind=1)[1:..__result] * & __result, integer(kind=8) * 
> .__result)
> {
>   # PT = nonlocal
>   character(kind=1)[1:..__result] * & __result_3(D) = __result;
>   # PT = nonlocal null
>   integer(kind=8) * .__result_5(D) = .__result;
>   integer(kind=4) _1;
>
>[local count: 1073741824]:
>   *__result_3(D) = 
>   # USE = nonlocal escaped { D.4230 } (nonlocal, escaped)
>   _1 = _gfortran_compare_string (5, , 5, &"FIVEC"[1]{lb: 1 sz: 1});
>   if (_1 != 0)
> goto ; [0.04%]
>   else
> goto ; [99.96%]
>
>[local count: 429496]:
>   # USE = nonlocal escaped null
>   # CLB = nonlocal escaped null
>   _gfortran_stop_numeric (10, 0);
>
>[local count: 1073312329]:
>   *.__result_5(D) = 5;
>   return;
> }
>
> The fnspec ". . R " specifies that .__result is readonly however we
> have:
>   *.__result_5(D) = 5;
>
> I am not sure I understand fortran FE well enough to figure out why
> it is set so.  The function is declared as:
>
>   function subfunc() result(res)
> character(len=:), pointer :: res
> res => fifec
> if (len(res) /= 5) STOP 9
> if (res /= "FIVEC") STOP 10
>   end function subfunc
>
> and we indeed optimize load of the result:
> -  # USE = nonlocal escaped { D.4252 D.4254 } (nonlocal, escaped)
> -  # CLB = nonlocal escaped { D.4254 } (escaped)
> +  # USE = nonlocal escaped
> +  # CLB = { D.4254 }
>subfunc (, );
> -  .s2_34 = slen.4;
> -  # PT = nonlocal escaped null { D.4254 } (escaped)
> -  s2_35 = pstr.5;
>pstr.5 ={v} {CLOBBER};
>
> and I think tat is what breaks the testcase (I also verified that
> ignoring the fnspec 'R' fixes it).

The FE code adding the fnspec probably fails to consider the
separately passed length of the string result?

Can you open a bugreport please?

Richard.

> Honza
> >
> > with GCC configured with
> >
> > ../../gcc/configure 
> > --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-4457/usr
> >  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> > --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet 
> > --without-isl --enable-libmpx x86_64-linux --disable-bootstrap
> >
> > To reproduce:
> >
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/deferred_type_param_6.f90 
> > --target_board='unix{-m32}'"
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/deferred_type_param_6.f90 
> > --target_board='unix{-m32\ -march=cascadelake}'"
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/deferred_type_param_6.f90 
> > --target_board='unix{-m64}'"
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/deferred_type_param_6.f90 
> > --target_board='unix{-m64\ -march=cascadelake}'"
> >
> > (Please do not reply to this email, for question about this report, contact 
> > me at skpgkp2 at gmail dot com)


Re: libgfortran.so SONAME and powerpc64le-linux ABI changes

2021-10-04 Thread Richard Biener via Fortran
On Mon, Oct 4, 2021 at 12:08 PM Jakub Jelinek via Fortran
 wrote:
>
> Hi!
>
> On powerpc64le-linux target, one can select between two incompatible
> long double formats (both of them are 16-byte), __ibm128 which is
> a sum of two doubles, and __float128 (note, not implemented through
> libquadmath), which is IEEE754 quad format.  The default for
> long double can be selected with --with-long-double-format={ieee,ibm}
> configure options.
> AFAIK no distributions switched to --with-long-double-format=ieee
> yet (correct me if I'm wrong), but the goal is that eventually all
> distros switch to that (like they've switched from the --with-long-double-64
> default to --with-long-double-128 on powerpc64-linux, s390*-linux etc. years
> ago).
>
> libstdc++ has been changed already last year, so that the same
> libstdc++.so.6 is ABI compatible with both configurations, in C++ the
> IEEE quad long double mangles differently from IBM double double long
> double and so it is possible (with quite some work) to achieve that.
> Other C++ libraries not shipped as part of gcc are either lucky and don't
> use long double on any of its public APIs, then they are compatible with
> both, or developers can go the same painful way and support both ABIs in
> the same shared library, or they are simply ABI incompatible.
>
> But, I believe --with-long-double-format={ieee,ibm} configure time choice
> doesn't change just the meaning of long double, but also of real(kind=16)
> and complex(kind=32), but Fortran name mangling just appends _ and doesn't
> encode types.
> So, the choices for libgfortran.so.5 are either don't do anything, then
> we have from GCC the same SONAME but based on what 
> --with-long-double-format={ieee,ibm}
> distributions choose ABI incompatible libraries, or bump libgfortran
> SONAME in GCC 12 on all targets (the problem is that it unnecessarily
> changes the SONAME even on targets that don't really need it - unless
> there are important ABI changes in the queue for GCC 12 already), but
> that to be effective would basically require that all distros change to
> --with-long-double-format=ieee together with GCC 12, or change the
> SONAME only on powerpc64le somehow (still the problem that all distros
> have to change at once), or add some kind=16 suffix letter into the SONAME
> if configured --with-long-double-format=ieee (so we'd have
> libgfortran.so.5ieee or whatever, and when we'd bump to libgfortran.so.6
> on all arches libgfortran.so.6ieee etc.).
>
> Or the last option would be to try to make libgfortran.so.5 ABI compatible
> with both choices on powerpc64le-linux.  From quick skimming of libgfortran,
> we have lots of generated functions, which use HAVE_GFC_REAL_16 and
> GFC_REAL_16 etc. macros.  So, we could perhaps arrange for the compiler
> to use r16i or r17 instead of r16 in the names when real(kind=16) is the
> IEEE quad on powerpc64le and keep using r16 for the IBM double double.
> For the *.F90 generated files, one could achieve it by making sure
> the *r16* files are compiled with -mabi=ibmlongdouble, for
> *r16i* or *r17* with -mabi=ieeelongdouble and otherwise use kind=16 in
> those, for *.c generated files the *GFC_* macros could just ensure that
> it doesn't use long double but __ibm128 or __float128 depending on which one
> is needed.
> But then I see e.g. the io routines to just pass in kind and so
> switch (kind) // or len
>   {
>   case ...:
> *(GFC_REAL_*) = ...;
>   }
> etc.  Could we just pretend in the compiler to libgfortran ABI that
> powerpc64le-linux real(kind=16) is kind 17 and make sure that if anything
> would actually think it is 17 bytes it uses 16 instead (though, kind=10
> on x86-64 or i686 also isn't 10 bytes but 16 or 12, right?).
>
> Your thoughts on this?

How does glibc deal with this?  There's a load of long double ABI in there.

Richard.

>
> Jakub
>


Re: [RFH] ME optimizes variable assignment away / Fortran bind(C) descriptor conversion

2021-08-30 Thread Richard Biener via Fortran
On Sun, Aug 29, 2021 at 10:07 AM Tobias Burnus  wrote:
>
> Hi all, hi Richard,
>
> On 27.08.21 21:48, Richard Biener wrote:
> >> It looks really nice with "-O1 -fno-inline"   :-)
> >>The callee 'rank_p()' is mostly optimized and in the
> >>caller only those struct elements are set, which are used:
> >>
> >> integer(kind=4) rank_p (struct CFI_cdesc_t & _this)
> >> {
> >>_1 = _this_11(D)->base_addr;
> >>_2 = _this_11(D)->rank;
> >> ...
> >>rnk_13 = (integer(kind=4)) _2;
> >>return rnk_13;
> >> }
> >>
> >> void selr_p ()
> >> {
> >> ...
> >>struct CFI_cdesc_t01 cfi.7;
> >> ...
> >> [local count: 537730764]:
> >>cfi.7.rank = 1;
> >>cfi.7.base_addr = 0B;
> > You need to do the accesses above using the generic 't' type as well, 
> > otherwise they are non-conflicting TBAA wise.
>
> First, I wonder why the following works with C:
>
>   struct gen_t { int n; int dim[]; }
>
>   int f (struct gen_t *x) {
> if (x->n > 1) x->dim[0] = 5;
> return x->n;
>   }
>
>   void test() {
> struct { int n; int dim[2]; } y;
> y.n = 2;
> f ((struct gen_t*) );
>   }
>
> Or doesn't it?

It probably doesn't and suffers from the same issue as your
original fortran code.

>In any case, that's how it is suggested
> and 'y.n' is not accessed using 'gen_t' – there is only
> a cast in the function call. (Which is not sufficient
> in the Fortran FE-code generated code – as tried)
>
>   * * *
>
> Otherwise, I can confirm that the following works.
> Does this original dump now looks fine?
>
>  struct CFI_cdesc_t01 cfi.2;
> ...
>  ((struct CFI_cdesc_t *) )->version = 1;
>  ((struct CFI_cdesc_t *) )->rank = 1;
>  ((struct CFI_cdesc_t *) )->type = 1025;
>  ((struct CFI_cdesc_t *) )->attribute = 0;
>  ((struct CFI_cdesc_t *) )->base_addr = intp.data;
>  ((struct CFI_cdesc_t *) )->elem_len = 4;
> ...
>  irnk = rank_p ((struct CFI_cdesc_t *) );

Yes, that looks OK now.  The idea is you can use the complete
types for storage allocation but you _always_ have to use the
incomplete (with flexarray member) type for all accesses.

Richard.

> Thanks,
>
> Tobias
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: Failing tests after applying patches?

2021-08-24 Thread Richard Biener via Fortran
On Tue, Aug 24, 2021 at 8:47 AM Arjen Markus via Fortran
 wrote:
>
> Hi Tobias,
>
> thanks for these tips - this should come in handy indeed.
>
> One thing though: when I tried to run my freshly built gfortran compiler on
> one of the test programs, I got the message that it could not find the file
> libgfortran.spec. Is there some environment variable that muist be set?

If you are running gfortran from inside the build directory you'll need quite
some -B arguments, like

 objdir/gcc> ./gfortran -B ..//libgfortran -B
..//libgfortran/.libs -B ..//libquadmath/.libs

where the latter two are when you are also doing linking.   is
x86_64-pc-linux-gnu for me but likely sth with cygwin for you.

Richard.

> Regards,
>
> Arjen
>
> Op ma 23 aug. 2021 om 21:36 schreef Tobias Burnus :
>
> > Hi Arjen,
> >
> > On 23.08.21 20:59, Arjen Markus via Fortran wrote:
> > > as promised, here is an overview of the unexpectedly failing tests. I got
> > > these after applying the patches by Steve Kargl for bug ID 101951 and
> > > 101967. The platform I used to build it is Cygwin on WIndows 10.
> > >
> > > FAIL: gfortran.dg/analyzer/pr96949.f90   -O  (internal compiler error)
> > > FAIL: gfortran.dg/analyzer/pr96949.f90   -O  (test for excess errors)
> >
> > I recommend: https://gcc.gnu.org/pipermail/gcc-testresults/current – it
> > shows what others are getting.
> >
> > In particular, it helps: to ensure to look at the right branch (12.0
> > mainline), to look at  x86-64 Linux (as others tend to have some
> > additional issues) — and to make sure that that build actual does test
> > Fortran.
> >
> > But the simplest test is to undo your patches - recompile GCC and then
> > run (in the build directory):
> >
> > cd gcc; make check-fortran RUNTESTFLAGS="analyzer.exp=pr96949.f90"
> >
> > If the error still occurs, it is probably unrelated to the patch; if it
> > is gone, the patch probably caused it.
> >
> > I also do note that many analyzer commits have been committed today,
> > hence, it is a moving target. (It does work for me – with the current
> > checkout. But this does not tell anything about when you did your tests,
> > given that several commits were done this evening.)
> >
> > Tobias
> >
> >


Re: Build problems: mpfr, mpc

2021-08-20 Thread Richard Biener via Fortran
On Fri, Aug 20, 2021 at 12:55 PM Arjen Markus  wrote:
>
> Okay, that solved that error, now I get:
>
>  -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi=2 
> -fdiagnostics-show-location=once -ffunction-sections -fdata-sections 
> -frandom-seed=fs_ops.lo -fimplicit-templates -g -O2 -c 
> ../../../../../../libstdc++-v3/src/c++17/fs_ops.cc -o fs_ops.o
> In file included from ../../../../../../libstdc++-v3/src/c++17/fs_ops.cc:58:
> ../../../../../../libstdc++-v3/src/c++17/../filesystem/ops-common.h: In 
> function 'const char* 
> std::filesystem::get_temp_directory_from_env(std::error_code&)':
> ../../../../../../libstdc++-v3/src/c++17/../filesystem/ops-common.h:601:25: 
> error: '::secure_getenv' has not been declared
>   601 | auto tmpdir = ::secure_getenv(env);
>   | ^
>
> and there is a weird subdirectory under the build directory:  gccd: - the 
> colon cannot be part of a directory name under Windows. I guess this is my 
> mistake, as I set the install directory to "d:/gfortran/work" - that should 
> become "/cygdrive/d/gfortran/work". Easily fixed, but I do not know what to 
> do about the first error.

Possibly secure_getenv is not available in cygwin and for some reason
AC_CHECK_FUNCS (secure_getenv) as done by
libstdc++ configury misdetects it and/or the use site has a different
set of includes.  Jonathan might know.

Note using cygwin (or mingw for that) might make GCC development more
painful than necessary. It might be
that using a WSL2 based setup is easier if you're stuck to a Windows
host.  Using Linux in a virtual machine
might be another option of course.

Richard.

> Regards,
>
> Arjen
>
> Op vr 20 aug. 2021 om 12:10 schreef Arjen Markus :
>>
>>
>>
>> Op vr 20 aug. 2021 om 11:54 schreef Richard Biener:
>>>
>>>
>>>
>>> The easiest is probably to build them in-tree by means of
>>> executing ./contrib/download_prerequesites which will download
>>> and unpack them into your source tree.
>>>
>>
>> Well, I do have the libraries (source and all) and I copied the built 
>> libraries (that is the content of mpfr/,libs and mpc/.libs to the locations 
>> that are referred to in the link statement, so I assumed that would enable 
>> the linker to find them.
>>
>>> Other than that you are likely either missing some of the requirements
>>> or lack the appropriate --with-{gmp,mpc,mpfr}[-lib] configury.
>>>
>>
>> The odd thing is that the link statement as presented in the command window 
>> knows where to find the libraries. Here is the (hopefully) relevant part:
>>
>> g++ -std=c++11 -no-pie   -g -DIN_GCC -fno-exceptions -fno-rtti 
>> -fasynchronous-unwind-tables -W -Wall
>>   (...)  ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a  
>> -L/cygdrive/d/gfortran/gcc/build-gcc/./gmp/.libs 
>> -L/cygdrive/d/gfortran/gcc/build-gcc/./mpfr/src/.libs 
>> -L/cygdrive/d/gfortran/gcc/build-gcc/./mpc/src/.libs -lmpc -lmpfr -lgmp   
>> -L./../zlib -lz  libcommon.a ../libcpp/libcpp.a  -liconv 
>> ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a 
>> ../libdecnumber/libdecnumber.a
>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: cannot 
>> find -lmpc
>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: cannot 
>> find -lmpc
>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: cannot 
>> find -lmpfr
>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: cannot 
>> find -lmpfr
>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: cannot 
>> find -lmpc
>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: cannot 
>> find -lmpfr
>>
>>
>>> What host operating system are you using?
>>>
>> Cygwin on Windows 10
>>
>> Oops, now that I copied the relevant bit of the link command, I see what I 
>> did wrong: the libraries are under mpfr/./libs, not mpfr/src/.libs (and 
>> ditto for mpc).
>>
>> Okay, trying again!
>>
>> Thanks for the reply. This ought to do it.
>>
>> Regards,
>>
>> Arjen


Re: Build problems: mpfr, mpc

2021-08-20 Thread Richard Biener via Fortran
On Fri, Aug 20, 2021 at 9:59 AM Arjen Markus via Fortran
 wrote:
>
> I am trying to build the compiler suite to test the two patches Steve Kargl
> posted. But I run into a problem with the mpfr and mpc libraries: the
> linker claims it cannot find them.
>
> I checked this, in fist instance they were not present in the location they
> were assumed to be, but I had copies from an earlier build, so I copied
> them into the location indicated by the -L option. That does not help
> though: same error messages.
>
> How do I solve this?

The easiest is probably to build them in-tree by means of
executing ./contrib/download_prerequesites which will download
and unpack them into your source tree.

Other than that you are likely either missing some of the requirements
or lack the appropriate --with-{gmp,mpc,mpfr}[-lib] configury.

What host operating system are you using?

Richard.

> Regards,
>
> Arjen


Re: [PATCH] libgfortran : Use the libtool macro to determine libm availability.

2021-08-20 Thread Richard Biener via Fortran
On Thu, Aug 19, 2021 at 10:10 PM Iain Sandoe  wrote:
>
> Hi,
>
> A while ago had a report of build failure against a Darwin branch on
> the latest OS release.  This was because (temporarily) the symlink
> from libm.dylib => libSystem.dylib had been removed/omitted.
>
> libm is not needed on Darwin, and should not be added unconditionally
> even if that is (mostly) harmless since it is a symlink to libc.
>
> There could be cases where the addition was not completely harmless
> because the presentation of the symlink would cause the symbols exposed
> in libSystem to be considered ahead of ones presented in convenience
> libraries.
>
> tested on x86_64, i686-darwin, x86_64-linux,
> OK for master?

OK.

Richard.

> thanks
> Iain
>
> libgfortran/ChangeLog:
>
> * Makefile.am: Use configured libm availability.
> * Makefile.in: Regenerate.
> * configure: Regenerate.
> * configure.ac: Use libtool macro to find libm availability.
> * libgfortran.spec.in: Use configured libm availability.
> ---
>  libgfortran/Makefile.am |   2 +-
>  libgfortran/Makefile.in |   3 +-
>  libgfortran/configure   | 142 
>  libgfortran/configure.ac|   1 +
>  libgfortran/libgfortran.spec.in |   2 +-
>  5 files changed, 147 insertions(+), 3 deletions(-)
>
> diff --git a/libgfortran/Makefile.am b/libgfortran/Makefile.am
> index 8d104321567..6fc4b465c6e 100644
> --- a/libgfortran/Makefile.am
> +++ b/libgfortran/Makefile.am
> @@ -42,7 +42,7 @@ libgfortran_la_LINK = $(LINK) $(libgfortran_la_LDFLAGS)
>  libgfortran_la_LDFLAGS = -version-info `grep -v '^\#' 
> $(srcdir)/libtool-version` \
> $(LTLDFLAGS) $(LIBQUADLIB) ../libbacktrace/libbacktrace.la \
> $(HWCAP_LDFLAGS) \
> -   -lm $(extra_ldflags_libgfortran) \
> +   $(LIBM) $(extra_ldflags_libgfortran) \
> $(version_arg) -Wc,-shared-libgcc
>  libgfortran_la_DEPENDENCIES = $(version_dep) libgfortran.spec 
> $(LIBQUADLIB_DEP)
>
> diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
> index 523eb24bca1..513fd80b936 100644
> --- a/libgfortran/configure.ac
> +++ b/libgfortran/configure.ac
> @@ -260,6 +260,7 @@ AC_PROG_INSTALL
>  #AC_MSG_NOTICE([== Starting libtool configuration])
>  AC_LIBTOOL_DLOPEN
>  AM_PROG_LIBTOOL
> +AC_CHECK_LIBM
>  ACX_LT_HOST_FLAGS
>  AC_SUBST(enable_shared)
>  AC_SUBST(enable_static)
> diff --git a/libgfortran/libgfortran.spec.in b/libgfortran/libgfortran.spec.in
> index 95aa3f842a3..b870e78c151 100644
> --- a/libgfortran/libgfortran.spec.in
> +++ b/libgfortran/libgfortran.spec.in
> @@ -5,4 +5,4 @@
>  #
>
>  %rename lib liborig
> -*lib: @LIBQUADSPEC@ -lm %(libgcc) %(liborig)
> +*lib: @LIBQUADSPEC@  @LIBM@ %(libgcc) %(liborig)
> --
> 2.24.3 (Apple Git-128)
>
>


Re: [PATCH][v2] Adjust volatile handling of the operand scanner

2021-08-11 Thread Richard Biener via Fortran
On Wed, 11 Aug 2021, Eric Botcazou wrote:

> > So I'm leaning towards leaving build3 alone and fixing up frontends
> > as issues pop up.
> 
> FWIW fine with me.

OK, so I pushed the original change (reposted below).

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Richard.

>From e5a23d54d189f3d160c82f770683288a15c3645e Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Mon, 9 Aug 2021 13:12:08 +0200
Subject: [PATCH] Adjust volatile handling of the operand scanner
To: gcc-patc...@gcc.gnu.org

The GIMPLE SSA operand scanner handles COMPONENT_REFs that are
not marked TREE_THIS_VOLATILE but have a TREE_THIS_VOLATILE
FIELD_DECL as volatile.  That's inconsistent in how TREE_THIS_VOLATILE
testing on GENERIC refs works which requires operand zero of
component references to mirror TREE_THIS_VOLATILE to the ref
so that testing TREE_THIS_VOLATILE on the outermost reference
is enough to determine the volatileness.

The following patch thus removes FIELD_DECL scanning from
the GIMPLE SSA operand scanner, possibly leaving fewer stmts
marked as gimple_has_volatile_ops.

It shows we miss at least one case in the fortran frontend, though
there's a suspicious amount of COMPONENT_REF creation compared
to little setting of TREE_THIS_VOLATILE.  This fixes the FAIL
of gfortran.dg/volatile11.f90 that would otherwise occur.

Visually inspecting fortran/ reveals a bunch of likely to fix
cases but I don't know the constraints of 'volatile' uses in
the fortran language to assess whether some of these are not
necessary.

2021-08-09  Richard Biener  

gcc/
* tree-ssa-operands.c (operands_scanner::get_expr_operands):
Do not look at COMPONENT_REF FIELD_DECLs TREE_THIS_VOLATILE
to determine has_volatile_ops.

gcc/fortran/
* trans-common.c (create_common): Set TREE_THIS_VOLATILE on the
COMPONENT_REF if the field is volatile.
---
 gcc/fortran/trans-common.c | 9 +
 gcc/tree-ssa-operands.c| 7 +--
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/gcc/fortran/trans-common.c b/gcc/fortran/trans-common.c
index a11cf4c839e..7bcf18dc475 100644
--- a/gcc/fortran/trans-common.c
+++ b/gcc/fortran/trans-common.c
@@ -759,10 +759,11 @@ create_common (gfc_common_head *com, segment_info *head, 
bool saw_equiv)
   else
gfc_add_decl_to_function (var_decl);
 
-  SET_DECL_VALUE_EXPR (var_decl,
-  fold_build3_loc (input_location, COMPONENT_REF,
-   TREE_TYPE (s->field),
-   decl, s->field, NULL_TREE));
+  tree comp = build3_loc (input_location, COMPONENT_REF,
+ TREE_TYPE (s->field), decl, s->field, NULL_TREE);
+  if (TREE_THIS_VOLATILE (s->field))
+   TREE_THIS_VOLATILE (comp) = 1;
+  SET_DECL_VALUE_EXPR (var_decl, comp);
   DECL_HAS_VALUE_EXPR_P (var_decl) = 1;
   GFC_DECL_COMMON_OR_EQUIV (var_decl) = 1;
 
diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
index c15575416dd..ebf7eea3b04 100644
--- a/gcc/tree-ssa-operands.c
+++ b/gcc/tree-ssa-operands.c
@@ -834,12 +834,7 @@ operands_scanner::get_expr_operands (tree *expr_p, int 
flags)
get_expr_operands (_OPERAND (expr, 0), flags);
 
if (code == COMPONENT_REF)
- {
-   if (!(flags & opf_no_vops)
-   && TREE_THIS_VOLATILE (TREE_OPERAND (expr, 1)))
- gimple_set_has_volatile_ops (stmt, true);
-   get_expr_operands (_OPERAND (expr, 2), uflags);
- }
+ get_expr_operands (_OPERAND (expr, 2), uflags);
else if (code == ARRAY_REF || code == ARRAY_RANGE_REF)
  {
get_expr_operands (_OPERAND (expr, 1), uflags);
-- 
2.31.1



Re: [PATCH][v2] Adjust volatile handling of the operand scanner

2021-08-11 Thread Richard Biener via Fortran
On Wed, 11 Aug 2021, Richard Biener wrote:

> On Tue, 10 Aug 2021, Eric Botcazou wrote:
> 
> > > The question is whether we instead want to amend build3 to
> > > set TREE_THIS_VOLATILE automatically when the FIELD_DECL has
> > > it set.  At least for the Fortran FE cases the gimplifier
> > > fails to see some volatile references and thus can generate
> > > wrong code which is a latent issue.
> > 
> > What do we do for other similar flags, e.g. TREE_READONLY?
> 
> build3 currently does no special processing for the FIELD_DECL operand,
> it just sets TREE_THIS_VOLATILE from operand zero for tcc_references.
> 
> The C and C++ frontends have repeated patterns like
> 
>   ref = build3 (COMPONENT_REF, subtype, datum, subdatum,
> NULL_TREE);
>   SET_EXPR_LOCATION (ref, loc);
>   if (TREE_READONLY (subdatum)
>   || (use_datum_quals && TREE_READONLY (datum)))
> TREE_READONLY (ref) = 1;
>   if (TREE_THIS_VOLATILE (subdatum)
>   || (use_datum_quals && TREE_THIS_VOLATILE (datum)))
> TREE_THIS_VOLATILE (ref) = 1;
> 
> Leaving out TREE_READONLY shouldn't have any correctness issue.  It's
> just that when adjusting the SSA operand scanner to correctly interpret
> GENERIC that this uncovers pre-existing issues in the Fortran frontend
> (one manifests in a testsuite FAIL - otherwise I wouldn't have noticed).
> 
> I'm fine with just plugging the Fortran FE holes as we discover them
> but I did not check other frontends and testsuite coverage is weak.
> 
> Now - I wonder if there's a reason a frontend might _not_ want to
> set TREE_THIS_VOLATILE on a COMPONENT_REF when the FIELD_DECL has
> TREE_THIS_VOLATILE set.
> 
> I guess I'll do one more experiment and add verification that
> TREE_THIS_VOLATILE on COMPONENT_REFs and FIELD_DECLs is consistent
> and see where that trips.

It trips for

struct X { volatile int i; };

void foo ()
{
  struct X x = (struct X){ .i = 0 };
}

where the gimplifier in gimplify_init_ctor_eval does

  gcc_assert (TREE_CODE (purpose) == FIELD_DECL);
  cref = build3 (COMPONENT_REF, TREE_TYPE (purpose),
 unshare_expr (object), purpose, NULL_TREE);

producing

  x.i = 0;

that is not volatile qualified.  This manifests itself during the
build of libasan.  I'm not sure whether the gimplifiers doing is
correct or not.  Changing build3 would alter the behavior here.

Then there's a case where the COMPONENT_REF is TREE_THIS_VOLATILE
but neither the FIELD_DECL nor the base reference is.  This
trips during libtsan build and again is from gimplification/folding,
this time gimplify_modify_expr_rhs doing

case INDIRECT_REF:
  {
/* If we have code like

 *(const A*)(A*)

 where the type of "x" is a (possibly cv-qualified variant
 of "A"), treat the entire expression as identical to "x".
 This kind of code arises in C++ when an object is bound
 to a const reference, and if "x" is a TARGET_EXPR we want
 to take advantage of the optimization below.  */
bool volatile_p = TREE_THIS_VOLATILE (*from_p);
tree t = gimple_fold_indirect_ref_rhs (TREE_OPERAND (*from_p, 
0));
if (t)
  {
if (TREE_THIS_VOLATILE (t) != volatile_p)
  {
if (DECL_P (t))
  t = build_simple_mem_ref_loc (EXPR_LOCATION 
(*from_p),
build_fold_addr_expr 
(t));
if (REFERENCE_CLASS_P (t))
  TREE_THIS_VOLATILE (t) = volatile_p;

I suppose that's OK, it's folding volatile
*(void (*__sanitizer_sighandler_ptr) (int) *) >D.5368.handler
to act->D.5368.handler which wouldn't be volatile.

The opposite could happen, too, of course - casting away volatileness
for an access but letting that slip through verification would make
it moot.  So ...

With those cases fixed bootstrap runs through and testing reveals
no additional issues apart from the already known
gfortran.dg/volatile11.f90

So I'm leaning towards leaving build3 alone and fixing up frontends
as issues pop up.

Ricahrd.


Re: [PATCH][v2] Adjust volatile handling of the operand scanner

2021-08-11 Thread Richard Biener via Fortran
On Tue, 10 Aug 2021, Eric Botcazou wrote:

> > The question is whether we instead want to amend build3 to
> > set TREE_THIS_VOLATILE automatically when the FIELD_DECL has
> > it set.  At least for the Fortran FE cases the gimplifier
> > fails to see some volatile references and thus can generate
> > wrong code which is a latent issue.
> 
> What do we do for other similar flags, e.g. TREE_READONLY?

build3 currently does no special processing for the FIELD_DECL operand,
it just sets TREE_THIS_VOLATILE from operand zero for tcc_references.

The C and C++ frontends have repeated patterns like

  ref = build3 (COMPONENT_REF, subtype, datum, subdatum,
NULL_TREE);
  SET_EXPR_LOCATION (ref, loc);
  if (TREE_READONLY (subdatum)
  || (use_datum_quals && TREE_READONLY (datum)))
TREE_READONLY (ref) = 1;
  if (TREE_THIS_VOLATILE (subdatum)
  || (use_datum_quals && TREE_THIS_VOLATILE (datum)))
TREE_THIS_VOLATILE (ref) = 1;

Leaving out TREE_READONLY shouldn't have any correctness issue.  It's
just that when adjusting the SSA operand scanner to correctly interpret
GENERIC that this uncovers pre-existing issues in the Fortran frontend
(one manifests in a testsuite FAIL - otherwise I wouldn't have noticed).

I'm fine with just plugging the Fortran FE holes as we discover them
but I did not check other frontends and testsuite coverage is weak.

Now - I wonder if there's a reason a frontend might _not_ want to
set TREE_THIS_VOLATILE on a COMPONENT_REF when the FIELD_DECL has
TREE_THIS_VOLATILE set.

I guess I'll do one more experiment and add verification that
TREE_THIS_VOLATILE on COMPONENT_REFs and FIELD_DECLs is consistent
and see where that trips.

Richard.


Re: [PATCH][v2] Adjust volatile handling of the operand scanner

2021-08-10 Thread Richard Biener via Fortran
On Tue, Aug 10, 2021 at 1:41 PM Richard Biener via Gcc-patches
 wrote:
>
> The GIMPLE SSA operand scanner handles COMPONENT_REFs that are
> not marked TREE_THIS_VOLATILE but have a TREE_THIS_VOLATILE
> FIELD_DECL as volatile.  That's inconsistent in how TREE_THIS_VOLATILE
> testing on GENERIC refs works which requires operand zero of
> component references to mirror TREE_THIS_VOLATILE to the ref
> so that testing TREE_THIS_VOLATILE on the outermost reference
> is enough to determine the volatileness.
>
> The following patch thus removes FIELD_DECL scanning from
> the GIMPLE SSA operand scanner, possibly leaving fewer stmts
> marked as gimple_has_volatile_ops.
[...]

> The question is whether we instead want to amend build3 to
> set TREE_THIS_VOLATILE automatically when the FIELD_DECL has
> it set.

A change along that line also passes bootstrap and regtest.

Any preference?

Thanks,
Richard.

>  At least for the Fortran FE cases the gimplifier
> fails to see some volatile references and thus can generate
> wrong code which is a latent issue.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK?
>
> Thanks,
> Richard.
>
> 2021-08-09  Richard Biener  
>
> gcc/
> * tree-ssa-operands.c (operands_scanner::get_expr_operands):
> Do not look at COMPONENT_REF FIELD_DECLs TREE_THIS_VOLATILE
> to determine has_volatile_ops.
>
> gcc/fortran/
> * trans-common.c (create_common): Set TREE_THIS_VOLATILE on the
> COMPONENT_REF if the field is volatile.
> ---
>  gcc/fortran/trans-common.c | 9 +
>  gcc/tree-ssa-operands.c| 7 +--
>  2 files changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/fortran/trans-common.c b/gcc/fortran/trans-common.c
> index a11cf4c839e..7bcf18dc475 100644
> --- a/gcc/fortran/trans-common.c
> +++ b/gcc/fortran/trans-common.c
> @@ -759,10 +759,11 @@ create_common (gfc_common_head *com, segment_info 
> *head, bool saw_equiv)
>else
> gfc_add_decl_to_function (var_decl);
>
> -  SET_DECL_VALUE_EXPR (var_decl,
> -  fold_build3_loc (input_location, COMPONENT_REF,
> -   TREE_TYPE (s->field),
> -   decl, s->field, NULL_TREE));
> +  tree comp = build3_loc (input_location, COMPONENT_REF,
> + TREE_TYPE (s->field), decl, s->field, 
> NULL_TREE);
> +  if (TREE_THIS_VOLATILE (s->field))
> +   TREE_THIS_VOLATILE (comp) = 1;
> +  SET_DECL_VALUE_EXPR (var_decl, comp);
>DECL_HAS_VALUE_EXPR_P (var_decl) = 1;
>GFC_DECL_COMMON_OR_EQUIV (var_decl) = 1;
>
> diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
> index c15575416dd..ebf7eea3b04 100644
> --- a/gcc/tree-ssa-operands.c
> +++ b/gcc/tree-ssa-operands.c
> @@ -834,12 +834,7 @@ operands_scanner::get_expr_operands (tree *expr_p, int 
> flags)
> get_expr_operands (_OPERAND (expr, 0), flags);
>
> if (code == COMPONENT_REF)
> - {
> -   if (!(flags & opf_no_vops)
> -   && TREE_THIS_VOLATILE (TREE_OPERAND (expr, 1)))
> - gimple_set_has_volatile_ops (stmt, true);
> -   get_expr_operands (_OPERAND (expr, 2), uflags);
> - }
> + get_expr_operands (_OPERAND (expr, 2), uflags);
> else if (code == ARRAY_REF || code == ARRAY_RANGE_REF)
>   {
> get_expr_operands (_OPERAND (expr, 1), uflags);
> --
> 2.31.1


[PATCH][v2] Adjust volatile handling of the operand scanner

2021-08-10 Thread Richard Biener via Fortran
The GIMPLE SSA operand scanner handles COMPONENT_REFs that are
not marked TREE_THIS_VOLATILE but have a TREE_THIS_VOLATILE
FIELD_DECL as volatile.  That's inconsistent in how TREE_THIS_VOLATILE
testing on GENERIC refs works which requires operand zero of
component references to mirror TREE_THIS_VOLATILE to the ref
so that testing TREE_THIS_VOLATILE on the outermost reference
is enough to determine the volatileness.

The following patch thus removes FIELD_DECL scanning from
the GIMPLE SSA operand scanner, possibly leaving fewer stmts
marked as gimple_has_volatile_ops.

It shows we miss at least one case in the fortran frontend, though
there's a suspicious amount of COMPONENT_REF creation compared
to little setting of TREE_THIS_VOLATILE.  This fixes the FAIL
of gfortran.dg/volatile11.f90 that would otherwise occur.

Visually inspecting fortran/ reveals a bunch of likely to fix
cases but I don't know the constraints of 'volatile' uses in
the fortran language to assess whether some of these are not
necessary.  The cases would be fixed with

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 0d013defdbb..5d708fe90aa 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -6983,9 +6983,10 @@ gfc_get_dataptr_offset (stmtblock_t *block, tree parm, 
tree desc, tree offset,
case REF_COMPONENT:
  field = ref->u.c.component->backend_decl;
  gcc_assert (field && TREE_CODE (field) == FIELD_DECL);
- tmp = fold_build3_loc (input_location, COMPONENT_REF,
-TREE_TYPE (field),
-tmp, field, NULL_TREE);
+ tmp = build3_loc (input_location, COMPONENT_REF,
+   TREE_TYPE (field), tmp, field, NULL_TREE);
+ if (TREE_THIS_VOLATILE (field))
+   TREE_THIS_VOLATILE (tmp) = 1;
  break;

case REF_SUBSTRING:
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index c4291cce079..e6dc79f8c1e 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -2244,9 +2244,11 @@ gfc_get_tree_for_caf_expr (gfc_expr *expr)

if (POINTER_TYPE_P (TREE_TYPE (caf_decl)))
  caf_decl = build_fold_indirect_ref_loc (input_location, caf_decl);
-   caf_decl = fold_build3_loc (input_location, COMPONENT_REF,
-   TREE_TYPE (comp->backend_decl), caf_decl,
-   comp->backend_decl, NULL_TREE);
+   caf_decl = build3_loc (input_location, COMPONENT_REF,
+  TREE_TYPE (comp->backend_decl), caf_decl,
+  comp->backend_decl, NULL_TREE);
+   if (TREE_THIS_VOLATILE (comp->backend_decl))
+ TREE_THIS_VOLATILE (caf_decl) = 1;
if (comp->ts.type == BT_CLASS)
  {
caf_decl = gfc_class_data_get (caf_decl);
@@ -2755,8 +2757,10 @@ gfc_conv_component_ref (gfc_se * se, gfc_ref * ref)
   else
 se->class_vptr = NULL_TREE;

-  tmp = fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
-decl, field, NULL_TREE);
+  tmp =build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
+  decl, field, NULL_TREE);
+  if (TREE_THIS_VOLATILE (field))
+TREE_THIS_VOLATILE (tmp) = 1;

   se->expr = tmp;

@@ -8792,8 +8796,10 @@ gfc_trans_structure_assign (tree dest, gfc_expr * expr, 
bool init, bool coarray)
}
   field = cm->backend_decl;
   gcc_assert(field);
-  tmp = fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
-dest, field, NULL_TREE);
+  tmp = build3_loc (input_location, COMPONENT_REF, TREE_TYPE (field),
+   dest, field, NULL_TREE);
+  if (TREE_THIS_VOLATILE (field))
+   TREE_THIS_VOLATILE (tmp) = 1;
   if (!c->expr)
{
  gfc_expr *e = gfc_get_null_expr (NULL);

but I did not include them as they have no effect on the testsuite.

The question is whether we instead want to amend build3 to
set TREE_THIS_VOLATILE automatically when the FIELD_DECL has
it set.  At least for the Fortran FE cases the gimplifier
fails to see some volatile references and thus can generate
wrong code which is a latent issue.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2021-08-09  Richard Biener  

gcc/
* tree-ssa-operands.c (operands_scanner::get_expr_operands):
Do not look at COMPONENT_REF FIELD_DECLs TREE_THIS_VOLATILE
to determine has_volatile_ops.

gcc/fortran/
* trans-common.c (create_common): Set TREE_THIS_VOLATILE on the
COMPONENT_REF if the field is volatile.
---
 gcc/fortran/trans-common.c | 9 +
 gcc/tree-ssa-operands.c| 7 +--
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/gcc/fortran/trans-common.c b/gcc/fortran/trans-common.c
index a11cf4c839e..7bcf18dc475 100644
--- 

Re: Fix Fortran rounding issues, PR fortran/96983.

2021-04-22 Thread Richard Biener via Fortran
On April 22, 2021 9:09:28 PM GMT+02:00, Michael Meissner 
 wrote:
>On Wed, Apr 21, 2021 at 10:10:07AM +0200, Tobias Burnus wrote:
>> On 20.04.21 08:58, Richard Biener via Fortran wrote:
>> >On Mon, Apr 19, 2021 at 9:40 PM Michael Meissner via Fortran
>> >  wrote:
>> Is there any reason to not only send the email to fortran@ _and_
>> gcc-patches@ but sending it to 13 Fortran maintainers explicitly?
>(Now
>> removed)
>
>Sorry about that.  With PowerPC backend changes, I generally do
>explicitly add
>the maintainers so things don't get lost.
>
>
>> >>Fix Fortran rounding issues, PR fortran/96983.
>> >>
>> >>Can I check this change into the GCC trunk?
>> >The patch looks fine technically and is definitely an improvement
>since the
>> >intermediate conversion looks odd.  But it might be that the
>standard
>> >requires such dance though the preceeding cases handled don't seem
>to
>> >care.  I'm thinking of a FP format where round(1.6) == 3 because of
>lack
>> >of precision but using an intermediate FP format with higher
>precision
>> >would "correctly" compute 2.
>> 
>> The patched build_round_expr is only called by ANINT / NINT;
>> NINT is real → integer; ANINT is real → real
>> [And the modified code is only called for NINT, reason: see comment
>far below.]
>> 
>> NINT (A[, KIND]) is described (F2018) as "Nearest integer":
>> * Result Characteristics. Integer. If KIND is present, the kind type
>parameter
>>   is that specified by the value of KIND;
>>   otherwise, the kind type parameter is that of default integer type.
>> * The result is the integer nearest A, or if there are two
>>   integers equally near A, the result is whichever such integer has
>the greater
>>   magnitude.
>> * Example. NINT (2.783) has the value 3.
>> 
>> ANINT (A[, KIND]) as "Nearest whole number":
>> * The result is of type real. If KIND is present, the kind type
>parameter is that
>>   specified by the value of KIND; otherwise, the kind type parameter
>is that of A.
>> * The result is the integer nearest A, or if there are two integers
>equally near A,
>>   the result is whichever such integer has the greater magnitude.
>> * Examples. ANINT (2.783) has the value 3.0. ANINT (−2.783) has the
>value −3.0.
>> 
>> >Of course the current code doesn't handle this correctly for the
>> >case if llroundf either.
>> >>I've not contributed to the Fortran front end before.  If the
>maintainers like
>> >>the patch, can somebody point out if I need to do additional things
>to commit
>> >>the patch?
>> Nothing special: a testcase already exists, committing is done as
>usual
>> and a PR to update you have as well.
>
>Given GCC 11 has branched, is it ok to backport the patch to the GCC 11
>branch
>as well?  I assume it is, since it fixes a regression in the compiler.

Please wait until after 11.1 is released. 

Thanks, 
Richard. 



Re: Fix Fortran rounding issues, PR fortran/96983.

2021-04-20 Thread Richard Biener via Fortran
On Mon, Apr 19, 2021 at 9:40 PM Michael Meissner via Fortran
 wrote:
>
> Fix Fortran rounding issues, PR fortran/96983.
>
> I was looking at Fortran PR 96983, which fails on the PowerPC when trying to
> run the test PR96711.F90.  The compiler ICEs because the PowerPC does not have
> a floating point type with a type precision of 128.  The reason is that the
> PowerPC has 3 different 128 bit floating point types (__float128/_Float128,
> __ibm128, and long double).  Currently long double uses the IBM extended 
> double
> type, but we would like to switch to using IEEE 128-bit long doubles in the
> future.
>
> In order to prevent the compiler from converting explicit __ibm128 types to
> long double when long double uses the IEEE 128-bit representation, we have set
> up the precision for __ibm128 to be 128, long double to be 127, and
> __float128/_Float128 to be 126.
>
> Originally, I was trying to see if for Fortran, I could change the precision 
> of
> long double to be 128 (Fortran doesn't access __ibm128), but it quickly became
> hard to get the changes to work.
>
> I looked at the Fortran code in build_round_expr, and I came to the conclusion
> that there is no reason to promote the floating point type.  If you just do a
> normal round of the value using the current floating point format and then
> convert it to the integer type.  We don't have an appropriate built-in 
> function
> that provides the equivalent of llround for 128-bit integer types.
>
> This patch fixes the compiler crash.
>
> However, while with this patch, the PowerPC compiler will not crash when
> building the test case, it will not run on the current default installation.
> The failure is because the test is explicitly expecting 128-bit floating point
> to handle 10384593717069655257060992658440192_16 (i.e. 2**113).
>
> By default, the PowerPC uses IBM extended double used for 128-bit floating
> point.  The IBM extended double format is a pair of doubles that provides more
> mantissa bits but does not grow the expoenent range.  The value in the test is
> fine for IEEE 128-bit floating point, but it is too large for the PowerPC
> extended double setup.
>
> I have built the following tests with this patch:
>
>* I have built a bootstrap compiler on a little endian power9 Linux system
>  with the default long double format (IBM extended double).  The
>  pr96711.f90 test builds, but it does not run due to the range of the
>  real*16 exponent.  There were no other regressions in the C/C++/Fortran
>  tests.
>
>* I have built a bootstrap compiler on a little endian power9 Linux system
>  with the default long double format set to IEEE 128-bit. I used the
>  Advance Toolchain 14.0-2 to provide the IEEE 128-bits.  The compiler was
>  configured to build power9 code by default, so the test generated native
>  power9 IEEE 128-bit instructions.  The pr96711.f90 test builds and runs
>  correctly in this setup.
>
>* I have built a bootstrap compiler on a big endian power8 Linux system 
> with
>  the default long double format (IBM extended double).  Like the first
>  case, the pr96711.f90 test does not crash the compiler, but the test 
> fails
>  due to the range of the real*16 exponent.There were no other
>  regressions in the C/C++/Fortran tests.
>
>* I built a bootstrap compiler on my x86_64 laptop.  There were no
>  regressions in the tests.
>
>
> Can I check this change into the GCC trunk?

The patch looks fine technically and is definitely an improvement since the
intermediate conversion looks odd.  But it might be that the standard
requires such dance though the preceeding cases handled don't seem to
care.  I'm thinking of a FP format where round(1.6) == 3 because of lack
of precision but using an intermediate FP format with higher precision
would "correctly" compute 2.

Of course the current code doesn't handle this correctly for the
case if llroundf either.

Richard.

> I've not contributed to the Fortran front end before.  If the maintainers like
> the patch, can somebody point out if I need to do additional things to commit
> the patch?
>
> gcc/fortran/
> 2021-04-19  Michael Meissner  
>
> PR gfortran/96983
> * trans-intrinsic.c (build_round_expr): If int type is larger than
> long long, do the round and convert to the integer type.  Do not
> try to find a floating point type the exact size of the integer
> type.
> ---
>  gcc/fortran/trans-intrinsic.c | 26 --
>  1 file changed, 8 insertions(+), 18 deletions(-)
>
> diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
> index 5e53d1162fa..cceef8f34ac 100644
> --- a/gcc/fortran/trans-intrinsic.c
> +++ b/gcc/fortran/trans-intrinsic.c
> @@ -386,30 +386,20 @@ build_round_expr (tree arg, tree restype)
>argprec = TYPE_PRECISION (argtype);
>resprec = TYPE_PRECISION (restype);
>
> -  /* Depending on the type of the result, choose 

Re: MATMUL broken with frontend optimization.

2021-03-18 Thread Richard Biener via Fortran
On Thu, Mar 18, 2021 at 3:48 PM Tobias Burnus  wrote:
>
> Richard,
>
> On 18.03.21 13:35, Richard Biener via Fortran wrote:
> > [...]
> > Since the libgfortran MATMUL should be vectorized
> > I think it's not reasonable to inline any but _very_ small
> > MATMUL at optimization levels that do not enable vectorization.
>
> Besides the obvious if (!flag_external_blas) which should always prevent
> inlining (possibly except for tiny N like N=1), your idea is 'if (N
> small || flag_tree_loop_vectorize)'?
>
> Or are you thinking of a different or additional flag_... than
> flag_tree_loop_vectorize for making this choice?

Yes, I was thinking of flag_tree_loop_vectorize.  Of course libgfortran
is far from having micro-optimized matmul for various architectures
but IIRC it uses attribute(target) to provide several overloads.  So
maybe only ever inlining tiny matmul makes sense as well (does the
runtime have specializations for small sizes?)

Richard.

> Tobias
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
> Thürauf


Re: MATMUL broken with frontend optimization.

2021-03-18 Thread Richard Biener via Fortran
On Thu, Mar 18, 2021 at 8:49 AM Steve Kargl via Fortran
 wrote:
>
> It seems that gfortran will inline MATMUL with optimization.
> This  produce very poor performance.  In fact, gfortran will
> inline MATMUL even if one specifies -fexternal-blas.  This is
> very bad.
>
> % cat a.f90
> program main
>
>implicit none
>
>integer, parameter :: imax = 2, jmax = 1
>real, allocatable :: inVect(:), matrix(:,:), outVect(:)
>real :: start, finish
>
>allocate(invect(imax), matrix(imax,jmax), outvect(jmax))
>
>call random_number(inVect)
>call random_number(matrix)
>
>call cpu_time(start)
>outVect = matmul(inVect, matrix)
>call cpu_time(finish)
>
>print '("Time = ",f10.7," seconds. – First Value = 
> ",f10.4)',finish-start,outVect(1)
> end program main
>
> % gfcx -o z -O0 a.f90 && ./z
> Time =  0.2234111 seconds. – First Value =  4982.6362
> % nm z | grep matmul
>  U _gfortran_matmul_r4@@GFORTRAN_8
> % gfcx -o z -O1 a.f90 && ./z
> Time =  0.3295890 seconds. – First Value =  4971.0962
> % nm z | grep matmul
> % gfcx -o z -O2 a.f90 && ./z
> Time =  0.3299561 seconds. – First Value =  5025.4902
> % nm z | grep matmul
> % gfcx -o z -O2 -fexternal-blas a.f90 && ./z
> Time =  0.3295580 seconds. – First Value =  5022.8291
>
> This last one is definitely broken.  I did not link with
> an external BLAS library.  Please fix before 11.1 is
> released.

Since the libgfortran MATMUL should be vectorized
I think it's not reasonable to inline any but _very_ small
MATMUL at optimization levels that do not enable vectorization.

Richard.

>
> --
> Steve


Re:

2021-03-10 Thread Richard Biener via Fortran
On Wed, Mar 10, 2021 at 8:39 PM mscfd via Fortran  wrote:
>
> > which version of gfortran, and which operating system?
> I have seen this on two different Linux distros on x86 with a recently 
> compiled version, but also some time ago with an older gfortran 10 version.
>
> Using helgrind on a simple omp do loop with write to a character variable, I 
> get some possible data races in Libgfortran/io/unit.c. There a newunits array 
> is allocated and possibly reallocated in "newunit_alloc". According to the 
> lock outputs from helgrind I see that this routine is called even if output 
> into character variable is done. Now "newunit_alloc" uses a lock to avoid 
> having several thread all over the place. But newunit_free also writes to 
> newunits array. And this routine does not obtain a lock itself (see comment 
> in unit.c) So in theory it can happen that newunit_alloc reallocated 
> newunits, and newunit_free writes to it just at this time. As I also use 18 
> threads the initial size of 16 does not suffice and reallocation does 
> probably indeed happen.
> Also acces to newunit_lwi is not protected as well (and complained about by 
> helgrind).
>
> Could it be that the corresponding write routine in transfer.c which calls 
> newunit_free does not obtain the necessary lock. I cannot find it (which does 
> not count much).
>
> Any thoughts?

try obtaining the locks around the places you figured and see if your
problem goes away?

> Martin