from:"Toon Moene"

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene


On 9/4/24 12:55, Jan Hubicka wrote:


On 9/3/24 15:07, Jan Hubicka wrote:


Hi,
We disable gathers for zen4.  It seems that gather has improved a bit compared
to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when
the indices are known ahead of time. Vector loads followed by shuffles result
in a higher load bandwidth." however the situation seems to be more
complicated.


A small bit of "real world" experience (but for zen3):

Recently I switched to gfortran 14.2 for my weather forecasting.
A year ago I had changed "-march=native -mtune=native" (on my zen3 system)
to "-march=native -mtune=znver2" while using gfortran 13 - it had only a
small effect (but positive).

Last Monday I switched back to "-march=native -mtune=native", but that
consistently made a 12 hour computation around 6 minutes slower (i.e., about
1/120th, or 0.8 %). The most computational intensive part of the code needs
gather (either instructions or inline expansions of them).


It would be nice to know what is causing this. Gathers can be enabled
using -mtune-ctrl=use_gather and I would be happy to know about real
world situations where they help.


Ah - one detail that I forgot to mention: our code is "special" in the 
sense that it uses 32-bit floats while it runs on 64-bit address space.


So its use of gather instructions is rather suboptimal, needing 2 gather 
instructions for each actual "gather operation".


Hope this helps,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene


On 9/3/24 15:07, Jan Hubicka wrote:


Hi,
We disable gathers for zen4.  It seems that gather has improved a bit compared
to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when
the indices are known ahead of time. Vector loads followed by shuffles result
in a higher load bandwidth." however the situation seems to be more
complicated.


A small bit of "real world" experience (but for zen3):

Recently I switched to gfortran 14.2 for my weather forecasting.
A year ago I had changed "-march=native -mtune=native" (on my zen3 
system) to "-march=native -mtune=znver2" while using gfortran 13 - it 
had only a small effect (but positive).


Last Monday I switched back to "-march=native -mtune=native", but that 
consistently made a 12 hour computation around 6 minutes slower (i.e., 
about 1/120th, or 0.8 %). The most computational intensive part of the 
code needs gather (either instructions or inline expansions of them).


Hope this helps,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH][v2] Enhance if-conversion for automatic arrays

2024-06-19 Thread Toon Moene


On 6/19/24 21:06, Richard Biener wrote:




Am 19.06.2024 um 20:25 schrieb Toon Moene :

On 6/17/24 16:05, Richard Biener wrote:


Automatic arrays that are not address-taken should not be subject to
store data races.  This applies to OMP SIMD in-branch lowered
functions result array which for the testcase otherwise prevents
vectorization with SSE and for AVX and AVX512 ends up with spurious
.MASK_STORE to the stack surviving.


Does this also apply for "automatic arrays" as defined by the Fortran Standard 
(see https://j3-fortran.org/doc/year/23/23-007r1.pdf, page 104), i.e., outside of the 
OMP_SIMD construct ?

In gfortran, when using the option -fstack-arrays, they are assigned memory 
space on the stack.


I’d say yes though the likelihood those are address taken and thus not 
considered is high.  The main target were the arrays created as part of the 
SIMD lowering.


Isn't there a "not" missing before "high" ?

So it mostly helps after the call to subroutine 'sub' in the following:

SUBROUTINE AAP(A, B, N)
INTEGER N
REAL A(N), B(N), R(N)
CALL SUB(R, N) ! Address of R passed to SUB
R = ABS(A)
B = R
B = SQRT(A)
END

?
--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH][v2] Enhance if-conversion for automatic arrays

2024-06-19 Thread Toon Moene


On 6/17/24 16:05, Richard Biener wrote:


Automatic arrays that are not address-taken should not be subject to
store data races.  This applies to OMP SIMD in-branch lowered
functions result array which for the testcase otherwise prevents
vectorization with SSE and for AVX and AVX512 ends up with spurious
.MASK_STORE to the stack surviving.


Does this also apply for "automatic arrays" as defined by the Fortran 
Standard (see https://j3-fortran.org/doc/year/23/23-007r1.pdf, page 
104), i.e., outside of the OMP_SIMD construct ?


In gfortran, when using the option -fstack-arrays, they are assigned 
memory space on the stack.


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] rtl-optimization/113597 - recover base term for argument pointers

2024-02-10 Thread Toon Moene


I managed to try this patch on aarch64-linux-gnu:

This is the test run without your patch:

https://gcc.gnu.org/pipermail/gcc-testresults/2024-February/807637.html

And this is the "result" with your patch:

https://gcc.gnu.org/pipermail/gcc-testresults/2024-February/807645.html

For me, as for you, it works for x86_64-linux-gnu:

https://gcc.gnu.org/pipermail/gcc-testresults/2024-February/807609.html

I hope this helps.

Kind regards,
Toon Moene.

On 2/9/24 11:26, Richard Biener wrote:

The following allows a base term to be derived from an existing
MEM_EXPR, notably the points-to set of a MEM_REF base.  For the
testcase in the PR this helps RTL DSE elide stores to a stack
temporary.  This covers pointers to NONLOCAL which can be mapped
to arg_base_value, helping to disambiguate against other special
bases (ADDRESS) as well as PARM_DECL accesses.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

This is an attempt to recover some of the losses from dumbing down
find_base_{term,value}.  I did give my ideas how to properly do
this during stage1 a start, I will post a short incomplete RFC series
later today.

OK for trunk?

I've included all languages in testing and also tested with -m32 but
details of RTL alias analysis might escape me ...

Thanks,
Richard.

PR rtl-optimization/113597
* alias.cc (find_base_term): Add argument for the whole mem
and derive a base term from its MEM_EXPR.
(true_dependence_1): Pass down the MEMs to find_base_term.
(write_dependence_p): Likewise.
(may_alias_p): Likewise.
---
  gcc/alias.cc | 43 ---
  1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/gcc/alias.cc b/gcc/alias.cc
index 6fad4b29d31..e33c56b0e80 100644
--- a/gcc/alias.cc
+++ b/gcc/alias.cc
@@ -40,6 +40,9 @@ along with GCC; see the file COPYING3.  If not see
  #include "rtl-iter.h"
  #include "cgraph.h"
  #include "ipa-utils.h"
+#include "stringpool.h"
+#include "value-range.h"
+#include "tree-ssanames.h"
  
  /* The aliasing API provided here solves related but different problems:
  
@@ -190,6 +193,10 @@ static struct {

arguments, since we do not know at this level whether accesses
based on different arguments can alias.  The ADDRESS has id 0.
  
+	This is solely useful to disambiguate against other ADDRESS

+   bases as we know incoming pointers cannot point to local
+   stack, frame or argument space.
+
   2. stack_pointer_rtx, frame_pointer_rtx, hard_frame_pointer_rtx
(if distinct from frame_pointer_rtx) and arg_pointer_rtx.
Each of these rtxes has a separate ADDRESS associated with it,
@@ -2113,12 +2120,34 @@ find_base_term (rtx x, vec  
  static rtx

-find_base_term (rtx x)
+find_base_term (rtx x, const_rtx mem = NULL_RTX)
  {
auto_vec, 32> visited_vals;
rtx res = find_base_term (x, visited_vals);
for (unsigned i = 0; i < visited_vals.length (); ++i)
  visited_vals[i].first->locs = visited_vals[i].second;
+  if (!res && mem && MEM_EXPR (mem))
+{
+  tree base = get_base_address (MEM_EXPR (mem));
+  if (TREE_CODE (base) == PARM_DECL
+ && DECL_RTL_SET_P (base))
+   /* We need to look at how we expanded a PARM_DECL.  It might be in
+  the argument space (UNIQUE_BASE_VALUE_ARGP) or it might
+  be spilled (UNIQUE_BASE_VALUE_FP/UNIQUE_BASE_VALUE_HFP).  */
+   res = find_base_term (DECL_RTL (base));
+  else if (TREE_CODE (base) == MEM_REF
+  && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME
+  && SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0)))
+   {
+ auto pt = &SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0))->pt;
+ if (pt->nonlocal
+ && !pt->anything
+ && !pt->escaped
+ && !pt->ipa_escaped
+ && bitmap_empty_p (pt->vars))
+   res = arg_base_value;
+   }
+}
return res;
  }
  
@@ -3035,13 +3064,13 @@ true_dependence_1 (const_rtx mem, machine_mode mem_mode, rtx mem_addr,

if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x))
  return true;
  
-  base = find_base_term (x_addr);

+  base = find_base_term (x_addr, x);
if (base && (GET_CODE (base) == LABEL_REF
   || (GET_CODE (base) == SYMBOL_REF
   && CONSTANT_POOL_ADDRESS_P (base
  return false;
  
-  rtx mem_base = find_base_term (true_mem_addr);

+  rtx mem_base = find_base_term (true_mem_addr, mem);
if (! base_alias_check (x_addr, base, true_mem_addr, mem_base,
  GET_MODE (x), mem_mode))
  return false;
@@ -3142,7 +3171,7 @@ write_dependence_p (const_rtx mem,
if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x))
  return true;
  
-  base = find_base_term

Re: [PATCH]middle-end: check memory accesses in the destination block [PR113588].

2024-02-02 Thread Toon Moene


On 2/1/24 22:33, Tamar Christina wrote:


Bootstrapped Regtested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu no 
issues.
Also checked both with --enable-lto --with-build-config='bootstrap-O3 
bootstrap-lto' --enable-multilib
and --enable-lto --with-build-config=bootstrap-O3 
--enable-checking=release,yes,rtl,extra;
and checked the libcrypt testsuite as reported on PR113467.


Note that I still run into problems if bootstrapping 
--with-build-config=bootstrap-O3 
(https://gcc.gnu.org/pipermail/gcc-testresults/2024-February/806840.html), 
but it is not visible.


That is because it happens in the test suite of gmp, which I build 
locally as part of the build.


It *is* visible in the full log of the bootstrap:

toon@moene:~/compilers$ grep ^FAIL log-thunderx-r14-8681
FAIL: libphobos.exceptions/rt_trap_exceptions.d output pattern test
FAIL: tacos
FAIL: tacosh
FAIL: tadd_si
FAIL: tadd_ui
FAIL: targ
FAIL: tasin
FAIL: tasinh
FAIL: tatan
FAIL: tatanh
FAIL: tcos
FAIL: tcosh


Quite a few of these "t" routines I had to kill by hand because they hang.

With a standard bootstrap I do not have that problem.

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH]middle-end: thread through existing LCSSA variable for alternative exits too [PR113237]

2024-01-07 Thread Toon Moene


On 1/7/24 18:29, Tamar Christina wrote:


gcc/ChangeLog:

PR tree-optimization/113237
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Use
existing LCSSA variable for exit when all exits are early break.


Might that be the same error as I got here when building with 
bootstrap-lto and bootstrap-O3:


https://gcc.gnu.org/pipermail/gcc-testresults/2024-January/804807.html

?

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH V3] RISC-V: Remove mem-to-mem VLS move pattern[PR111566]

2023-09-27 Thread Toon Moene


On 9/27/23 19:31, Jeff Law wrote:



On 9/27/23 04:14, juzhe.zh...@rivai.ai wrote:

Since after removing mem-to-mem pattern.

program main
   integer, dimension(:,:), allocatable :: a, b
   integer, dimension(:), allocatable :: sh
   allocate (a(2,2))
   allocate (b(2,2))
   allocate (sh(3))
   a = 1
   b = cshift(a,sh)
end program main

This case will failed if we don't change mov pattern.
Can you expand on this?  You didn't indicate the failure mode or any 
analysis behind the failure.


jeff


Note that this Fortran code has no defined behavior, because the sh 
array isn't given any values ...


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [RFC] GCC Security policy

2023-08-16 Thread Toon Moene


On 8/16/23 01:07, Alexander Monakov wrote:


On Tue, 15 Aug 2023, Siddhesh Poyarekar wrote:


Thanks, this is nicer (see notes below). My main concern is that we
shouldn't pretend there's some method of verifying that arbitrary source
code is "safe" to pass to an unsandboxed compiler, nor should we push
the responsibility of doing that on users.


But responsibility would be pushed to users, wouldn't it?


Making users responsible for verifying that sources are "safe" is not okay
(we cannot teach them how to do that since there's no general method).


While there is no "general method" for this, there exists a whole 
Working Group under ISO whose responsibility is to identify and list 
vulnerabilities in programming languages - Working Group 23.


Its web page is: https://www.open-std.org/jtc1/sc22/wg23/

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: Should -ffp-contract=off the default on GCC?

2023-03-21 Thread Toon Moene


On 3/21/23 19:03, Paul Koning via Gcc-patches wrote:


Failure to understand the language is a common problem and we do try to emit 
various diagnostics to help developers avoid writing non-conformant code.  But 
ultimately if a developer fails to understand the language standard, then 
they're going to be surprised by the behavior of their code.


Conversely, of course, the problem is that C and other languages have evolved 
to the point that you have to be a language lawyer to write valid code.  In 
other words, a substantial fraction of programmers are by definition writing 
unreliable code.  This is not a good situation, and it may be part of the 
reason why modern software has such a high rate of defects.


Fortran compilers that I use regularly (I mean, aside from gfortran) 
have already given up on this battle, at least as far as floating point 
issues are concerned.


So many people want to have "repeatable floating point computations" 
that if someone writes:


READ*, X, Y, Z
PRINT*, X + Y + Z
END

they will get (if they know the compiler option that guarantees this - 
but they will) the following code:


READ*, X, Y, Z
PRINT*, (X + Y) + Z
END

even though there's no way in hell the Fortran Language Standard (any of 
them) guarantees this.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] range-op-float: Fix up ICE in lower_bound [PR107975]

2022-12-06 Thread Toon Moene


On 12/5/22 19:35, Jakub Jelinek via Gcc-patches wrote:


Hi!

According to 
https://gcc.gnu.org/pipermail/gcc-regression/2022-December/077258.html


Seen in the wild too - compiling one of the two weather forecasting 
programs I use:


during GIMPLE pass: dom
/home/toon/scratch/hm_home/my_CY46h1/lib/src/surfex/ASSIM/assim_nature_isba_ekf.F90:5:32:

5 | SUBROUTINE ASSIM_NATURE_ISBA_EKF (KMYPROC, IO, S, K, NP, NPE, 
HPROGRAM, KI, PT2M, PHU2M, HTEST)

  |^
internal compiler error: in lower_bound, at value-range.h:350
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/OFFLIN/open_close_bin_asc_forc.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/OFFLIN/open_filein_ol.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/OFFLIN/sfx_oasis_def_ol.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/abor1_sfx.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/albedo.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/allocate_physio.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/allocate_teb_veg.F90.o
0x7db1c4 frange::lower_bound() const [clone .part.0] [clone .lto_priv.0] 
[clone .lto_priv.0]

/home/toon/compilers/gcc/gcc/value-range.h:350
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/allocate_teb_veg_pgd.F90.o
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/average2_cover.F90.o

0x83f204 frange::lower_bound() const
/home/toon/compilers/gcc/gcc/value-range.h:1127
0x83f204 foperator_mult::op1_range(frange&, tree_node*, frange const&, 
frange const&, relation_trio) const

/home/toon/compilers/gcc/gcc/range-op-float.cc:2149
[ 72%] Building Fortran object 
surfex/CMakeFiles/surfex-core-static.dir/SURFEX/average2_mesh.F90.o
0x1ab62f8 gori_compute::compute_operand1_range(vrange&, 
gimple_range_op_handler&, vrange const&, tree_node*, fur_source&, 
value_relation*)

/home/toon/compilers/gcc/gcc/gimple-range-gori.cc:1095
0x1ab4f93 gori_compute::compute_operand_range(vrange&, gimple*, vrange 
const&, tree_node*, fur_source&, value_relation*)

/home/toon/compilers/gcc/gcc/gimple-range-gori.cc:692
0x1ab6378 gori_compute::compute_operand1_range(vrange&, 
gimple_range_op_handler&, vrange const&, tree_node*, fur_source&, 
value_relation*)

/home/toon/compilers/gcc/gcc/gimple-range-gori.cc:1150
0x1ab4f93 gori_compute::compute_operand_range(vrange&, gimple*, vrange 
const&, tree_node*, fur_source&, value_relation*)

/home/toon/compilers/gcc/gcc/gimple-range-gori.cc:692
0x1ac5861 gori_compute::outgoing_edge_range_p(vrange&, edge_def*, 
tree_node*, range_query&)

/home/toon/compilers/gcc/gcc/gimple-range-gori.cc:1373
0x1ac668e ranger_cache::edge_range(vrange&, edge_def*, tree_node*, 
ranger_cache::rfd_mode)

/home/toon/compilers/gcc/gcc/gimple-range-cache.cc:964
0x1acef14 gimple_ranger::range_on_edge(vrange&, edge_def*, tree_node*)
/home/toon/compilers/gcc/gcc/gimple-range.cc:241
0x1ab9902 fold_using_range::range_of_phi(vrange&, gphi*, fur_source&)
/home/toon/compilers/gcc/gcc/gimple-range-fold.cc:759
0x1ac5240 fold_using_range::fold_stmt(vrange&, gimple*, fur_source&, 
tree_node*)

/home/toon/compilers/gcc/gcc/gimple-range-fold.cc:491
0x1ac813e gimple_ranger::fold_range_internal(vrange&, gimple*, tree_node*)
/home/toon/compilers/gcc/gcc/gimple-range.cc:257
0x1ac813e gimple_ranger::prefill_stmt_dependencies(tree_node*)
/home/toon/compilers/gcc/gcc/gimple-range.cc:392
0x1ac88ba gimple_ranger::range_of_stmt(vrange&, gimple*, tree_node*)
/home/toon/compilers/gcc/gcc/gimple-range.cc:314
0x1ace076 gimple_ranger::range_on_entry(vrange&, basic_block_def*, 
tree_node*)

/home/toon/compilers/gcc/gcc/gimple-range.cc:153
0x115d524 path_range_query::internal_range_of_expr(vrange&, tree_node*, 
gimple*)

/home/toon/compilers/gcc/gcc/gimple-range-path.cc:176
0x115d6b0 path_range_query::range_of_expr(vrange&, tree_node*, gimple*)
/home/toon/compilers/gcc/gcc/gimple-range-path.cc:202
0x1ac3f4a fold_using_range::range_of_range_op(vrange&, 
gimple_range_op_handler&, fur_source&)

/home/toon/compilers/gcc/gcc/gimple-range-fold.cc:558
0x1ac50ba fold_using_range::fold_stmt(vrange&, gimple*, fur_source&, 
tree_node*)

/home/toon/compilers/gcc/gcc/gimple-range-fold.cc:489
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).

Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Toon Moene


It was just a comment on the code of the PR ...

Toon.

On 10/13/22 15:44, Aldy Hernandez wrote:


I'm not following.  My patch doesn't affect this behavior.

What am I missing?

Aldy

On Thu, Oct 13, 2022 at 3:04 PM Toon Moene  wrote:


On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote:


   PR tree-optimization/24021


Ah - Verboten in Fortran:

$ cat d.f
DOUBLE PRECISION A, X
A = 0.0
DO X = 0.1, 1.0
   A = A + X
ENDDO
END
$ gfortran d.f
d.f:3:9:

  3 |   DO X = 0.1, 1.0
| 1
Warning: Deleted feature: Loop variable at (1) must be integer
d.f:3:12:

  3 |   DO X = 0.1, 1.0
|1
Warning: Deleted feature: Start expression in DO loop at (1) must be integer
d.f:3:17:

  3 |   DO X = 0.1, 1.0
| 1
Warning: Deleted feature: End expression in DO loop at (1) must be integer

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands





--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Toon Moene


On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote:


PR tree-optimization/24021


Ah - Verboten in Fortran:

$ cat d.f
  DOUBLE PRECISION A, X
  A = 0.0
  DO X = 0.1, 1.0
 A = A + X
  ENDDO
  END
$ gfortran d.f
d.f:3:9:

3 |   DO X = 0.1, 1.0
  | 1
Warning: Deleted feature: Loop variable at (1) must be integer
d.f:3:12:

3 |   DO X = 0.1, 1.0
  |1
Warning: Deleted feature: Start expression in DO loop at (1) must be integer
d.f:3:17:

3 |   DO X = 0.1, 1.0
  | 1
Warning: Deleted feature: End expression in DO loop at (1) must be integer

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] frange: dump hex values when dumping FP numbers.

2022-09-22 Thread Toon Moene


If it's not too cumbersome, I suggest dumping both.

In my neck-of-the-woods (meteorology) I have seen this done just to 
ensure that algorithms that are supposed to be bit-reproducable actually 
are - and that it can be checked visually.


Kind regards,
Toon.

On 9/22/22 18:49, Aldy Hernandez via Gcc-patches wrote:


It has been suggested that if we start bumping numbers by an ULP when
calculating open ranges (for example the numbers less than 3.0) that
dumping these will become increasingly harder to read, and instead we
should opt for the hex representation.  I still find the floating
point representation easier to read for most numbers, but perhaps we
could have both?

With this patch this is the representation for [15.0, 20.0]:

  [frange] float [1.5e+1 (0x0.fp+4), 2.0e+1 (0x0.ap+5)]

Would you find this useful, or should we stick to the hex
representation only (or something altogether different)?

Tested on x86-64 Linux.

gcc/ChangeLog:

* value-range-pretty-print.cc (vrange_printer::print_real_value): New.
(vrange_printer::visit): Call print_real_value.
* value-range-pretty-print.h: New print_real_value.
---
  gcc/value-range-pretty-print.cc | 16 
  gcc/value-range-pretty-print.h  |  1 +
  2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/gcc/value-range-pretty-print.cc b/gcc/value-range-pretty-print.cc
index eb7442229ba..51be037c254 100644
--- a/gcc/value-range-pretty-print.cc
+++ b/gcc/value-range-pretty-print.cc
@@ -117,6 +117,16 @@ vrange_printer::print_irange_bitmasks (const irange &r) 
const
pp_string (pp, buf);
  }
  
+void

+vrange_printer::print_real_value (tree type, const REAL_VALUE_TYPE &r) const
+{
+  char s[60];
+  tree t = build_real (type, r);
+  dump_generic_node (pp, t, 0, TDF_NONE, false);
+  real_to_hexadecimal (s, &r, sizeof (s), 0, 1);
+  pp_printf (pp, " (%s)", s);
+}
+
  // Print an frange.
  
  void

@@ -141,11 +151,9 @@ vrange_printer::visit (const frange &r) const
bool has_endpoints = !r.known_isnan ();
if (has_endpoints)
  {
-  dump_generic_node (pp,
-build_real (type, r.lower_bound ()), 0, TDF_NONE, 
false);
+  print_real_value (type, r.lower_bound ());
pp_string (pp, ", ");
-  dump_generic_node (pp,
-build_real (type, r.upper_bound ()), 0, TDF_NONE, 
false);
+  print_real_value (type, r.upper_bound ());
  }
pp_character (pp, ']');
print_frange_nan (r);
diff --git a/gcc/value-range-pretty-print.h b/gcc/value-range-pretty-print.h
index 20c26598fe7..a9ae5a7b4cc 100644
--- a/gcc/value-range-pretty-print.h
+++ b/gcc/value-range-pretty-print.h
@@ -32,6 +32,7 @@ private:
void print_irange_bound (const wide_int &w, tree type) const;
void print_irange_bitmasks (const irange &) const;
void print_frange_nan (const frange &) const;
+  void print_real_value (tree type, const REAL_VALUE_TYPE &r) const;
  
pretty_printer *pp;

  };


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] [ranger] x == -0.0 does not mean we can replace x with -0.0

2022-08-29 Thread Toon Moene


On 8/29/22 19:07, Jeff Law via Gcc-patches wrote:

One of the more interesting ones is to try to limit the range of the 
input to the trigonometric functions - that way you could use ones 
without any argument reduction phase ...


The difficult part is that most of the trig stuff is in libraries, so we 
don't have visibility into the full range.


What we do sometimes have is knowledge that the special values are 
already handled which allows us to do things like automatically 
transform a division into estimation + NR correction steps (atan2).


I guess we could do specialization based on the input range.  So rather 
than calling "sin" we could call a special one that didn't have the 
reduction step when we know the input value is in a sensible range.


Exactly. It's probably not that hard to have sin/cos/tan with a special 
entry point that foregoes the whole argument reduction step.


In every weather forecast, you have to compute the local solar height 
(to get the effects of solar radiation correct) every time step, in 
every grid point.


You *know* that angle is between 0 and 90 degrees, as are all the angles 
that go into that computation (latitude, longitude (and time [hour of 
the day, day of the year]).


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] [ranger] x == -0.0 does not mean we can replace x with -0.0

2022-08-29 Thread Toon Moene


On 8/29/22 17:08, Jeff Law via Gcc-patches wrote:


However, I'm hoping to forget as many floating point details, as fast
as possible, as soon as I can ;-).


Actually FP isn't that bad -- I'd largely avoided it for decades, but 
didn't have a choice earlier this year.  And there's a lot more headroom 
for improvements in the FP space than the integer space IMHO.


One of the more interesting ones is to try to limit the range of the 
input to the trigonometric functions - that way you could use ones 
without any argument reduction phase ...


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-29 Thread Toon Moene


On 8/29/22 16:36, Aldy Hernandez wrote:


On Mon, Aug 29, 2022 at 4:30 PM Toon Moene  wrote:


On 8/29/22 16:15, Aldy Hernandez wrote:


But even with -ffinite-math-only, is there any benefit to propagating
a known NAN?  For example:


The original intent (in 2002) for the option -ffinite-math-only was for
the optimizers to ignore all the various exceptions to common
optimizations because they might not work correctly when presented with
a NaN or an Inf.

I do not know what the effect for floating point range information would
be - offhand.

But in the *spirit* of this option would be to ignore that the range
[5.0, 5.0] would "also" contain NaN, for instance.


Hmm, this is somewhat similar to what Jakub suggested.  Perhaps we
could categorically set !NAN for !HONOR_NANS at frange construction
time?

For reference:
bool
HONOR_NANS (machine_mode m)
{
   return MODE_HAS_NANS (m) && !flag_finite_math_only;
}

Thanks.
Aldy



Yep, I think that would do it.

Thanks,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-29 Thread Toon Moene


On 8/29/22 16:15, Aldy Hernandez wrote:


But even with -ffinite-math-only, is there any benefit to propagating
a known NAN?  For example:


The original intent (in 2002) for the option -ffinite-math-only was for 
the optimizers to ignore all the various exceptions to common 
optimizations because they might not work correctly when presented with 
a NaN or an Inf.


I do not know what the effect for floating point range information would 
be - offhand.


But in the *spirit* of this option would be to ignore that the range 
[5.0, 5.0] would "also" contain NaN, for instance.


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-29 Thread Toon Moene


On 8/29/22 15:54, Jakub Jelinek via Gcc-patches wrote:


On Mon, Aug 29, 2022 at 03:45:33PM +0200, Aldy Hernandez wrote:



For convenience, singleton_p() returns false for a NAN.  IMO, it makes
the implementation cleaner, but I'm not wed to the idea if someone
objects.


If singleton_p() is used to decide whether one can just replace a variable
with singleton range with a constant, then certainly.
If MODE_HAS_SIGNED_ZEROS, zero has 2 representations (-0.0 and 0.0) and
NaNs have lots of different representations (the sign bit is ignored
except for stuff like copysign/signbit, there are qNaNs and sNaNs and
except for the single case how Inf is represented, all other values of the
mantissa mean different representations of NaN).  So, unless we track which
exact form of NaN can appear, NaN or any [x, x] range with NaN property
set can't be a singleton.  There could be programs that propagate something
important in NaN mantissa and would be upset if frange kills that.
Of course, one needs to take into account that when a FPU creates NaN, it
will create the canonical qNaN.

Jakub



But the NaNs are irrelevant with -ffinite-math-only, as are the various 
Infs (I do not know offhand which MODE_ that is) ...


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH, v2] Fortran: fix invalid rank error in ASSOCIATED when rank is remapped [PR77652]

2022-07-27 Thread Toon Moene


On 7/27/22 21:45, Harald Anlauf via Fortran wrote:


Hi Mikael,



Am 26.07.22 um 21:25 schrieb Mikael Morin:



Le 25/07/2022 à 22:18, Harald Anlauf a écrit :


I would normally trust NAG more than Intel and Cray. 

… and yourself, it seems.  Too bad.
May I suggest that, if well known Fortran compilers differ in the 
treatment of this case, there might be reason to ask the Fortran 
Standard Committee for an Interpretation of the Standard ?


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [COMMITTED] Export global ranges during the VRP block walk.

2022-05-14 Thread Toon Moene


On 5/14/22 10:00, Iain Sandoe via Gcc-patches wrote:


Hi Andrew


On 13 May 2022, at 14:58, Andrew MacLeod via Gcc-patches 
 wrote:

VRP currently searches the ssa_name list for globals to exported after it  
finishes running.  This change simply exports globals as they are calculated 
for the final time during the DOM walk.

This avoid the occasional awkwardness of determined what ssa-names in the list 
are important, as well as allowing forthcoming side-effect code to adjust what 
is currently known as a global value during the walk without affecting the 
values exported for the entire function.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.


This (r13-436-gaf34279921f4) appears to cause or expose a problem which breaks 
bootstrap with in-tree MPFR on at least x86_64-linux/darwin.

thanks
Iain

during GIMPLE pass: threadfull
../../../src/mpfr/src/sin_cos.c: In function ‘mpfr_sin_cos’:
../../../src/mpfr/src/sin_cos.c:29:1: internal compiler error: in type, at 
value-range.h:225
29 | mpfr_sin_cos (mpfr_ptr y, mpfr_ptr z, mpfr_srcptr x, mpfr_rnd_t 
rnd_mode)
   | ^~~~
0x107d316 irange::type() const
 ../../src/gcc/value-range.h:225



Seems to have been fixed in r13-449.

Compare:

https://gcc.gnu.org/pipermail/gcc-testresults/2022-May/761443.html

with (r13-448):

https://gcc.gnu.org/pipermail/gcc-testresults/2022-May/761431.html

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] Mass rename of C++ .c files to .cc suffix

2022-01-11 Thread Toon Moene


On 1/11/22 13:56, Martin Liška wrote:


Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
Plus it survives build of all FEs (--enable-languages=all) on 
x86_64-linux-gnu

and I've built all cross compilers.


Does this also rename .c files in the fortran and libgfortran directories ?

I would recommend to send this message to the fort...@gcc.gnu.org list 
too, then.


Not everyone reads the gcc and gcc-patches lists ...

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Toon Moene


On 11/30/21 8:54 PM, Harald Anlauf via Fortran wrote:


Hi Tobias,



You seem to be quite convinced with your interpretation,
while I am simply confused.


If both compiler developers are confused, and actual compiler 
implementations differ in their outcomes of the test case, IMNSHO it is 
time to ask the Fortran Standardization Committee for an interpretation 
(of the standard's text).


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included)

2020-11-21 Thread Toon Moene

[ cc'd to the fortran mailing list to hopely get some more knowledgeable 
input ... ]


On 11/20/20 4:38 AM, Maciej W. Rozycki wrote:


2. libgfortran -- oddly enough for Fortran a piece requires IEEE 754
floating-point arithmetic (possibly a porting problem too).


gcc/libgfortran/config.h.in does have:

/* Define to 1 if you have the  header file. */
#undef HAVE_IEEEFP_H

So perhaps it does do the "right thing" if you do not have this header 
file on your VAX operating system.


The Fortran Standard allows an implementation *not* to have IEEE 
floating point support ...


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH 1/2] analyzer: gfortran testsuite support

2020-02-09 Thread Toon Moene


On 2/9/20 9:55 PM, Steve Kargl wrote:


On Sun, Feb 09, 2020 at 09:15:46PM +0100, Toon Moene wrote:



On 2/6/20 9:01 PM, David Malcolm wrote:


PR analyzer/93405 reports an ICE when attempting to use -fanalyzer on
certain gfortran code.  The second patch in this kit fixes that, but
in the meantime I need somewhere to put regression tests for -fanalyzer
with gfortran.

This patch adds a gfortran.dg/analyzer subdirectory with an analyzer.exp,
setting DEFAULT_FFLAGS on the tests run within it.


I have seen no objections against this proposal, so please go ahead.



Perhaps, there are no objections because the people who contribute
patches and provide reviews for gfortran have twindled to 1 or 2 people
with sporadic available time.  Did you actually review the proposed
changes?


You are right. I did test the fix for PR93405, and thought this update 
was included in that test, but it was not - my fault.


I will be more careful in the future about what I test, and show the 
results (otherwise, why test ...).


Suggestion withdrawn.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH 1/2] analyzer: gfortran testsuite support

2020-02-09 Thread Toon Moene


On 2/6/20 9:01 PM, David Malcolm wrote:


PR analyzer/93405 reports an ICE when attempting to use -fanalyzer on
certain gfortran code.  The second patch in this kit fixes that, but
in the meantime I need somewhere to put regression tests for -fanalyzer
with gfortran.

This patch adds a gfortran.dg/analyzer subdirectory with an analyzer.exp,
setting DEFAULT_FFLAGS on the tests run within it.


I have seen no objections against this proposal, so please go ahead.

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH v3] Extend the simd function attribute

2019-11-15 Thread Toon Moene


On 11/15/19 4:05 PM, Francesco Petrogalli wrote:


Thank you Szabolcs for working on this.



OpenMP 5 has a declare variant feature that
allows declaring more specific simd variants, but it is complicated and
still requires gcc or vendor extension for unambiguous declarations.)



It is not just that it is complicated, it is also a good idea to make math 
function vectorization orthogonal to OpenMP.


Definitely agree. I always found it a strained relationship, and only 
supported it being done that way because it worked.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH 2/2][GCC][RFC][middle-end]: Add complex number arithmetic patterns to SLP pattern matcher.

2019-10-01 Thread Toon Moene


On 10/1/19 1:39 PM, Tamar Christina wrote:


The patterns work by looking at the sequence produced after GCC lowers complex
numbers.  As such they would match any normal operation that does the same
computations.


Thanks - I didn't understand Ramana's comments during the GNU Tools 
Cauldron about this feature, but now I do.


Can't wait to put my (upcoming) ThunderX hardware to work on this (plus 
that I have to teach *a lot* of 30-year+ Fortran programmers that you do 
not have to lower COMPLEX arithmetic yourself, because the compiler will 
do this optimally for you ...).


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Avoid adding impossible copies in ira-conflicts.c:process_reg_shuffles

2019-09-17 Thread Toon Moene


On 9/17/19 6:50 PM, Richard Sandiford wrote:

[ ... ]


This patch tries to avoid the problem by not adding register shuffle
copies if there appears to be no chance that the two operands could be
allocated to the same register.



The table below summarises
the tests that had more or fewer assembly lines after the patch
(always a bad metric, but it's just to get a flavour):

Target Tests  Delta   Best  Worst Median
== =  =     = ==



x86_64-linux-gnu  39   -577   -164 23 -1


Hmmm, this sounds certainly interesting enough to try on its own merits, 
even if it's not committed by tomorrow morning ...


Fascinating analysis - thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: OpenCoarrays integration with gfortran

2018-09-23 Thread Toon Moene

On 09/22/2018 01:23 AM, Jerry DeLisle wrote:

On 9/21/18 1:16 PM, Damian Rouson wrote:> On Fri, Sep 21, 2018 at 9:25 
AM Jerry DeLisle  wrote:

 >> 1) Focus on distribution packages such as Fedora, Debian, Ubuntu,
 >> Windows, etc. Building of these packages needs to be automated into the
 >> distributions.
 >
 > This is the option that the OpenCoarrays documentation recommends as 
easiest for

 > most users.

Agree.

I just installed opencoarrays on my system at home (Debian Testing):

root@moene:~# apt-get install libcoarrays-openmpi-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  libcaf-openmpi-3
The following NEW packages will be installed:
  libcaf-openmpi-3 libcoarrays-openmpi-dev
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 107 kB of archives.
After this operation, 317 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://ftp.nl.debian.org/debian testing/main amd64 
libcaf-openmpi-3 amd64 2.2.0-3 [38.2 kB]
Get:2 http://ftp.nl.debian.org/debian testing/main amd64 
libcoarrays-openmpi-dev amd64 2.2.0-3 [68.9 kB]

Fetched 107 kB in 0s (634 kB/s)
Selecting previously unselected package libcaf-openmpi-3:amd64.
(Reading database ... 212249 files and directories currently installed.)
Preparing to unpack .../libcaf-openmpi-3_2.2.0-3_amd64.deb ...
Unpacking libcaf-openmpi-3:amd64 (2.2.0-3) ...
Selecting previously unselected package libcoarrays-openmpi-dev:amd64.
Preparing to unpack .../libcoarrays-openmpi-dev_2.2.0-3_amd64.deb ...
Unpacking libcoarrays-openmpi-dev:amd64 (2.2.0-3) ...
Setting up libcaf-openmpi-3:amd64 (2.2.0-3) ...
Setting up libcoarrays-openmpi-dev:amd64 (2.2.0-3) ...
Processing triggers for libc-bin (2.27-6) ...

[ previously this led to apt errors, but not now. ]

and moved my own installation of the OpenCoarrays-2.2.0.tar.gz out of 
the way:

toon@moene:~$ ls -ld *pen*
drwxr-xr-x 6 toon toon 4096 Aug 10 16:01 OpenCoarrays-2.2.0.opzij
drwxr-xr-x 8 toon toon 4096 Sep 15 11:26 opencoarrays-build.opzij
drwxr-xr-x 6 toon toon 4096 Sep 15 11:26 opencoarrays.opzij

and recompiled my stuff:

gfortran -g -fbacktrace -fcoarray=lib random-weather.f90 
-L/usr/lib/x86_64-linux-gnu/open-coarrays/openmpi/lib -lcaf_mpi

[ Yes, the location of the libs is quite experimental, but OK for the 
"Testing" variant of Debian ... ]

I couldn't find cafrun, but mpirun works just fine:

toon@moene:~/src$ echo ' &config /' | mpirun --oversubscribe --bind-to 
none -np 20 ./a.out
Decomposition information on image7 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image6 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   11 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   15 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image1 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   13 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   12 is4 *5 slabs with   21 * 
  18 grid cells on this image.
Decomposition information on image   20 is4 *5 slabs with   21 * 
  18 grid cells on this image.
Decomposition information on image9 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   14 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   16 is4 *5 slabs with   21 * 
  18 grid cells on this image.
Decomposition information on image   17 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   18 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image2 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image4 is4 *5 slabs with   21 * 
  18 grid cells on this image.
Decomposition information on image5 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image3 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image8 is4 *5 slabs with   21 * 
  18 grid cells on this image.
Decomposition information on image   10 is4 *5 slabs with   23 * 
  18 grid cells on this image.
Decomposition information on image   19 is4 *5 slabs with   23 * 
  18 grid cells on this image.

... etc. (see http://moene.org/~toon/random-weather.f90).

I presume other Linux distributors will follow shortly (this *is* Debian 
Testing, which can be a bit testy at times - but I do trust my main 
business at home on it for over 15 years now).

Ki

Re: [PATCH 08/25] Fix co-array allocation

2018-09-21 Thread Toon Moene


On 09/20/2018 10:01 PM, Thomas Koenig wrote:


Hi Damian,

On a related note, two Sourcery Institute developers have attempted to 
edit

the GCC build system to make the downloading and building of OpenCoarrays
automatically part of the gfortran build process.  Neither developer
succeeded.


We addressed integrating OpenCoarray into the gcc source tree at the
recent Gcc summit during the gfortran BoF session.

Feedback from people working for big Linux distributions was that they
would prefer to package OpenCoarrays as a separate library.
(They also mentioned it was quite hard to build.)


Well, Linux distributors have to fit the build of OpenCoarrays into 
*their* build system, which might be just as complicated as we trying it 
to force it into *gcc's* build system ...


For an individual, OpenCoarrays is not hard to build, and the web page 
www.opencoarrays.org offers multiple solutions:


"Installation via package management is generally the easiest and most 
reliable option.   See below for the package-management installation 
options for Linux, macOS, and FreeBSD.  Alternatively, download and 
build the latest OpenCoarrays release  via the contained installation 
scripts or with CMake."


I choose the cmake based one, because I already had cmake installed to 
be able to build ECMWF's (ecmwf.int) eccodes package. It probably helped 
that I also already had openmpi installed. From my command history:


 1754  tar zxvf ~/Downloads/OpenCoarrays-2.2.0.tar.gz
 1755  cd OpenCoarrays-2.2.0/
 1756  ls
 1757  less README.md
 1758  cd ..
 1759  mkdir opencoarrays-build
 1760  cd opencoarrays-build
 1761  (export FC=gfortran; export CC=gcc; cmake ../OpenCoarrays-2.2.0/ 
-DCMAKE_INSTALL_PREFIX=$HOME/opencoarrays)

 1762  make
 1763  make test
 1764  make install

After that, it was a breeze to test my mock weather program 
(moene.org/~toon/random-weather.f90), that I had built until then only 
with -fcoarray=single.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] combine: Allow combining two insns to two insns

2018-08-01 Thread Toon Moene


On 07/24/2018 07:18 PM, Segher Boessenkool wrote:


This patch allows combine to combine two insns into two.  This helps
in many cases, by reducing instruction path length, and also allowing
further combinations to happen.  PR85160 is a typical example of code
that it can improve.


I cannot state with certainty that the improvements to our most 
notorious routine between 8.2 and current trunk are solely due to this 
change, but the differences are telling (see attached Fortran code - the 
analysis is about the third loop).


Number of instructions for this loop (Skylake i9-7900).

gfortran82 -S -Ofast -march=native -mtune=native:

  458 verint.s.82.loop3

gfortran90 -S -Ofast -march=native -mtune=native:

  396 verint.s.90.loop3

But the most stunning difference is the use of the stack [ nn(rsp) ] - 
see the attached files ...


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
# 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F"
# 1 ""
# 1 ""
# 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F"
c Library:grdy $RCSfile$, $Revision: 7536 $
c checked in by $Author: ovignes $ at $Date: 2009-12-18 14:23:36 +0100 (Fri, 18 Dec 2009) $
c $State$, $Locker$
c $Log$
c Revision 1.3  1999/04/22 09:30:45  DagBjoerge
c MPP code
c
c Revision 1.2  1999/03/09 10:23:13  GerardCats
c Add SGI paralllellisation directives DOACROSS
c
c Revision 1.1  1996/09/06 13:12:18  GCats
c Created from grdy.apl, 1 version 2.6.1, by Gerard Cats
c
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
  IF (KINT.EQ.1) THEN
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  ELSE
 +IF (KINT.EQ.2) THEN
C  QUADRATIC INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,3)*PARG(IDX+1,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*

Re: [patch, fortran] Handling of .and. and .or. expressions

2018-06-28 Thread Toon Moene


On 06/28/2018 06:22 PM, Steve Kargl wrote:


You continue to miss my point or conveniently ignore it.
You want to special case the forced evaluation of the
operands in two specific logical expressions; namely, .and.
and .or.  If you want to force evaluation of operands, then
do it for all binary operators.


And - most interesting - that's how Fortran 77 formulated it (way before 
PURE/IMPURE functions entered the language):


"6.6.1 Evaluation of Operands

It is not necessary for a processor to evaluate all of the operands of 
an expression if the value of the expression can be determined 
otherwise. This principle is most often applicable to logical 
expressions, but it applies to all expressions. For example, in 
evaluating the logical expression

   X .GT. Y .OR. L(Z)
where X, Y, and Z are real, and L is a logical function, the function 
reference L(Z) need not be evaluated if X is greater than Y. If a 
statement contains a function reference in a part of an expression that 
need not be evaluated, all entities that would have become defined in 
the execution of that reference become undefined at the completion of
evaluation of the expression containing the function reference. In the 
example above, evaluation of the expression causes Z to become undefined 
if L defines its argument."


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Documentation patch for -floop-interchange and -floop-unroll-and-jam.

2018-05-17 Thread Toon Moene

The documentation of both options is still inconsistent, in both the 
trunk and the gcc-8 branch.


The following is my suggestion to clear this up (and move 
-floop-unroll-and-jam close to -floop-interchange.


ChangeLog:

2018-05-17  Toon Moene  

* doc/invoke.texi: Move -floop-unroll-and-jam documentation
directly after that of -floop-interchange. Indicate that both
options are enabled by default when specifying -O3.

OK for trunk and gcc-8 ?

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Index: invoke.texi
===
--- invoke.texi	(revision 260287)
+++ invoke.texi	(working copy)
@@ -8866,7 +8866,14 @@
 for (int j = 0; j < N; j++)
   c[i][j] = c[i][j] + a[i][k]*b[k][j];
 @end smallexample
+This flag is enabled by default at @option{-O3}.
 
+@item -floop-unroll-and-jam
+@opindex floop-unroll-and-jam
+Apply unroll and jam transformations on feasible loops.  In a loop
+nest this unrolls the outer loop by some factor and fuses the resulting
+multiple inner loops.  This flag is enabled by default at @option{-O3}.
+
 @item -ftree-loop-im
 @opindex ftree-loop-im
 Perform loop invariant motion on trees.  This pass moves only invariants that
@@ -10038,12 +10045,6 @@
 Move branches with loop invariant conditions out of the loop, with duplicates
 of the loop on both branches (modified according to result of the condition).
 
-@item -floop-unroll-and-jam
-@opindex floop-unroll-and-jam
-Apply unroll and jam transformations on feasible loops.  In a loop
-nest this unrolls the outer loop by some factor and fuses the resulting
-multiple inner loops.
-
 @item -ffunction-sections
 @itemx -fdata-sections
 @opindex ffunction-sections

New option -floop-unroll-and-jam.

2018-04-19 Thread Toon Moene

According to the Changes page for GCC 8, -floop-unroll-and-jam is 
enabled by default for -O3 optimization:


"Two new classical loop nest optimization passes have been added. 
-floop-unroll-and-jam performs outer loop unrolling and fusing of the 
inner loop copies. -floop-interchange exchanges loops in a loop nest to 
improve data locality. Both passes are enabled by default at -O3 and above."


However, the documentation of optimization options does not reflect this.

Is the following change to the documentation acceptable ?

ChangeLog

2018-04-19  Toon Moene  

* doc/invoke.texi: Add -floop-unroll-and-jam to options enabled
by -O3.

Index: invoke.texi
===
--- invoke.texi (revision 259471)
+++ invoke.texi (working copy)
@@ -7652,6 +7652,7 @@
 -ftree-loop-distribution @gol
 -ftree-loop-distribute-patterns @gol
 -floop-interchange @gol
+-floop-unroll-and-jam @gol
 -fsplit-paths @gol
 -ftree-slp-vectorize @gol
 -fvect-cost-model @gol

Thanks,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Disable autogeneration of gather instructions on Ryzen and generic

2018-01-11 Thread Toon Moene


On 01/09/2018 11:28 AM, Richard Biener wrote:


Note that the vectorizer gives up on loops with gathers with no target
support for
gathers.  It could simply open-code the gather though (and properly cost that
open-coded variant), that's probably the way to go here.


Man, I wish I had made this comment a few years ago, when Jakub Jelinek 
implemented support for gather in vectorization *of our loops*.


Before that, I had only seen programmers do it for our code by rewriting 
the Fortran into something that was completely unreadable.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [RFA] Zen tuning part 9: Add support for scatter/gather in vectorizer costmodel

2017-10-21 Thread Toon Moene


On 10/17/2017 07:22 PM, Jan Hubicka wrote:


According to Agner's tables, gathers range from 12 ops (vgatherdpd)
to 66 ops (vpgatherdd).  I assume that CPU needs to do following:


In our code, it is basically don't" care" how much work it is for a 
gather instruction to do its work.


Without gather the most expensive loop in our code couldn't be 
vectorized (there are only a handful of gather instructions in that loop 
and dozens of other vector instructions).


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH doc] use "cannot" consistently

2017-03-15 Thread Toon Moene


On 03/14/2017 09:53 PM, Richard Kenner wrote:


The GCC manual uses "cannot" in most places (280 lines) but there
are a few instances of "can't" (33 lines).

The attached patch replaces the informal "can't" with the former
for consistency.


In my opinion, this is the wrong direction.  Contractions are becoming
more acceptable in even more formal writing and there's even a current
US Supreme Court justice who uses them in her opinions.

I certainly don't think it's worth a change to use "can't" throughout,
but I'm not in favor of eliminating it either.


I think for non-native speakers of English, using the full word is 
easier to read (you can't take my experience as an example, as I was 
exposed to written English 48 years ago).


I think replacing the few instances of can't with cannot is worth the 
clarity.


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH, Fortran] Extension: Support legacy PARAMETER statements with -std=legacy (or -fdec?)

2016-11-02 Thread Toon Moene


On 11/02/2016 04:47 PM, Fritz Reese wrote:


All,

Another quirk of legacy compilers is their syntax for PARAMETER
statements. Such statements are similar to standard PARAMETER
statements but lack parentheses following the PARAMETER keyword. There
is a good reason the standard doesn't support this - because the
statement becomes ambiguous with assignment statements in fixed form.
Consider the following:

parameter pi = 3.14d0

In fixed form, spaces are ignored so in standard Fortran the above
looks like an assignment to the variable "parameterpi". In legacy
compilers, the above is instead interpreted as

parameter (pi = 3.14d0)

which of course declares the variable 'pi' to be a parameter with the
value 3.14.


Please test this with a program that has a *variable* that's named 
PARAMETER.


I have encountered Fortran compilers in the past that couldn't cope with 
it (I won't name names).


Thanks and kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-09-26 Thread Toon Moene


On 09/26/2016 04:01 PM, Fritz Reese wrote:


All,

Attached is a patch extending the GNU Fortran front-end to support
some additional math intrinsics, enabled with a new compile flag
-fdec-math. The flag adds the COTAN intrinsic (cotangent), as well as
degree versions of all trigonometric intrinsics (SIND, TAND, ACOSD,
etc...). This extension allows for further compatibility with legacy
code that depends on the compiler to support such intrinsic functions.


Don't you want an option name like -fdegree-trigon-math ? Note that 
option name lengths are hardly a problem, as most end up in scripts and 
make files ... Whether support for COTAN (in radians) should be part of 
this option - I rather had it not.


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Ping : [Patch, fortran] PR48298 - [F03] User-Defined Derived-Type IO (DTIO)

2016-08-29 Thread Toon Moene


On 08/27/2016 10:15 PM, Janne Blomqvist wrote:


On Sat, Aug 27, 2016 at 9:50 PM, Paul Richard Thomas



 wrote:

Although we have said that we would commit on Monday if no review is
forthcoming, we would very much prefer that somebody takes a look. We
understand perfectly that a 4052 line patch is rather daunting.
However, even a cursory scan of the patch would be helpful.


To be honest, I had a nagging suspicion that DTIO would remain forever
on the TODO list, but as you and Jerry have pulled it off, my hat is
off to you!


At the last Fortran Standardization Committee meeting (Boulder, June) I 
opined that UDDTIO might be an unsolved F2003 issue in gfortran 
eternally, because there is no real *pressure* for a free compiler to be 
Standard Conformant.


Now the last major item on the F2003 list is "parameterized derived 
types". Compiler writers on the committee tell me that you need to 
overhaul most of the front end (and parts of the run-time library) to 
get that correct ...


Thanks Paul et al. for working on this daunting task !

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [patch, fortran] Inline MATMUL(A,TRANSPOSE(B)), PR 66094

2016-01-24 Thread Toon Moene


On 01/23/2016 01:26 PM, Thomas Koenig wrote:


Hi Toon,


However, today I *did* run the test harness with your modification:


...

Thanks for the testing!

So, what do people think?  Is the patch OK for trunk?


As far as I am able to determine, this is working. We still have 3 
months (until mid-April) to fix it, if necessary.


Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [patch, fortran] Inline MATMUL(A,TRANSPOSE(B)), PR 66094

2016-01-19 Thread Toon Moene


On 01/18/2016 08:55 PM, Toon Moene wrote:


On 01/17/2016 01:44 PM, Thomas Koenig wrote:


So... comments?  Toon, would this help you?  Could yo maybe give this
a spin?


Thanks, the nightly test at my home computer will build with your patch.


That was the plan; unfortunately, the system crashed while doing this 
(due to an unrelated problem).


However, today I *did* run the test harness with your modification:

https://gcc.gnu.org/ml/gcc-testresults/2016-01/msg01795.html

Looks good.  These are the messages related to your new test cases:

/home/toon/compilers/trunk/gcc/testsuite/gfortran.dg/inline_matmul_13.f90:34:2: 
Warning: Code for reallocating the allocatable array at (1) will be 
added [-Wrealloc-lhs]
/home/toon/compilers/trunk/gcc/testsuite/gfortran.dg/inline_matmul_13.f90:41:2: 
Warning: Code for reallocating the allocatable array at (1) will be 
added [-Wrealloc-lhs]


PASS: gfortran.dg/inline_matmul_13.f90   -O0   (test for warnings, line 34)
PASS: gfortran.dg/inline_matmul_13.f90   -O0   (test for warnings, line 41)
PASS: gfortran.dg/inline_matmul_13.f90   -O0  (test for excess errors)
...
PASS: gfortran.dg/inline_matmul_13.f90   -O0  execution test
PASS: gfortran.dg/inline_matmul_13.f90   -O0   scan-tree-dump-times 
original "_gfortran_matmul" 0


and ditto for higher optimization levels.

The bounds tests added also completed correctly.

Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [patch, fortran] Inline MATMUL(A,TRANSPOSE(B)), PR 66094

2016-01-18 Thread Toon Moene


On 01/18/2016 11:14 PM, Thomas Koenig wrote:


Hi Toon,


It will also perform the following tests (minus the
"inline_matmul_13.f90" one, which wasn't included in the attachements :-)


Well, here it is.


Included, thanks,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [patch, fortran] Inline MATMUL(A,TRANSPOSE(B)), PR 66094

2016-01-18 Thread Toon Moene


On 01/17/2016 01:44 PM, Thomas Koenig wrote:


So... comments?  Toon, would this help you?  Could yo maybe give this
a spin?


Thanks, the nightly test at my home computer will build with your patch.


2016-01-17  Thomas Koenig  

 PR fortran/66094
 * frontend-passes.c (enum matrix_case):  Add case A2B2T for
 MATMUL(A,TRANSPoSE(B)) where A and B are rank 2.
 (inline_limit_check):  Also add A2B2T.
 (matmul_lhs_realloc):  Handle A2B2T.
 (check_conjg_variable):  Rename to
 (check_conjg_transpose_variable):  and also count TRANSPOSE.
 (inline_matmul_assign):  Handle A2B2T.


It will also perform the following tests (minus the 
"inline_matmul_13.f90" one, which wasn't included in the attachements :-)



2016-01-17  Thomas Koenig  

 PR fortran/66094
 * gfortran.dg/inline_matmul_13.f90:  New test.
 * gfortran.dg/matmul_bounds_8.f90:  New test.
 * gfortran.dg/matmul_bounds_9.f90:  New test.
 * gfortran.dg/matmul_bounds_10.f90:  New test.


Unfortunately, running the whole of our weather forecasting system with 
gcc-6 will be *a lot of work*, because I have to build all kinds of 
support libraries (for which I now depend on Debian Testing) by hand.


But I hope just testing your examples will at least give you an idea (on 
-march=haswell).


Thanks, and kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [wwwdocs] Document how to add a new SSH key

2015-12-23 Thread Toon Moene


On 12/23/2015 08:57 PM, Marc Glisse wrote:


On Wed, 23 Dec 2015, Toon Moene wrote:


On 12/22/2015 02:59 PM, Gerald Pfeifer wrote:



+ssh username@gcc.gnu.org append-key < KEYFILE


Hmm, I get:

toon@moene:~$ ssh t...@gcc.gnu.org append-key < .ssh/id_dsa.pub
/home/toon/.ssh/config line 3: Bad protocol spec '1'.

toon@moene:~$ cat .ssh/config
Host gcc.gnu.org
ForwardX11 no
Protocol 1

Any hints ?


Wild guess (that you are using debian testing or unstable):

apt-get install openssh-client-ssh1
ssh1 t...@gcc.gnu.org ...


You are probably right. In stead of going backwards, I sent a new key to 
the overseers ...


Thanks,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [wwwdocs] Document how to add a new SSH key

2015-12-23 Thread Toon Moene


On 12/22/2015 02:59 PM, Gerald Pfeifer wrote:


Jan (Beulich) ran into this, and indeed I could not find it
documented.  So I added it. ;-)



+ssh username@gcc.gnu.org append-key < KEYFILE


Hmm, I get:

toon@moene:~$ ssh t...@gcc.gnu.org append-key < .ssh/id_dsa.pub
/home/toon/.ssh/config line 3: Bad protocol spec '1'.

toon@moene:~$ cat .ssh/config
Host gcc.gnu.org
ForwardX11 no
Protocol 1

Any hints ?

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [C/C++ PATCH] Fix -Wshift-overflow with sign bit

2015-08-12 Thread Toon Moene


On 08/12/2015 05:39 PM, Marek Polacek wrote:


This patch fixes a defect in -Wshift-overflow.  We should only warn
about left-shifting 1 into the sign bit when -Wshift-overflow=2.  But
this doesn't apply only for 1 << 31, but also for 2 << 30, etc.
In C++14, never warn about this.


And then there's this:

https://gcc.gnu.org/ml/gcc-testresults/2015-08/msg01036.html

[ Yes, that's at run time, not compile time ... ]

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Bug fortran/52846] [F2008] Support submodules - part 3/3

2015-08-10 Thread Toon Moene


On 08/03/2015 02:36 PM, Paul Richard Thomas wrote:

Dear Mikael,

Thanks for your green light!

I have been mulling over the trans-decl part of the patch and having
been wondering if it is necessary. Without optimization, private
entities can be linked to. Given the discussion concerning the
combination of submodules and private entities, I wonder if this is
not sufficient? Within submodule scope, an advisory could be given for
undefined references to suggest recompiling the module without
optimization or making the entities public.

Cheers

Paul

On 3 August 2015 at 12:44, Mikael Morin  wrote:

Le 29/07/2015 17:08, Paul Richard Thomas a écrit :


Dear All,

On 24 July 2015 at 10:08, Damian Rouson 
wrote:


I love this idea and had similar thoughts as well.

:D

Sent from my iPhone


On Jul 24, 2015, at 1:06 AM, Paul Richard Thomas
 wrote:

Dear Mikael,

It had crossed my mind also that a .mod and a .smod file could be
written. Normally, the .smod files are produced by the submodules
themselves, so that their descendants can pick up the symbols that
they generate. There is no reason at all why this could not be
implemented; early on in the development I did just this, although I
think that it would now be easier to modify this patch.

One huge advantage of proceeding in this way is that any resulting
library can be distributed with the .mod file alone so that the
private entities are never exposed. The penalty is that a second file
is output.

With best regards

Paul



Please find attached the implementation of this suggestion.

Bootstraps and regtests on FC21/x86_64 - OK for trunk or is the
original preferred?


There hasn't been a lot of voices about this among the other active and less
active team members.
I prefer this "private members to separate smod" variant.
It's OK for trunk as far as I'm concerned.
Thanks.

Mikael

PS: Regarding redundant initializations: rather have too many than too few.
;-)


Although I do not immediately know if this is relevant for *this* 
debate, J3 passed the following (attached) interpretation on submodules 
the past week (it still has to go to several mail ballots, but still), 
overwhelmingly prefering option 3:


[attached]

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
 J3/15-208
To: J3
From: Malcolm Cohen
Subject: Interp of USE and submodules
Date: 2015 August 06


1. Introduction

Three options are provided for the answer to this interp:

Option 1: Basically what the actual text of the standard says now.
  Accessing a PROTECTED item by a USE statement will hide the
  host association, and therefore the item will be protected.

Option 2: Continue to allow a submodule to USE its ancestor, but say
  that PROTECTED has no effect in this case.

Option 3: Decide that it was a mistake to allow a submodule to access
  its ancestor module by use association, and forbid it.

2. The interpretation request

--

NUMBER: F08/0128
TITLE: Is recursive USE within a submodule permitted?
KEYWORDS: SUBMODULE, USE
DEFECT TYPE: Erratum
STATUS: J3 consideration in progress

QUESTION:

Consider
  Module m1
Real x
  End Module
  Submodule(m1) subm1
Use m1
  End Submodule

Q1. The module m1 is referenced from within one of its own
submodules.  Is this standard-conforming?

Note that the "submodule TR", Technical Report 19767 contains, an edit
with the normative requirement:
  "A submodule shall not reference its ancestor module by use
   association, either directly or indirectly."
along with a note which says
  "It is possible for submodules with different ancestor modules to
   access each other's ancestor modules by use association."
It also contains an edit to insert the direct reference prohibition
as a constraint.

However, none of this text appears in ISO/IEC 1539-1:2010.

The Introduction simply comments that submodules are available, but
not that they have been extended beyond the Technical Report that
created them.

Also, consider

  Module m2
Real, Private :: a
Real, Protected :: b
...
  End Module
  Submodule(m2) subm2
  Contains
Subroutine s
  Use m2
  Implicit None
  a = 3
  b = 4
End Subroutine
  End Submodule

In submodule SUBM2, procedure S references M2 by use association.
Use association does not make "A" accessible.

Q2. Is "A" still accessible by host association?

Also, procedure S attempts to assign a value to B, which is accessed
by use association, but has the PROTECTED attribute.  Normally, this
attribute prevents assignment to variables accessed by use
association.

Q3. Is

Re: [wwwdocs] Skeleton for GCC 6 release notes

2015-05-06 Thread Toon Moene


On 05/06/2015 12:06 AM, Gerald Pfeifer wrote:


I started working on this over the weekend, and then Jason
wondered about it yesterday, so I completed and committed the
following skeleton for the GCC 6 release notes yesterday.



+GCC 6 Release SeriesChanges, New Features, and Fixes
+
+
+Caveats
+
+
+
+
+
+
+
+New Languages and Language specific improvements


Aarghh - you forgot Fortran !

/snark.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] Make IPA-CP propagate alignment information of pointers

2014-11-16 Thread Toon Moene


On 11/15/2014 02:04 AM, Martin Jambor wrote:


Hi,

this patch adds very simple propagation of alignment of pointers to
IPA-CP.  Because I have not attempted to estimate profitability of
such propagation in any way, it does not do any cloning, just
propagation when the alignment is known and the same in all contexts.


Thanks for this improvement !

From the Fortran side, arrays can be "created" in the following ways:

1. Statically in the main program.

2. As a subroutine-temporary "automatic array".

3. By allocating an allocatable array.

Arrays under 1. are aligned properly by the compiler.

Arrays under 2. are aligned properly because of the proper alignment of 
the stack nowadays.


Arrays under 3. are aligned properly because Fortran "ALLOCATE" 
ultimately calls malloc.


So Fortran arrays are always suitably aligned (the exception being an 
array actual argument passed as "CALL SUB(.., A(2:), ..)", which is 
extremely rare).


So this propagation of alignment information will result in basically 
removing all alignment peeling for Fortran code.


Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] Add support for -fno-sanitize-recover and -fsanitize-undefined-trap-on-error (PR sanitizer/60275)

2014-05-15 Thread Toon Moene


On 05/15/2014 09:10 PM, Jakub Jelinek wrote:


On Thu, May 15, 2014 at 08:37:07PM +0200, Marek Polacek wrote:



Sure, -fdiagnostics-color=auto "means to use color only when the
standard error is a terminal", or -fdiagnostics-color=never to turn it
off completely (testsuite uses the latter).


I guess Toon meant that there is no easy way? to get rid of the color
stuff in libsanitizer messages.


Sure. The point is that the testsuite runs only write to a *file*, so 
why should I get color-coded error messages like this:


ESC[1m/home/toon/compilers/trunk/gcc/config/i386/i386.c:6577:60:ESC[1mESC[31m 
runtime error: ESC[1mESC[0mESC[1mload of value 32763, which is not a 
valid value for type 'x86_64_reg_class'ESC[1mESC[0m


?

It makes precise grepping needlessly hard ...

Otherwise, thanks very much for this work - definitely appreciated !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] Add support for -fno-sanitize-recover and -fsanitize-undefined-trap-on-error (PR sanitizer/60275)

2014-05-15 Thread Toon Moene


On 05/15/2014 05:08 PM, Marek Polacek wrote:


On Thu, May 15, 2014 at 11:39:26AM +0100, Ramana Radhakrishnan wrote:

What's the overhead with bootstrap-ubsan ?


I timed normal bootstrap and bootstrap-ubsan on x86_64, 24 cores,
Intel(R) Xeon(R) CPU X5670  @ 2.93GHz (aka cfarm 20) and the results:

--enable-languages=all

real35m10.419s
user204m5.613s
sys 6m15.615s

--enable-languages=all --with-build-config=bootstrap-ubsan

real71m39.338s
user347m53.409s
sys 7m44.281s


And don't underestimate the *usefulness* of this - if you don't have the 
resources to do a ubsan bootstrap, download mine from last night 
(x86_64-linux-gnu): http://moene.org/~toon/gcc-tests.log.gz


[ I hope there is a way to discard color codings when writing error 
messages to a file, ugh ]


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Patch ping

2014-04-16 Thread Toon Moene


On 04/14/2014 01:02 PM, Jakub Jelinek wrote:


On Thu, Apr 10, 2014 at 12:01:31PM -0400, DJ Delorie wrote:



So, now that 4.9 has branched, are both patches ok for trunk, or just the
first one?  The first one fixes --with-build-config=bootstrap-ubsan
fully and --with-build-config=bootstrap-asan partially, the second one
--with-build-config=bootstrap-asan fully.


Now that the 4.9 branch happened, I sincerely hope this goes in (both 
parts of it) - my bootstrap-asan run this morning still failed.


I'm quite sure regular asan/ubsan bootstraps on various platforms (mine 
is only the most common x86-64 one) would be helpful to find bugs in the 
compilers' frontends, middle end and libraries ...


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: How to generate AVX512 instructions now (just to look at them).

2014-01-05 Thread Toon Moene


On 01/03/2014 10:11 PM, Jakub Jelinek wrote:


Hi!

On Fri, Jan 03, 2014 at 08:58:30PM +0100, Toon Moene wrote:

I don't doubt that would work, what I'm interested in, is (cat verintlin.f):


Well, you need gather loads for that and there you hit PR target/59617.


I tried your patch, and the effect on the most heavily used loop in the 
full routine (not the part that I quoted before):


160   DO JY = KLAT1,KLAT2
161   DO JX = KLON1,KLON2
162  IDX  = KP(JX,JY)
163  IDY  = KQ(JX,JY)
164  ILEV = KR(JX,JY)
...
237  + + PBETA(JX,JY,4)*( PALFA(JX,JY,1)*PARG(IDX-2,IDY+1,ILEV+1)
238  +  + PALFA(JX,JY,2)*PARG(IDX-1,IDY+1,ILEV+1)
239  +  + PALFA(JX,JY,3)*PARG(IDX  ,IDY+1,ILEV+1)
240  +  + 
PALFA(JX,JY,4)*PARG(IDX+1,IDY+1,ILEV+1) ) )

241   ENDDO
242   ENDDO

is (just counting assembler lines, i.e., instructions):

-Ofast -mavx2 -mfma:   627 lines in the .s file.

-Ofast -mavx2 -mfma -mavx512f: 588 lines in the .s file.

However, this routine is clearly memory bound (as the vectorization with 
the gather instruction, needed for the indirect adressing via IDX  = 
KP(JX,JY), etc. didn't bring any speed improvement).


The number of instructions accessing memory:

-Ofast -mavx2 -mfma:   364 lines in the .s file.

-Ofast -mavx2 -mfma -mavx512f: 221 lines in the .s file.

So there might be a clear improvement here ...

Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Ubsan merged into trunk

2013-08-30 Thread Toon Moene


On 08/30/2013 06:14 PM, Marek Polacek wrote:


I've just merged ubsan into trunk.  Please send complaints my way.
Thanks,

Marek


Just watch the equivalent of this one:

http://gcc.gnu.org/ml/gcc-testresults/2013-08/msg02869.html

tomorrow morning (substitute "java" for "go" and "ubsan" for "asan").

Kind regards,

--
Toon Moene, Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [gomp4] C++ OpenMP 4.0 atomics support

2013-03-15 Thread Toon Moene


On 03/15/2013 05:27 PM, Jakub Jelinek wrote:


Queued for gomp-4_0-branch (to be created next week).  Comments?


I heard from colleagues on the Fortran Standardization Committee 
(http://j3-fortran.org) that 4.0 doubled in size w.r.t. the 3.x standard.


I wish you lots of success implementing this - it is really hard to get 
a cross-language standard like this one correct, let alone its 
implementation.


The reports I receive on the OpenMP implementation in GCC (from the 
gfortran users' side) are without exception positive.


Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: r196201 - in /trunk: gcc/ChangeLog gcc/config/i...

2013-02-26 Thread Toon Moene


On 02/24/2013 07:20 PM, Jakub Jelinek wrote:


On Fri, Feb 22, 2013 at 02:42:44PM -0800, H.J. Lu wrote:

2013-02-22  H.J. Lu

* bootstrap-asan.mk (POSTSTAGE1_LDFLAGS): Add
-B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/asan/.
diff --git a/config/bootstrap-asan.mk b/config/bootstrap-asan.mk
index d37a9da..e3f34f5 100644
--- a/config/bootstrap-asan.mk
+++ b/config/bootstrap-asan.mk
@@ -3,4 +3,5 @@
  STAGE2_CFLAGS += -fsanitize=address
  STAGE3_CFLAGS += -fsanitize=address
  POSTSTAGE1_LDFLAGS += -fsanitize=address -static-libasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/asan/ \
  -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/asan/.libs


Ok.

Jakub


After this went in, I got further now:

http://gcc.gnu.org/ml/gcc-testresults/2013-02/msg02974.html

than before:

http://gcc.gnu.org/ml/gcc-testresults/2013-02/msg02749.html

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] Check headers in verify_loop_structure

2013-02-08 Thread Toon Moene


On 02/08/2013 12:20 PM, Richard Biener wrote:


On Thu, 7 Feb 2013, Marek Polacek wrote:


On Thu, Feb 07, 2013 at 02:56:48PM +0100, Richard Biener wrote:

+  /* Check the headers.  */
+  FOR_EACH_BB (bb)
+{
+  /* Skip BBs in the root tree.  */
+  if (bb->loop_father == current_loops->tree_root)
+   continue;


You shouldn't need this ... it will miss missing toplevel loops


Done.


+  if (bb_loop_header_p (bb))
+   if (bb->loop_father->header != bb)


   &&  bb->loop_father->header != bb)


Fixed.  Regtested/bootstrapped on x86_64 again (with tailc fix:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00310.html), ok now?


Ok.  Let's watch for fallout...

Thanks,
Richard.


Maybe here:

http://gcc.gnu.org/ml/gcc-testresults/2013-02/msg00835.html

?

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: ASAN merge...

2012-11-17 Thread Toon Moene


On 11/16/2012 04:35 PM, Ian Lance Taylor wrote:


I expect it's pronounced with a sort of throat-clearing noise that is
hard to write.  Sort of like the gargling sound represented by "argh."
  argh32 and argh64.


I thought it was a nice way to enshrine a Monty Python joke into silicon.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Backtrace library [1/3]

2012-09-12 Thread Toon Moene


On 09/12/2012 01:08 AM, Ian Lance Taylor wrote:


The interface is somewhat constrained in that, on systems that support
anonymous mmap, it does not call malloc.  That makes it possible to do
a symbolic backtrace from a signal handler.


It would also make it possible to have a traceback of a segmentation 
fault caused by corruption of the malloc arena ...


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Merge C++ conversion into trunk (0/6 - Overview)

2012-08-15 Thread Toon Moene


On 08/15/2012 06:00 PM, Diego Novillo wrote:

> On the switch to C++ as the build language for GCC ...

Here are my results:

0:30 UTC - using C as the initial build language:

http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg01329.html

and:

18:40 UTC - using C++ as the initial build language:

http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg01408.html

both for x86_64-unknown-linux-gnu native (note: 
--with-build-config=bootstrap-lto).


As far as I can, little difference.

Congratulations, Diego and all the others who worked on this transition.

Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] Fix PR53295

2012-05-12 Thread Toon Moene


On 05/12/2012 12:36 PM, Richard Guenther wrote:


On Sat, May 12, 2012 at 9:53 AM, Toon Moene  wrote:



On 05/11/2012 01:59 PM, Richard Guenther wrote:


This fixes the dependency of vectorization of strided loads on
gather support.  For that to work we need to lift the restriction
in data-ref analysis that requries a constant DR_STEP.  Fortunately
fallout is small.



Would this also vectorize strided loops when the architecture doesn't have a
gather instruction ?


gather is different from strided loops.  Gather is a[b[i]] while strided loops
are for (i=0;; i+=stride) ...= a[i] with stride being non-constant.

Your testcase requires gather support.


Yep, apparently I didn't read your explanation correctly.

On the other hand, I'm wondering if - in the absence of a gather 
*instruction* - one could do a gather-by-hand, i.e., load 8 32-bit 
floating point values in a (temporary) consecutive buffer, then load it 
into a vector register ...


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH] Fix PR53295

2012-05-12 Thread Toon Moene


On 05/11/2012 01:59 PM, Richard Guenther wrote:


This fixes the dependency of vectorization of strided loads on
gather support.  For that to work we need to lift the restriction
in data-ref analysis that requries a constant DR_STEP.  Fortunately
fallout is small.


Would this also vectorize strided loops when the architecture doesn't 
have a gather instruction ?


If so, it doesn't work for the attached case, which *does* vectorize 
with a gather instruction:


$ /tmp/c/bin/gfortran -g -O3 -ftree-vectorizer-verbose=2 -mavx2 -S 
verintlin.f


Analyzing loop at verintlin.f:68

Analyzing loop at verintlin.f:69


Vectorizing loop at verintlin.f:69

69: LOOP VECTORIZED.
verintlin.f:1: note: vectorized 1 loops in function.

whereas:

$ /tmp/c/bin/gfortran -g -O3 -ftree-vectorizer-verbose=2 -mavx -S 
verintlin.f


Analyzing loop at verintlin.f:68

Analyzing loop at verintlin.f:69

69: not vectorized: not suitable for gather load D.2051_74 = 
*parg_73(D)[D.2050_72];


69: not vectorized: not suitable for gather load D.2051_74 = 
*parg_73(D)[D.2050_72];


verintlin.f:1: note: vectorized 0 loops in function.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  RETURN
  END

Re: [Patch, Fortran] PR 51800 - Fix -finit-local-zero with automatic arrays and -fno-automatic

2012-01-11 Thread Toon Moene


On 01/11/2012 09:01 PM, Tobias Burnus wrote:


Before your patch: No initialization of automatic data objects
With your patch: Initialization for automatic arrays, but fails with
-fno-automatic.
With my patch: Initialization also for nonconst-length character
strings, no failures with -fno-automatic.


BTW, does this mean -finit-real=snan would work via an "initialization 
expression" at run time after allocation for allocatable arrays (note 
that this isn't covered, still), i.e., that I could try to fix this in 
4.8 in the same way as it works in resolve.c, but then applied to 
trans-array.c ?


Or do I have to apply a different method ?

Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Patch, Fortran] PR 51800 - Fix -finit-local-zero with automatic arrays and -fno-automatic

2012-01-11 Thread Toon Moene


On 01/11/2012 06:03 PM, Tobias Burnus wrote:


-finit-* creates an initialization for local variables - either as
static initializer or by "initializing" at run time.

The latter also works with automatic variables, but was breaking with
-fno-automatic, which causes all nonautomatic local variables to be
placed in static memory. However, combining -finit-* -fno-automatic with
automatic arrays is failing at resolution time. The fix turned out to be
rather simple.


Good, I wondered how this would work (the reason I thought it would 
always work with automatic arrays was that it apparently (assembler 
source and output of trial program) worked for arrays smaller than the 
limit to place them on the stack.


Unfortunately, I forgot to test this against the combination of -finit-* 
-fno-automatic, which just proves you cannot have too many test cases.



I wondered about characters strings where the length is a nonconstant
specification question (thus: they are also automatic data objects). It
turned out that only the "initialization" was missing - no code
generation (trans*.c) change and no other resolution change were required.

The first part fixes a regression as "-finit-* -fno-automatic" could be
combined before (albeit without initializing the automatic arrays - but
there was no compile error).


Perhaps we can issue a warning.


Build and regtested on x86-64-linux.
OK for the trunk?


Note that I backported this change (noted in PR/51310) to the 4.6 
branch, so it's needed there too.


Thanks for fixing this !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Patch, fortran] Would this patch - applied to trunk - be OK for the 4.6 branch ?

2011-12-22 Thread Toon Moene


On 12/21/2011 10:00 PM, Steve Kargl wrote:


On Mon, Dec 19, 2011 at 06:54:08PM +0100, Toon Moene wrote:



The attached patch makes -finit-=  generate default
initialization for automatic arrays.

It was OK for the trunk - is it also OK for the 4.6 branch ?

Strictly speaking, it doesn't fix a regression, it is a fix for a
(non-default) debugging option.



Speaking for myself, it seems that the general rule
of thumb among the gfortran committers is that it
is up to the discretion of the committer if he/she
wants to backport a non-regression patch.  The 2
measures I use are: 1) is the patch fairly local
(ie., only a few lines in at most a couple of files);
2) what is the likelihood for causing a problem
(ie., a regression of breaking bootstrap).

In your particular case, I think the patch is a
useful debugging aid, and the likelihood of causing
a problem is very low.


Yep, I thought so too - however, it has been over 6 years since I 
contributed a code patch, so I wanted to be careful.


It is committed as revision 182634.

Thanks for your comment.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

[Patch, fortran] Would this patch - applied to trunk - be OK for the 4.6 branch ?

2011-12-19 Thread Toon Moene

The attached patch makes -finit-= generate default 
initialization for automatic arrays.


It was OK for the trunk - is it also OK for the 4.6 branch ?

Strictly speaking, it doesn't fix a regression, it is a fix for a 
(non-default) debugging option.


2011-12-19  Toon Moene  

PR fortran/51310
* resolve.c (build_default_init_expr): Allow non-allocatable,
non-compile-time-constant-shape arrays to have a default
initializer.
* invoke.texi: Delete the restriction on automatic arrays not
being initialized by -finit-=.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
*** resolve.c.orig	2011-12-13 13:23:59.488029519 +
--- resolve.c	2011-12-13 13:24:37.098239361 +
*** build_default_init_expr (gfc_symbol *sym
*** 9899,9905 
int i;
  
/* These symbols should never have a default initialization.  */
!   if ((sym->attr.dimension && !gfc_is_compile_time_shape (sym->as))
|| sym->attr.external
|| sym->attr.dummy
|| sym->attr.pointer
--- 9899,9905 
int i;
  
/* These symbols should never have a default initialization.  */
!   if (sym->attr.allocatable
|| sym->attr.external
|| sym->attr.dummy
|| sym->attr.pointer
Index: invoke.texi
===
--- invoke.texi	(revision 182127)
+++ invoke.texi	(working copy)
@@ -1474,8 +1474,6 @@
 value) options.  These options do not initialize
 @itemize @bullet
 @item
-automatic arrays
-@item
 allocatable arrays
 @item
 components of derived type variables

Re: [Tentative patch] -finit-real=snan - would it really be so simple for automatic arrays.

2011-12-13 Thread Toon Moene


On 12/13/2011 08:41 PM, Thomas Koenig wrote:


Hi Toon,

(For gcc-patches: Patch at
http://gcc.gnu.org/ml/fortran/2011-12/msg00080.html )

I would appreciate a review and a regression test by someone who can.


Regression-test passed on trunk.

This one really looks obvious. Unless somebody objects who knows this
field better than I do, OK for trunk.


Thanks - I'll wait two days for further comment and then apply.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Update to Fortran "invoke" documentation about the features -finit- really provides.

2011-12-08 Thread Toon Moene


On 12/07/2011 07:58 PM, Toon Moene wrote:


On 12/06/2011 08:32 PM, Steve Kargl wrote:



Looks good to me. You can apply it to the 4.6 branch
if you have time.


And then  shortly before applying it, I realized that the proper
documentation of the limitations might be dependent on the
-fno-automatic, -fstack-arrays and -fmax-stack-var-size=n compiler flags
used.

So I'll come back tomorrow with version 2.0 of this patch, after
checking out all of the above (the documentation of 4.6 and 4.7 will be
different if using -fstack-arrays makes a difference, because that
option only exists in 4.7).


The flags mentioned turned out not to make a difference.  I committed 
the patch as-is to the trunk as revision 182127.  On the 4.6 branch it 
is revision 182138.


I will now change the type of the bug report to "Enhancement" - after 
all, all of these ease-debugging flags are enhancements over standard 
compiler behavior (quality-of-implementation issue, as we call it on 
http://j3-fortran.org).


Thanks for the review.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: Update to Fortran "invoke" documentation about the features -finit- really provides.

2011-12-07 Thread Toon Moene


On 12/06/2011 08:32 PM, Steve Kargl wrote:


On Mon, Dec 05, 2011 at 07:21:59PM +0100, Toon Moene wrote:


2011-12-05  Toon Moene

PR/51310
invoke.texi: Itemize the cases for which -finit-  doesn't
work.

OK for trunk ? (and perhaps later for the 4.6 branch ?



Looks good to me.  You can apply it to the 4.6 branch
if you have time.


And then  shortly before applying it, I realized that the proper 
documentation of the limitations might be dependent on the 
-fno-automatic, -fstack-arrays and -fmax-stack-var-size=n compiler flags 
used.


So I'll come back tomorrow with version 2.0 of this patch, after 
checking out all of the above (the documentation of 4.6 and 4.7 will be 
different if using -fstack-arrays makes a difference, because that 
option only exists in 4.7).


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Update to Fortran "invoke" documentation about the features -finit- really provides.

2011-12-05 Thread Toon Moene


And, as indicated, the list might change in the future.

ChangeLog:

2011-12-05  Toon Moene  

PR/51310
invoke.texi: Itemize the cases for which -finit- doesn't
work.

OK for trunk ? (and perhaps later for the 4.6 branch ?

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290  | 4 more
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands   | 4 44
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Index: invoke.texi
===
--- invoke.texi	(revision 182001)
+++ invoke.texi	(working copy)
@@ -1471,10 +1471,18 @@
 the real and imaginary parts of local @code{COMPLEX} variables),
 @option{-finit-logical=@var{}}, and
 @option{-finit-character=@var{n}} (where @var{n} is an ASCII character
-value) options.  These options do not initialize components of derived
-type variables, nor do they initialize variables that appear in an
-@code{EQUIVALENCE} statement.  (This limitation may be removed in
-future releases).
+value) options.  These options do not initialize
+@itemize @bullet
+@item
+automatic arrays
+@item
+allocatable arrays
+@item
+components of derived type variables
+@item
+variables that appear in an @code{EQUIVALENCE} statement.
+@end itemize
+(These limitations may be removed in future releases).
 
 Note that the @option{-finit-real=nan} option initializes @code{REAL}
 and @code{COMPLEX} variables with a quiet NaN. For a signalling NaN

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-31 Thread Toon Moene


On 10/31/2011 03:23 PM, Jakub Jelinek wrote:


On Sat, Oct 29, 2011 at 03:53:37PM +0200, Toon Moene wrote:



I wonder whether it will work with the attached Fortran routine - it
sure would mean a boost to the 18%+ heaviest CPU user in our code.



Would be nice to cut down slightly this testcase into just one or two loops
that are vectorized and turn it into a runtime testcase which verifies
the vectorization was correct.


This is not a verifiable routine yet, but as the linear interpolation 
part already has all the juicy indirection necessary to test this 
vectorization, most of the routine can be thrown away, to leave the 
attached as essential.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290  | 4 more
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands   | 4 44
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  RETURN
  END

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-29 Thread Toon Moene


On 10/26/2011 11:56 PM, Jakub Jelinek wrote:


Hi!

This patch implements gather vectorization with -mavx2, if
dr_may_alias (which apparently doesn't use tbaa :(( ) can figure out
there is no overlap with stores in the loop (if any).
The testcases show what is possible to get vectorized.


Hmmm,

I wonder whether it will work with the attached Fortran routine - it 
sure would mean a boost to the 18%+ heaviest CPU user in our code.


What follows is the single CPU breakdown of the most demanding codes in 
our weather forecasting code (from my 2006 GCC Summit "contribution", 
which wasn't approved):


Flat profile:
% time  calls name
 18.34  85684 verint_ <-- That's the one attached
  9.34   1380 invlo4_
  7.84  85684 bixint_
  6.76133 sl2tim_
  5.30  14950 condcv_
  4.74  14950 radia_
  4.65  14950 vcbr_
  3.25133 sldyn_
  2.98  14950 phtask_
  2.42133 sldynm_
  2.29  14950 phys_
  2.19  14950 prevap_

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290  | 4 more
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands   | 4 44
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
# 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F"
# 1 ""
# 1 ""
# 1 "/scratch/hirlam/hl_home/MPI/lib/src/grdy/verint.F"
c Library:grdy $RCSfile$, $Revision: 7536 $
c checked in by $Author: ovignes $ at $Date: 2009-12-18 14:23:36 +0100 (Fri, 18 Dec 2009) $
c $State$, $Locker$
c $Log$
c Revision 1.3  1999/04/22 09:30:45  DagBjoerge
c MPP code
c
c Revision 1.2  1999/03/09 10:23:13  GerardCats
c Add SGI paralllellisation directives DOACROSS
c
c Revision 1.1  1996/09/06 13:12:18  GCats
c Created from grdy.apl, 1 version 2.6.1, by Gerard Cats
c
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
  IF (KINT.EQ.1) THEN
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  ELSE
 +IF (KINT.EQ.2) THEN
C  QUADRATIC INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-29 Thread Toon Moene


On 10/26/2011 11:56 PM, Jakub Jelinek wrote:

Hi!

This patch implements gather vectorization with -mavx2, if
dr_may_alias (which apparently doesn't use tbaa :(( ) can figure out
there is no overlap with stores in the loop (if any).
The testcases show what is possible to get vectorized.

I chose to add 4 extra (internal only) gather builtins in addition to the
16 ones needed for the intrinsics, because the builtins using different
sizes of the index vs. src/mask/ret vectors would complicate the generic
code way too much (we don't have a VEC_SELECT_EXPR nor VEC_CONCAT_EXPR
and interleaving/extract even/odd is undesirable here).
With these 4 extra builtins the generic code always sees same sized
src/mask/ret vs. index vectors, either they have same number of units,
then just one vgather* insn is needed, or the index has more elements
(int index and double/long long load) - then for one loaded index vector
there is one vgather* insn using the first half of the index vector and
one using the second half of that vector, or long index with float/int
load, then two index vectors are processed by two vgather* insns and
the result gets concatenated first halves of both results.

All this is so far unconditional only, we'd need some tree representation
of conditional loads resp. conditional stores (and could already with AVX
use vmaskmov* insns for that).

Bootstrapped/regtested on x86_64-linux and i686-linux, testcases tested
also under sde.  Ok for trunk?

2011-10-26  Jakub Jelinek

PR tree-optimization/50789
* tree-vect-stmts.c (process_use): Add force argument, avoid
exist_non_indexing_operands_for_use_p check if true.
(vect_mark_stmts_to_be_vectorized): Adjust callers.  Handle
STMT_VINFO_GATHER_P.
(gen_perm_mask): New function.
(perm_mask_for_reverse): Use it.
(reverse_vec_element): Rename to...
(permute_vec_elements): ... this.  Add Y and MASK_VEC arguments,
generalize for any permutations.
(vectorizable_load): Adjust caller.  Handle STMT_VINFO_GATHER_P.
* target.def (TARGET_VECTORIZE_BUILTIN_GATHER): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_GATHER): Document it.
* doc/tm.texi: Regenerate.
* tree-data-ref.c (initialize_data_dependence_relation,
compute_self_dependence): No longer static.
* tree-data-ref.h (initialize_data_dependence_relation,
compute_self_dependence): New prototypes.
* tree-vect-data-refs.c (vect_check_gather): New function.
(vect_analyze_data_refs): Detect possible gather load data
refs.
* tree-vectorizer.h (struct _stmt_vec_info): Add gather_p field.
(STMT_VINFO_GATHER_P): Define.
(vect_check_gather): New prototype.
* config/i386/i386-builtin-types.def: Add types for alternate
gather builtins.
* config/i386/sse.md (AVXMODE48P_DI): Remove.
(VEC_GATHER_MODE): Rename mode_attr to...
(VEC_GATHER_IDXSI): ... this.
(VEC_GATHER_IDXDI, VEC_GATHER_SRCDI): New mode_attrs.
(avx2_gathersi, *avx2_gathersi): Use
instead of.
(avx2_gatherdi): Use  instead of
<  and  instead of VEC_GATHER_MODE
on src and mask operands.
(*avx2_gatherdi): Likewise.  Use VEC_GATHER_MODE iterator
instead of AVXMODE48P_DI.
(avx2_gatherdi256, *avx2_gatherdi256): Removed.
* config/i386/i386.c (enum ix86_builtins): Add
IX86_BUILTIN_GATHERALTSIV4DF, IX86_BUILTIN_GATHERALTDIV8SF,
IX86_BUILTIN_GATHERALTSIV4DI and IX86_BUILTIN_GATHERALTDIV8SI.
(ix86_init_mmx_sse_builtins): Create those builtins.
(ix86_expand_builtin): Handle those builtins and adjust expansions
of other gather builtins.
(ix86_vectorize_builtin_gather): New function.
(TARGET_VECTORIZE_BUILTIN_GATHER): Define.

* gcc.target/i386/avx2-gather-1.c: New test.
* gcc.target/i386/avx2-gather-2.c: New test.
* gcc.target/i386/avx2-gather-3.c: New test.

--- gcc/tree-vect-stmts.c.jj2011-10-26 14:19:11.0 +0200
+++ gcc/tree-vect-stmts.c   2011-10-26 16:54:23.0 +0200
@@ -332,6 +332,8 @@ exist_non_indexing_operands_for_use_p (t
 - LIVE_P, RELEVANT - enum values to be set in the STMT_VINFO of the stmt
   that defined USE.  This is done by calling mark_relevant and passing it
   the WORKLIST (to add DEF_STMT to the WORKLIST in case it is relevant).
+   - FORCE is true if exist_non_indexing_operands_for_use_p check shouldn't
+ be performed.

 Outputs:
 Generally, LIVE_P and RELEVANT are used to define the liveness and
@@ -351,7 +353,8 @@ exist_non_indexing_operands_for_use_p (t

  static bool
  process_use (gimple stmt, tree use, loop_vec_info loop_vinfo, bool live_p,
-enum vect_relevant relevant, VEC(gimple,heap) **worklist)
+enum vect_relevant relevant, VEC(gimple,heap) **worklist,
+bool force)
  {
stru

Re: [Patch, Fortran, committed] PR 50547 & 50553

2011-09-29 Thread Toon Moene


On 09/29/2011 02:09 PM, Janus Weil wrote:


Hi all,

I just committed as obvious another pair of accepts-invalid fixes:

http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179345

Also I would like to thank Vittorio for the large number of bug
reports he has been delivering lately. (I could make a comment about
his great talent for writing invalid Fortran code, but that would
probably sound more negative than intended ;)


In fact, I *do* consider this a virtue - in most cases it will catch 
errors in the parser or resolver, but there are also those who will lead 
to (hidden) bugs in later stages of the compiler.


In the days of g77 we had one person who made a program to insert random 
one-character changes into valid programs until he hit an ICE.


Of course, that was all automated ... it lead to a *lot* of bug fixes.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Patch, Fortran, committed] PR 50515 & 50517

2011-09-26 Thread Toon Moene


On 09/26/2011 10:10 PM, Janus Weil wrote:


Hi all,

I just committed as obvious two small patches for two recently
reported accepts-invalid PRs (after regtesting):

http://gcc.gnu.org/viewcvs?view=revision&revision=179213


Thanks !

As far as the fortran/interface.c change is concerned, why isn't a full 
TKR (Type/Kind/Rank) comparison in order ?


[ I might easily be missing something from context here, but TKR is my
  initial reaction to these kind of checks :-) ]

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [patch] support for multiarch systems

2011-08-22 Thread Toon Moene


On 08/21/2011 02:14 AM, Matthias Klose wrote:


On 08/20/2011 09:51 PM, Matthias Klose wrote:



Multiarch [1] is the term being used to refer to the capability of a system to
install and run applications of multiple different binary targets on the same
system.  The idea and name of multiarch dates back to 2004/2005 [2] (to be
confused with multiarch in glibc).


attached is an updated patch which includes feedback from Jakub and Joseph.


Perhaps I could "like" this patch ?  It probably solves

http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg02398.html

[ My system is Debian Testing, updated 20110821 at 12:15 UTC ]

(h/t Mark Glisse).

Thanks in advance,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: PATCH RFA: Build stages 2 and 3 with C++

2011-07-20 Thread Toon Moene


On 07/20/2011 04:34 PM, Toon Moene wrote:


So I changed my "lto" bootstrap script to do the following:

language=fortran
if [ $RANDOM -lt 16384 ]
then
language=ada
fi
...
../gcc/configure \
...
--enable-languages=c++,$language \

Still have to see if this will fit in the 2:20 hour gap between two
weather forecasting runs.


Well, that took *way* too long, so I am changing it back to

language=fortran
if [ $RANDOM -lt 16384 ]
then
   language=ada
fi
language=c++
...
../gcc/configure \
...
--enable-languages=$language \

Next run 21/07/2011 at 18:00 UTC.

Cheers,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: PATCH RFA: Build stages 2 and 3 with C++

2011-07-20 Thread Toon Moene


On 07/19/2011 08:33 PM, Ian Lance Taylor wrote:


2011-07-15  Ian Lance Taylor

* configure.ac: Add --enable-build-poststage1-with-cxx.  If set,
make C++ a boot_language.  Set and substitute
POSTSTAGE1_CONFIGURE_FLAGS.
* Makefile.tpl (POSTSTAGE1_CONFIGURE_FLAGS): New variable.
(STAGE[+id+]_CONFIGURE_FLAGS): Add $(POSTSTAGE1_CONFIGURE_FLAGS).
* configure, Makefile.in: Rebuild.



I got agreement from two global reviewers and no objections.

I have committed this patch.

Please let me know about any problems.


This was probably the reason for:

http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg02302.html

I subsequently performed the following bootstrap 
(--enable-languages=c++), which succeeded:


http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg02352.html

So I changed my "lto" bootstrap script to do the following:

language=fortran
if [ $RANDOM -lt 16384 ]
then
   language=ada
fi
...
../gcc/configure \
...
--enable-languages=c++,$language \

Still have to see if this will fit in the 2:20 hour gap between two 
weather forecasting runs.


Cheers,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: PATCH RFA: Build stages 2 and 3 with C++

2011-07-16 Thread Toon Moene


On 07/16/2011 08:52 AM, Ian Lance Taylor wrote:


I would like to propose this patch as a step toward building gcc using a
C++ compiler.  This patch builds stage1 with the C compiler as usual,
and defaults to building stages 2 and 3 with a C++ compiler built during
stage 1.


I just completed a run using the following language configure options:

../gcc/configure \
...
--enable-build-with-cxx \
--enable-languages=c,c++,fortran,ada \

on x86-64-unknown-linux-gnu.  As far as I can see it was successful:

http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg01852.html

For extra fun, the 0:20 UTC run at home using:

../gcc/configure \
...
--with-build-config=bootstrap-lto \

is going to use Ada as a language *of choice* instead of Fortran based 
on the value of $RANDOM in the bash shell :-)


language=fortran
if [ $RANDOM -lt 16384 ]
then
   language=ada
fi

../gcc/configure \
--prefix=/tmp/lto \
--enable-languages=$language \
--with-build-config=bootstrap-lto \

I'll see you tomorrow (evil cackle) :-)

All this is using:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.1-1' 
--with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr 
--program-suffix=-4.6 --enable-shared --enable-multiarch 
--with-multiarch-defaults=x86_64-linux-gnu --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib/x86_64-linux-gnu 
--without-included-gettext --enable-threads=posix 
--with-gxx-include-dir=/usr/include/c++/4.6 
--libdir=/usr/lib/x86_64-linux-gnu --enable-nls --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin 
--enable-objc-gc --with-arch-32=i586 --with-tune=generic 
--enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

Thread model: posix
gcc version 4.6.1 (Debian 4.6.1-1)

toon@super:~$ gnat -v
GNAT 4.6.1
...

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH]: Restore bootstrap with --enable-build-with-cxx

2011-05-18 Thread Toon Moene


On 05/18/2011 10:31 PM, Richard Guenther wrote:


Not that I'm too excited to see GCC built with a C++ compiler (or even C++
features being used).


Hmmm, you think using "false" as a value for a pointer-returning 
function is just A-OK ?


Duh, I'm glad I'm using Fortran, where the programmer isn't even 
supposed to know what the value of .FALSE. is, because it is 
implementation dependent.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH]: Restore bootstrap with --enable-build-with-cxx

2011-05-18 Thread Toon Moene


On 05/18/2011 05:41 AM, Gabriel Dos Reis wrote:


On Tue, May 17, 2011 at 2:46 PM, Toon Moene  wrote:



On 05/17/2011 08:32 PM, Uros Bizjak wrote:


Tested on x86_64-pc-linux-gnu {, m32} with --enable-build-with-cxx.
Committed to mainline SVN as obvious.


Does that mean that I can now remove the --disable-werror from my daily C++
bootstrap run ?


Well, that certainly worked, as exemplified by this:

http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg01890.html

At least that would enable my daily run (between 18:10 and 20:10 UTC) to 
catch -Werror mistakes ...



It's great that some people understand the intricacies of the
infight^H^H^H^H^H^H differences between the C and C++ type model.

OK: 1/2 :-)


I suspect this infight would vanish if we just switched, as we discussed
in the past.


Perhaps it would just help if we implemented the next step of the plan 
(http://gcc.gnu.org/wiki/gcc-in-cxx):


# "it would be a good thing to try forcing the C++ host compiler 
requirement for GCC 4.[7] with just building stage1 with C++ and 
stage2/3 with the stage1 C compiler. --disable-build-with-cxx would be a 
workaround for a missing C++ host compiler."


Of course, that still wouldn't make it possible to implement C++ 
solutions for C hacks because the "--disable-build-with-cxx" crowd would 
cry "foul" over this ...


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [PATCH]: Restore bootstrap with --enable-build-with-cxx

2011-05-17 Thread Toon Moene


On 05/17/2011 08:32 PM, Uros Bizjak wrote:


Tested on x86_64-pc-linux-gnu {, m32} with --enable-build-with-cxx.
Committed to mainline SVN as obvious.


Does that mean that I can now remove the --disable-werror from my daily 
C++ bootstrap run ?


It's great that some people understand the intricacies of the 
infight^H^H^H^H^H^H differences between the C and C++ type model.


OK: 1/2 :-)

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Patch, libfortran] PR 48931 Async-signal-safety of backtrace signal handler

2011-05-17 Thread Toon Moene


On 05/17/2011 07:50 PM, Toon Moene wrote:


On 05/14/2011 09:40 PM, Janne Blomqvist wrote:


Hi,

the current version of showing the backtrace is not async-signal-safe
as it uses backtrace_symbols() which, in turn, uses malloc(). The
attached patch changes the backtrace printing functionality to instead
use backtrace_symbols_fd() and pipes.


Great - this would solve a problem I filed a bugzilla report for years
ago (unfortunately, I do not know the number of it).


It was 33905 (2007-10-26).

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Patch, libfortran] PR 48931 Async-signal-safety of backtrace signal handler

2011-05-17 Thread Toon Moene


On 05/14/2011 09:40 PM, Janne Blomqvist wrote:


Hi,

the current version of showing the backtrace is not async-signal-safe
as it uses backtrace_symbols() which, in turn, uses malloc(). The
attached patch changes the backtrace printing functionality to instead
use backtrace_symbols_fd() and pipes.


Great - this would solve a problem I filed a bugzilla report for years 
ago (unfortunately, I do not know the number of it).


I closed it WONTFIX, because neither FX nor I could come up with an 
alternative way *not* using malloc.


[ The problem was getting a traceback after corruption of the
  malloc arena, which just hangs under the current implementation. ]

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [Patch, Fortran] Make -Ofast imply -fstack-arrays

2011-05-14 Thread Toon Moene


On 05/14/2011 09:14 AM, Tobias Burnus wrote:


As title says: Make -Ofast imply -fstack-arrays


I haven't commented on this before, but everyone should realize that 
automatic arrays were allocated on the stack *always* by g77.


I never even bothered to study how gfortran did it, because I assumed it 
would just have copied g77's behavior.


Duh.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

A five character qualifier saves the C++ build ...

2011-05-12 Thread Toon Moene


[ Strong Typing Is For People With Weak Memories ]

The attached patch fixes the C++ (--disable-werror) bootstrap:

2011-05-12  Toon Moene  

* objc-next-runtime-abi-02.c (objc_build_internal_classname):
Add const qualifier to constant variable pointer declaration.

Apply as obvious ?

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Index: objc-next-runtime-abi-02.c
===
--- objc-next-runtime-abi-02.c	(revision 173710)
+++ objc-next-runtime-abi-02.c	(working copy)
@@ -1879,7 +1879,7 @@ objc_build_internal_classname (tree ident, bool me
 static const char *
 newabi_append_ro (const char *name)
 {
-  char *dollar;
+  const char *dollar;
   char *p;
   static char string[BUFSIZE];
   dollar = strchr (name, '$');

Re: [Patch, Fortran+gcc/doc/invoke.texi] PR48864: -Ofast implies -fno-protect parens

2011-05-04 Thread Toon Moene


On 05/04/2011 02:00 PM, Tobias Burnus wrote:


As the example in the PR shows, using -fno-protect parens can make a
huge difference. As -fno-protect is in the spirit of -Ofast, enable it
with that option.


As long as -Ofast -fprotect-parens still works, I don't think this would 
be objectionable.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

89 matches

Mail list logo