Re: [patch, fortran] Fix PR 82567

2017-10-18 Thread Thomas Koenig

Hi Jerry and Steve,


Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42

Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42
elements only occupies 336 bytes.  Seems small.


Well, the answer is that I didn't know how to chose a reasonable
constant.  I now actually ran some benchmarks using rdtsc, and
these seem to indicate that the optimum value for CONST_LEN_MAX
is actually quite short, 3 or 4, otherwise I just got a slowdown
or a break even.

So, I committed (r253872) with a length of 4 as a limit.  If anybody
comes up with a better number, we can always change this.

So, thanks for the review and the comments.

Regards

Thomas

If somebody wants to check, here is the test case:

main.f90:

module tick
  interface
 function rdtsc()
   integer(kind=8) :: rdtsc
 end function rdtsc
  end interface
end module tick

program main
  use tick
  use tst
  implicit none
  integer(8) :: t1, t2
  t1 = rdtsc()
  call sub1(2.0)
  t2 = rdtsc()
  !  print *,"sub1 : ", t2-t1

  t1 = rdtsc()
  do i=1,1
 call sub1(2.0)
  end do
  t2 = rdtsc()
  print *,"sub1 : ", t2-t1

  t1 = rdtsc()
  do i=1,1
 call sub2(2.0)
  end do
  t2 = rdtsc()
  print *,"sub2 : ", t2-t1

end program main

tst.f90:
module tst
  integer, parameter :: n=4
  real, dimension(n) :: x
  real, dimension(n), parameter :: s = [(i,i=1,n)]
contains
  subroutine sub1(a)
real, intent(in) :: a
x(1) = a * 1.0
x(2) = a * 2.0
x(3) = a * 3.0
x(4) = a * 3.0

  end subroutine sub1
  subroutine sub2(a)
x(:) = a * s(:)
  end subroutine sub2
end module tst

rdtsc.s:
.file   "rdtsc.s"
.text
.globl  rdtsc_
.type   rdtsc_, @function
rdtsc_:
.LFB0:
.cfi_startproc
rdtsc
shl $32, %rdx
or  %rdx, %rax
ret
.cfi_endproc
.LFE0:
.size   rdtsc_, .-rdtsc_
.section.note.GNU-stack,"",@progbits


Re: [patch, fortran] Fix PR 82567

2017-10-18 Thread Mikael Morin

Le 18/10/2017 à 04:05, Steve Kargl a écrit :

On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote:

On 10/17/2017 03:36 PM, Thomas Koenig wrote:

Hello world,

this patch fixes a regression with long compile times,
which came about due to our handling of array constructors
at compile time.  This, togeteher with a simplification in
front end optimization, led to long compile times and large
code.

Regression-tested. OK for trunk and the other affected branches?



Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42


Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42
elements only occupies 336 bytes.  Seems small.


There is a -fmax-array-constructor=n option. Can’t we use it for the limit?


Re: [patch, fortran] Fix PR 82567

2017-10-17 Thread Steve Kargl
On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote:
> On 10/17/2017 03:36 PM, Thomas Koenig wrote:
> > Hello world,
> > 
> > this patch fixes a regression with long compile times,
> > which came about due to our handling of array constructors
> > at compile time.  This, togeteher with a simplification in
> > front end optimization, led to long compile times and large
> > code.
> > 
> > Regression-tested. OK for trunk and the other affected branches?
> > 
> 
> Well I know 42 is the answer to the ultimate question of the universe so this
> must be OK.  I just don't know what the question is.
> 
> OK and thanks,
> 
> Jerry
> 
> +#define CONSTR_LEN_MAX 42

Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42 
elements only occupies 336 bytes.  Seems small.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [patch, fortran] Fix PR 82567

2017-10-17 Thread Jerry DeLisle
On 10/17/2017 03:36 PM, Thomas Koenig wrote:
> Hello world,
> 
> this patch fixes a regression with long compile times,
> which came about due to our handling of array constructors
> at compile time.  This, togeteher with a simplification in
> front end optimization, led to long compile times and large
> code.
> 
> Regression-tested. OK for trunk and the other affected branches?
> 

Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42
+


[patch, fortran] Fix PR 82567

2017-10-17 Thread Thomas Koenig

Hello world,

this patch fixes a regression with long compile times,
which came about due to our handling of array constructors
at compile time.  This, togeteher with a simplification in
front end optimization, led to long compile times and large
code.

Regression-tested. OK for trunk and the other affected branches?

Regards

Thomas

2917-10-17  Thomas Koenig  

PR fortran/82567
* frontend-passes.c (combine_array_constructor): If an array
constructor is all constants and has more elements than a small
constant, don't convert a*[b,c] to [a*b,a*c] to reduce compilation
times.

2917-10-17  Thomas Koenig  

PR fortran/82567
* gfortran.dg/array_constructor_51.f90: New test.
! { dg-do compile }
! { dg-additional-options "-ffrontend-optimize -fdump-tree-original" }
! PR 82567 - long compile times caused by large constant constructors
! multiplied by variables

  SUBROUTINE sub()
  IMPLICIT NONE
  
  INTEGER, PARAMETER :: n = 1000
  REAL, ALLOCATABLE :: x(:)
  REAL :: xc, h
  INTEGER :: i
 
  ALLOCATE( x(n) )
  xc = 100.
  h = xc/n
  x = h*[(i,i=1,n)]
  
end
! { dg-final { scan-tree-dump-times "__var" 0 "original" } }
Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 253768)
+++ frontend-passes.c	(Arbeitskopie)
@@ -1635,6 +1635,8 @@ combine_array_constructor (gfc_expr *e)
   gfc_constructor *c, *new_c;
   gfc_constructor_base oldbase, newbase;
   bool scalar_first;
+  int n_elem;
+  bool all_const;
 
   /* Array constructors have rank one.  */
   if (e->rank != 1)
@@ -1674,12 +1676,38 @@ combine_array_constructor (gfc_expr *e)
   if (op2->ts.type == BT_CHARACTER)
 return false;
 
-  scalar = create_var (gfc_copy_expr (op2), "constr");
+  /* This might be an expanded constructor with very many constant values. If
+ we perform the operation here, we might end up with a long compile time,
+ so an arbitrary length bound is in order here.  If the constructor
+ constains something which is not a constant, it did not come from an
+ expansion, so leave it alone.  */
 
+#define CONSTR_LEN_MAX 42
+
   oldbase = op1->value.constructor;
+
+  n_elem = 0;
+  all_const = true;
+  for (c = gfc_constructor_first (oldbase); c; c = gfc_constructor_next(c))
+{
+  if (c->expr->expr_type != EXPR_CONSTANT)
+	{
+	  all_const = false;
+	  break;
+	}
+  n_elem += 1;
+}
+
+  if (all_const && n_elem > CONSTR_LEN_MAX)
+return false;
+
+#undef CONSTR_LEN_MAX
+
   newbase = NULL;
   e->expr_type = EXPR_ARRAY;
 
+  scalar = create_var (gfc_copy_expr (op2), "constr");
+
   for (c = gfc_constructor_first (oldbase); c;
c = gfc_constructor_next (c))
 {