Re: [patch, fortran] Fix PR 82567

2017-10-17 Thread Jerry DeLisle
On 10/17/2017 03:36 PM, Thomas Koenig wrote:
> Hello world,
> 
> this patch fixes a regression with long compile times,
> which came about due to our handling of array constructors
> at compile time.  This, togeteher with a simplification in
> front end optimization, led to long compile times and large
> code.
> 
> Regression-tested. OK for trunk and the other affected branches?
> 

Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42
+


Re: [patch, fortran] Fix PR 82567

2017-10-17 Thread Steve Kargl
On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote:
> On 10/17/2017 03:36 PM, Thomas Koenig wrote:
> > Hello world,
> > 
> > this patch fixes a regression with long compile times,
> > which came about due to our handling of array constructors
> > at compile time.  This, togeteher with a simplification in
> > front end optimization, led to long compile times and large
> > code.
> > 
> > Regression-tested. OK for trunk and the other affected branches?
> > 
> 
> Well I know 42 is the answer to the ultimate question of the universe so this
> must be OK.  I just don't know what the question is.
> 
> OK and thanks,
> 
> Jerry
> 
> +#define CONSTR_LEN_MAX 42

Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42 
elements only occupies 336 bytes.  Seems small.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [patch, fortran] Fix PR 82567

2017-10-18 Thread Mikael Morin

Le 18/10/2017 à 04:05, Steve Kargl a écrit :

On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote:

On 10/17/2017 03:36 PM, Thomas Koenig wrote:

Hello world,

this patch fixes a regression with long compile times,
which came about due to our handling of array constructors
at compile time.  This, togeteher with a simplification in
front end optimization, led to long compile times and large
code.

Regression-tested. OK for trunk and the other affected branches?



Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42


Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42
elements only occupies 336 bytes.  Seems small.


There is a -fmax-array-constructor=n option. Can’t we use it for the limit?


Re: [patch, fortran] Fix PR 82567

2017-10-18 Thread Thomas Koenig

Hi Jerry and Steve,


Well I know 42 is the answer to the ultimate question of the universe so this
must be OK.  I just don't know what the question is.

OK and thanks,

Jerry

+#define CONSTR_LEN_MAX 42

Actually, I was wondering about the choice myself.  With
most common hardware having fairly robust L1 and L2 cache
sizes, a double precision array constructor with 42
elements only occupies 336 bytes.  Seems small.


Well, the answer is that I didn't know how to chose a reasonable
constant.  I now actually ran some benchmarks using rdtsc, and
these seem to indicate that the optimum value for CONST_LEN_MAX
is actually quite short, 3 or 4, otherwise I just got a slowdown
or a break even.

So, I committed (r253872) with a length of 4 as a limit.  If anybody
comes up with a better number, we can always change this.

So, thanks for the review and the comments.

Regards

Thomas

If somebody wants to check, here is the test case:

main.f90:

module tick
  interface
 function rdtsc()
   integer(kind=8) :: rdtsc
 end function rdtsc
  end interface
end module tick

program main
  use tick
  use tst
  implicit none
  integer(8) :: t1, t2
  t1 = rdtsc()
  call sub1(2.0)
  t2 = rdtsc()
  !  print *,"sub1 : ", t2-t1

  t1 = rdtsc()
  do i=1,1
 call sub1(2.0)
  end do
  t2 = rdtsc()
  print *,"sub1 : ", t2-t1

  t1 = rdtsc()
  do i=1,1
 call sub2(2.0)
  end do
  t2 = rdtsc()
  print *,"sub2 : ", t2-t1

end program main

tst.f90:
module tst
  integer, parameter :: n=4
  real, dimension(n) :: x
  real, dimension(n), parameter :: s = [(i,i=1,n)]
contains
  subroutine sub1(a)
real, intent(in) :: a
x(1) = a * 1.0
x(2) = a * 2.0
x(3) = a * 3.0
x(4) = a * 3.0

  end subroutine sub1
  subroutine sub2(a)
x(:) = a * s(:)
  end subroutine sub2
end module tst

rdtsc.s:
.file   "rdtsc.s"
.text
.globl  rdtsc_
.type   rdtsc_, @function
rdtsc_:
.LFB0:
.cfi_startproc
rdtsc
shl $32, %rdx
or  %rdx, %rax
ret
.cfi_endproc
.LFE0:
.size   rdtsc_, .-rdtsc_
.section.note.GNU-stack,"",@progbits