[Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?

2009-10-19 Thread Robin
Hi,

I have been looking at moving some of my bottleneck functions to
fortran with f2py. To get started I tried some simple things, and was
surprised they performend so much better than the number builtins -
which I assumed would be c and would be quite fast.

On my Macbook pro laptop (Intel core 2 duo) I got the following
results. Numpy is built with xcode gcc 4.0.1 and gfortran is 4.2.3 -
fortran code for shuffle and bincount below:
In [1]: x = np.random.random_integers(0,1023,100).astype(int)
In [2]: import ftest
In [3]: timeit np.bincount(x)
100 loops, best of 3: 3.97 ms per loop
In [4]: timeit ftest.bincount(x,1024)
1000 loops, best of 3: 1.15 ms per loop
In [5]: timeit np.random.shuffle(x)
1 loops, best of 3: 605 ms per loop
In [6]: timeit ftest.shuffle(x)
10 loops, best of 3: 139 ms per loop

So fortran was about 4 times faster for these loops - similarly faster
than cython as well. So I was really happy as these are two of my
biggest bottlenecks, but when I moved a linux workstation I got
different results. Here with gcc/gfortran 4.3.3 :
In [3]: x = np.random.random_integers(0,1023,100).astype(int)
In [4]: timeit np.bincount(x)
100 loops, best of 3: 8.18 ms per loop
In [5]: timeit ftest.bincount(x,1024)
100 loops, best of 3: 8.25 ms per loop
In [6]:
In [7]: timeit np.random.shuffle(x)
1 loops, best of 3: 379 ms per loop
In [8]: timeit ftest.shuffle(x)
10 loops, best of 3: 172 ms per loop

So shuffle is a bit faster, but bincount is now the same as fortran.
The only thing I can think is that it is due to much better
performance of the more recent c compiler. I think this would also
explain why f2py extension was performing so much better than cython
on the mac.

So my question is -  is there a way to build numpy with a more recent
compiler on leopard? (I guess I could upgrade to snow leopard now) -
Could I make the numpy install use gcc-4.2 from xcode or would it
break stuff? Could I use gcc 4.3.3 from macports? It would be great to
get a 4x speed up on all numpy c loops! (already just these two
functions I use a lot would make a big difference).

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?

2009-10-19 Thread Robin
Forgot to include the fortran code used:

jm-g26b101:fortran robince$ cat test.f95
subroutine bincount (x,c,n,m)
implicit none
integer, intent(in) :: n,m
integer, dimension(0:n-1), intent(in) :: x
integer, dimension(0:m-1), intent(out) :: c
integer :: i

c = 0
do i = 0, n-1
c(x(i)) = c(x(i)) + 1
end do
end


subroutine shuffle (x,s,n)
implicit none
integer, intent(in) :: n
integer, dimension(n), intent(in) :: x
integer, dimension(n), intent(out) :: s
integer :: i,randpos,temp
real :: r

! copy input
s = x
call init_random_seed()
! knuth shuffle from http://rosettacode.org/wiki/Knuth_shuffle#Fortran
do i = n, 2, -1
call random_number(r)
randpos = int(r * i) + 1
temp = s(randpos)
s(randpos) = s(i)
s(i) = temp
end do
end


subroutine init_random_seed()
! init_random_seed from gfortran documentation
integer :: i, n, clock
integer, dimension(:), allocatable :: seed

call random_seed(size = n)
allocate(seed(n))

call system_clock(count=clock)

seed = clock + 37 * (/ (i - 1, i = 1, n) /)
call random_seed(put = seed)

deallocate(seed)
end subroutine
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion