Hi Troels, You had it working in the first way that has been deleted in this commit:
- dot(Rexpo, r180x, evolution_matrix) I would really prefer to avoid making SciPy a dependency for relax users performing a dispersion analysis. It is only needed for the frame order analysis, but I have been working hard to eliminate this. SciPy as a dependency can be a major problem for many relax users when they do not have root access to their own machines - it is best to avoid it. Using the BLAS library directly is based on the heresy at http://www.huyng.com/posts/faster-numpy-dot-product/, which is dangerous. Note that that speed up is because he has also used the CPU optimised ATLAS rather than standard BLAS+LAPACK. Switching to the scipy blas_dot does not activate this important part. This post don't even have any meaningful comments from numpy users. It is likely to be because there is a better way, or that Huy Nguyen has missed something very important. If you cannot get the mightily important 'out' argument working, then we really need to solve this problem before the code goes too far. Can you run the script at https://gna.org/task/?7807#comment199 on your system? Regards, Edward On 15 June 2014 15:57, <[email protected]> wrote: > Author: tlinnet > Date: Sun Jun 15 15:57:33 2014 > New Revision: 23964 > > URL: http://svn.gna.org/viewcvs/relax?rev=23964&view=rev > Log: > Implemented the BLAS method of dot product, which should be faster. > > I cannot get the "out" argument to work. > > Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion > models for Clustered analysis. > > Modified: > branches/disp_spin_speed/lib/dispersion/ns_cpmg_2site_3d.py > > Modified: branches/disp_spin_speed/lib/dispersion/ns_cpmg_2site_3d.py > URL: > http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/lib/dispersion/ns_cpmg_2site_3d.py?rev=23964&r1=23963&r2=23964&view=diff > ============================================================================== > --- branches/disp_spin_speed/lib/dispersion/ns_cpmg_2site_3d.py (original) > +++ branches/disp_spin_speed/lib/dispersion/ns_cpmg_2site_3d.py Sun Jun 15 > 15:57:33 2014 > @@ -56,6 +56,7 @@ > # Python module imports. > from numpy import asarray, dot, fabs, isfinite, log, min, sum > from numpy.ma import fix_invalid, masked_where > +from scipy.linalg.blas import dgemm as blas_dot > > > # relax module imports. > @@ -158,14 +159,31 @@ > # This matrix is a propagator that will evolve the > magnetization with the matrix R for a delay tcp. > Rexpo = matrix_exponential(R*tcp_si_mi_di) > > + # The numpy way. Give dot two matrices that are both > C_CONTIGUOUS, then the performance is better: > # The essential evolution matrix. > # This is the first round. > - dot(Rexpo, r180x, evolution_matrix) > - evolution_matrix = dot(evolution_matrix, Rexpo) > + #dot(Rexpo.T, r180x.T, evolution_matrix) > + #evolution_matrix = dot(evolution_matrix, Rexpo) > # The second round. > - evolution_matrix = dot(evolution_matrix, evolution_matrix) > + #evolution_matrix = dot(evolution_matrix, evolution_matrix) > # The third and fourth round, > - evolution_matrix = dot(evolution_matrix, evolution_matrix) > + #evolution_matrix = dot(evolution_matrix, evolution_matrix) > + > + # Give dot two matrices that are both F_CONTIGUOUS, we can > use BLAS through the method: > + # The become F_CONTIGUOUS by transposing them. > + # See by: print Rexpo.flags.c_contiguous, > Rexpo.T.flags.c_contiguous > + # http://wiki.scipy.org/PerformanceTips > + # The FORTRAN code. > + # > tchar=s,d,c,z>gemm(m,n,k,alpha,a,b,beta,c,trans_a,trans_b,lda,ka,ldb,kb) > + # ! c = > gemm(alpha,a,b,beta=0,c=0,trans_a=0,trans_b=0,overwrite_c=0) > + # ! Calculate C <- alpha * op(A) * op(B) + beta * C > + # This is the first round. > + evolution_matrix[:] = blas_dot(alpha=1.0, a=Rexpo.T, > b=r180x.T, trans_a=True, trans_b=True) > + evolution_matrix[:] = blas_dot(alpha=1.0, > a=evolution_matrix.T, b=Rexpo.T, trans_a=True, trans_b=True) > + # The second round. > + evolution_matrix[:] = blas_dot(alpha=1.0, > a=evolution_matrix.T, b=evolution_matrix.T, trans_a=True, trans_b=True) > + # The third and fourth round. > + evolution_matrix[:] = blas_dot(alpha=1.0, > a=evolution_matrix.T, b=evolution_matrix.T, trans_a=True, trans_b=True) > > # Loop over the CPMG elements, propagating the magnetisation. > for j in range(power_si_mi_di/2): > > > _______________________________________________ > relax (http://www.nmr-relax.com) > > This is the relax-commits mailing list > [email protected] > > To unsubscribe from this list, get a password > reminder, or change your subscription options, > visit the list information page at > https://mail.gna.org/listinfo/relax-commits _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

