Wow, you managed to work out how to remove the slow Python looping from the numeric models! For the record, this is the speed up for this 'NS R1rho 2-site' model using Python 2.7.6 & numpy 1.8.0 on a 64-bit Linux system:
""" $ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2 /data/relax/branches/disp_spin_speed [snip] New relax version: relax repository checkout r24308 svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed Old relax version: relax repository checkout r24274 svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed Execution iteration 1 $ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed2 1000 0.016 0.000 3.561 0.004 relax_disp.py:1582(func_ns_r1rho_2site) 10 0.013 0.001 3.237 0.324 relax_disp.py:1582(func_ns_r1rho_2site) $ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed 1000 0.017 0.000 6.245 0.006 relax_disp.py:1552(func_ns_r1rho_2site) 10 0.013 0.001 5.899 0.590 relax_disp.py:1552(func_ns_r1rho_2site) [snip] 100 single spins analysis: NS R1rho 2-site: 62.450+/-0.000 -> 35.610+/-0.000, 1.754x faster. Cluster of 100 spins analysis: NS R1rho 2-site: 58.990+/-0.000 -> 32.370+/-0.000, 1.822x faster. """ This is quite significant for these models. With all the changes since relax 3.2.2: """ $ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2 /data/relax/tags/3.2.2/ [snip] New relax version: relax repository checkout r24308 svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed Old relax version: relax 3.2.2 Execution iteration 1 $ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed2 1000 0.016 0.000 3.576 0.004 relax_disp.py:1582(func_ns_r1rho_2site) 10 0.013 0.001 3.247 0.325 relax_disp.py:1582(func_ns_r1rho_2site) $ python profiling_ns_r1rho_2site.py /data/relax/tags/3.2.2/ 1000 0.245 0.000 24.323 0.024 relax_disp.py:1681(func_ns_r1rho_2site) 10 0.266 0.027 24.439 2.444 relax_disp.py:1681(func_ns_r1rho_2site) [snip] 100 single spins analysis: NS R1rho 2-site: 243.230+/-0.000 -> 35.760+/-0.000, 6.802x faster. Cluster of 100 spins analysis: NS R1rho 2-site: 244.390+/-0.000 -> 32.470+/-0.000, 7.527x faster. """ Considering how slow these models are compared to the analytic models, this is a huge improvement! According to your profiling_ns_r1rho_2site.py script, the slow parts are: """ ('chi2 cluster:', 14153134802.76194) Wed Jun 25 07:59:18 2014 /tmp/tmp17lqUn 211077 function calls in 3.510 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 3.510 3.510 <string>:1(<module>) 1 0.002 0.002 3.510 3.510 profiling_ns_r1rho_2site.py:553(cluster) 10 0.000 0.000 3.254 0.325 profiling_ns_r1rho_2site.py:515(calc) 10 0.015 0.002 3.254 0.325 relax_disp.py:1582(func_ns_r1rho_2site) 10 0.015 0.002 3.236 0.324 ns_r1rho_2site.py:190(ns_r1rho_2site) 10 0.104 0.010 3.041 0.304 matrix_exponential.py:33(matrix_exponential_rank_NE_NS_NM_NO_ND_x_x) 10 1.803 0.180 1.827 0.183 linalg.py:982(eig) 10 0.564 0.056 0.601 0.060 linalg.py:455(inv) 40 0.501 0.013 0.501 0.013 {numpy.core.multiarray.einsum} """ The single spin numbers are very similar. Looking at the total times, the numpy.einsum and numpy.inv functions are taking 1/7 of the time each, and numpy.eig 51% of the time - this is a total of 80% of the time. The lib.dispersion.matrix_exponential.matrix_exponential_rank_NE_NS_NM_NO_ND_x_x() function now takes up 86% of the cumulative time - so this is the new bottleneck. The only way to solve this would be using clever linear algebra tricks to simplify or to change the logic in matrix_exponential_rank_NE_NS_NM_NO_ND_x_x(). There are multiple techniques for finding the matrix exponential, it could be that the eigenvalue method is one of the slowest. This looks to be interesting for an overview of recent research in the mathematics field on this subject: http://www.maths.manchester.ac.uk/~higham/talks/exp09.pdf. Great work though, you have eliminated the worst bottleneck for the numeric models! Cheers, Edward On 25 June 2014 02:14, <[email protected]> wrote: > Author: tlinnet > Date: Wed Jun 25 02:14:36 2014 > New Revision: 24294 > > URL: http://svn.gna.org/viewcvs/relax?rev=24294&view=rev > Log: > Speeded up the code of NS r1rho 2site. > > This was essential done to numpy einsum, and doing the dot operations in > multiple dimensions. > It was though necessary to realize, that to do the proper dot product > operations, the outer two > axis if M0 should be swapped, by rolling the outer axis one back. > > Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion > models for Clustered analysis. > > Modified: > branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py > branches/disp_spin_speed/target_functions/relax_disp.py > > Modified: branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py > URL: > http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py?rev=24294&r1=24293&r2=24294&view=diff > ============================================================================== > --- branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py (original) > +++ branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py Wed Jun 25 > 02:14:36 2014 > @@ -50,10 +50,10 @@ > """ > > # Python module imports. > -from numpy import array, cos, dot, einsum, float64, log, multiply, rollaxis, > sin, sum > +from numpy import array, einsum, float64, isfinite, log, min, multiply, > rollaxis, sin, sum > +from numpy.ma import fix_invalid, masked_less > > # relax module imports. > -from lib.float import isNaN > from lib.dispersion.matrix_exponential import > matrix_exponential_rank_NE_NS_NM_NO_ND_x_x > > # Repetitive calculations (to speed up calculations). > @@ -187,14 +187,16 @@ > return matrix > > > -def ns_r1rho_2site(M0=None, r1rho_prime=None, omega=None, offset=None, > r1=0.0, pA=None, dw=None, kex=None, spin_lock_fields=None, relax_time=None, > inv_relax_time=None, back_calc=None, num_points=None): > +def ns_r1rho_2site(M0=None, M0_T=None, r1rho_prime=None, omega=None, > offset=None, r1=0.0, pA=None, dw=None, kex=None, spin_lock_fields=None, > relax_time=None, inv_relax_time=None, back_calc=None, num_points=None): > """The 2-site numerical solution to the Bloch-McConnell equation for > R1rho data. > > This function calculates and stores the R1rho values. > > > @keyword M0: This is a vector that contains the initial > magnetizations corresponding to the A and B state transverse magnetizations. > - @type M0: numpy float64, rank-1, 7D array > + @type M0: numpy float array of rank > [NE][NS][NM][NO][ND][6][1] > + @keyword M0_T: This is a vector that contains the initial > magnetizations corresponding to the A and B state transverse magnetizations, > where the outer two axis has been swapped for efficient dot operations. > + @type M0_T: numpy float array of rank > [NE][NS][NM][NO][ND][1][6] > @keyword r1rho_prime: The R1rho_prime parameter value (R1rho with > no exchange). > @type r1rho_prime: numpy float array of rank > [NE][NS][NM][NO][ND] > @keyword omega: The chemical shift for the spin in rad/s. > @@ -238,43 +240,22 @@ > # Magnetization evolution. > Rexpo_M0_mat = einsum('...ij,...jk', Rexpo_mat, M0) > > - # Transpose M0, to prepare for dot operation. Roll the last axis one > back. > - M0_T = rollaxis(M0, 6, 5) > - > - if NS == 1: > - # Loop over the spectrometer frequencies. > - for mi in range(NM): > - # Loop over offsets: > - for oi in range(NO): > - # Extract number of points. > - num_points_i = num_points[0, 0, mi, oi] > - > - # Loop over the time points, back calculating the R2eff > values. > - for j in range(num_points_i): > - > - # If the number of spins are 1, do the fastest method by > dot product. Else do it by einstein summation. > - # Set the spin index for one spin. > - si = 0 > - # Extract the preformed matrix that rotate the > magnetization previous to spin-lock into the weff frame. > - M0_i= M0_T[0, si, mi, oi, j] > - > - # Extract from the pre-formed Magnetization evolution > matrix. > - Rexpo_M0_mat_i = Rexpo_M0_mat[0, si, mi, oi, j] > - > - # Magnetization evolution. > - MA = dot(M0_i, Rexpo_M0_mat_i)[0, 0] > - > - # The next lines calculate the R1rho using a two-point > approximation, i.e. assuming that the decay is mono-exponential. > - if MA <= 0.0 or isNaN(MA): > - back_calc[0, si, mi, oi, j] = 1e99 > - else: > - back_calc[0, si, mi, oi, j]= -inv_relax_time[0, si, > mi, oi, j] * log(MA) > - > - else: > - # Magnetization evolution, which include all dimensions. > - MA_mat = einsum('...ij,...jk', M0_T, Rexpo_M0_mat)[:, :, :, :, :, 0, > 0] > - > - # Do back calculation. > - back_calc[:] = -inv_relax_time * log(MA_mat) > - > - > + # Magnetization evolution, which include all dimensions. > + MA_mat = einsum('...ij,...jk', M0_T, Rexpo_M0_mat)[:, :, :, :, :, 0, 0] > + > + # Insert safe checks. > + if min(MA_mat) < 0.0: > + mask_min_MA_mat = masked_less(MA_mat, 0.0) > + # Fill with high values. > + MA_mat[mask_min_MA_mat.mask] = 1e100 > + > + # Do back calculation. > + back_calc[:] = -inv_relax_time * log(MA_mat) > + > + # Catch errors, taking a sum over array is the fastest way to check for > + # +/- inf (infinity) and nan (not a number). > + if not isfinite(sum(back_calc)): > + # Replaces nan, inf, etc. with fill value. > + fix_invalid(back_calc, copy=False, fill_value=1e100) > + > + > > Modified: branches/disp_spin_speed/target_functions/relax_disp.py > URL: > http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/target_functions/relax_disp.py?rev=24294&r1=24293&r2=24294&view=diff > ============================================================================== > --- branches/disp_spin_speed/target_functions/relax_disp.py (original) > +++ branches/disp_spin_speed/target_functions/relax_disp.py Wed Jun 25 > 02:14:36 2014 > @@ -26,8 +26,7 @@ > > # Python module imports. > from copy import deepcopy > -from math import pi > -from numpy import add, array, arctan2, asarray, cos, complex64, dot, > float64, int16, max, multiply, ones, sin, sqrt, sum, tile, zeros > +from numpy import arctan2, cos, dot, float64, int16, multiply, ones, > rollaxis, pi, sin, sum, zeros > from numpy.ma import masked_equal > > # relax module imports. > @@ -415,6 +414,9 @@ > M0_2[2, 0] = 1 > M0_cos = multiply.outer( cos(theta_mat), M0_2 ) > self.M0 = M0_sin + M0_cos > + # Transpose M0, to prepare for dot operation. Roll the last axis > one back, corresponds to a transpose for the outer two axis. > + self.M0_T = rollaxis(self.M0, 6, 5) > + > if model in [MODEL_NS_R1RHO_3SITE, MODEL_NS_R1RHO_3SITE_LINEAR]: > self.M0 = zeros(9, float64) > # Offset of spin-lock from A. > @@ -1598,7 +1600,7 @@ > self.r20_struct[:] = multiply.outer( r1rho_prime.reshape(self.NE, > self.NS, self.NM), self.no_nd_ones ) > > # Back calculate the R2eff values. > - ns_r1rho_2site(M0=self.M0, r1rho_prime=self.r20_struct, > omega=self.chemical_shifts, offset=self.offset, r1=self.r1, pA=pA, > dw=self.dw_struct, kex=kex, spin_lock_fields=self.spin_lock_omega1, > relax_time=self.relax_times, inv_relax_time=self.inv_relax_times, > back_calc=self.back_calc, num_points=self.num_disp_points) > + ns_r1rho_2site(M0=self.M0, M0_T=self.M0_T, > r1rho_prime=self.r20_struct, omega=self.chemical_shifts, offset=self.offset, > r1=self.r1, pA=pA, dw=self.dw_struct, kex=kex, > spin_lock_fields=self.spin_lock_omega1, relax_time=self.relax_times, > inv_relax_time=self.inv_relax_times, back_calc=self.back_calc, > num_points=self.num_disp_points) > > # Clean the data for all values, which is left over at the end of > arrays. > self.back_calc = self.back_calc*self.disp_struct > > > _______________________________________________ > relax (http://www.nmr-relax.com) > > This is the relax-commits mailing list > [email protected] > > To unsubscribe from this list, get a password > reminder, or change your subscription options, > visit the list information page at > https://mail.gna.org/listinfo/relax-commits _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

