Wow, you managed to work out how to remove the slow Python looping
from the numeric models!  For the record, this is the speed up for
this 'NS R1rho 2-site' model using Python 2.7.6 & numpy 1.8.0 on a
64-bit Linux system:

"""
$ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2
/data/relax/branches/disp_spin_speed
[snip]

New relax version:  relax repository checkout r24308
svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed
Old relax version:  relax repository checkout r24274
svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed

Execution iteration 1

$ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed2
     1000    0.016    0.000    3.561    0.004
relax_disp.py:1582(func_ns_r1rho_2site)
       10    0.013    0.001    3.237    0.324
relax_disp.py:1582(func_ns_r1rho_2site)
$ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed
     1000    0.017    0.000    6.245    0.006
relax_disp.py:1552(func_ns_r1rho_2site)
       10    0.013    0.001    5.899    0.590
relax_disp.py:1552(func_ns_r1rho_2site)
[snip]

100 single spins analysis:
NS R1rho 2-site:            62.450+/-0.000 ->  35.610+/-0.000,   1.754x faster.

Cluster of 100 spins analysis:
NS R1rho 2-site:            58.990+/-0.000 ->  32.370+/-0.000,   1.822x faster.
"""


This is quite significant for these models.  With all the changes
since relax 3.2.2:

"""
$ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2
/data/relax/tags/3.2.2/
[snip]

New relax version:  relax repository checkout r24308
svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed
Old relax version:  relax 3.2.2

Execution iteration 1

$ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed2
     1000    0.016    0.000    3.576    0.004
relax_disp.py:1582(func_ns_r1rho_2site)
       10    0.013    0.001    3.247    0.325
relax_disp.py:1582(func_ns_r1rho_2site)
$ python profiling_ns_r1rho_2site.py /data/relax/tags/3.2.2/
     1000    0.245    0.000   24.323    0.024
relax_disp.py:1681(func_ns_r1rho_2site)
       10    0.266    0.027   24.439    2.444
relax_disp.py:1681(func_ns_r1rho_2site)
[snip]

100 single spins analysis:
NS R1rho 2-site:           243.230+/-0.000 ->  35.760+/-0.000,   6.802x faster.

Cluster of 100 spins analysis:
NS R1rho 2-site:           244.390+/-0.000 ->  32.470+/-0.000,   7.527x faster.
"""


Considering how slow these models are compared to the analytic models,
this is a huge improvement!  According to your
profiling_ns_r1rho_2site.py script, the slow parts are:

"""
('chi2 cluster:', 14153134802.76194)
Wed Jun 25 07:59:18 2014    /tmp/tmp17lqUn

         211077 function calls in 3.510 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.510    3.510 <string>:1(<module>)
        1    0.002    0.002    3.510    3.510
profiling_ns_r1rho_2site.py:553(cluster)
       10    0.000    0.000    3.254    0.325
profiling_ns_r1rho_2site.py:515(calc)
       10    0.015    0.002    3.254    0.325
relax_disp.py:1582(func_ns_r1rho_2site)
       10    0.015    0.002    3.236    0.324
ns_r1rho_2site.py:190(ns_r1rho_2site)
       10    0.104    0.010    3.041    0.304
matrix_exponential.py:33(matrix_exponential_rank_NE_NS_NM_NO_ND_x_x)
       10    1.803    0.180    1.827    0.183 linalg.py:982(eig)
       10    0.564    0.056    0.601    0.060 linalg.py:455(inv)
       40    0.501    0.013    0.501    0.013 {numpy.core.multiarray.einsum}
"""


The single spin numbers are very similar.  Looking at the total times,
the numpy.einsum and numpy.inv functions are taking 1/7 of the time
each, and numpy.eig 51% of the time - this is a total of 80% of the
time.  The 
lib.dispersion.matrix_exponential.matrix_exponential_rank_NE_NS_NM_NO_ND_x_x()
function now takes up 86% of the cumulative time - so this is the new
bottleneck.  The only way to solve this would be using clever linear
algebra tricks to simplify or to change the logic in
matrix_exponential_rank_NE_NS_NM_NO_ND_x_x().  There are multiple
techniques for finding the matrix exponential, it could be that the
eigenvalue method is one of the slowest.  This looks to be interesting
for an overview of recent research in the mathematics field on this
subject: http://www.maths.manchester.ac.uk/~higham/talks/exp09.pdf.
Great work though, you have eliminated the worst bottleneck for the
numeric models!

Cheers,

Edward



On 25 June 2014 02:14,  <[email protected]> wrote:
> Author: tlinnet
> Date: Wed Jun 25 02:14:36 2014
> New Revision: 24294
>
> URL: http://svn.gna.org/viewcvs/relax?rev=24294&view=rev
> Log:
> Speeded up the code of NS r1rho 2site.
>
> This was essential done to numpy einsum, and doing the dot operations in 
> multiple dimensions.
> It was though necessary to realize, that to do the proper dot product 
> operations, the outer two
> axis if M0 should be swapped, by rolling the outer axis one back.
>
> Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion 
> models for Clustered analysis.
>
> Modified:
>     branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py
>     branches/disp_spin_speed/target_functions/relax_disp.py
>
> Modified: branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py
> URL: 
> http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py?rev=24294&r1=24293&r2=24294&view=diff
> ==============================================================================
> --- branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py   (original)
> +++ branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py   Wed Jun 25 
> 02:14:36 2014
> @@ -50,10 +50,10 @@
>  """
>
>  # Python module imports.
> -from numpy import array, cos, dot, einsum, float64, log, multiply, rollaxis, 
> sin, sum
> +from numpy import array, einsum, float64, isfinite, log, min, multiply, 
> rollaxis, sin, sum
> +from numpy.ma import fix_invalid, masked_less
>
>  # relax module imports.
> -from lib.float import isNaN
>  from lib.dispersion.matrix_exponential import 
> matrix_exponential_rank_NE_NS_NM_NO_ND_x_x
>
>  # Repetitive calculations (to speed up calculations).
> @@ -187,14 +187,16 @@
>      return matrix
>
>
> -def ns_r1rho_2site(M0=None, r1rho_prime=None, omega=None, offset=None, 
> r1=0.0, pA=None, dw=None, kex=None, spin_lock_fields=None, relax_time=None, 
> inv_relax_time=None, back_calc=None, num_points=None):
> +def ns_r1rho_2site(M0=None, M0_T=None, r1rho_prime=None, omega=None, 
> offset=None, r1=0.0, pA=None, dw=None, kex=None, spin_lock_fields=None, 
> relax_time=None, inv_relax_time=None, back_calc=None, num_points=None):
>      """The 2-site numerical solution to the Bloch-McConnell equation for 
> R1rho data.
>
>      This function calculates and stores the R1rho values.
>
>
>      @keyword M0:                This is a vector that contains the initial 
> magnetizations corresponding to the A and B state transverse magnetizations.
> -    @type M0:                   numpy float64, rank-1, 7D array
> +    @type M0:                   numpy float array of rank 
> [NE][NS][NM][NO][ND][6][1]
> +    @keyword M0_T:              This is a vector that contains the initial 
> magnetizations corresponding to the A and B state transverse magnetizations, 
> where the outer two axis has been swapped for efficient dot operations.
> +    @type M0_T:                 numpy float array of rank 
> [NE][NS][NM][NO][ND][1][6]
>      @keyword r1rho_prime:       The R1rho_prime parameter value (R1rho with 
> no exchange).
>      @type r1rho_prime:          numpy float array of rank 
> [NE][NS][NM][NO][ND]
>      @keyword omega:             The chemical shift for the spin in rad/s.
> @@ -238,43 +240,22 @@
>      # Magnetization evolution.
>      Rexpo_M0_mat = einsum('...ij,...jk', Rexpo_mat, M0)
>
> -    # Transpose M0, to prepare for dot operation. Roll the last axis one 
> back.
> -    M0_T = rollaxis(M0, 6, 5)
> -
> -    if NS == 1:
> -        # Loop over the spectrometer frequencies.
> -        for mi in range(NM):
> -            # Loop over offsets:
> -            for oi in range(NO):
> -                # Extract number of points.
> -                num_points_i = num_points[0, 0, mi, oi]
> -
> -                # Loop over the time points, back calculating the R2eff 
> values.
> -                for j in range(num_points_i):
> -
> -                    # If the number of spins are 1, do the fastest method by 
> dot product.  Else do it by einstein summation.
> -                    # Set the spin index for one spin.
> -                    si = 0
> -                    # Extract the preformed matrix that rotate the 
> magnetization previous to spin-lock into the weff frame.
> -                    M0_i= M0_T[0, si, mi, oi, j]
> -
> -                    # Extract from the pre-formed Magnetization evolution 
> matrix.
> -                    Rexpo_M0_mat_i = Rexpo_M0_mat[0, si, mi, oi, j]
> -
> -                    # Magnetization evolution.
> -                    MA = dot(M0_i, Rexpo_M0_mat_i)[0, 0]
> -
> -                    # The next lines calculate the R1rho using a two-point 
> approximation, i.e. assuming that the decay is mono-exponential.
> -                    if MA <= 0.0 or isNaN(MA):
> -                        back_calc[0, si, mi, oi, j] = 1e99
> -                    else:
> -                        back_calc[0, si, mi, oi, j]= -inv_relax_time[0, si, 
> mi, oi, j] * log(MA)
> -
> -    else:
> -        # Magnetization evolution, which include all dimensions.
> -        MA_mat = einsum('...ij,...jk', M0_T, Rexpo_M0_mat)[:, :, :, :, :, 0, 
> 0]
> -
> -        # Do back calculation.
> -        back_calc[:] = -inv_relax_time * log(MA_mat)
> -
> -
> +    # Magnetization evolution, which include all dimensions.
> +    MA_mat = einsum('...ij,...jk', M0_T, Rexpo_M0_mat)[:, :, :, :, :, 0, 0]
> +
> +    # Insert safe checks.
> +    if min(MA_mat) < 0.0:
> +        mask_min_MA_mat = masked_less(MA_mat, 0.0)
> +        # Fill with high values.
> +        MA_mat[mask_min_MA_mat.mask] = 1e100
> +
> +    # Do back calculation.
> +    back_calc[:] = -inv_relax_time * log(MA_mat)
> +
> +    # Catch errors, taking a sum over array is the fastest way to check for
> +    # +/- inf (infinity) and nan (not a number).
> +    if not isfinite(sum(back_calc)):
> +        # Replaces nan, inf, etc. with fill value.
> +        fix_invalid(back_calc, copy=False, fill_value=1e100)
> +
> +
>
> Modified: branches/disp_spin_speed/target_functions/relax_disp.py
> URL: 
> http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/target_functions/relax_disp.py?rev=24294&r1=24293&r2=24294&view=diff
> ==============================================================================
> --- branches/disp_spin_speed/target_functions/relax_disp.py     (original)
> +++ branches/disp_spin_speed/target_functions/relax_disp.py     Wed Jun 25 
> 02:14:36 2014
> @@ -26,8 +26,7 @@
>
>  # Python module imports.
>  from copy import deepcopy
> -from math import pi
> -from numpy import add, array, arctan2, asarray, cos, complex64, dot, 
> float64, int16, max, multiply, ones, sin, sqrt, sum, tile, zeros
> +from numpy import arctan2, cos, dot, float64, int16, multiply, ones, 
> rollaxis, pi, sin, sum, zeros
>  from numpy.ma import masked_equal
>
>  # relax module imports.
> @@ -415,6 +414,9 @@
>              M0_2[2, 0] = 1
>              M0_cos = multiply.outer( cos(theta_mat), M0_2 )
>              self.M0 = M0_sin + M0_cos
> +            # Transpose M0, to prepare for dot operation. Roll the last axis 
> one back, corresponds to a transpose for the outer two axis.
> +            self.M0_T = rollaxis(self.M0, 6, 5)
> +
>          if model in [MODEL_NS_R1RHO_3SITE, MODEL_NS_R1RHO_3SITE_LINEAR]:
>              self.M0 = zeros(9, float64)
>              # Offset of spin-lock from A.
> @@ -1598,7 +1600,7 @@
>          self.r20_struct[:] = multiply.outer( r1rho_prime.reshape(self.NE, 
> self.NS, self.NM), self.no_nd_ones )
>
>          # Back calculate the R2eff values.
> -        ns_r1rho_2site(M0=self.M0, r1rho_prime=self.r20_struct, 
> omega=self.chemical_shifts, offset=self.offset, r1=self.r1, pA=pA, 
> dw=self.dw_struct, kex=kex, spin_lock_fields=self.spin_lock_omega1, 
> relax_time=self.relax_times, inv_relax_time=self.inv_relax_times, 
> back_calc=self.back_calc, num_points=self.num_disp_points)
> +        ns_r1rho_2site(M0=self.M0, M0_T=self.M0_T, 
> r1rho_prime=self.r20_struct, omega=self.chemical_shifts, offset=self.offset, 
> r1=self.r1, pA=pA, dw=self.dw_struct, kex=kex, 
> spin_lock_fields=self.spin_lock_omega1, relax_time=self.relax_times, 
> inv_relax_time=self.inv_relax_times, back_calc=self.back_calc, 
> num_points=self.num_disp_points)
>
>          # Clean the data for all values, which is left over at the end of 
> arrays.
>          self.back_calc = self.back_calc*self.disp_struct
>
>
> _______________________________________________
> relax (http://www.nmr-relax.com)
>
> This is the relax-commits mailing list
> [email protected]
>
> To unsubscribe from this list, get a password
> reminder, or change your subscription options,
> visit the list information page at
> https://mail.gna.org/listinfo/relax-commits

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Reply via email to