Hi Troels,

I just checked this change using the master profiling script (1
iteration and only the 'NS MMQ 2-site' model):

"""
$ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2
/data/relax/branches/disp_spin_speed
[snip]

New relax version:  relax repository checkout r24288
svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed
Old relax version:  relax repository checkout r24274
svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed

Execution iteration 1

$ python profiling_ns_mmq_2site.py /data/relax/branches/disp_spin_speed2
     1000    0.026    0.000   25.812    0.026
relax_disp.py:1450(func_ns_mmq_2site)
       10    0.003    0.000   25.587    2.559
relax_disp.py:1450(func_ns_mmq_2site)
$ python profiling_ns_mmq_2site.py /data/relax/branches/disp_spin_speed
     1000    0.025    0.000   25.860    0.026
relax_disp.py:1427(func_ns_mmq_2site)
       10    0.002    0.000   25.504    2.550
relax_disp.py:1427(func_ns_mmq_2site)

[snip]

100 single spins analysis:
NS MMQ 2-site:             258.600+/-0.000 -> 258.120+/-0.000,   1.002x faster.

Cluster of 100 spins analysis:
NS MMQ 2-site:             255.040+/-0.000 -> 255.870+/-0.000,   0.997x faster.
"""

This is strange!  I guess that the numpy.einsum() function is
performing its own internal Python looping, not using the highly
optimised and multi-threaded BLAS/LAPACK libraries, and hence this
change does not influence the speed.  This is nevertheless very useful
and important for the eventual elimination of the Python loops in the
lib.dispersion.ns_mmq_2site module.  But it is interesting that
numpy.einsum() is performing the same as Python looping in this case.
I guess that future numpy versions will have far greater optimisation
for numpy.einsum() so that this change will be significant later on.

Cheers,

Edward


On 24 June 2014 15:27,  <[email protected]> wrote:
> Author: tlinnet
> Date: Tue Jun 24 15:27:51 2014
> New Revision: 24282
>
> URL: http://svn.gna.org/viewcvs/relax?rev=24282&view=rev
> Log:
> Speeded up ns mmq 2site, by moving the forming of evolution matrix out of the 
> for loops, and preform it.
>
> Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion 
> models for Clustered analysis.
>
> Modified:
>     branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py
>
> Modified: branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py
> URL: 
> http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py?rev=24282&r1=24281&r2=24282&view=diff
> ==============================================================================
> --- branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py     (original)
> +++ branches/disp_spin_speed/lib/dispersion/ns_mmq_2site.py     Tue Jun 24 
> 15:27:51 2014
> @@ -51,7 +51,7 @@
>
>  # Python module imports.
>  from math import floor
> -from numpy import array, conj, complex64, dot, float64, log, multiply, sum
> +from numpy import array, conj, complex64, dot, einsum, float64, log, 
> multiply, sum
>
>  # relax module imports.
>  from lib.float import isNaN
> @@ -197,8 +197,20 @@
>      M2_mat = matrix_exponential_rank_NS_NM_NO_ND_x_x(m2_mat, dtype=complex64)
>
>      # The complex conjugates M1* and M2*
> +    # Equivalent to D+*.
>      M1_star_mat = conj(M1_mat)
> +    # Equivalent to Z-*.
>      M2_star_mat = conj(M2_mat)
> +
> +    # Repetitive dot products (minimised for speed).
> +    M1_M2_mat = einsum('...ij,...jk', M1_mat, M2_mat)
> +    M2_M1_mat = einsum('...ij,...jk', M2_mat, M1_mat)
> +    M1_M2_M2_M1_mat = einsum('...ij,...jk', M1_M2_mat, M2_M1_mat)
> +    M2_M1_M1_M2_mat = einsum('...ij,...jk', M2_M1_mat, M1_M2_mat)
> +    M1_M2_star_mat = einsum('...ij,...jk', M1_star_mat, M2_star_mat)
> +    M2_M1_star_mat = einsum('...ij,...jk', M2_star_mat, M1_star_mat)
> +    M1_M2_M2_M1_star_mat = einsum('...ij,...jk', M1_M2_star_mat, 
> M2_M1_star_mat)
> +    M2_M1_M1_M2_star_mat = einsum('...ij,...jk', M2_M1_star_mat, 
> M1_M2_star_mat)
>
>      # Loop over spins.
>      for si in range(NS):
> @@ -210,79 +222,68 @@
>
>                  # Loop over the time points, back calculating the R2eff 
> values.
>                  for i in range(num_points_i):
> -                    # The M1 and M2 matrices.
> -                    # Equivalent to D+.
> -                    M1_i = M1_mat[si, mi, oi, i]
> -                    # Equivalent to Z-.
> -                    M2_i = M2_mat[si, mi, oi, i]
> -
> -                    # The complex conjugates M1* and M2*
> -                    # Equivalent to D+*.
> -                    M1_star_i = M1_star_mat[si, mi, oi, i]
> -                    # Equivalent to Z-*.
> -                    M2_star_i = M2_star_mat[si, mi, oi, i]
> -
> -                    # Repetitive dot products (minimised for speed).
> -                    M1_M2 = dot(M1_i, M2_i)
> -                    M2_M1 = dot(M2_i, M1_i)
> -                    M1_M2_M2_M1 = dot(M1_M2, M2_M1)
> -                    M2_M1_M1_M2 = dot(M2_M1, M1_M2)
> -                    M1_M2_star = dot(M1_star_i, M2_star_i)
> -                    M2_M1_star = dot(M2_star_i, M1_star_i)
> -                    M1_M2_M2_M1_star = dot(M1_M2_star, M2_M1_star)
> -                    M2_M1_M1_M2_star = dot(M2_M1_star, M1_M2_star)
> +                    # Extract data from array.
> +                    power_i = power[si, mi, oi, i]
> +                    M1_M2_i = M1_M2_mat[si, mi, oi, i]
> +                    M1_M2_star_i = M1_M2_star_mat[si, mi, oi, i]
> +                    M2_M1_i = M2_M1_mat[si, mi, oi, i]
> +                    M2_M1_star_i = M2_M1_star_mat[si, mi, oi, i]
> +                    M1_M2_M2_M1_i = M1_M2_M2_M1_mat[si, mi, oi, i]
> +                    M2_M1_M1_M2_star_i = M2_M1_M1_M2_star_mat[si, mi, oi, i]
> +                    M2_M1_M1_M2_i = M2_M1_M1_M2_mat[si, mi, oi, i]
> +                    M1_M2_M2_M1_star_i = M1_M2_M2_M1_star_mat[si, mi, oi, i]
>
>                      # Special case of 1 CPMG block - the power is zero.
> -                    if power[si, mi, oi, i] == 1:
> +                    if power_i == 1:
>                          # M1.M2.
> -                        A = M1_M2
> +                        A = M1_M2_i
>
>                          # M1*.M2*.
> -                        B = M1_M2_star
> +                        B = M1_M2_star_i
>
>                          # M2.M1.
> -                        C = M2_M1
> +                        C = M2_M1_i
>
>                          # M2*.M1*.
> -                        D = M2_M1_star
> +                        D = M2_M1_star_i
>
>                      # Matrices for even number of CPMG blocks.
> -                    elif power[si, mi, oi, i] % 2 == 0:
> +                    elif power_i % 2 == 0:
>                          # The power factor (only calculate once).
> -                        fact = int(floor(power[si, mi, oi, i] / 2))
> +                        fact = int(floor(power_i / 2))
>
>                          # (M1.M2.M2.M1)^(n/2).
> -                        A = square_matrix_power(M1_M2_M2_M1, fact)
> +                        A = square_matrix_power(M1_M2_M2_M1_i, fact)
>
>                          # (M2*.M1*.M1*.M2*)^(n/2).
> -                        B = square_matrix_power(M2_M1_M1_M2_star, fact)
> +                        B = square_matrix_power(M2_M1_M1_M2_star_i, fact)
>
>                          # (M2.M1.M1.M2)^(n/2).
> -                        C = square_matrix_power(M2_M1_M1_M2, fact)
> +                        C = square_matrix_power(M2_M1_M1_M2_i, fact)
>
>                          # (M1*.M2*.M2*.M1*)^(n/2).
> -                        D = square_matrix_power(M1_M2_M2_M1_star, fact)
> +                        D = square_matrix_power(M1_M2_M2_M1_star_i, fact)
>
>                      # Matrices for odd number of CPMG blocks.
>                      else:
>                          # The power factor (only calculate once).
> -                        fact = int(floor((power[si, mi, oi, i] - 1) / 2))
> +                        fact = int(floor((power_i - 1) / 2))
>
>                          # (M1.M2.M2.M1)^((n-1)/2).M1.M2.
> -                        A = square_matrix_power(M1_M2_M2_M1, fact)
> -                        A = dot(A, M1_M2)
> +                        A = square_matrix_power(M1_M2_M2_M1_i, fact)
> +                        A = dot(A, M1_M2_i)
>
>                          # (M1*.M2*.M2*.M1*)^((n-1)/2).M1*.M2*.
> -                        B = square_matrix_power(M1_M2_M2_M1_star, fact)
> -                        B = dot(B, M1_M2_star)
> +                        B = square_matrix_power(M1_M2_M2_M1_star_i, fact)
> +                        B = dot(B, M1_M2_star_i)
>
>                          # (M2.M1.M1.M2)^((n-1)/2).M2.M1.
> -                        C = square_matrix_power(M2_M1_M1_M2, fact)
> -                        C = dot(C, M2_M1)
> +                        C = square_matrix_power(M2_M1_M1_M2_i, fact)
> +                        C = dot(C, M2_M1_i)
>
>                          # (M2*.M1*.M1*.M2*)^((n-1)/2).M2*.M1*.
> -                        D = square_matrix_power(M2_M1_M1_M2_star, fact)
> -                        D = dot(D, M2_M1_star)
> +                        D = square_matrix_power(M2_M1_M1_M2_star_i, fact)
> +                        D = dot(D, M2_M1_star_i)
>
>                      # The next lines calculate the R2eff using a two-point 
> approximation, i.e. assuming that the decay is mono-exponential.
>                      A_B = dot(A, B)
> @@ -353,6 +354,11 @@
>      A_pos_mat =  matrix_exponential_rank_NS_NM_NO_ND_x_x(m1_mat, 
> dtype=complex64)
>      A_neg_mat =  matrix_exponential_rank_NS_NM_NO_ND_x_x(m2_mat, 
> dtype=complex64)
>
> +    # The evolution for one n.
> +    evol_block_mat = einsum('...ij,...jk', A_neg_mat, A_pos_mat)
> +    evol_block_mat = einsum('...ij,...jk', A_neg_mat, evol_block_mat)
> +    evol_block_mat = einsum('...ij,...jk', A_pos_mat, evol_block_mat)
> +
>      # Loop over spins.
>      for si in range(NS):
>          # Loop over the spectrometer frequencies.
> @@ -364,15 +370,12 @@
>
>                  # Loop over the time points, back calculating the R2eff 
> values.
>                  for i in range(num_points_i):
> -                    # The A+/- matrices.
> -                    A_pos_i = A_pos_mat[si, mi, oi, i]
> -                    A_neg_i = A_neg_mat[si, mi, oi, i]
> -
> -                    # The evolution for one n.
> -                    evol_block = dot(A_pos_i, dot(A_neg_i, dot(A_neg_i, 
> A_pos_i)))
> +                    # Extract data from array.
> +                    power_i = power[si, mi, oi, i]
> +                    evol_block_i = evol_block_mat[si, mi, oi, i]
>
>                      # The full evolution.
> -                    evol = square_matrix_power(evol_block, power[si, mi, oi, 
> i])
> +                    evol = square_matrix_power(evol_block_i, power_i)
>
>                      # The next lines calculate the R2eff using a two-point 
> approximation, i.e. assuming that the decay is mono-exponential.
>                      Mx = dot(F_vector, dot(evol, M0))
>
>
> _______________________________________________
> relax (http://www.nmr-relax.com)
>
> This is the relax-commits mailing list
> [email protected]
>
> To unsubscribe from this list, get a password
> reminder, or change your subscription options,
> visit the list information page at
> https://mail.gna.org/listinfo/relax-commits

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Reply via email to