For this problem, the following reference could be interesting for
summarising the different techniques:

Moler, C. and Van Loan, C. (2003)  Nineteen Dubious Ways to Compute
the Exponential of a Matrix, Twenty-Five Years Later.  SIAM Review,
45, 3-49.  (http://dx.doi.org/10.1137/S00361445024180 or
http://www.cs.cornell.edu/cv/researchpdf/19ways+.pdf).

I'm putting the link here for future reference as well.

Regards,

Edward



On 25 June 2014 18:12, Edward d'Auvergne <[email protected]> wrote:
> Well, looking at the Scipy source code of ./scipy/linalg/matfuncs.py,
> expm2() and expm3() are implemented in Python code, interfacing to
> other scipy functions.  And expm() is an alias for the function of the
> same name in ./scipy/sparse/linalg/matfuncs.py, which is also
> implemented in Python code.  So there is no BLAS/LAPACK magic
> happening here, except maybe in the functions that these expm*()
> functions call.
>
> Regards,
>
> Edward
>
>
>
> On 25 June 2014 17:54, Edward d'Auvergne <[email protected]> wrote:
>> Hi Troels,
>>
>> If you are interested in eliminating this 86% bottleneck in the
>> numeric R1rho dispersion models in the
>> lib.dispersion.matrix_exponential.matrix_exponential_rank_NE_NS_NM_NO_ND_x_x()
>> function, it might be worth looking at the scipy.linalg.expm(),
>> scipy.linalg.expm2(), and scipy.linalg.expm3() functions
>> (http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html#exponential-and-logarithm-functions),
>> or the BLAS/LAPACK functions these interface to.  Note that I
>> eliminated the use of these functions in the dispersion analysis a
>> while ago (http://article.gmane.org/gmane.science.nmr.relax.scm/18869/).
>> And for good reason, there was a nasty Scipy bug in the Pade
>> approximation of scipy.linalg.expm().  This and the fatal bugs in all
>> optimisation algorithms in Scipy (the catalyst for me to create minfx,
>> originally as part of relax, https://gna.org/projects/minfx/), is the
>> reason why I try to avoid Scipy as much as possible - I just cannot
>> trust it any more.  Scipy is a collection of poorly integrated tools,
>> often with the original authors completely disappearing, and it is
>> full of bugs.  Anyway, maybe one of these functions would be of use or
>> would be a good inspiration.  Oh, the scipy.linalg.expm() function in
>> the original code comes from the fitting_main_kex.py script from
>> Mathilde Lescanne, Paul Schanda, and Dominique Marion.  Note that if
>> the Taylor series approximation of scipy.linalg.expm3() works well for
>> the numeric dispersion models, which is untested, that might be by far
>> the fastest solution.
>>
>> Regards,
>>
>> Edward
>>
>>
>> On 25 June 2014 08:20, Edward d'Auvergne <[email protected]> wrote:
>>> Wow, you managed to work out how to remove the slow Python looping
>>> from the numeric models!  For the record, this is the speed up for
>>> this 'NS R1rho 2-site' model using Python 2.7.6 & numpy 1.8.0 on a
>>> 64-bit Linux system:
>>>
>>> """
>>> $ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2
>>> /data/relax/branches/disp_spin_speed
>>> [snip]
>>>
>>> New relax version:  relax repository checkout r24308
>>> svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed
>>> Old relax version:  relax repository checkout r24274
>>> svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed
>>>
>>> Execution iteration 1
>>>
>>> $ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed2
>>>      1000    0.016    0.000    3.561    0.004
>>> relax_disp.py:1582(func_ns_r1rho_2site)
>>>        10    0.013    0.001    3.237    0.324
>>> relax_disp.py:1582(func_ns_r1rho_2site)
>>> $ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed
>>>      1000    0.017    0.000    6.245    0.006
>>> relax_disp.py:1552(func_ns_r1rho_2site)
>>>        10    0.013    0.001    5.899    0.590
>>> relax_disp.py:1552(func_ns_r1rho_2site)
>>> [snip]
>>>
>>> 100 single spins analysis:
>>> NS R1rho 2-site:            62.450+/-0.000 ->  35.610+/-0.000,   1.754x 
>>> faster.
>>>
>>> Cluster of 100 spins analysis:
>>> NS R1rho 2-site:            58.990+/-0.000 ->  32.370+/-0.000,   1.822x 
>>> faster.
>>> """
>>>
>>>
>>> This is quite significant for these models.  With all the changes
>>> since relax 3.2.2:
>>>
>>> """
>>> $ ./disp_profile_all.py /data/relax/branches/disp_spin_speed2
>>> /data/relax/tags/3.2.2/
>>> [snip]
>>>
>>> New relax version:  relax repository checkout r24308
>>> svn+ssh://[email protected]/svn/relax/branches/disp_spin_speed
>>> Old relax version:  relax 3.2.2
>>>
>>> Execution iteration 1
>>>
>>> $ python profiling_ns_r1rho_2site.py /data/relax/branches/disp_spin_speed2
>>>      1000    0.016    0.000    3.576    0.004
>>> relax_disp.py:1582(func_ns_r1rho_2site)
>>>        10    0.013    0.001    3.247    0.325
>>> relax_disp.py:1582(func_ns_r1rho_2site)
>>> $ python profiling_ns_r1rho_2site.py /data/relax/tags/3.2.2/
>>>      1000    0.245    0.000   24.323    0.024
>>> relax_disp.py:1681(func_ns_r1rho_2site)
>>>        10    0.266    0.027   24.439    2.444
>>> relax_disp.py:1681(func_ns_r1rho_2site)
>>> [snip]
>>>
>>> 100 single spins analysis:
>>> NS R1rho 2-site:           243.230+/-0.000 ->  35.760+/-0.000,   6.802x 
>>> faster.
>>>
>>> Cluster of 100 spins analysis:
>>> NS R1rho 2-site:           244.390+/-0.000 ->  32.470+/-0.000,   7.527x 
>>> faster.
>>> """
>>>
>>>
>>> Considering how slow these models are compared to the analytic models,
>>> this is a huge improvement!  According to your
>>> profiling_ns_r1rho_2site.py script, the slow parts are:
>>>
>>> """
>>> ('chi2 cluster:', 14153134802.76194)
>>> Wed Jun 25 07:59:18 2014    /tmp/tmp17lqUn
>>>
>>>          211077 function calls in 3.510 seconds
>>>
>>>    Ordered by: cumulative time
>>>
>>>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>>         1    0.000    0.000    3.510    3.510 <string>:1(<module>)
>>>         1    0.002    0.002    3.510    3.510
>>> profiling_ns_r1rho_2site.py:553(cluster)
>>>        10    0.000    0.000    3.254    0.325
>>> profiling_ns_r1rho_2site.py:515(calc)
>>>        10    0.015    0.002    3.254    0.325
>>> relax_disp.py:1582(func_ns_r1rho_2site)
>>>        10    0.015    0.002    3.236    0.324
>>> ns_r1rho_2site.py:190(ns_r1rho_2site)
>>>        10    0.104    0.010    3.041    0.304
>>> matrix_exponential.py:33(matrix_exponential_rank_NE_NS_NM_NO_ND_x_x)
>>>        10    1.803    0.180    1.827    0.183 linalg.py:982(eig)
>>>        10    0.564    0.056    0.601    0.060 linalg.py:455(inv)
>>>        40    0.501    0.013    0.501    0.013 {numpy.core.multiarray.einsum}
>>> """
>>>
>>>
>>> The single spin numbers are very similar.  Looking at the total times,
>>> the numpy.einsum and numpy.inv functions are taking 1/7 of the time
>>> each, and numpy.eig 51% of the time - this is a total of 80% of the
>>> time.  The 
>>> lib.dispersion.matrix_exponential.matrix_exponential_rank_NE_NS_NM_NO_ND_x_x()
>>> function now takes up 86% of the cumulative time - so this is the new
>>> bottleneck.  The only way to solve this would be using clever linear
>>> algebra tricks to simplify or to change the logic in
>>> matrix_exponential_rank_NE_NS_NM_NO_ND_x_x().  There are multiple
>>> techniques for finding the matrix exponential, it could be that the
>>> eigenvalue method is one of the slowest.  This looks to be interesting
>>> for an overview of recent research in the mathematics field on this
>>> subject: http://www.maths.manchester.ac.uk/~higham/talks/exp09.pdf.
>>> Great work though, you have eliminated the worst bottleneck for the
>>> numeric models!
>>>
>>> Cheers,
>>>
>>> Edward
>>>
>>>
>>>
>>> On 25 June 2014 02:14,  <[email protected]> wrote:
>>>> Author: tlinnet
>>>> Date: Wed Jun 25 02:14:36 2014
>>>> New Revision: 24294
>>>>
>>>> URL: http://svn.gna.org/viewcvs/relax?rev=24294&view=rev
>>>> Log:
>>>> Speeded up the code of NS r1rho 2site.
>>>>
>>>> This was essential done to numpy einsum, and doing the dot operations in 
>>>> multiple dimensions.
>>>> It was though necessary to realize, that to do the proper dot product 
>>>> operations, the outer two
>>>> axis if M0 should be swapped, by rolling the outer axis one back.
>>>>
>>>> Task #7807 (https://gna.org/task/index.php?7807): Speed-up of dispersion 
>>>> models for Clustered analysis.
>>>>
>>>> Modified:
>>>>     branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py
>>>>     branches/disp_spin_speed/target_functions/relax_disp.py
>>>>
>>>> Modified: branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py
>>>> URL: 
>>>> http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py?rev=24294&r1=24293&r2=24294&view=diff
>>>> ==============================================================================
>>>> --- branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py   (original)
>>>> +++ branches/disp_spin_speed/lib/dispersion/ns_r1rho_2site.py   Wed Jun 25 
>>>> 02:14:36 2014
>>>> @@ -50,10 +50,10 @@
>>>>  """
>>>>
>>>>  # Python module imports.
>>>> -from numpy import array, cos, dot, einsum, float64, log, multiply, 
>>>> rollaxis, sin, sum
>>>> +from numpy import array, einsum, float64, isfinite, log, min, multiply, 
>>>> rollaxis, sin, sum
>>>> +from numpy.ma import fix_invalid, masked_less
>>>>
>>>>  # relax module imports.
>>>> -from lib.float import isNaN
>>>>  from lib.dispersion.matrix_exponential import 
>>>> matrix_exponential_rank_NE_NS_NM_NO_ND_x_x
>>>>
>>>>  # Repetitive calculations (to speed up calculations).
>>>> @@ -187,14 +187,16 @@
>>>>      return matrix
>>>>
>>>>
>>>> -def ns_r1rho_2site(M0=None, r1rho_prime=None, omega=None, offset=None, 
>>>> r1=0.0, pA=None, dw=None, kex=None, spin_lock_fields=None, 
>>>> relax_time=None, inv_relax_time=None, back_calc=None, num_points=None):
>>>> +def ns_r1rho_2site(M0=None, M0_T=None, r1rho_prime=None, omega=None, 
>>>> offset=None, r1=0.0, pA=None, dw=None, kex=None, spin_lock_fields=None, 
>>>> relax_time=None, inv_relax_time=None, back_calc=None, num_points=None):
>>>>      """The 2-site numerical solution to the Bloch-McConnell equation for 
>>>> R1rho data.
>>>>
>>>>      This function calculates and stores the R1rho values.
>>>>
>>>>
>>>>      @keyword M0:                This is a vector that contains the 
>>>> initial magnetizations corresponding to the A and B state transverse 
>>>> magnetizations.
>>>> -    @type M0:                   numpy float64, rank-1, 7D array
>>>> +    @type M0:                   numpy float array of rank 
>>>> [NE][NS][NM][NO][ND][6][1]
>>>> +    @keyword M0_T:              This is a vector that contains the 
>>>> initial magnetizations corresponding to the A and B state transverse 
>>>> magnetizations, where the outer two axis has been swapped for efficient 
>>>> dot operations.
>>>> +    @type M0_T:                 numpy float array of rank 
>>>> [NE][NS][NM][NO][ND][1][6]
>>>>      @keyword r1rho_prime:       The R1rho_prime parameter value (R1rho 
>>>> with no exchange).
>>>>      @type r1rho_prime:          numpy float array of rank 
>>>> [NE][NS][NM][NO][ND]
>>>>      @keyword omega:             The chemical shift for the spin in rad/s.
>>>> @@ -238,43 +240,22 @@
>>>>      # Magnetization evolution.
>>>>      Rexpo_M0_mat = einsum('...ij,...jk', Rexpo_mat, M0)
>>>>
>>>> -    # Transpose M0, to prepare for dot operation. Roll the last axis one 
>>>> back.
>>>> -    M0_T = rollaxis(M0, 6, 5)
>>>> -
>>>> -    if NS == 1:
>>>> -        # Loop over the spectrometer frequencies.
>>>> -        for mi in range(NM):
>>>> -            # Loop over offsets:
>>>> -            for oi in range(NO):
>>>> -                # Extract number of points.
>>>> -                num_points_i = num_points[0, 0, mi, oi]
>>>> -
>>>> -                # Loop over the time points, back calculating the R2eff 
>>>> values.
>>>> -                for j in range(num_points_i):
>>>> -
>>>> -                    # If the number of spins are 1, do the fastest method 
>>>> by dot product.  Else do it by einstein summation.
>>>> -                    # Set the spin index for one spin.
>>>> -                    si = 0
>>>> -                    # Extract the preformed matrix that rotate the 
>>>> magnetization previous to spin-lock into the weff frame.
>>>> -                    M0_i= M0_T[0, si, mi, oi, j]
>>>> -
>>>> -                    # Extract from the pre-formed Magnetization evolution 
>>>> matrix.
>>>> -                    Rexpo_M0_mat_i = Rexpo_M0_mat[0, si, mi, oi, j]
>>>> -
>>>> -                    # Magnetization evolution.
>>>> -                    MA = dot(M0_i, Rexpo_M0_mat_i)[0, 0]
>>>> -
>>>> -                    # The next lines calculate the R1rho using a 
>>>> two-point approximation, i.e. assuming that the decay is mono-exponential.
>>>> -                    if MA <= 0.0 or isNaN(MA):
>>>> -                        back_calc[0, si, mi, oi, j] = 1e99
>>>> -                    else:
>>>> -                        back_calc[0, si, mi, oi, j]= -inv_relax_time[0, 
>>>> si, mi, oi, j] * log(MA)
>>>> -
>>>> -    else:
>>>> -        # Magnetization evolution, which include all dimensions.
>>>> -        MA_mat = einsum('...ij,...jk', M0_T, Rexpo_M0_mat)[:, :, :, :, :, 
>>>> 0, 0]
>>>> -
>>>> -        # Do back calculation.
>>>> -        back_calc[:] = -inv_relax_time * log(MA_mat)
>>>> -
>>>> -
>>>> +    # Magnetization evolution, which include all dimensions.
>>>> +    MA_mat = einsum('...ij,...jk', M0_T, Rexpo_M0_mat)[:, :, :, :, :, 0, 
>>>> 0]
>>>> +
>>>> +    # Insert safe checks.
>>>> +    if min(MA_mat) < 0.0:
>>>> +        mask_min_MA_mat = masked_less(MA_mat, 0.0)
>>>> +        # Fill with high values.
>>>> +        MA_mat[mask_min_MA_mat.mask] = 1e100
>>>> +
>>>> +    # Do back calculation.
>>>> +    back_calc[:] = -inv_relax_time * log(MA_mat)
>>>> +
>>>> +    # Catch errors, taking a sum over array is the fastest way to check 
>>>> for
>>>> +    # +/- inf (infinity) and nan (not a number).
>>>> +    if not isfinite(sum(back_calc)):
>>>> +        # Replaces nan, inf, etc. with fill value.
>>>> +        fix_invalid(back_calc, copy=False, fill_value=1e100)
>>>> +
>>>> +
>>>>
>>>> Modified: branches/disp_spin_speed/target_functions/relax_disp.py
>>>> URL: 
>>>> http://svn.gna.org/viewcvs/relax/branches/disp_spin_speed/target_functions/relax_disp.py?rev=24294&r1=24293&r2=24294&view=diff
>>>> ==============================================================================
>>>> --- branches/disp_spin_speed/target_functions/relax_disp.py     (original)
>>>> +++ branches/disp_spin_speed/target_functions/relax_disp.py     Wed Jun 25 
>>>> 02:14:36 2014
>>>> @@ -26,8 +26,7 @@
>>>>
>>>>  # Python module imports.
>>>>  from copy import deepcopy
>>>> -from math import pi
>>>> -from numpy import add, array, arctan2, asarray, cos, complex64, dot, 
>>>> float64, int16, max, multiply, ones, sin, sqrt, sum, tile, zeros
>>>> +from numpy import arctan2, cos, dot, float64, int16, multiply, ones, 
>>>> rollaxis, pi, sin, sum, zeros
>>>>  from numpy.ma import masked_equal
>>>>
>>>>  # relax module imports.
>>>> @@ -415,6 +414,9 @@
>>>>              M0_2[2, 0] = 1
>>>>              M0_cos = multiply.outer( cos(theta_mat), M0_2 )
>>>>              self.M0 = M0_sin + M0_cos
>>>> +            # Transpose M0, to prepare for dot operation. Roll the last 
>>>> axis one back, corresponds to a transpose for the outer two axis.
>>>> +            self.M0_T = rollaxis(self.M0, 6, 5)
>>>> +
>>>>          if model in [MODEL_NS_R1RHO_3SITE, MODEL_NS_R1RHO_3SITE_LINEAR]:
>>>>              self.M0 = zeros(9, float64)
>>>>              # Offset of spin-lock from A.
>>>> @@ -1598,7 +1600,7 @@
>>>>          self.r20_struct[:] = multiply.outer( r1rho_prime.reshape(self.NE, 
>>>> self.NS, self.NM), self.no_nd_ones )
>>>>
>>>>          # Back calculate the R2eff values.
>>>> -        ns_r1rho_2site(M0=self.M0, r1rho_prime=self.r20_struct, 
>>>> omega=self.chemical_shifts, offset=self.offset, r1=self.r1, pA=pA, 
>>>> dw=self.dw_struct, kex=kex, spin_lock_fields=self.spin_lock_omega1, 
>>>> relax_time=self.relax_times, inv_relax_time=self.inv_relax_times, 
>>>> back_calc=self.back_calc, num_points=self.num_disp_points)
>>>> +        ns_r1rho_2site(M0=self.M0, M0_T=self.M0_T, 
>>>> r1rho_prime=self.r20_struct, omega=self.chemical_shifts, 
>>>> offset=self.offset, r1=self.r1, pA=pA, dw=self.dw_struct, kex=kex, 
>>>> spin_lock_fields=self.spin_lock_omega1, relax_time=self.relax_times, 
>>>> inv_relax_time=self.inv_relax_times, back_calc=self.back_calc, 
>>>> num_points=self.num_disp_points)
>>>>
>>>>          # Clean the data for all values, which is left over at the end of 
>>>> arrays.
>>>>          self.back_calc = self.back_calc*self.disp_struct
>>>>
>>>>
>>>> _______________________________________________
>>>> relax (http://www.nmr-relax.com)
>>>>
>>>> This is the relax-commits mailing list
>>>> [email protected]
>>>>
>>>> To unsubscribe from this list, get a password
>>>> reminder, or change your subscription options,
>>>> visit the list information page at
>>>> https://mail.gna.org/listinfo/relax-commits

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Reply via email to