Re: Why does one have to loop over the dispersion points?

Edward d'Auvergne Fri, 09 May 2014 06:59:34 -0700

Hi,

To set up a branch, just read the 3 small sections of the 'Branches'
section of the user manual
(http://www.nmr-relax.com/manual/Branches.html).  Using git here is
fatal.  But all the commands you need are listed there.  Try the CR72
optimisations first though.  Those will then make the API changes much
easier.


Regards,

Edward



On 9 May 2014 15:34, Troels Emtekær Linnet <[email protected]> wrote:
> How do I setup a branch? :-)
>
> Best
> Troels
>
> 2014-05-09 15:31 GMT+02:00 Troels Emtekær Linnet <[email protected]>:
>> Hi Edward.
>>
>> I really think for my case, that 25 speed up is a deal breaker !
>> I have so much data to crunch, that 25 speed is absolutely perfect.
>>
>> I would only optimise this for CR72, and TSMFK01, since these are the
>> ones I need now.
>> And the change of code is only 3-5 lines?
>>
>> And i was thinking of one thing more.
>>
>> CR72 always go over loop.
>>
>> -----------
>>     # Loop over the time points, back calculating the R2eff values.
>>     for i in range(num_points):
>>         # The full eta+/- values.
>>         etapos = etapos_part / cpmg_frqs[i]
>>         etaneg = etaneg_part / cpmg_frqs[i]
>>
>>         # Catch large values of etapos going into the cosh function.
>>         if etapos > 100:
>>             back_calc[i] = 1e100
>>             continue
>>
>>         # The arccosh argument - catch invalid values.
>>         fact = Dpos * cosh(etapos) - Dneg * cos(etaneg)
>>         if fact < 1.0:
>>             back_calc[i] = r20_kex
>>             continue
>>
>>         # The full formula.
>>         back_calc[i] = r20_kex - cpmg_frqs[i] * arccosh(fact)
>> ------------
>> I would rather do:
>> etapos = etapos_part / cpmg_frqs
>>
>> And then check for nan values.
>> If any of these are there, just return the whole array with 1e100,
>> instead of single values.
>> That would replace a loop with a check.
>>
>> Best
>> Troels
>>
>>
>> 2014-05-09 14:58 GMT+02:00 Edward d'Auvergne <[email protected]>:
>>> Hi,
>>>
>>> This approach can add a little speed.  You really need to stress test
>>> and have profile timings to understand.  You should also try different
>>> Python versions (2 and 3) because each implementation is different.
>>> You can sometimes have a speed up in Python 2 which does nothing in
>>> Python 3 (due to Python 3 being more optimised).  There can also be
>>> huge differences between numpy versions.  Anyway, here is a powerful
>>> test which shows 3 different implementation ideas for the
>>> back-calculated R2eff data in the dispersion functions:
>>>
>>> """
>>> import cProfile as profile
>>> from numpy import array, cos, float64, sin, zeros
>>> import pstats
>>>
>>> def in_place(values, bc):
>>>     x = cos(values) * sin(values)
>>>     for i in range(len(bc)):
>>>         bc[i] = x[i]
>>>
>>> def really_slow(values, bc):
>>>     for i in range(len(bc)):
>>>         x = cos(values[i]) * sin(values[i])
>>>         bc[i] = x
>>>
>>> def return_bc(values):
>>>     return cos(values) * sin(values)
>>>
>>> def test_in_place(inc=None, values=None, values2=None, bc=None):
>>>     for i in range(inc):
>>>         in_place(values, bc[0])
>>>         in_place(values2, bc[1])
>>>     print(bc)
>>>
>>> def test_really_slow(inc=None, values=None, values2=None, bc=None):
>>>     for i in range(inc):
>>>         really_slow(values, bc[0])
>>>         really_slow(values2, bc[1])
>>>     print(bc)
>>>
>>> def test_return_bc(inc=None, values=None, values2=None, bc=None):
>>>     for i in range(inc):
>>>         bc[0] = return_bc(values)
>>>         bc[1] = return_bc(values2)
>>>     print(bc)
>>>
>>> def test():
>>>     values = array([1, 3, 0.1], float64)
>>>     values2 = array([0.1, 0.2, 0.3], float64)
>>>     bc = zeros((2, 3), float64)
>>>     inc = 1000000
>>>     test_in_place(inc=inc, values=values, values2=values2, bc=bc)
>>>     test_really_slow(inc=inc, values=values, values2=values2, bc=bc)
>>>     test_return_bc(inc=inc, values=values, values2=values2, bc=bc)
>>>
>>> def print_stats(stats, status=0):
>>>     pstats.Stats(stats).sort_stats('time', 'name').print_stats()
>>> profile.Profile.print_stats = print_stats
>>> profile.runctx('test()', globals(), locals())
>>> """"
>>>
>>>
>>> Try running this in Python 2 and 3.  If the cProfile import does not
>>> work on one version, try simply "import profile".  You should create
>>> such scripts for testing out code optimisation ideas.  Knowing how to
>>> profile is essential.  For Python 3, I see:
>>>
>>> """
>>> $ python3.4 edward.py
>>> [[ 0.45464871 -0.13970775  0.09933467]
>>>  [ 0.09933467  0.19470917  0.28232124]]
>>> [[ 0.45464871 -0.13970775  0.09933467]
>>>  [ 0.09933467  0.19470917  0.28232124]]
>>> [[ 0.45464871 -0.13970775  0.09933467]
>>>  [ 0.09933467  0.19470917  0.28232124]]
>>>          10001042 function calls (10001036 primitive calls) in 39.303 
>>> seconds
>>>
>>>    Ordered by: internal time, function name
>>>
>>>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>>   2000000   19.744    0.000   19.867    0.000 edward.py:10(really_slow)
>>>   2000000    9.128    0.000    9.286    0.000 edward.py:5(in_place)
>>>   2000000    5.966    0.000    5.966    0.000 edward.py:15(return_bc)
>>>         1    1.964    1.964    7.931    7.931 edward.py:30(test_return_bc)
>>>         1    1.119    1.119   20.987   20.987 edward.py:24(test_really_slow)
>>>         1    1.099    1.099   10.385   10.385 edward.py:18(test_in_place)
>>>   4000198    0.281    0.000    0.281    0.000 {built-in method len}
>>>        27    0.001    0.000    0.001    0.000 {method 'reduce' of
>>> 'numpy.ufunc' objects}
>>> """
>>>
>>> Here you can see that test_return_bc() is 80% the speed of
>>> test_in_place().  The 'cumtime' is the important number, this is the
>>> total amount of time spent in that function.  So the speed up is not
>>> huge.  For Python 2:
>>>
>>> """
>>> $ python2.7 edward.py
>>> [[ 0.45464871 -0.13970775  0.09933467]
>>>  [ 0.09933467  0.19470917  0.28232124]]
>>> [[ 0.45464871 -0.13970775  0.09933467]
>>>  [ 0.09933467  0.19470917  0.28232124]]
>>> [[ 0.45464871 -0.13970775  0.09933467]
>>>  [ 0.09933467  0.19470917  0.28232124]]
>>>          14000972 function calls (14000966 primitive calls) in 38.625 
>>> seconds
>>>
>>>    Ordered by: internal time, function name
>>>
>>>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>>   2000000   18.373    0.000   19.086    0.000 edward.py:10(really_slow)
>>>   2000000    8.798    0.000    9.576    0.000 edward.py:5(in_place)
>>>   2000000    5.937    0.000    5.937    0.000 edward.py:15(return_bc)
>>>         1    1.839    1.839    7.785    7.785 edward.py:30(test_return_bc)
>>>   4000021    1.141    0.000    1.141    0.000 {range}
>>>         1    1.086    1.086   10.675   10.675 edward.py:18(test_in_place)
>>>         1    1.070    1.070   20.165   20.165 edward.py:24(test_really_slow)
>>>   4000198    0.379    0.000    0.379    0.000 {len}
>>> """
>>>
>>> Hmmm, Python 2 is faster than Python 3 for this example!  See for
>>> yourself.  If you really think that making the code 1.25 times faster,
>>> as shown in these tests, is worth your time, then this must be done in
>>> a subversion branch (http://svn.gna.org/viewcvs/relax/branches/).
>>> That way we can have timing tests between the trunk and the branch.
>>> As this affects all dispersion models, the changes will be quite
>>> disruptive.  And if the implementation is not faster or if it breaks
>>> everything, then the branch can be deleted.  What ever you do, please
>>> don't use a git-svn branch.
>>>
>>> Regards,
>>>
>>> Edward
>>>
>>>
>>>
>>>
>>> On 9 May 2014 14:07, Troels Emtekær Linnet <[email protected]> wrote:
>>>> Hi Edward.
>>>>
>>>> How about this script?
>>>> Here I try to pass the back the r2eff values, and then set them in the
>>>> back_calculated class object.
>>>> Will this work ??
>>>>
>>>> Or else I found this post about updating values.
>>>> http://stackoverflow.com/questions/14916284/in-python-class-object-how-to-auto-update-attributes
>>>> They talk about
>>>> @property
>>>> and setter, which I dont get yet. :-)
>>>>
>>>> Best
>>>> Troels
>>>>
>>>>
>>>> ---------------
>>>>
>>>>
>>>> def loop_rep(x, nr):
>>>>     y = [98, 99]
>>>>     for i in range(nr):
>>>>         x[i] = y[i]
>>>>
>>>> def not_loop_rep(x, nr):
>>>>     y = [98, 99]
>>>>     x = y
>>>>
>>>> def not_loop_rep_new(x, nr):
>>>>     y = [98, 99]
>>>>     x = y
>>>>     return x
>>>>
>>>>
>>>> class MyClass:
>>>>     def __init__(self, x):
>>>>         self.x = x
>>>>         self.nr = len(x)
>>>>
>>>>     def printc(self):
>>>>         print self.x, self.nr
>>>>
>>>>     def calc_loop_rep(self, x=None, nr=None):
>>>>         loop_rep(x=self.x, nr=self.nr)
>>>>
>>>>     def calc_not_loop_rep(self, x=None, nr=None):
>>>>         not_loop_rep(x=self.x, nr=self.nr)
>>>>
>>>>     def calc_not_loop_rep_new(self, x=None, nr=None):
>>>>         self.x = not_loop_rep_new(x=self.x, nr=self.nr)
>>>>
>>>> print("For class where we loop replace ")
>>>> "Create object of class"
>>>> t_rep = MyClass([0, 1])
>>>> "Print object of class"
>>>> t_rep.printc()
>>>> "Calc object of class"
>>>> t_rep.calc_loop_rep()
>>>> " Then print"
>>>> t_rep.printc()
>>>>
>>>> print("\nFor class where we not loop replace ")
>>>> " Now try with replace "
>>>> t = MyClass([3, 4])
>>>> t.printc()
>>>> t.calc_not_loop_rep()
>>>> t.printc()
>>>>
>>>> print("\nFor class where we not loop replace ")
>>>> t_new = MyClass([5, 6])
>>>> t_new.printc()
>>>> t_new.calc_not_loop_rep_new()
>>>> t_new.printc()
>>>>
>>>> 2014-05-05 19:07 GMT+02:00 Edward d'Auvergne <[email protected]>:
>>>>> :)  It does slow it down a little, but that's unavoidable.  It's also
>>>>> unavoidable in C, Fortran, Perl, etc.  As long as the number of
>>>>> operations in that loop is minimal, then it's the best you can do.  If
>>>>> it worries you, you could run a test where you call the target
>>>>> function say 1e6 times, with and without the loop to see the timing
>>>>> difference.  Or simply running in Python 2:
>>>>>
>>>>> for i in xrange(1000000):
>>>>>  x = 1
>>>>>
>>>>> Then try:
>>>>>
>>>>> for i in xrange(100000000):
>>>>>  x = 2
>>>>>
>>>>> These two demonstrate the slowness of the Python loop.  But the second
>>>>> case is extreme and you won't encounter that much looping in these
>>>>> functions.  So while it is theoretically slower than C and Fortran
>>>>> looping, you can probably see that no one would care :)  Here is
>>>>> another test, with Python 2 code:
>>>>>
>>>>> """
>>>>> import cProfile as profile
>>>>>
>>>>> def loop_1e6():
>>>>>     for i in xrange(int(1e6)):
>>>>>         x = 1
>>>>>
>>>>> def loop_1e8():
>>>>>     for i in xrange(int(1e8)):
>>>>>         x = 1
>>>>>
>>>>> def sum_conv():
>>>>>     for i in xrange(100000000):
>>>>>         x = 2 + 2.
>>>>>
>>>>> def sum_normal():
>>>>>     for i in xrange(100000000):
>>>>>         x = 2. + 2.
>>>>>
>>>>> def test():
>>>>>     loop_1e6()
>>>>>     loop_1e8()
>>>>>     sum_normal()
>>>>>     sum_conv()
>>>>>
>>>>> profile.runctx('test()', globals(), locals())
>>>>> """
>>>>>
>>>>> Running this on my system shows:
>>>>>
>>>>> """
>>>>>          7 function calls in 6.707 seconds
>>>>>
>>>>>    Ordered by: standard name
>>>>>
>>>>>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>>>>         1    0.000    0.000    6.707    6.707 <string>:1(<module>)
>>>>>         1    2.228    2.228    2.228    2.228 aaa.py:11(sum_conv)
>>>>>         1    2.228    2.228    2.228    2.228 aaa.py:15(sum_normal)
>>>>>         1    0.000    0.000    6.707    6.707 aaa.py:19(test)
>>>>>         1    0.022    0.022    0.022    0.022 aaa.py:3(loop_1e6)
>>>>>         1    2.228    2.228    2.228    2.228 aaa.py:7(loop_1e8)
>>>>>         1    0.000    0.000    0.000    0.000 {method 'disable' of
>>>>> '_lsprof.Profiler' objects}
>>>>> """
>>>>>
>>>>> That should be self explanatory.  The better optimisation targets are
>>>>> the repeated maths operations and the maths operations that can be
>>>>> shifted into the target function or the target function
>>>>> initialisation.  Despite the numbers above which prove my int to float
>>>>> speed argument as utter nonsense, it might still good to remove the
>>>>> int to float conversions, if only to match the other functions.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Edward
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 5 May 2014 18:45, Troels Emtekær Linnet <[email protected]> wrote:
>>>>>> The reason why I ask, is that I am afraid that this for loop slows
>>>>>> everything down.
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> 2014-05-05 18:41 GMT+02:00 Edward d'Auvergne <[email protected]>:
>>>>>>> This is not Python specific though :)  As far as I know, C uses
>>>>>>> pass-by-value for arguments, unless they are arrays or other funky
>>>>>>> objects/functions/etc..  This is the same behaviour as Python.
>>>>>>> Pass-by-reference and pass-by-value is something that needs to be
>>>>>>> mastered in all languages, whether or not you have pointers to play
>>>>>>> with.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Edward
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 5 May 2014 18:30, Troels Emtekær Linnet <[email protected]> 
>>>>>>> wrote:
>>>>>>>> This reminds me:
>>>>>>>>
>>>>>>>> http://combichem.blogspot.dk/2013/08/you-know-what-really-grinds-my-gears-in.html
>>>>>>>>
>>>>>>>> 2014-05-05 17:52 GMT+02:00 Edward d'Auvergne <[email protected]>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> This is an important difference.  In the first case (back_calc[i] =
>>>>>>>>> Minty[i]), what is happening is that your are copying the data into a
>>>>>>>>> pre-existing structure.  In the second case (back_calc = Minty), the
>>>>>>>>> existing back_calc structure will be overwritten.  Therefore the
>>>>>>>>> back_calc structure in the calling code will be modified in the first
>>>>>>>>> case but not the second.  Here is some demo code:
>>>>>>>>>
>>>>>>>>> def mod1(x):
>>>>>>>>>     x[0] = 1
>>>>>>>>>
>>>>>>>>> def mod2(x):
>>>>>>>>>     x = [3, 2]
>>>>>>>>>
>>>>>>>>> x = [0, 2]
>>>>>>>>> print(x)
>>>>>>>>> mod1(x)
>>>>>>>>> print(x)
>>>>>>>>> mod2(x)
>>>>>>>>> print(x)
>>>>>>>>>
>>>>>>>>> I don't know of a way around this.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Edward
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5 May 2014 17:42, Troels Emtekær Linnet <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>>> Hi Edward.
>>>>>>>>>>
>>>>>>>>>> In the library function of b14.py, i am looping over
>>>>>>>>>> the dispersion points to put in the data.
>>>>>>>>>>
>>>>>>>>>>     for i in range(num_points):
>>>>>>>>>>
>>>>>>>>>>         # The full formula.
>>>>>>>>>>         back_calc[i] = Minty[i]
>>>>>>>>>>
>>>>>>>>>> Why can I not just set:
>>>>>>>>>> back_calc = Minty
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> relax (http://www.nmr-relax.com)
>>>>>>>>>>
>>>>>>>>>> This is the relax-devel mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>>
>>>>>>>>>> To unsubscribe from this list, get a password
>>>>>>>>>> reminder, or change your subscription options,
>>>>>>>>>> visit the list information page at
>>>>>>>>>> https://mail.gna.org/listinfo/relax-devel

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: Why does one have to loop over the dispersion points?

Reply via email to