Re: Why does one have to loop over the dispersion points?

Edward d'Auvergne Fri, 09 May 2014 05:59:45 -0700

Hi,

This approach can add a little speed.  You really need to stress test
and have profile timings to understand.  You should also try different
Python versions (2 and 3) because each implementation is different.
You can sometimes have a speed up in Python 2 which does nothing in
Python 3 (due to Python 3 being more optimised).  There can also be
huge differences between numpy versions.  Anyway, here is a powerful
test which shows 3 different implementation ideas for the
back-calculated R2eff data in the dispersion functions:


"""
import cProfile as profile
from numpy import array, cos, float64, sin, zeros
import pstats

def in_place(values, bc):
    x = cos(values) * sin(values)
    for i in range(len(bc)):
        bc[i] = x[i]

def really_slow(values, bc):
    for i in range(len(bc)):
        x = cos(values[i]) * sin(values[i])
        bc[i] = x

def return_bc(values):
    return cos(values) * sin(values)

def test_in_place(inc=None, values=None, values2=None, bc=None):
    for i in range(inc):
        in_place(values, bc[0])
        in_place(values2, bc[1])
    print(bc)

def test_really_slow(inc=None, values=None, values2=None, bc=None):
    for i in range(inc):
        really_slow(values, bc[0])
        really_slow(values2, bc[1])
    print(bc)

def test_return_bc(inc=None, values=None, values2=None, bc=None):
    for i in range(inc):
        bc[0] = return_bc(values)
        bc[1] = return_bc(values2)
    print(bc)

def test():
    values = array([1, 3, 0.1], float64)
    values2 = array([0.1, 0.2, 0.3], float64)
    bc = zeros((2, 3), float64)
    inc = 1000000
    test_in_place(inc=inc, values=values, values2=values2, bc=bc)
    test_really_slow(inc=inc, values=values, values2=values2, bc=bc)
    test_return_bc(inc=inc, values=values, values2=values2, bc=bc)

def print_stats(stats, status=0):
    pstats.Stats(stats).sort_stats('time', 'name').print_stats()
profile.Profile.print_stats = print_stats
profile.runctx('test()', globals(), locals())
""""


Try running this in Python 2 and 3.  If the cProfile import does not
work on one version, try simply "import profile".  You should create
such scripts for testing out code optimisation ideas.  Knowing how to
profile is essential.  For Python 3, I see:

"""
$ python3.4 edward.py
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
         10001042 function calls (10001036 primitive calls) in 39.303 seconds

   Ordered by: internal time, function name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2000000   19.744    0.000   19.867    0.000 edward.py:10(really_slow)
  2000000    9.128    0.000    9.286    0.000 edward.py:5(in_place)
  2000000    5.966    0.000    5.966    0.000 edward.py:15(return_bc)
        1    1.964    1.964    7.931    7.931 edward.py:30(test_return_bc)
        1    1.119    1.119   20.987   20.987 edward.py:24(test_really_slow)
        1    1.099    1.099   10.385   10.385 edward.py:18(test_in_place)
  4000198    0.281    0.000    0.281    0.000 {built-in method len}
       27    0.001    0.000    0.001    0.000 {method 'reduce' of
'numpy.ufunc' objects}
"""

Here you can see that test_return_bc() is 80% the speed of
test_in_place().  The 'cumtime' is the important number, this is the
total amount of time spent in that function.  So the speed up is not
huge.  For Python 2:

"""
$ python2.7 edward.py
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
[[ 0.45464871 -0.13970775  0.09933467]
 [ 0.09933467  0.19470917  0.28232124]]
         14000972 function calls (14000966 primitive calls) in 38.625 seconds

   Ordered by: internal time, function name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2000000   18.373    0.000   19.086    0.000 edward.py:10(really_slow)
  2000000    8.798    0.000    9.576    0.000 edward.py:5(in_place)
  2000000    5.937    0.000    5.937    0.000 edward.py:15(return_bc)
        1    1.839    1.839    7.785    7.785 edward.py:30(test_return_bc)
  4000021    1.141    0.000    1.141    0.000 {range}
        1    1.086    1.086   10.675   10.675 edward.py:18(test_in_place)
        1    1.070    1.070   20.165   20.165 edward.py:24(test_really_slow)
  4000198    0.379    0.000    0.379    0.000 {len}
"""

Hmmm, Python 2 is faster than Python 3 for this example!  See for
yourself.  If you really think that making the code 1.25 times faster,
as shown in these tests, is worth your time, then this must be done in
a subversion branch (http://svn.gna.org/viewcvs/relax/branches/).
That way we can have timing tests between the trunk and the branch.
As this affects all dispersion models, the changes will be quite
disruptive.  And if the implementation is not faster or if it breaks
everything, then the branch can be deleted.  What ever you do, please
don't use a git-svn branch.

Regards,

Edward




On 9 May 2014 14:07, Troels Emtekær Linnet <[email protected]> wrote:
> Hi Edward.
>
> How about this script?
> Here I try to pass the back the r2eff values, and then set them in the
> back_calculated class object.
> Will this work ??
>
> Or else I found this post about updating values.
> http://stackoverflow.com/questions/14916284/in-python-class-object-how-to-auto-update-attributes
> They talk about
> @property
> and setter, which I dont get yet. :-)
>
> Best
> Troels
>
>
> ---------------
>
>
> def loop_rep(x, nr):
>     y = [98, 99]
>     for i in range(nr):
>         x[i] = y[i]
>
> def not_loop_rep(x, nr):
>     y = [98, 99]
>     x = y
>
> def not_loop_rep_new(x, nr):
>     y = [98, 99]
>     x = y
>     return x
>
>
> class MyClass:
>     def __init__(self, x):
>         self.x = x
>         self.nr = len(x)
>
>     def printc(self):
>         print self.x, self.nr
>
>     def calc_loop_rep(self, x=None, nr=None):
>         loop_rep(x=self.x, nr=self.nr)
>
>     def calc_not_loop_rep(self, x=None, nr=None):
>         not_loop_rep(x=self.x, nr=self.nr)
>
>     def calc_not_loop_rep_new(self, x=None, nr=None):
>         self.x = not_loop_rep_new(x=self.x, nr=self.nr)
>
> print("For class where we loop replace ")
> "Create object of class"
> t_rep = MyClass([0, 1])
> "Print object of class"
> t_rep.printc()
> "Calc object of class"
> t_rep.calc_loop_rep()
> " Then print"
> t_rep.printc()
>
> print("\nFor class where we not loop replace ")
> " Now try with replace "
> t = MyClass([3, 4])
> t.printc()
> t.calc_not_loop_rep()
> t.printc()
>
> print("\nFor class where we not loop replace ")
> t_new = MyClass([5, 6])
> t_new.printc()
> t_new.calc_not_loop_rep_new()
> t_new.printc()
>
> 2014-05-05 19:07 GMT+02:00 Edward d'Auvergne <[email protected]>:
>> :)  It does slow it down a little, but that's unavoidable.  It's also
>> unavoidable in C, Fortran, Perl, etc.  As long as the number of
>> operations in that loop is minimal, then it's the best you can do.  If
>> it worries you, you could run a test where you call the target
>> function say 1e6 times, with and without the loop to see the timing
>> difference.  Or simply running in Python 2:
>>
>> for i in xrange(1000000):
>>  x = 1
>>
>> Then try:
>>
>> for i in xrange(100000000):
>>  x = 2
>>
>> These two demonstrate the slowness of the Python loop.  But the second
>> case is extreme and you won't encounter that much looping in these
>> functions.  So while it is theoretically slower than C and Fortran
>> looping, you can probably see that no one would care :)  Here is
>> another test, with Python 2 code:
>>
>> """
>> import cProfile as profile
>>
>> def loop_1e6():
>>     for i in xrange(int(1e6)):
>>         x = 1
>>
>> def loop_1e8():
>>     for i in xrange(int(1e8)):
>>         x = 1
>>
>> def sum_conv():
>>     for i in xrange(100000000):
>>         x = 2 + 2.
>>
>> def sum_normal():
>>     for i in xrange(100000000):
>>         x = 2. + 2.
>>
>> def test():
>>     loop_1e6()
>>     loop_1e8()
>>     sum_normal()
>>     sum_conv()
>>
>> profile.runctx('test()', globals(), locals())
>> """
>>
>> Running this on my system shows:
>>
>> """
>>          7 function calls in 6.707 seconds
>>
>>    Ordered by: standard name
>>
>>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>         1    0.000    0.000    6.707    6.707 <string>:1(<module>)
>>         1    2.228    2.228    2.228    2.228 aaa.py:11(sum_conv)
>>         1    2.228    2.228    2.228    2.228 aaa.py:15(sum_normal)
>>         1    0.000    0.000    6.707    6.707 aaa.py:19(test)
>>         1    0.022    0.022    0.022    0.022 aaa.py:3(loop_1e6)
>>         1    2.228    2.228    2.228    2.228 aaa.py:7(loop_1e8)
>>         1    0.000    0.000    0.000    0.000 {method 'disable' of
>> '_lsprof.Profiler' objects}
>> """
>>
>> That should be self explanatory.  The better optimisation targets are
>> the repeated maths operations and the maths operations that can be
>> shifted into the target function or the target function
>> initialisation.  Despite the numbers above which prove my int to float
>> speed argument as utter nonsense, it might still good to remove the
>> int to float conversions, if only to match the other functions.
>>
>> Regards,
>>
>> Edward
>>
>>
>>
>>
>> On 5 May 2014 18:45, Troels Emtekær Linnet <[email protected]> wrote:
>>> The reason why I ask, is that I am afraid that this for loop slows
>>> everything down.
>>>
>>> What do you think?
>>>
>>> 2014-05-05 18:41 GMT+02:00 Edward d'Auvergne <[email protected]>:
>>>> This is not Python specific though :)  As far as I know, C uses
>>>> pass-by-value for arguments, unless they are arrays or other funky
>>>> objects/functions/etc..  This is the same behaviour as Python.
>>>> Pass-by-reference and pass-by-value is something that needs to be
>>>> mastered in all languages, whether or not you have pointers to play
>>>> with.
>>>>
>>>> Regards,
>>>>
>>>> Edward
>>>>
>>>>
>>>>
>>>> On 5 May 2014 18:30, Troels Emtekær Linnet <[email protected]> wrote:
>>>>> This reminds me:
>>>>>
>>>>> http://combichem.blogspot.dk/2013/08/you-know-what-really-grinds-my-gears-in.html
>>>>>
>>>>> 2014-05-05 17:52 GMT+02:00 Edward d'Auvergne <[email protected]>:
>>>>>> Hi,
>>>>>>
>>>>>> This is an important difference.  In the first case (back_calc[i] =
>>>>>> Minty[i]), what is happening is that your are copying the data into a
>>>>>> pre-existing structure.  In the second case (back_calc = Minty), the
>>>>>> existing back_calc structure will be overwritten.  Therefore the
>>>>>> back_calc structure in the calling code will be modified in the first
>>>>>> case but not the second.  Here is some demo code:
>>>>>>
>>>>>> def mod1(x):
>>>>>>     x[0] = 1
>>>>>>
>>>>>> def mod2(x):
>>>>>>     x = [3, 2]
>>>>>>
>>>>>> x = [0, 2]
>>>>>> print(x)
>>>>>> mod1(x)
>>>>>> print(x)
>>>>>> mod2(x)
>>>>>> print(x)
>>>>>>
>>>>>> I don't know of a way around this.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Edward
>>>>>>
>>>>>>
>>>>>> On 5 May 2014 17:42, Troels Emtekær Linnet <[email protected]> wrote:
>>>>>>> Hi Edward.
>>>>>>>
>>>>>>> In the library function of b14.py, i am looping over
>>>>>>> the dispersion points to put in the data.
>>>>>>>
>>>>>>>     for i in range(num_points):
>>>>>>>
>>>>>>>         # The full formula.
>>>>>>>         back_calc[i] = Minty[i]
>>>>>>>
>>>>>>> Why can I not just set:
>>>>>>> back_calc = Minty
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> relax (http://www.nmr-relax.com)
>>>>>>>
>>>>>>> This is the relax-devel mailing list
>>>>>>> [email protected]
>>>>>>>
>>>>>>> To unsubscribe from this list, get a password
>>>>>>> reminder, or change your subscription options,
>>>>>>> visit the list information page at
>>>>>>> https://mail.gna.org/listinfo/relax-devel

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: Why does one have to loop over the dispersion points?

Reply via email to