Would you agree on discarding the whole loop, return one array with all values = 1e100, rather than some elements have 1e100, but other rather high values.?
Best Troels 2014-05-09 15:58 GMT+02:00 Edward d'Auvergne <[email protected]>: > Hi, > > To set up a branch, just read the 3 small sections of the 'Branches' > section of the user manual > (http://www.nmr-relax.com/manual/Branches.html). Using git here is > fatal. But all the commands you need are listed there. Try the CR72 > optimisations first though. Those will then make the API changes much > easier. > > Regards, > > Edward > > > > On 9 May 2014 15:34, Troels Emtekær Linnet <[email protected]> wrote: >> How do I setup a branch? :-) >> >> Best >> Troels >> >> 2014-05-09 15:31 GMT+02:00 Troels Emtekær Linnet <[email protected]>: >>> Hi Edward. >>> >>> I really think for my case, that 25 speed up is a deal breaker ! >>> I have so much data to crunch, that 25 speed is absolutely perfect. >>> >>> I would only optimise this for CR72, and TSMFK01, since these are the >>> ones I need now. >>> And the change of code is only 3-5 lines? >>> >>> And i was thinking of one thing more. >>> >>> CR72 always go over loop. >>> >>> ----------- >>> # Loop over the time points, back calculating the R2eff values. >>> for i in range(num_points): >>> # The full eta+/- values. >>> etapos = etapos_part / cpmg_frqs[i] >>> etaneg = etaneg_part / cpmg_frqs[i] >>> >>> # Catch large values of etapos going into the cosh function. >>> if etapos > 100: >>> back_calc[i] = 1e100 >>> continue >>> >>> # The arccosh argument - catch invalid values. >>> fact = Dpos * cosh(etapos) - Dneg * cos(etaneg) >>> if fact < 1.0: >>> back_calc[i] = r20_kex >>> continue >>> >>> # The full formula. >>> back_calc[i] = r20_kex - cpmg_frqs[i] * arccosh(fact) >>> ------------ >>> I would rather do: >>> etapos = etapos_part / cpmg_frqs >>> >>> And then check for nan values. >>> If any of these are there, just return the whole array with 1e100, >>> instead of single values. >>> That would replace a loop with a check. >>> >>> Best >>> Troels >>> >>> >>> 2014-05-09 14:58 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>> Hi, >>>> >>>> This approach can add a little speed. You really need to stress test >>>> and have profile timings to understand. You should also try different >>>> Python versions (2 and 3) because each implementation is different. >>>> You can sometimes have a speed up in Python 2 which does nothing in >>>> Python 3 (due to Python 3 being more optimised). There can also be >>>> huge differences between numpy versions. Anyway, here is a powerful >>>> test which shows 3 different implementation ideas for the >>>> back-calculated R2eff data in the dispersion functions: >>>> >>>> """ >>>> import cProfile as profile >>>> from numpy import array, cos, float64, sin, zeros >>>> import pstats >>>> >>>> def in_place(values, bc): >>>> x = cos(values) * sin(values) >>>> for i in range(len(bc)): >>>> bc[i] = x[i] >>>> >>>> def really_slow(values, bc): >>>> for i in range(len(bc)): >>>> x = cos(values[i]) * sin(values[i]) >>>> bc[i] = x >>>> >>>> def return_bc(values): >>>> return cos(values) * sin(values) >>>> >>>> def test_in_place(inc=None, values=None, values2=None, bc=None): >>>> for i in range(inc): >>>> in_place(values, bc[0]) >>>> in_place(values2, bc[1]) >>>> print(bc) >>>> >>>> def test_really_slow(inc=None, values=None, values2=None, bc=None): >>>> for i in range(inc): >>>> really_slow(values, bc[0]) >>>> really_slow(values2, bc[1]) >>>> print(bc) >>>> >>>> def test_return_bc(inc=None, values=None, values2=None, bc=None): >>>> for i in range(inc): >>>> bc[0] = return_bc(values) >>>> bc[1] = return_bc(values2) >>>> print(bc) >>>> >>>> def test(): >>>> values = array([1, 3, 0.1], float64) >>>> values2 = array([0.1, 0.2, 0.3], float64) >>>> bc = zeros((2, 3), float64) >>>> inc = 1000000 >>>> test_in_place(inc=inc, values=values, values2=values2, bc=bc) >>>> test_really_slow(inc=inc, values=values, values2=values2, bc=bc) >>>> test_return_bc(inc=inc, values=values, values2=values2, bc=bc) >>>> >>>> def print_stats(stats, status=0): >>>> pstats.Stats(stats).sort_stats('time', 'name').print_stats() >>>> profile.Profile.print_stats = print_stats >>>> profile.runctx('test()', globals(), locals()) >>>> """" >>>> >>>> >>>> Try running this in Python 2 and 3. If the cProfile import does not >>>> work on one version, try simply "import profile". You should create >>>> such scripts for testing out code optimisation ideas. Knowing how to >>>> profile is essential. For Python 3, I see: >>>> >>>> """ >>>> $ python3.4 edward.py >>>> [[ 0.45464871 -0.13970775 0.09933467] >>>> [ 0.09933467 0.19470917 0.28232124]] >>>> [[ 0.45464871 -0.13970775 0.09933467] >>>> [ 0.09933467 0.19470917 0.28232124]] >>>> [[ 0.45464871 -0.13970775 0.09933467] >>>> [ 0.09933467 0.19470917 0.28232124]] >>>> 10001042 function calls (10001036 primitive calls) in 39.303 >>>> seconds >>>> >>>> Ordered by: internal time, function name >>>> >>>> ncalls tottime percall cumtime percall filename:lineno(function) >>>> 2000000 19.744 0.000 19.867 0.000 edward.py:10(really_slow) >>>> 2000000 9.128 0.000 9.286 0.000 edward.py:5(in_place) >>>> 2000000 5.966 0.000 5.966 0.000 edward.py:15(return_bc) >>>> 1 1.964 1.964 7.931 7.931 edward.py:30(test_return_bc) >>>> 1 1.119 1.119 20.987 20.987 >>>> edward.py:24(test_really_slow) >>>> 1 1.099 1.099 10.385 10.385 edward.py:18(test_in_place) >>>> 4000198 0.281 0.000 0.281 0.000 {built-in method len} >>>> 27 0.001 0.000 0.001 0.000 {method 'reduce' of >>>> 'numpy.ufunc' objects} >>>> """ >>>> >>>> Here you can see that test_return_bc() is 80% the speed of >>>> test_in_place(). The 'cumtime' is the important number, this is the >>>> total amount of time spent in that function. So the speed up is not >>>> huge. For Python 2: >>>> >>>> """ >>>> $ python2.7 edward.py >>>> [[ 0.45464871 -0.13970775 0.09933467] >>>> [ 0.09933467 0.19470917 0.28232124]] >>>> [[ 0.45464871 -0.13970775 0.09933467] >>>> [ 0.09933467 0.19470917 0.28232124]] >>>> [[ 0.45464871 -0.13970775 0.09933467] >>>> [ 0.09933467 0.19470917 0.28232124]] >>>> 14000972 function calls (14000966 primitive calls) in 38.625 >>>> seconds >>>> >>>> Ordered by: internal time, function name >>>> >>>> ncalls tottime percall cumtime percall filename:lineno(function) >>>> 2000000 18.373 0.000 19.086 0.000 edward.py:10(really_slow) >>>> 2000000 8.798 0.000 9.576 0.000 edward.py:5(in_place) >>>> 2000000 5.937 0.000 5.937 0.000 edward.py:15(return_bc) >>>> 1 1.839 1.839 7.785 7.785 edward.py:30(test_return_bc) >>>> 4000021 1.141 0.000 1.141 0.000 {range} >>>> 1 1.086 1.086 10.675 10.675 edward.py:18(test_in_place) >>>> 1 1.070 1.070 20.165 20.165 >>>> edward.py:24(test_really_slow) >>>> 4000198 0.379 0.000 0.379 0.000 {len} >>>> """ >>>> >>>> Hmmm, Python 2 is faster than Python 3 for this example! See for >>>> yourself. If you really think that making the code 1.25 times faster, >>>> as shown in these tests, is worth your time, then this must be done in >>>> a subversion branch (http://svn.gna.org/viewcvs/relax/branches/). >>>> That way we can have timing tests between the trunk and the branch. >>>> As this affects all dispersion models, the changes will be quite >>>> disruptive. And if the implementation is not faster or if it breaks >>>> everything, then the branch can be deleted. What ever you do, please >>>> don't use a git-svn branch. >>>> >>>> Regards, >>>> >>>> Edward >>>> >>>> >>>> >>>> >>>> On 9 May 2014 14:07, Troels Emtekær Linnet <[email protected]> wrote: >>>>> Hi Edward. >>>>> >>>>> How about this script? >>>>> Here I try to pass the back the r2eff values, and then set them in the >>>>> back_calculated class object. >>>>> Will this work ?? >>>>> >>>>> Or else I found this post about updating values. >>>>> http://stackoverflow.com/questions/14916284/in-python-class-object-how-to-auto-update-attributes >>>>> They talk about >>>>> @property >>>>> and setter, which I dont get yet. :-) >>>>> >>>>> Best >>>>> Troels >>>>> >>>>> >>>>> --------------- >>>>> >>>>> >>>>> def loop_rep(x, nr): >>>>> y = [98, 99] >>>>> for i in range(nr): >>>>> x[i] = y[i] >>>>> >>>>> def not_loop_rep(x, nr): >>>>> y = [98, 99] >>>>> x = y >>>>> >>>>> def not_loop_rep_new(x, nr): >>>>> y = [98, 99] >>>>> x = y >>>>> return x >>>>> >>>>> >>>>> class MyClass: >>>>> def __init__(self, x): >>>>> self.x = x >>>>> self.nr = len(x) >>>>> >>>>> def printc(self): >>>>> print self.x, self.nr >>>>> >>>>> def calc_loop_rep(self, x=None, nr=None): >>>>> loop_rep(x=self.x, nr=self.nr) >>>>> >>>>> def calc_not_loop_rep(self, x=None, nr=None): >>>>> not_loop_rep(x=self.x, nr=self.nr) >>>>> >>>>> def calc_not_loop_rep_new(self, x=None, nr=None): >>>>> self.x = not_loop_rep_new(x=self.x, nr=self.nr) >>>>> >>>>> print("For class where we loop replace ") >>>>> "Create object of class" >>>>> t_rep = MyClass([0, 1]) >>>>> "Print object of class" >>>>> t_rep.printc() >>>>> "Calc object of class" >>>>> t_rep.calc_loop_rep() >>>>> " Then print" >>>>> t_rep.printc() >>>>> >>>>> print("\nFor class where we not loop replace ") >>>>> " Now try with replace " >>>>> t = MyClass([3, 4]) >>>>> t.printc() >>>>> t.calc_not_loop_rep() >>>>> t.printc() >>>>> >>>>> print("\nFor class where we not loop replace ") >>>>> t_new = MyClass([5, 6]) >>>>> t_new.printc() >>>>> t_new.calc_not_loop_rep_new() >>>>> t_new.printc() >>>>> >>>>> 2014-05-05 19:07 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>>>> :) It does slow it down a little, but that's unavoidable. It's also >>>>>> unavoidable in C, Fortran, Perl, etc. As long as the number of >>>>>> operations in that loop is minimal, then it's the best you can do. If >>>>>> it worries you, you could run a test where you call the target >>>>>> function say 1e6 times, with and without the loop to see the timing >>>>>> difference. Or simply running in Python 2: >>>>>> >>>>>> for i in xrange(1000000): >>>>>> x = 1 >>>>>> >>>>>> Then try: >>>>>> >>>>>> for i in xrange(100000000): >>>>>> x = 2 >>>>>> >>>>>> These two demonstrate the slowness of the Python loop. But the second >>>>>> case is extreme and you won't encounter that much looping in these >>>>>> functions. So while it is theoretically slower than C and Fortran >>>>>> looping, you can probably see that no one would care :) Here is >>>>>> another test, with Python 2 code: >>>>>> >>>>>> """ >>>>>> import cProfile as profile >>>>>> >>>>>> def loop_1e6(): >>>>>> for i in xrange(int(1e6)): >>>>>> x = 1 >>>>>> >>>>>> def loop_1e8(): >>>>>> for i in xrange(int(1e8)): >>>>>> x = 1 >>>>>> >>>>>> def sum_conv(): >>>>>> for i in xrange(100000000): >>>>>> x = 2 + 2. >>>>>> >>>>>> def sum_normal(): >>>>>> for i in xrange(100000000): >>>>>> x = 2. + 2. >>>>>> >>>>>> def test(): >>>>>> loop_1e6() >>>>>> loop_1e8() >>>>>> sum_normal() >>>>>> sum_conv() >>>>>> >>>>>> profile.runctx('test()', globals(), locals()) >>>>>> """ >>>>>> >>>>>> Running this on my system shows: >>>>>> >>>>>> """ >>>>>> 7 function calls in 6.707 seconds >>>>>> >>>>>> Ordered by: standard name >>>>>> >>>>>> ncalls tottime percall cumtime percall filename:lineno(function) >>>>>> 1 0.000 0.000 6.707 6.707 <string>:1(<module>) >>>>>> 1 2.228 2.228 2.228 2.228 aaa.py:11(sum_conv) >>>>>> 1 2.228 2.228 2.228 2.228 aaa.py:15(sum_normal) >>>>>> 1 0.000 0.000 6.707 6.707 aaa.py:19(test) >>>>>> 1 0.022 0.022 0.022 0.022 aaa.py:3(loop_1e6) >>>>>> 1 2.228 2.228 2.228 2.228 aaa.py:7(loop_1e8) >>>>>> 1 0.000 0.000 0.000 0.000 {method 'disable' of >>>>>> '_lsprof.Profiler' objects} >>>>>> """ >>>>>> >>>>>> That should be self explanatory. The better optimisation targets are >>>>>> the repeated maths operations and the maths operations that can be >>>>>> shifted into the target function or the target function >>>>>> initialisation. Despite the numbers above which prove my int to float >>>>>> speed argument as utter nonsense, it might still good to remove the >>>>>> int to float conversions, if only to match the other functions. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Edward >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 5 May 2014 18:45, Troels Emtekær Linnet <[email protected]> wrote: >>>>>>> The reason why I ask, is that I am afraid that this for loop slows >>>>>>> everything down. >>>>>>> >>>>>>> What do you think? >>>>>>> >>>>>>> 2014-05-05 18:41 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>>>>>> This is not Python specific though :) As far as I know, C uses >>>>>>>> pass-by-value for arguments, unless they are arrays or other funky >>>>>>>> objects/functions/etc.. This is the same behaviour as Python. >>>>>>>> Pass-by-reference and pass-by-value is something that needs to be >>>>>>>> mastered in all languages, whether or not you have pointers to play >>>>>>>> with. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Edward >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 5 May 2014 18:30, Troels Emtekær Linnet <[email protected]> >>>>>>>> wrote: >>>>>>>>> This reminds me: >>>>>>>>> >>>>>>>>> http://combichem.blogspot.dk/2013/08/you-know-what-really-grinds-my-gears-in.html >>>>>>>>> >>>>>>>>> 2014-05-05 17:52 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> This is an important difference. In the first case (back_calc[i] = >>>>>>>>>> Minty[i]), what is happening is that your are copying the data into a >>>>>>>>>> pre-existing structure. In the second case (back_calc = Minty), the >>>>>>>>>> existing back_calc structure will be overwritten. Therefore the >>>>>>>>>> back_calc structure in the calling code will be modified in the first >>>>>>>>>> case but not the second. Here is some demo code: >>>>>>>>>> >>>>>>>>>> def mod1(x): >>>>>>>>>> x[0] = 1 >>>>>>>>>> >>>>>>>>>> def mod2(x): >>>>>>>>>> x = [3, 2] >>>>>>>>>> >>>>>>>>>> x = [0, 2] >>>>>>>>>> print(x) >>>>>>>>>> mod1(x) >>>>>>>>>> print(x) >>>>>>>>>> mod2(x) >>>>>>>>>> print(x) >>>>>>>>>> >>>>>>>>>> I don't know of a way around this. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> Edward >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 5 May 2014 17:42, Troels Emtekær Linnet <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> Hi Edward. >>>>>>>>>>> >>>>>>>>>>> In the library function of b14.py, i am looping over >>>>>>>>>>> the dispersion points to put in the data. >>>>>>>>>>> >>>>>>>>>>> for i in range(num_points): >>>>>>>>>>> >>>>>>>>>>> # The full formula. >>>>>>>>>>> back_calc[i] = Minty[i] >>>>>>>>>>> >>>>>>>>>>> Why can I not just set: >>>>>>>>>>> back_calc = Minty >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> relax (http://www.nmr-relax.com) >>>>>>>>>>> >>>>>>>>>>> This is the relax-devel mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>>> To unsubscribe from this list, get a password >>>>>>>>>>> reminder, or change your subscription options, >>>>>>>>>>> visit the list information page at >>>>>>>>>>> https://mail.gna.org/listinfo/relax-devel _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

