How do I setup a branch? :-) Best Troels
2014-05-09 15:31 GMT+02:00 Troels Emtekær Linnet <[email protected]>: > Hi Edward. > > I really think for my case, that 25 speed up is a deal breaker ! > I have so much data to crunch, that 25 speed is absolutely perfect. > > I would only optimise this for CR72, and TSMFK01, since these are the > ones I need now. > And the change of code is only 3-5 lines? > > And i was thinking of one thing more. > > CR72 always go over loop. > > ----------- > # Loop over the time points, back calculating the R2eff values. > for i in range(num_points): > # The full eta+/- values. > etapos = etapos_part / cpmg_frqs[i] > etaneg = etaneg_part / cpmg_frqs[i] > > # Catch large values of etapos going into the cosh function. > if etapos > 100: > back_calc[i] = 1e100 > continue > > # The arccosh argument - catch invalid values. > fact = Dpos * cosh(etapos) - Dneg * cos(etaneg) > if fact < 1.0: > back_calc[i] = r20_kex > continue > > # The full formula. > back_calc[i] = r20_kex - cpmg_frqs[i] * arccosh(fact) > ------------ > I would rather do: > etapos = etapos_part / cpmg_frqs > > And then check for nan values. > If any of these are there, just return the whole array with 1e100, > instead of single values. > That would replace a loop with a check. > > Best > Troels > > > 2014-05-09 14:58 GMT+02:00 Edward d'Auvergne <[email protected]>: >> Hi, >> >> This approach can add a little speed. You really need to stress test >> and have profile timings to understand. You should also try different >> Python versions (2 and 3) because each implementation is different. >> You can sometimes have a speed up in Python 2 which does nothing in >> Python 3 (due to Python 3 being more optimised). There can also be >> huge differences between numpy versions. Anyway, here is a powerful >> test which shows 3 different implementation ideas for the >> back-calculated R2eff data in the dispersion functions: >> >> """ >> import cProfile as profile >> from numpy import array, cos, float64, sin, zeros >> import pstats >> >> def in_place(values, bc): >> x = cos(values) * sin(values) >> for i in range(len(bc)): >> bc[i] = x[i] >> >> def really_slow(values, bc): >> for i in range(len(bc)): >> x = cos(values[i]) * sin(values[i]) >> bc[i] = x >> >> def return_bc(values): >> return cos(values) * sin(values) >> >> def test_in_place(inc=None, values=None, values2=None, bc=None): >> for i in range(inc): >> in_place(values, bc[0]) >> in_place(values2, bc[1]) >> print(bc) >> >> def test_really_slow(inc=None, values=None, values2=None, bc=None): >> for i in range(inc): >> really_slow(values, bc[0]) >> really_slow(values2, bc[1]) >> print(bc) >> >> def test_return_bc(inc=None, values=None, values2=None, bc=None): >> for i in range(inc): >> bc[0] = return_bc(values) >> bc[1] = return_bc(values2) >> print(bc) >> >> def test(): >> values = array([1, 3, 0.1], float64) >> values2 = array([0.1, 0.2, 0.3], float64) >> bc = zeros((2, 3), float64) >> inc = 1000000 >> test_in_place(inc=inc, values=values, values2=values2, bc=bc) >> test_really_slow(inc=inc, values=values, values2=values2, bc=bc) >> test_return_bc(inc=inc, values=values, values2=values2, bc=bc) >> >> def print_stats(stats, status=0): >> pstats.Stats(stats).sort_stats('time', 'name').print_stats() >> profile.Profile.print_stats = print_stats >> profile.runctx('test()', globals(), locals()) >> """" >> >> >> Try running this in Python 2 and 3. If the cProfile import does not >> work on one version, try simply "import profile". You should create >> such scripts for testing out code optimisation ideas. Knowing how to >> profile is essential. For Python 3, I see: >> >> """ >> $ python3.4 edward.py >> [[ 0.45464871 -0.13970775 0.09933467] >> [ 0.09933467 0.19470917 0.28232124]] >> [[ 0.45464871 -0.13970775 0.09933467] >> [ 0.09933467 0.19470917 0.28232124]] >> [[ 0.45464871 -0.13970775 0.09933467] >> [ 0.09933467 0.19470917 0.28232124]] >> 10001042 function calls (10001036 primitive calls) in 39.303 seconds >> >> Ordered by: internal time, function name >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 2000000 19.744 0.000 19.867 0.000 edward.py:10(really_slow) >> 2000000 9.128 0.000 9.286 0.000 edward.py:5(in_place) >> 2000000 5.966 0.000 5.966 0.000 edward.py:15(return_bc) >> 1 1.964 1.964 7.931 7.931 edward.py:30(test_return_bc) >> 1 1.119 1.119 20.987 20.987 edward.py:24(test_really_slow) >> 1 1.099 1.099 10.385 10.385 edward.py:18(test_in_place) >> 4000198 0.281 0.000 0.281 0.000 {built-in method len} >> 27 0.001 0.000 0.001 0.000 {method 'reduce' of >> 'numpy.ufunc' objects} >> """ >> >> Here you can see that test_return_bc() is 80% the speed of >> test_in_place(). The 'cumtime' is the important number, this is the >> total amount of time spent in that function. So the speed up is not >> huge. For Python 2: >> >> """ >> $ python2.7 edward.py >> [[ 0.45464871 -0.13970775 0.09933467] >> [ 0.09933467 0.19470917 0.28232124]] >> [[ 0.45464871 -0.13970775 0.09933467] >> [ 0.09933467 0.19470917 0.28232124]] >> [[ 0.45464871 -0.13970775 0.09933467] >> [ 0.09933467 0.19470917 0.28232124]] >> 14000972 function calls (14000966 primitive calls) in 38.625 seconds >> >> Ordered by: internal time, function name >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 2000000 18.373 0.000 19.086 0.000 edward.py:10(really_slow) >> 2000000 8.798 0.000 9.576 0.000 edward.py:5(in_place) >> 2000000 5.937 0.000 5.937 0.000 edward.py:15(return_bc) >> 1 1.839 1.839 7.785 7.785 edward.py:30(test_return_bc) >> 4000021 1.141 0.000 1.141 0.000 {range} >> 1 1.086 1.086 10.675 10.675 edward.py:18(test_in_place) >> 1 1.070 1.070 20.165 20.165 edward.py:24(test_really_slow) >> 4000198 0.379 0.000 0.379 0.000 {len} >> """ >> >> Hmmm, Python 2 is faster than Python 3 for this example! See for >> yourself. If you really think that making the code 1.25 times faster, >> as shown in these tests, is worth your time, then this must be done in >> a subversion branch (http://svn.gna.org/viewcvs/relax/branches/). >> That way we can have timing tests between the trunk and the branch. >> As this affects all dispersion models, the changes will be quite >> disruptive. And if the implementation is not faster or if it breaks >> everything, then the branch can be deleted. What ever you do, please >> don't use a git-svn branch. >> >> Regards, >> >> Edward >> >> >> >> >> On 9 May 2014 14:07, Troels Emtekær Linnet <[email protected]> wrote: >>> Hi Edward. >>> >>> How about this script? >>> Here I try to pass the back the r2eff values, and then set them in the >>> back_calculated class object. >>> Will this work ?? >>> >>> Or else I found this post about updating values. >>> http://stackoverflow.com/questions/14916284/in-python-class-object-how-to-auto-update-attributes >>> They talk about >>> @property >>> and setter, which I dont get yet. :-) >>> >>> Best >>> Troels >>> >>> >>> --------------- >>> >>> >>> def loop_rep(x, nr): >>> y = [98, 99] >>> for i in range(nr): >>> x[i] = y[i] >>> >>> def not_loop_rep(x, nr): >>> y = [98, 99] >>> x = y >>> >>> def not_loop_rep_new(x, nr): >>> y = [98, 99] >>> x = y >>> return x >>> >>> >>> class MyClass: >>> def __init__(self, x): >>> self.x = x >>> self.nr = len(x) >>> >>> def printc(self): >>> print self.x, self.nr >>> >>> def calc_loop_rep(self, x=None, nr=None): >>> loop_rep(x=self.x, nr=self.nr) >>> >>> def calc_not_loop_rep(self, x=None, nr=None): >>> not_loop_rep(x=self.x, nr=self.nr) >>> >>> def calc_not_loop_rep_new(self, x=None, nr=None): >>> self.x = not_loop_rep_new(x=self.x, nr=self.nr) >>> >>> print("For class where we loop replace ") >>> "Create object of class" >>> t_rep = MyClass([0, 1]) >>> "Print object of class" >>> t_rep.printc() >>> "Calc object of class" >>> t_rep.calc_loop_rep() >>> " Then print" >>> t_rep.printc() >>> >>> print("\nFor class where we not loop replace ") >>> " Now try with replace " >>> t = MyClass([3, 4]) >>> t.printc() >>> t.calc_not_loop_rep() >>> t.printc() >>> >>> print("\nFor class where we not loop replace ") >>> t_new = MyClass([5, 6]) >>> t_new.printc() >>> t_new.calc_not_loop_rep_new() >>> t_new.printc() >>> >>> 2014-05-05 19:07 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>> :) It does slow it down a little, but that's unavoidable. It's also >>>> unavoidable in C, Fortran, Perl, etc. As long as the number of >>>> operations in that loop is minimal, then it's the best you can do. If >>>> it worries you, you could run a test where you call the target >>>> function say 1e6 times, with and without the loop to see the timing >>>> difference. Or simply running in Python 2: >>>> >>>> for i in xrange(1000000): >>>> x = 1 >>>> >>>> Then try: >>>> >>>> for i in xrange(100000000): >>>> x = 2 >>>> >>>> These two demonstrate the slowness of the Python loop. But the second >>>> case is extreme and you won't encounter that much looping in these >>>> functions. So while it is theoretically slower than C and Fortran >>>> looping, you can probably see that no one would care :) Here is >>>> another test, with Python 2 code: >>>> >>>> """ >>>> import cProfile as profile >>>> >>>> def loop_1e6(): >>>> for i in xrange(int(1e6)): >>>> x = 1 >>>> >>>> def loop_1e8(): >>>> for i in xrange(int(1e8)): >>>> x = 1 >>>> >>>> def sum_conv(): >>>> for i in xrange(100000000): >>>> x = 2 + 2. >>>> >>>> def sum_normal(): >>>> for i in xrange(100000000): >>>> x = 2. + 2. >>>> >>>> def test(): >>>> loop_1e6() >>>> loop_1e8() >>>> sum_normal() >>>> sum_conv() >>>> >>>> profile.runctx('test()', globals(), locals()) >>>> """ >>>> >>>> Running this on my system shows: >>>> >>>> """ >>>> 7 function calls in 6.707 seconds >>>> >>>> Ordered by: standard name >>>> >>>> ncalls tottime percall cumtime percall filename:lineno(function) >>>> 1 0.000 0.000 6.707 6.707 <string>:1(<module>) >>>> 1 2.228 2.228 2.228 2.228 aaa.py:11(sum_conv) >>>> 1 2.228 2.228 2.228 2.228 aaa.py:15(sum_normal) >>>> 1 0.000 0.000 6.707 6.707 aaa.py:19(test) >>>> 1 0.022 0.022 0.022 0.022 aaa.py:3(loop_1e6) >>>> 1 2.228 2.228 2.228 2.228 aaa.py:7(loop_1e8) >>>> 1 0.000 0.000 0.000 0.000 {method 'disable' of >>>> '_lsprof.Profiler' objects} >>>> """ >>>> >>>> That should be self explanatory. The better optimisation targets are >>>> the repeated maths operations and the maths operations that can be >>>> shifted into the target function or the target function >>>> initialisation. Despite the numbers above which prove my int to float >>>> speed argument as utter nonsense, it might still good to remove the >>>> int to float conversions, if only to match the other functions. >>>> >>>> Regards, >>>> >>>> Edward >>>> >>>> >>>> >>>> >>>> On 5 May 2014 18:45, Troels Emtekær Linnet <[email protected]> wrote: >>>>> The reason why I ask, is that I am afraid that this for loop slows >>>>> everything down. >>>>> >>>>> What do you think? >>>>> >>>>> 2014-05-05 18:41 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>>>> This is not Python specific though :) As far as I know, C uses >>>>>> pass-by-value for arguments, unless they are arrays or other funky >>>>>> objects/functions/etc.. This is the same behaviour as Python. >>>>>> Pass-by-reference and pass-by-value is something that needs to be >>>>>> mastered in all languages, whether or not you have pointers to play >>>>>> with. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Edward >>>>>> >>>>>> >>>>>> >>>>>> On 5 May 2014 18:30, Troels Emtekær Linnet <[email protected]> wrote: >>>>>>> This reminds me: >>>>>>> >>>>>>> http://combichem.blogspot.dk/2013/08/you-know-what-really-grinds-my-gears-in.html >>>>>>> >>>>>>> 2014-05-05 17:52 GMT+02:00 Edward d'Auvergne <[email protected]>: >>>>>>>> Hi, >>>>>>>> >>>>>>>> This is an important difference. In the first case (back_calc[i] = >>>>>>>> Minty[i]), what is happening is that your are copying the data into a >>>>>>>> pre-existing structure. In the second case (back_calc = Minty), the >>>>>>>> existing back_calc structure will be overwritten. Therefore the >>>>>>>> back_calc structure in the calling code will be modified in the first >>>>>>>> case but not the second. Here is some demo code: >>>>>>>> >>>>>>>> def mod1(x): >>>>>>>> x[0] = 1 >>>>>>>> >>>>>>>> def mod2(x): >>>>>>>> x = [3, 2] >>>>>>>> >>>>>>>> x = [0, 2] >>>>>>>> print(x) >>>>>>>> mod1(x) >>>>>>>> print(x) >>>>>>>> mod2(x) >>>>>>>> print(x) >>>>>>>> >>>>>>>> I don't know of a way around this. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Edward >>>>>>>> >>>>>>>> >>>>>>>> On 5 May 2014 17:42, Troels Emtekær Linnet <[email protected]> >>>>>>>> wrote: >>>>>>>>> Hi Edward. >>>>>>>>> >>>>>>>>> In the library function of b14.py, i am looping over >>>>>>>>> the dispersion points to put in the data. >>>>>>>>> >>>>>>>>> for i in range(num_points): >>>>>>>>> >>>>>>>>> # The full formula. >>>>>>>>> back_calc[i] = Minty[i] >>>>>>>>> >>>>>>>>> Why can I not just set: >>>>>>>>> back_calc = Minty >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> relax (http://www.nmr-relax.com) >>>>>>>>> >>>>>>>>> This is the relax-devel mailing list >>>>>>>>> [email protected] >>>>>>>>> >>>>>>>>> To unsubscribe from this list, get a password >>>>>>>>> reminder, or change your subscription options, >>>>>>>>> visit the list information page at >>>>>>>>> https://mail.gna.org/listinfo/relax-devel _______________________________________________ relax (http://www.nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

