I get these results

That shows a 4x-5x speed-up.

That is quite nice!



-------
Checked on MacBook Pro
2.4 GHz Intel Core i5
8 GB 1067 Mhz DDR3 RAM.
Python Distribution -- Python 2.7.3 |EPD 7.3-2 (32-bit)|

Timing for:
2 fields
20 dispersion points
iterations of function call: 1000

Timed for simulating 1 or 100 clustered spins.

svn ls "^/tags"

########
For tag 3.2.2
svn switch ^/tags/3.2.2
########

1 spin:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2000    0.168    0.000    0.198    0.000 cr72.py:100(r2eff_CR72)
     1000    0.040    0.000    0.280    0.000
relax_disp.py:456(calc_CR72_chi2)
     2000    0.028    0.000    0.039    0.000 chi2.py:32(chi2)

100 spins:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   200000   16.810    0.000   19.912    0.000 cr72.py:100(r2eff_CR72)
     1000    4.185    0.004   28.518    0.029
relax_disp.py:456(calc_CR72_chi2)
   200000    3.018    0.000    4.144    0.000 chi2.py:32(chi2)


########
For tag 3.2.1
svn switch ^/tags/3.2.1
########

1 spin:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2000    0.696    0.000    0.697    0.000 cr72.py:98(r2eff_CR72)
     1000    0.038    0.000    0.781    0.001
relax_disp.py:456(calc_CR72_chi2)
     2000    0.031    0.000    0.043    0.000 chi2.py:32(chi2)

100 spins:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   200000   75.880    0.000   76.078    0.000 cr72.py:98(r2eff_CR72)
     1000    4.201    0.004   85.519    0.086
relax_disp.py:456(calc_CR72_chi2)
   200000    3.513    0.000    4.940    0.000 chi2.py:32(chi2)



2014-06-05 11:36 GMT+02:00 Edward d'Auvergne <[email protected]>:

> Hi,
>
> The best place might be to create a special directory in the
> test_suite/shared_data/dispersion directories.  Or another option
> would be to create a devel_scripts/profiling/ directory and place it
> there.  The first option might be the best though as you could then
> save additional files there, such as the relax log files with the
> profile timings.  Or simply have everything on one page in the wiki -
> script and output.  What do you think is best?
>
> Regards,
>
> Edward
>
>
>
> On 5 June 2014 11:27, Troels Emtekær Linnet <[email protected]> wrote:
> > Hi Ed.
> >
> > I have worked on a rather long profiling script now.
> >
> > It creates the necessary data structures, and then call the
> > relax_disp target function.
> >
> > Can you devise a "place" to put this script?
> >
> > Best
> > Troels
> >
> >
> >
> > 2014-06-05 11:13 GMT+02:00 Edward d'Auvergne <[email protected]>:
> >
> >> Hi Troels,
> >>
> >> This huge speed up you see also applies when you have multiple field
> >> strength data.  To understand how you can convert the long rank-1
> >> array you have in your g_* data structures into the multi-index rank-5
> >> back_calc array with dimensions {Ei, Si, Mi, Oi, Di}, see the numpy
> >> reshape() function:
> >>
> >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
> >>
> >> You can obtain this huge speed up if you convert the
> >> target_functions.relax_disp data structures to be similar to your g_*
> >> data structures, delete the looping in the func_*() target functions
> >> over the {Ei, Si, Mi, Oi, Di} dimensions (for the numeric models, this
> >> looping would need to be shifted into the lib.dispersion code to keep
> >> the API consistent), pass the new higher-dimensional data into the
> >> lib.dispersion modules, and finally use R2eff.reshape() to place the
> >> data back into the back_calc data structure.  This would again need to
> >> be in a new branch, and you should only do it if you wish to have huge
> >> speed ups for multi-experiment, clustered, multi-field, or
> >> multi-offset data.  The speed ups will also only be for the analytic
> >> models as the numeric models unfortunately do not have the necessary
> >> maths derived for calculating everything simultaneously in one linear
> >> algebra operation.
> >>
> >> Regards,
> >>
> >> Edward
> >>
> >>
> >>
> >> On 4 June 2014 17:11, Edward d'Auvergne <[email protected]> wrote:
> >> > Hi,
> >> >
> >> > The huge differences are because of the changes in the lib.dispersion
> >> > modules.  But wait!  The r2eff_CR72() receives the data for each
> >> > experiment, spin, and offset separately.  So this insane speed up is
> >> > not realised in the current target functions.  But the potential for
> >> > these speed ups is there thanks to your infrastructure work in the
> >> > 'disp_speed' branch.  I have mentioned this before:
> >> >
> >> > http://thread.gmane.org/gmane.science.nmr.relax.devel/5726
> >> >
> >> > Specifically the follow up at:
> >> >
> >> > http://thread.gmane.org/gmane.science.nmr.relax.devel/5726/focus=5806
> >> >
> >> > The idea mentioned in this post is exactly the speed up you see in
> >> > this test!  So if the idea is implemented in relax then, yes, you will
> >> > see this insane speed up in a clustered analysis.  Especially for
> >> > large clusters and a large number of offsets (for R1rho but also for
> >> > CPMG when off-resonace effects are implemented,
> >> > http://thread.gmane.org/gmane.science.nmr.relax.devel/5414/focus=5445
> ).
> >> >  But unfortunately currently you do not.
> >> >
> >> > Regards,
> >> >
> >> > Edward
> >> >
> >> >
> >> >
> >> >
> >> > On 4 June 2014 16:45, Troels Emtekær Linnet <[email protected]>
> >> > wrote:
> >> >> Hi Edward.
> >> >>
> >> >> Ah ja.
> >> >> I overwrite the state file for each new global fitting, with the new
> >> >> pipe.
> >> >> So that is increasing quite much.
> >> >> I will change that.
> >> >>
> >> >> I just checked my scripts.
> >> >> In both cases, I would do one grid search for the first run, and then
> >> >> the
> >> >> recurring analysis would copy the parameters from the first pipe.
> >> >>
> >> >> And the speed-up is between these analysis.
> >> >>
> >> >> Hm.
> >> >> I have to take that variable out with the grid search!
> >> >>
> >> >> I am trying to device a profile script, which I can put in the base
> >> >> folder
> >> >> of older versions of relax.
> >> >> For example relax 3.1.6 which I also have.
> >> >>
> >> >> It looks like this:
> >> >> -------------
> >> >> # Python module imports.
> >> >> from numpy import array, float64, pi, zeros
> >> >> import sys
> >> >> import os
> >> >> import cProfile
> >> >>
> >> >> # relax module imports.
> >> >> from lib.dispersion.cr72 import r2eff_CR72
> >> >>
> >> >> # Default parameter values.
> >> >> r20a = 2.0
> >> >> r20b = 4.0
> >> >> pA = 0.95
> >> >> dw = 2.0
> >> >> kex = 1000.0
> >> >>
> >> >> relax_times = 0.04
> >> >> ncyc_list = [2, 4, 8, 10, 20, 40, 500]
> >> >>
> >> >> # Required data structures.
> >> >> s_ncyc = array(ncyc_list)
> >> >> s_num_points = len(s_ncyc)
> >> >> s_cpmg_frqs = s_ncyc / relax_times
> >> >> s_R2eff = zeros(s_num_points, float64)
> >> >>
> >> >> g_ncyc = array(ncyc_list*100)
> >> >> g_num_points = len(g_ncyc)
> >> >> g_cpmg_frqs = g_ncyc / relax_times
> >> >> g_R2eff = zeros(g_num_points, float64)
> >> >>
> >> >> # The spin Larmor frequencies.
> >> >> sfrq = 200. * 1E6
> >> >>
> >> >> # Calculate pB.
> >> >> pB = 1.0 - pA
> >> >>
> >> >> # Exchange rates.
> >> >> k_BA = pA * kex
> >> >> k_AB = pB * kex
> >> >>
> >> >> # Calculate spin Larmor frequencies in 2pi.
> >> >> frqs = sfrq * 2 * pi
> >> >>
> >> >> # Convert dw from ppm to rad/s.
> >> >> dw_frq = dw * frqs / 1.e6
> >> >>
> >> >>
> >> >> def single():
> >> >>     for i in xrange(0,10000):
> >> >>         r2eff_CR72(r20a=r20a, r20b=r20b, pA=pA, dw=dw_frq, kex=kex,
> >> >> cpmg_frqs=s_cpmg_frqs, back_calc=s_R2eff, num_points=s_num_points)
> >> >>
> >> >> cProfile.run('single()')
> >> >>
> >> >> def cluster():
> >> >>     for i in xrange(0,10000):
> >> >>         r2eff_CR72(r20a=r20a, r20b=r20b, pA=pA, dw=dw_frq, kex=kex,
> >> >> cpmg_frqs=g_cpmg_frqs, back_calc=g_R2eff, num_points=g_num_points)
> >> >>
> >> >> cProfile.run('cluster()')
> >> >> ------------------------
> >> >>
> >> >> For 3.1.6
> >> >> [tlinnet@tomat relax-3.1.6]$ python profile_lib_dispersion_cr72.py
> >> >>          20003 function calls in 0.793 CPU seconds
> >> >>
> >> >>    Ordered by: standard name
> >> >>
> >> >>    ncalls  tottime  percall  cumtime  percall
> filename:lineno(function)
> >> >>         1    0.000    0.000    0.793    0.793 <string>:1(<module>)
> >> >>     10000    0.778    0.000    0.783    0.000 cr72.py:98(r2eff_CR72)
> >> >>         1    0.010    0.010    0.793    0.793
> >> >> profile_lib_dispersion_cr72.py:69(single)
> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> '_lsprof.Profiler' objects}
> >> >>     10000    0.005    0.000    0.005    0.000 {range}
> >> >>
> >> >>
> >> >>          20003 function calls in 61.901 CPU seconds
> >> >>
> >> >>    Ordered by: standard name
> >> >>
> >> >>    ncalls  tottime  percall  cumtime  percall
> filename:lineno(function)
> >> >>         1    0.000    0.000   61.901   61.901 <string>:1(<module>)
> >> >>     10000   61.853    0.006   61.887    0.006 cr72.py:98(r2eff_CR72)
> >> >>         1    0.013    0.013   61.901   61.901
> >> >> profile_lib_dispersion_cr72.py:75(cluster)
> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> '_lsprof.Profiler' objects}
> >> >>     10000    0.035    0.000    0.035    0.000 {range}
> >> >>
> >> >>
> >> >> For trunk
> >> >>
> >> >> [tlinnet@tomat relax_trunk]$ python profile_lib_dispersion_cr72.py
> >> >>          80003 function calls in 0.514 CPU seconds
> >> >>
> >> >>    Ordered by: standard name
> >> >>
> >> >>    ncalls  tottime  percall  cumtime  percall
> filename:lineno(function)
> >> >>         1    0.000    0.000    0.514    0.514 <string>:1(<module>)
> >> >>     10000    0.390    0.000    0.503    0.000 cr72.py:100(r2eff_CR72)
> >> >>     10000    0.008    0.000    0.040    0.000
> fromnumeric.py:1314(sum)
> >> >>     10000    0.007    0.000    0.037    0.000
> fromnumeric.py:1708(amax)
> >> >>     10000    0.006    0.000    0.037    0.000
> fromnumeric.py:1769(amin)
> >> >>         1    0.011    0.011    0.514    0.514
> >> >> profile_lib_dispersion_cr72.py:69(single)
> >> >>     10000    0.007    0.000    0.007    0.000 {isinstance}
> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> '_lsprof.Profiler' objects}
> >> >>     10000    0.030    0.000    0.030    0.000 {method 'max' of
> >> >> 'numpy.ndarray' objects}
> >> >>     10000    0.030    0.000    0.030    0.000 {method 'min' of
> >> >> 'numpy.ndarray' objects}
> >> >>     10000    0.025    0.000    0.025    0.000 {method 'sum' of
> >> >> 'numpy.ndarray' objects}
> >> >>
> >> >>
> >> >>          80003 function calls in 1.209 CPU seconds
> >> >>
> >> >>    Ordered by: standard name
> >> >>
> >> >>    ncalls  tottime  percall  cumtime  percall
> filename:lineno(function)
> >> >>         1    0.000    0.000    1.209    1.209 <string>:1(<module>)
> >> >>     10000    1.042    0.000    1.196    0.000 cr72.py:100(r2eff_CR72)
> >> >>     10000    0.009    0.000    0.049    0.000
> fromnumeric.py:1314(sum)
> >> >>     10000    0.007    0.000    0.052    0.000
> fromnumeric.py:1708(amax)
> >> >>     10000    0.007    0.000    0.052    0.000
> fromnumeric.py:1769(amin)
> >> >>         1    0.014    0.014    1.209    1.209
> >> >> profile_lib_dispersion_cr72.py:75(cluster)
> >> >>     10000    0.007    0.000    0.007    0.000 {isinstance}
> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> '_lsprof.Profiler' objects}
> >> >>     10000    0.045    0.000    0.045    0.000 {method 'max' of
> >> >> 'numpy.ndarray' objects}
> >> >>     10000    0.045    0.000    0.045    0.000 {method 'min' of
> >> >> 'numpy.ndarray' objects}
> >> >>     10000    0.033    0.000    0.033    0.000 {method 'sum' of
> >> >> 'numpy.ndarray' objects}
> >> >> ---------------
> >> >>
> >> >> For 10000 iterations
> >> >>
> >> >> 3.1.6
> >> >> Single: 0.778
> >> >> 100 cluster: 61.853
> >> >>
> >> >> trunk
> >> >> Single: 0.390
> >> >> 100 cluster: 1.042
> >> >>
> >> >> ------
> >> >>
> >> >> For 1000000 iterations
> >> >> 3.1.6
> >> >> Single: 83.365
> >> >> 100 cluster:  ???? Still running....
> >> >>
> >> >> trunk
> >> >> Single: 40.825
> >> >> 100 cluster: 106.339
> >> >>
> >> >> I am doing something wrong here?
> >> >>
> >> >> That is such a massive speed up for clustered analysis, that I simply
> >> >> can't
> >> >> believe it!
> >> >>
> >> >> Best
> >> >> Troels
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 2014-06-04 15:04 GMT+02:00 Edward d'Auvergne <[email protected]>:
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> Such a huge speed up cannot be from the changes of the 'disp_speed'
> >> >>> branch alone.  I would expect from that branch a maximum drop from
> 30
> >> >>> min to 15 min.  Therefore it must be your grid search changes.  When
> >> >>> changing, simplifying, or eliminating the grid search, you have to
> be
> >> >>> very careful about the introduced bias.  This bias is unavoidable.
>  It
> >> >>> needs to be mentioned in the methods of any paper.  The key is to be
> >> >>> happy that the bias you have introduced will not negatively impact
> >> >>> your results.  For example if you believe that the grid search
> >> >>> replacement is reasonably close to the true solution that the
> >> >>> optimisation will be able to reach the global minimum.  You also
> have
> >> >>> to convince the people reading your paper that the introduced bias
> is
> >> >>> reasonable.
> >> >>>
> >> >>> As for a script to show the speed changes, you could have a look at
> >> >>> maybe the
> >> >>> test_suite/shared_data/dispersion/Hansen/relax_results/relax_disp.py
> >> >>> file.  This performs a full analysis with a large range of
> dispersion
> >> >>> models on the truncated data set from Flemming Hansen.  Or
> >> >>> test_suite/shared_data/dispersion/Hansen/relax_disp.py which uses
> all
> >> >>> of Flemming's data.  These could be run before and after the merger
> of
> >> >>> the 'disp_speed' branch, maybe with different models and the profile
> >> >>> flag turned on.  You could then create a text file in the
> >> >>> test_suite/shared_data/dispersion/Hansen/relax_results/ directory
> >> >>> called something like 'relax_timings' to permanently record the
> speed
> >> >>> ups.  This file can be used in the future for documenting any other
> >> >>> speed ups as well.
> >> >>>
> >> >>> Regards,
> >> >>>
> >> >>> Edward
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> On 4 June 2014 14:37, Troels Emtekær Linnet <[email protected]>
> >> >>> wrote:
> >> >>> > Looking at my old data, I can see that writing out of data between
> >> >>> > each
> >> >>> > global fit analysis before took around 30 min.
> >> >>> >
> >> >>> > They now take 2-6 mins.
> >> >>> >
> >> >>> > I almost can't believe that speed up!
> >> >>> >
> >> >>> > Could we devise a devel-script, which we could use to simulate the
> >> >>> > change?
> >> >>> >
> >> >>> > Best
> >> >>> > Troels
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > 2014-06-04 14:24 GMT+02:00 Troels Emtekær Linnet
> >> >>> > <[email protected]>:
> >> >>> >
> >> >>> >> Hi Edward.
> >> >>> >>
> >> >>> >> After the changes to the lib/dispersion/model.py files, I see
> >> >>> >> massive
> >> >>> >> speed-up of the computations.
> >> >>> >>
> >> >>> >> During 2 days, I performed over 600 global fittings for a 68
> >> >>> >> residue
> >> >>> >> protein, where all residues where clustered.I just did it with 1
> >> >>> >> cpu.
> >> >>> >>
> >> >>> >> This is really really impressive.
> >> >>> >>
> >> >>> >> I did though also alter how the grid search was performed,
> >> >>> >> pre-setting
> >> >>> >> some of the values from known values referred to in a paper.
> >> >>> >> So I can't really say what has cut the time down.
> >> >>> >>
> >> >>> >> But looking at the calculations running, the minimisation runs
> >> >>> >> quite
> >> >>> >> fast.
> >> >>> >>
> >> >>> >> So, how does relax do the collecting of data for global fitting?
> >> >>> >>
> >> >>> >> Does i collect all the R2eff values for the clustered spins, and
> >> >>> >> sent
> >> >>> >> it
> >> >>> >> to the target function
> >> >>> >> together with the array of parameters to vary?
> >> >>> >>
> >> >>> >> Or does it calculate per spin, and share the common parameters?
> >> >>> >>
> >> >>> >> My current bottle neck actually seems to be the saving of the
> state
> >> >>> >> file,
> >> >>> >> between each iteration of global analysis.
> >> >>> >>
> >> >>> >> Best
> >> >>> >> Troels
> >> >>> >>
> >> >>> > _______________________________________________
> >> >>> > relax (http://www.nmr-relax.com)
> >> >>> >
> >> >>> > This is the relax-devel mailing list
> >> >>> > [email protected]
> >> >>> >
> >> >>> > To unsubscribe from this list, get a password
> >> >>> > reminder, or change your subscription options,
> >> >>> > visit the list information page at
> >> >>> > https://mail.gna.org/listinfo/relax-devel
> >> >>
> >> >>
> >
> >
>
_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Reply via email to