Hi Edward.

So, I have tried to implement directly the infrastructure data format for NO
* NM * NS * NE.
And the speed up is 4.1x-4.5x times faster.

I think that is a very nice message to the release list.

It is obvious, that the largest speed-up will be gained by getting rid of
the NS loop.

Could one just re-shape the numpy arrays in the target function?

Best
Troels


2014-06-05 14:36 GMT+02:00 Edward d'Auvergne <[email protected]>:

> That is your infrastructure at work :)  As I mentioned previously, we
> are however not yet tapping into the full speed possible that you see
> in this test (
> http://thread.gmane.org/gmane.science.nmr.relax.devel/6022/focus=6029).
>  Especially for data with many spins in a cluster, many magnetic field
> strengths, or many offsets.  Let's say that we have the following
> counts:
>
>  - NE, the number of different dispersion experiments,
>  - NS, the number of spins in one cluster,
>  - NM, the number of magnetic field strengths,
>  - NO, the number of offsets,
>  - ND, the number of dispersion points,
>
> and that these counts are the same for all data combinations.  And
> let's say that t_diff is the time difference between Python and numpy
> for the calculation of one R2eff value.  Then compared to the 3.2.1
> release, the total speed up possible with your infrastructure is
> t_diff * ND * NO * NM * NS * NE.  With the 3.2.2 release we have the
> t_diff * ND speed up, but not the rest.  If your NO * NM * NS * NE
> value is not very high, then you will not see much of a speed up
> compared to the ultimate speed up of t_diff * ND * NO * NM * NS * NE.
> But if NO * NM * NS * NE is high, then the implementation of this
> speed up in the relax target functions might be worth considering (as
> described at
> http://thread.gmane.org/gmane.science.nmr.relax.devel/5726/focus=5806).
>
> Regards,
>
> Edward
>
>
>
>
> On 5 June 2014 14:18, Troels Emtekær Linnet <[email protected]> wrote:
> > I get these results
> >
> > That shows a 4x-5x speed-up.
> >
> > That is quite nice!
> >
> >
> >
> > -------
> > Checked on MacBook Pro
> > 2.4 GHz Intel Core i5
> > 8 GB 1067 Mhz DDR3 RAM.
> > Python Distribution -- Python 2.7.3 |EPD 7.3-2 (32-bit)|
> >
> > Timing for:
> > 2 fields
> > 20 dispersion points
> > iterations of function call: 1000
> >
> > Timed for simulating 1 or 100 clustered spins.
> >
> > svn ls "^/tags"
> >
> > ########
> > For tag 3.2.2
> > svn switch ^/tags/3.2.2
> > ########
> >
> > 1 spin:
> >    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
> >      2000    0.168    0.000    0.198    0.000 cr72.py:100(r2eff_CR72)
> >      1000    0.040    0.000    0.280    0.000
> > relax_disp.py:456(calc_CR72_chi2)
> >      2000    0.028    0.000    0.039    0.000 chi2.py:32(chi2)
> >
> > 100 spins:
> >    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
> >    200000   16.810    0.000   19.912    0.000 cr72.py:100(r2eff_CR72)
> >      1000    4.185    0.004   28.518    0.029
> > relax_disp.py:456(calc_CR72_chi2)
> >    200000    3.018    0.000    4.144    0.000 chi2.py:32(chi2)
> >
> >
> > ########
> > For tag 3.2.1
> > svn switch ^/tags/3.2.1
> > ########
> >
> > 1 spin:
> >    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
> >      2000    0.696    0.000    0.697    0.000 cr72.py:98(r2eff_CR72)
> >      1000    0.038    0.000    0.781    0.001
> > relax_disp.py:456(calc_CR72_chi2)
> >      2000    0.031    0.000    0.043    0.000 chi2.py:32(chi2)
> >
> > 100 spins:
> >    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
> >    200000   75.880    0.000   76.078    0.000 cr72.py:98(r2eff_CR72)
> >      1000    4.201    0.004   85.519    0.086
> > relax_disp.py:456(calc_CR72_chi2)
> >    200000    3.513    0.000    4.940    0.000 chi2.py:32(chi2)
> >
> >
> >
> > 2014-06-05 11:36 GMT+02:00 Edward d'Auvergne <[email protected]>:
> >
> >> Hi,
> >>
> >> The best place might be to create a special directory in the
> >> test_suite/shared_data/dispersion directories.  Or another option
> >> would be to create a devel_scripts/profiling/ directory and place it
> >> there.  The first option might be the best though as you could then
> >> save additional files there, such as the relax log files with the
> >> profile timings.  Or simply have everything on one page in the wiki -
> >> script and output.  What do you think is best?
> >>
> >> Regards,
> >>
> >> Edward
> >>
> >>
> >>
> >> On 5 June 2014 11:27, Troels Emtekær Linnet <[email protected]>
> wrote:
> >> > Hi Ed.
> >> >
> >> > I have worked on a rather long profiling script now.
> >> >
> >> > It creates the necessary data structures, and then call the
> >> > relax_disp target function.
> >> >
> >> > Can you devise a "place" to put this script?
> >> >
> >> > Best
> >> > Troels
> >> >
> >> >
> >> >
> >> > 2014-06-05 11:13 GMT+02:00 Edward d'Auvergne <[email protected]>:
> >> >
> >> >> Hi Troels,
> >> >>
> >> >> This huge speed up you see also applies when you have multiple field
> >> >> strength data.  To understand how you can convert the long rank-1
> >> >> array you have in your g_* data structures into the multi-index
> rank-5
> >> >> back_calc array with dimensions {Ei, Si, Mi, Oi, Di}, see the numpy
> >> >> reshape() function:
> >> >>
> >> >>
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
> >> >>
> >> >> You can obtain this huge speed up if you convert the
> >> >> target_functions.relax_disp data structures to be similar to your g_*
> >> >> data structures, delete the looping in the func_*() target functions
> >> >> over the {Ei, Si, Mi, Oi, Di} dimensions (for the numeric models,
> this
> >> >> looping would need to be shifted into the lib.dispersion code to keep
> >> >> the API consistent), pass the new higher-dimensional data into the
> >> >> lib.dispersion modules, and finally use R2eff.reshape() to place the
> >> >> data back into the back_calc data structure.  This would again need
> to
> >> >> be in a new branch, and you should only do it if you wish to have
> huge
> >> >> speed ups for multi-experiment, clustered, multi-field, or
> >> >> multi-offset data.  The speed ups will also only be for the analytic
> >> >> models as the numeric models unfortunately do not have the necessary
> >> >> maths derived for calculating everything simultaneously in one linear
> >> >> algebra operation.
> >> >>
> >> >> Regards,
> >> >>
> >> >> Edward
> >> >>
> >> >>
> >> >>
> >> >> On 4 June 2014 17:11, Edward d'Auvergne <[email protected]>
> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > The huge differences are because of the changes in the
> lib.dispersion
> >> >> > modules.  But wait!  The r2eff_CR72() receives the data for each
> >> >> > experiment, spin, and offset separately.  So this insane speed up
> is
> >> >> > not realised in the current target functions.  But the potential
> for
> >> >> > these speed ups is there thanks to your infrastructure work in the
> >> >> > 'disp_speed' branch.  I have mentioned this before:
> >> >> >
> >> >> > http://thread.gmane.org/gmane.science.nmr.relax.devel/5726
> >> >> >
> >> >> > Specifically the follow up at:
> >> >> >
> >> >> >
> http://thread.gmane.org/gmane.science.nmr.relax.devel/5726/focus=5806
> >> >> >
> >> >> > The idea mentioned in this post is exactly the speed up you see in
> >> >> > this test!  So if the idea is implemented in relax then, yes, you
> >> >> > will
> >> >> > see this insane speed up in a clustered analysis.  Especially for
> >> >> > large clusters and a large number of offsets (for R1rho but also
> for
> >> >> > CPMG when off-resonace effects are implemented,
> >> >> >
> >> >> >
> http://thread.gmane.org/gmane.science.nmr.relax.devel/5414/focus=5445).
> >> >> >  But unfortunately currently you do not.
> >> >> >
> >> >> > Regards,
> >> >> >
> >> >> > Edward
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > On 4 June 2014 16:45, Troels Emtekær Linnet <[email protected]
> >
> >> >> > wrote:
> >> >> >> Hi Edward.
> >> >> >>
> >> >> >> Ah ja.
> >> >> >> I overwrite the state file for each new global fitting, with the
> new
> >> >> >> pipe.
> >> >> >> So that is increasing quite much.
> >> >> >> I will change that.
> >> >> >>
> >> >> >> I just checked my scripts.
> >> >> >> In both cases, I would do one grid search for the first run, and
> >> >> >> then
> >> >> >> the
> >> >> >> recurring analysis would copy the parameters from the first pipe.
> >> >> >>
> >> >> >> And the speed-up is between these analysis.
> >> >> >>
> >> >> >> Hm.
> >> >> >> I have to take that variable out with the grid search!
> >> >> >>
> >> >> >> I am trying to device a profile script, which I can put in the
> base
> >> >> >> folder
> >> >> >> of older versions of relax.
> >> >> >> For example relax 3.1.6 which I also have.
> >> >> >>
> >> >> >> It looks like this:
> >> >> >> -------------
> >> >> >> # Python module imports.
> >> >> >> from numpy import array, float64, pi, zeros
> >> >> >> import sys
> >> >> >> import os
> >> >> >> import cProfile
> >> >> >>
> >> >> >> # relax module imports.
> >> >> >> from lib.dispersion.cr72 import r2eff_CR72
> >> >> >>
> >> >> >> # Default parameter values.
> >> >> >> r20a = 2.0
> >> >> >> r20b = 4.0
> >> >> >> pA = 0.95
> >> >> >> dw = 2.0
> >> >> >> kex = 1000.0
> >> >> >>
> >> >> >> relax_times = 0.04
> >> >> >> ncyc_list = [2, 4, 8, 10, 20, 40, 500]
> >> >> >>
> >> >> >> # Required data structures.
> >> >> >> s_ncyc = array(ncyc_list)
> >> >> >> s_num_points = len(s_ncyc)
> >> >> >> s_cpmg_frqs = s_ncyc / relax_times
> >> >> >> s_R2eff = zeros(s_num_points, float64)
> >> >> >>
> >> >> >> g_ncyc = array(ncyc_list*100)
> >> >> >> g_num_points = len(g_ncyc)
> >> >> >> g_cpmg_frqs = g_ncyc / relax_times
> >> >> >> g_R2eff = zeros(g_num_points, float64)
> >> >> >>
> >> >> >> # The spin Larmor frequencies.
> >> >> >> sfrq = 200. * 1E6
> >> >> >>
> >> >> >> # Calculate pB.
> >> >> >> pB = 1.0 - pA
> >> >> >>
> >> >> >> # Exchange rates.
> >> >> >> k_BA = pA * kex
> >> >> >> k_AB = pB * kex
> >> >> >>
> >> >> >> # Calculate spin Larmor frequencies in 2pi.
> >> >> >> frqs = sfrq * 2 * pi
> >> >> >>
> >> >> >> # Convert dw from ppm to rad/s.
> >> >> >> dw_frq = dw * frqs / 1.e6
> >> >> >>
> >> >> >>
> >> >> >> def single():
> >> >> >>     for i in xrange(0,10000):
> >> >> >>         r2eff_CR72(r20a=r20a, r20b=r20b, pA=pA, dw=dw_frq,
> kex=kex,
> >> >> >> cpmg_frqs=s_cpmg_frqs, back_calc=s_R2eff, num_points=s_num_points)
> >> >> >>
> >> >> >> cProfile.run('single()')
> >> >> >>
> >> >> >> def cluster():
> >> >> >>     for i in xrange(0,10000):
> >> >> >>         r2eff_CR72(r20a=r20a, r20b=r20b, pA=pA, dw=dw_frq,
> kex=kex,
> >> >> >> cpmg_frqs=g_cpmg_frqs, back_calc=g_R2eff, num_points=g_num_points)
> >> >> >>
> >> >> >> cProfile.run('cluster()')
> >> >> >> ------------------------
> >> >> >>
> >> >> >> For 3.1.6
> >> >> >> [tlinnet@tomat relax-3.1.6]$ python
> profile_lib_dispersion_cr72.py
> >> >> >>          20003 function calls in 0.793 CPU seconds
> >> >> >>
> >> >> >>    Ordered by: standard name
> >> >> >>
> >> >> >>    ncalls  tottime  percall  cumtime  percall
> >> >> >> filename:lineno(function)
> >> >> >>         1    0.000    0.000    0.793    0.793 <string>:1(<module>)
> >> >> >>     10000    0.778    0.000    0.783    0.000
> cr72.py:98(r2eff_CR72)
> >> >> >>         1    0.010    0.010    0.793    0.793
> >> >> >> profile_lib_dispersion_cr72.py:69(single)
> >> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> >> '_lsprof.Profiler' objects}
> >> >> >>     10000    0.005    0.000    0.005    0.000 {range}
> >> >> >>
> >> >> >>
> >> >> >>          20003 function calls in 61.901 CPU seconds
> >> >> >>
> >> >> >>    Ordered by: standard name
> >> >> >>
> >> >> >>    ncalls  tottime  percall  cumtime  percall
> >> >> >> filename:lineno(function)
> >> >> >>         1    0.000    0.000   61.901   61.901 <string>:1(<module>)
> >> >> >>     10000   61.853    0.006   61.887    0.006
> cr72.py:98(r2eff_CR72)
> >> >> >>         1    0.013    0.013   61.901   61.901
> >> >> >> profile_lib_dispersion_cr72.py:75(cluster)
> >> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> >> '_lsprof.Profiler' objects}
> >> >> >>     10000    0.035    0.000    0.035    0.000 {range}
> >> >> >>
> >> >> >>
> >> >> >> For trunk
> >> >> >>
> >> >> >> [tlinnet@tomat relax_trunk]$ python
> profile_lib_dispersion_cr72.py
> >> >> >>          80003 function calls in 0.514 CPU seconds
> >> >> >>
> >> >> >>    Ordered by: standard name
> >> >> >>
> >> >> >>    ncalls  tottime  percall  cumtime  percall
> >> >> >> filename:lineno(function)
> >> >> >>         1    0.000    0.000    0.514    0.514 <string>:1(<module>)
> >> >> >>     10000    0.390    0.000    0.503    0.000
> >> >> >> cr72.py:100(r2eff_CR72)
> >> >> >>     10000    0.008    0.000    0.040    0.000
> >> >> >> fromnumeric.py:1314(sum)
> >> >> >>     10000    0.007    0.000    0.037    0.000
> >> >> >> fromnumeric.py:1708(amax)
> >> >> >>     10000    0.006    0.000    0.037    0.000
> >> >> >> fromnumeric.py:1769(amin)
> >> >> >>         1    0.011    0.011    0.514    0.514
> >> >> >> profile_lib_dispersion_cr72.py:69(single)
> >> >> >>     10000    0.007    0.000    0.007    0.000 {isinstance}
> >> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> >> '_lsprof.Profiler' objects}
> >> >> >>     10000    0.030    0.000    0.030    0.000 {method 'max' of
> >> >> >> 'numpy.ndarray' objects}
> >> >> >>     10000    0.030    0.000    0.030    0.000 {method 'min' of
> >> >> >> 'numpy.ndarray' objects}
> >> >> >>     10000    0.025    0.000    0.025    0.000 {method 'sum' of
> >> >> >> 'numpy.ndarray' objects}
> >> >> >>
> >> >> >>
> >> >> >>          80003 function calls in 1.209 CPU seconds
> >> >> >>
> >> >> >>    Ordered by: standard name
> >> >> >>
> >> >> >>    ncalls  tottime  percall  cumtime  percall
> >> >> >> filename:lineno(function)
> >> >> >>         1    0.000    0.000    1.209    1.209 <string>:1(<module>)
> >> >> >>     10000    1.042    0.000    1.196    0.000
> >> >> >> cr72.py:100(r2eff_CR72)
> >> >> >>     10000    0.009    0.000    0.049    0.000
> >> >> >> fromnumeric.py:1314(sum)
> >> >> >>     10000    0.007    0.000    0.052    0.000
> >> >> >> fromnumeric.py:1708(amax)
> >> >> >>     10000    0.007    0.000    0.052    0.000
> >> >> >> fromnumeric.py:1769(amin)
> >> >> >>         1    0.014    0.014    1.209    1.209
> >> >> >> profile_lib_dispersion_cr72.py:75(cluster)
> >> >> >>     10000    0.007    0.000    0.007    0.000 {isinstance}
> >> >> >>         1    0.000    0.000    0.000    0.000 {method 'disable' of
> >> >> >> '_lsprof.Profiler' objects}
> >> >> >>     10000    0.045    0.000    0.045    0.000 {method 'max' of
> >> >> >> 'numpy.ndarray' objects}
> >> >> >>     10000    0.045    0.000    0.045    0.000 {method 'min' of
> >> >> >> 'numpy.ndarray' objects}
> >> >> >>     10000    0.033    0.000    0.033    0.000 {method 'sum' of
> >> >> >> 'numpy.ndarray' objects}
> >> >> >> ---------------
> >> >> >>
> >> >> >> For 10000 iterations
> >> >> >>
> >> >> >> 3.1.6
> >> >> >> Single: 0.778
> >> >> >> 100 cluster: 61.853
> >> >> >>
> >> >> >> trunk
> >> >> >> Single: 0.390
> >> >> >> 100 cluster: 1.042
> >> >> >>
> >> >> >> ------
> >> >> >>
> >> >> >> For 1000000 iterations
> >> >> >> 3.1.6
> >> >> >> Single: 83.365
> >> >> >> 100 cluster:  ???? Still running....
> >> >> >>
> >> >> >> trunk
> >> >> >> Single: 40.825
> >> >> >> 100 cluster: 106.339
> >> >> >>
> >> >> >> I am doing something wrong here?
> >> >> >>
> >> >> >> That is such a massive speed up for clustered analysis, that I
> >> >> >> simply
> >> >> >> can't
> >> >> >> believe it!
> >> >> >>
> >> >> >> Best
> >> >> >> Troels
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2014-06-04 15:04 GMT+02:00 Edward d'Auvergne <
> [email protected]>:
> >> >> >>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> Such a huge speed up cannot be from the changes of the
> 'disp_speed'
> >> >> >>> branch alone.  I would expect from that branch a maximum drop
> from
> >> >> >>> 30
> >> >> >>> min to 15 min.  Therefore it must be your grid search changes.
> >> >> >>> When
> >> >> >>> changing, simplifying, or eliminating the grid search, you have
> to
> >> >> >>> be
> >> >> >>> very careful about the introduced bias.  This bias is
> unavoidable.
> >> >> >>> It
> >> >> >>> needs to be mentioned in the methods of any paper.  The key is to
> >> >> >>> be
> >> >> >>> happy that the bias you have introduced will not negatively
> impact
> >> >> >>> your results.  For example if you believe that the grid search
> >> >> >>> replacement is reasonably close to the true solution that the
> >> >> >>> optimisation will be able to reach the global minimum.  You also
> >> >> >>> have
> >> >> >>> to convince the people reading your paper that the introduced
> bias
> >> >> >>> is
> >> >> >>> reasonable.
> >> >> >>>
> >> >> >>> As for a script to show the speed changes, you could have a look
> at
> >> >> >>> maybe the
> >> >> >>>
> >> >> >>>
> test_suite/shared_data/dispersion/Hansen/relax_results/relax_disp.py
> >> >> >>> file.  This performs a full analysis with a large range of
> >> >> >>> dispersion
> >> >> >>> models on the truncated data set from Flemming Hansen.  Or
> >> >> >>> test_suite/shared_data/dispersion/Hansen/relax_disp.py which uses
> >> >> >>> all
> >> >> >>> of Flemming's data.  These could be run before and after the
> merger
> >> >> >>> of
> >> >> >>> the 'disp_speed' branch, maybe with different models and the
> >> >> >>> profile
> >> >> >>> flag turned on.  You could then create a text file in the
> >> >> >>> test_suite/shared_data/dispersion/Hansen/relax_results/ directory
> >> >> >>> called something like 'relax_timings' to permanently record the
> >> >> >>> speed
> >> >> >>> ups.  This file can be used in the future for documenting any
> other
> >> >> >>> speed ups as well.
> >> >> >>>
> >> >> >>> Regards,
> >> >> >>>
> >> >> >>> Edward
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> On 4 June 2014 14:37, Troels Emtekær Linnet <
> [email protected]>
> >> >> >>> wrote:
> >> >> >>> > Looking at my old data, I can see that writing out of data
> >> >> >>> > between
> >> >> >>> > each
> >> >> >>> > global fit analysis before took around 30 min.
> >> >> >>> >
> >> >> >>> > They now take 2-6 mins.
> >> >> >>> >
> >> >> >>> > I almost can't believe that speed up!
> >> >> >>> >
> >> >> >>> > Could we devise a devel-script, which we could use to simulate
> >> >> >>> > the
> >> >> >>> > change?
> >> >> >>> >
> >> >> >>> > Best
> >> >> >>> > Troels
> >> >> >>> >
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > 2014-06-04 14:24 GMT+02:00 Troels Emtekær Linnet
> >> >> >>> > <[email protected]>:
> >> >> >>> >
> >> >> >>> >> Hi Edward.
> >> >> >>> >>
> >> >> >>> >> After the changes to the lib/dispersion/model.py files, I see
> >> >> >>> >> massive
> >> >> >>> >> speed-up of the computations.
> >> >> >>> >>
> >> >> >>> >> During 2 days, I performed over 600 global fittings for a 68
> >> >> >>> >> residue
> >> >> >>> >> protein, where all residues where clustered.I just did it
> with 1
> >> >> >>> >> cpu.
> >> >> >>> >>
> >> >> >>> >> This is really really impressive.
> >> >> >>> >>
> >> >> >>> >> I did though also alter how the grid search was performed,
> >> >> >>> >> pre-setting
> >> >> >>> >> some of the values from known values referred to in a paper.
> >> >> >>> >> So I can't really say what has cut the time down.
> >> >> >>> >>
> >> >> >>> >> But looking at the calculations running, the minimisation runs
> >> >> >>> >> quite
> >> >> >>> >> fast.
> >> >> >>> >>
> >> >> >>> >> So, how does relax do the collecting of data for global
> fitting?
> >> >> >>> >>
> >> >> >>> >> Does i collect all the R2eff values for the clustered spins,
> and
> >> >> >>> >> sent
> >> >> >>> >> it
> >> >> >>> >> to the target function
> >> >> >>> >> together with the array of parameters to vary?
> >> >> >>> >>
> >> >> >>> >> Or does it calculate per spin, and share the common
> parameters?
> >> >> >>> >>
> >> >> >>> >> My current bottle neck actually seems to be the saving of the
> >> >> >>> >> state
> >> >> >>> >> file,
> >> >> >>> >> between each iteration of global analysis.
> >> >> >>> >>
> >> >> >>> >> Best
> >> >> >>> >> Troels
> >> >> >>> >>
> >> >> >>> > _______________________________________________
> >> >> >>> > relax (http://www.nmr-relax.com)
> >> >> >>> >
> >> >> >>> > This is the relax-devel mailing list
> >> >> >>> > [email protected]
> >> >> >>> >
> >> >> >>> > To unsubscribe from this list, get a password
> >> >> >>> > reminder, or change your subscription options,
> >> >> >>> > visit the list information page at
> >> >> >>> > https://mail.gna.org/listinfo/relax-devel
> >> >> >>
> >> >> >>
> >> >
> >> >
> >
> >
>
_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Reply via email to