Hi Edward.
Ah ja.
I overwrite the state file for each new global fitting, with the new pipe.
So that is increasing quite much.
I will change that.
I just checked my scripts.
In both cases, I would do one grid search for the first run, and then the
recurring analysis would copy the parameters from the first pipe.
And the speed-up is between these analysis.
Hm.
I have to take that variable out with the grid search!
I am trying to device a profile script, which I can put in the base folder
of older versions of relax.
For example relax 3.1.6 which I also have.
It looks like this:
-------------
# Python module imports.
from numpy import array, float64, pi, zeros
import sys
import os
import cProfile
# relax module imports.
from lib.dispersion.cr72 import r2eff_CR72
# Default parameter values.
r20a = 2.0
r20b = 4.0
pA = 0.95
dw = 2.0
kex = 1000.0
relax_times = 0.04
ncyc_list = [2, 4, 8, 10, 20, 40, 500]
# Required data structures.
s_ncyc = array(ncyc_list)
s_num_points = len(s_ncyc)
s_cpmg_frqs = s_ncyc / relax_times
s_R2eff = zeros(s_num_points, float64)
g_ncyc = array(ncyc_list*100)
g_num_points = len(g_ncyc)
g_cpmg_frqs = g_ncyc / relax_times
g_R2eff = zeros(g_num_points, float64)
# The spin Larmor frequencies.
sfrq = 200. * 1E6
# Calculate pB.
pB = 1.0 - pA
# Exchange rates.
k_BA = pA * kex
k_AB = pB * kex
# Calculate spin Larmor frequencies in 2pi.
frqs = sfrq * 2 * pi
# Convert dw from ppm to rad/s.
dw_frq = dw * frqs / 1.e6
def single():
for i in xrange(0,10000):
r2eff_CR72(r20a=r20a, r20b=r20b, pA=pA, dw=dw_frq, kex=kex,
cpmg_frqs=s_cpmg_frqs, back_calc=s_R2eff, num_points=s_num_points)
cProfile.run('single()')
def cluster():
for i in xrange(0,10000):
r2eff_CR72(r20a=r20a, r20b=r20b, pA=pA, dw=dw_frq, kex=kex,
cpmg_frqs=g_cpmg_frqs, back_calc=g_R2eff, num_points=g_num_points)
cProfile.run('cluster()')
------------------------
For 3.1.6
[tlinnet@tomat relax-3.1.6]$ python profile_lib_dispersion_cr72.py
20003 function calls in 0.793 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.793 0.793 <string>:1(<module>)
10000 0.778 0.000 0.783 0.000 cr72.py:98(r2eff_CR72)
1 0.010 0.010 0.793 0.793
profile_lib_dispersion_cr72.py:69(single)
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
10000 0.005 0.000 0.005 0.000 {range}
20003 function calls in 61.901 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 61.901 61.901 <string>:1(<module>)
10000 61.853 0.006 61.887 0.006 cr72.py:98(r2eff_CR72)
1 0.013 0.013 61.901 61.901
profile_lib_dispersion_cr72.py:75(cluster)
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
10000 0.035 0.000 0.035 0.000 {range}
For trunk
[tlinnet@tomat relax_trunk]$ python profile_lib_dispersion_cr72.py
80003 function calls in 0.514 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.514 0.514 <string>:1(<module>)
10000 0.390 0.000 0.503 0.000 cr72.py:100(r2eff_CR72)
10000 0.008 0.000 0.040 0.000 fromnumeric.py:1314(sum)
10000 0.007 0.000 0.037 0.000 fromnumeric.py:1708(amax)
10000 0.006 0.000 0.037 0.000 fromnumeric.py:1769(amin)
1 0.011 0.011 0.514 0.514
profile_lib_dispersion_cr72.py:69(single)
10000 0.007 0.000 0.007 0.000 {isinstance}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
10000 0.030 0.000 0.030 0.000 {method 'max' of
'numpy.ndarray' objects}
10000 0.030 0.000 0.030 0.000 {method 'min' of
'numpy.ndarray' objects}
10000 0.025 0.000 0.025 0.000 {method 'sum' of
'numpy.ndarray' objects}
80003 function calls in 1.209 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.209 1.209 <string>:1(<module>)
10000 1.042 0.000 1.196 0.000 cr72.py:100(r2eff_CR72)
10000 0.009 0.000 0.049 0.000 fromnumeric.py:1314(sum)
10000 0.007 0.000 0.052 0.000 fromnumeric.py:1708(amax)
10000 0.007 0.000 0.052 0.000 fromnumeric.py:1769(amin)
1 0.014 0.014 1.209 1.209
profile_lib_dispersion_cr72.py:75(cluster)
10000 0.007 0.000 0.007 0.000 {isinstance}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
10000 0.045 0.000 0.045 0.000 {method 'max' of
'numpy.ndarray' objects}
10000 0.045 0.000 0.045 0.000 {method 'min' of
'numpy.ndarray' objects}
10000 0.033 0.000 0.033 0.000 {method 'sum' of
'numpy.ndarray' objects}
---------------
For 10000 iterations
3.1.6
Single: 0.778
100 cluster: 61.853
trunk
Single: 0.390
100 cluster: 1.042
------
For 1000000 iterations
3.1.6
Single: 83.365
100 cluster: ???? Still running....
trunk
Single: 40.825
100 cluster: 106.339
I am doing something wrong here?
That is such a massive speed up for clustered analysis, that I simply can't
believe it!
Best
Troels
2014-06-04 15:04 GMT+02:00 Edward d'Auvergne <[email protected]>:
> Hi,
>
> Such a huge speed up cannot be from the changes of the 'disp_speed'
> branch alone. I would expect from that branch a maximum drop from 30
> min to 15 min. Therefore it must be your grid search changes. When
> changing, simplifying, or eliminating the grid search, you have to be
> very careful about the introduced bias. This bias is unavoidable. It
> needs to be mentioned in the methods of any paper. The key is to be
> happy that the bias you have introduced will not negatively impact
> your results. For example if you believe that the grid search
> replacement is reasonably close to the true solution that the
> optimisation will be able to reach the global minimum. You also have
> to convince the people reading your paper that the introduced bias is
> reasonable.
>
> As for a script to show the speed changes, you could have a look at
> maybe the
> test_suite/shared_data/dispersion/Hansen/relax_results/relax_disp.py
> file. This performs a full analysis with a large range of dispersion
> models on the truncated data set from Flemming Hansen. Or
> test_suite/shared_data/dispersion/Hansen/relax_disp.py which uses all
> of Flemming's data. These could be run before and after the merger of
> the 'disp_speed' branch, maybe with different models and the profile
> flag turned on. You could then create a text file in the
> test_suite/shared_data/dispersion/Hansen/relax_results/ directory
> called something like 'relax_timings' to permanently record the speed
> ups. This file can be used in the future for documenting any other
> speed ups as well.
>
> Regards,
>
> Edward
>
>
>
>
> On 4 June 2014 14:37, Troels Emtekær Linnet <[email protected]> wrote:
> > Looking at my old data, I can see that writing out of data between each
> > global fit analysis before took around 30 min.
> >
> > They now take 2-6 mins.
> >
> > I almost can't believe that speed up!
> >
> > Could we devise a devel-script, which we could use to simulate the
> change?
> >
> > Best
> > Troels
> >
> >
> >
> > 2014-06-04 14:24 GMT+02:00 Troels Emtekær Linnet <[email protected]
> >:
> >
> >> Hi Edward.
> >>
> >> After the changes to the lib/dispersion/model.py files, I see massive
> >> speed-up of the computations.
> >>
> >> During 2 days, I performed over 600 global fittings for a 68 residue
> >> protein, where all residues where clustered.I just did it with 1 cpu.
> >>
> >> This is really really impressive.
> >>
> >> I did though also alter how the grid search was performed, pre-setting
> >> some of the values from known values referred to in a paper.
> >> So I can't really say what has cut the time down.
> >>
> >> But looking at the calculations running, the minimisation runs quite
> fast.
> >>
> >> So, how does relax do the collecting of data for global fitting?
> >>
> >> Does i collect all the R2eff values for the clustered spins, and sent it
> >> to the target function
> >> together with the array of parameters to vary?
> >>
> >> Or does it calculate per spin, and share the common parameters?
> >>
> >> My current bottle neck actually seems to be the saving of the state
> file,
> >> between each iteration of global analysis.
> >>
> >> Best
> >> Troels
> >>
> > _______________________________________________
> > relax (http://www.nmr-relax.com)
> >
> > This is the relax-devel mailing list
> > [email protected]
> >
> > To unsubscribe from this list, get a password
> > reminder, or change your subscription options,
> > visit the list information page at
> > https://mail.gna.org/listinfo/relax-devel
>
_______________________________________________
relax (http://www.nmr-relax.com)
This is the relax-devel mailing list
[email protected]
To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel