Re: [task #7822] Implement user function to estimate R2eff and associated errors for exponential curve fitting.

Troels Emtekær Linnet Mon, 25 Aug 2014 02:03:07 -0700

Hi Edward.

Thank you for your input.


I have implemented scipy.optimize.leastsq as a mean of freedom of
choice for the user.

I consider this as an experimental feature, but it gives freedom to
the user to try different options.

The incredible speed-up for the exact calculation of R2eff and i0, but
rough estimation of the errors can still be a valuable tool.

I find myself having a hard time argumenting with fellow colleagues,
that even the screening data results is so mass-time consuming.

With this new implementation, I see it as a way of:

1) Estimate R2eff and errors.
2) Do model fitting. Inspect results.
3) Try some global fitting/clustering analysis.
4) Select residues for clustering.
5) Do model fitting with clustered residues. Inspect results.

When this looks promising.
1) Estimate R2eff and errors by Monte-Carlo simulations.
Repeat as before.

The key-point is to make it possible to screen results fast.

The new user function: "relax_disp.r2eff_estimate()",
could be extended to select other algorithms for r2eff and errors.

scipy.optimize.leastsq should be (or is) a well known option in the
python community for minimisation.

It is a freedom of choice to make the decision to try this out.

Best
Troels


2014-08-25 10:05 GMT+02:00 Edward d'Auvergne <[email protected]>:
> Hi Troels,
>
> On the subject of optimisation, if you use the scipy.optimise.fmin()
> function in the profiling, you will be able to compare the speed of
> the Nelder-Mead simplex algorithm in minfx vs. scipy.  These should
> however be similar.
>
> Regards,
>
> Edward
>
>
>
> On 25 August 2014 09:32, Edward d'Auvergne <[email protected]> wrote:
>> Hi Troels,
>>
>> We should probably discuss how you would like to implement this in
>> relax, prior to writing too much code.  That way we can develop the
>> code in the best direction from the start and save a -lot- of time.
>> The way I see it is that there are two separate parts to this task:
>>
>> 1)  Optimisation algorithm.
>>
>> 2)  Error estimation algorithm.
>>
>> In this thread, I'll cover 1) and then write a second message about 2).
>>
>> For the scipy.optimize.leastsq() function, the Levenberg-Marquardt
>> optimisation algorithm is being used.  See
>> https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm,
>> though this Wikipedia description is terrible at the moment and a
>> better reference would be the book Numerical Recipes in
>> C/Fortran/Pascal/etc.  This algorithm is also available in minfx
>> http://home.gna.org/minfx/minfx.levenberg_marquardt-module.html.  For
>> the algorithm, gradients are required.  These are converted into a
>> special matrix known as the Jacobian.  You can see this as the Dfun
>> argument to the scipy.optimise.leastsq() function.  If you do not
>> supply Dfun, then the algorithm uses a much, much, much slower
>> technique - the numeric approximation of the gradient.  As from the
>> docs:
>>
>>     Dfun : callable
>>         A function or method to compute the Jacobian of func with derivatives
>>         across the rows. If this is None, the Jacobian will be estimated.
>>
>> So, if you derive the gradient equations and add these to the relax C
>> module, you will have a much greater speed up as estimating the
>> gradients will be the majority of the computation time for this
>> algorithm as you are currently using it.  These equations are
>> incredibly basic - just take the partial derivative of the exponential
>> equation with respect to each parameter.  The floating point value for
>> each parameter derivative is them packed into an array with the same
>> dimensions as the parameter vector.  This is the gradient.  As an
>> aside, the Hessian is similar but is second partial derivatives in a
>> matrix form, and using this together with the even faster Newton
>> algorithm is really the absolute ultimate solution (this powerful
>> algorithm should probably only need 2-3 steps of optimisation to solve
>> this simple problem).
>>
>> The key part that is missing from minfx is the numerical gradient
>> approximation.  However deriving the gradient is usually such a basic
>> exercise, in this case it is ridiculously simple, that gives so much
>> more speed that I've never bothered with numerical approximations in
>> minfx.  I also did not derive or implement the gradient for the
>> exponential curve-fitting as the code was fast enough for my purposes.
>> To use gradient functions with minfx, there are the dchi2_func and
>> dfunc arguments which then constructs the Jacobian form for you that
>> the Dfun scipy argument expects.
>>
>> Anyway, that was just the background to all of this.  How do you see
>> this being implemented in relax?  All of the optimisation is via minfx
>> (https://gna.org/projects/minfx/), so maybe you should sign up to that
>> Gna! project as well.  Remember that minfx was just a normal relax
>> package that I separated into its own project, just like bmrblib
>> (https://gna.org/projects/bmrblib/).
>>
>> Regards,
>>
>> Edward
>>
>>
>> On 24 August 2014 17:56, Troels E. Linnet
>> <[email protected]> wrote:
>>> URL:
>>>   <http://gna.org/task/?7822>
>>>
>>>                  Summary: Implement user function to estimate R2eff and
>>> associated errors for exponential curve fitting.
>>>                  Project: relax
>>>             Submitted by: tlinnet
>>>             Submitted on: Sun 24 Aug 2014 03:56:36 PM UTC
>>>          Should Start On: Sun 24 Aug 2014 12:00:00 AM UTC
>>>    Should be Finished on: Sun 24 Aug 2014 12:00:00 AM UTC
>>>                 Category: relax's source code
>>>                 Priority: 5 - Normal
>>>                   Status: In Progress
>>>         Percent Complete: 0%
>>>              Assigned to: tlinnet
>>>              Open/Closed: Open
>>>          Discussion Lock: Any
>>>                   Effort: 0.00
>>>
>>>     _______________________________________________________
>>>
>>> Details:
>>>
>>> A verification script, showed that using scipy.optimize.leastsq reaches the
>>> exact same parameters as minfx for exponential curve fitting.
>>>
>>> The verification script is in:
>>> test_suite/shared_data/curve_fitting/profiling/profiling_relax_fit.py
>>> test_suite/shared_data/curve_fitting/profiling/verify_error.py
>>>
>>> The profiling script shows that a 10 X increase in speed can be reached by
>>> removing
>>> the linear constraints when using minfx.
>>>
>>> The profiling also shows that scipy.optimize.leastsq is 10X as fast as using
>>> minfx, even without linear constraints.
>>>
>>> scipy.optimize.leastsq is a wrapper around wrapper around MINPACK's lmdif 
>>> and
>>> lmder algorithms.
>>>
>>> MINPACK is a FORTRAN90 library which solves systems of nonlinear equations, 
>>> or
>>> carries out the least squares minimization of the residual of a set of 
>>> linear
>>> or nonlinear equations.
>>>
>>>  The verification script also shows, that a very heavy and time consuming
>>> monte carlo simulation of 2000 steps, reaches the same errors as the errors
>>> reported by scipy.optimize.leastsq.
>>>
>>> The return from scipy.optimize.leastsq, gives the estimated co-variance.
>>> Taking the square root of the co-variance corresponds with 2X error reported
>>> by minfx after 2000 Monte-Carlo simulations.
>>>
>>> This could be an extremely time saving step, when performing model fitting 
>>> in
>>> R1rho, where the errors of the R2eff values, are estimated by Monte-Carlo
>>> simulations.
>>>
>>> The following setup illustrates the problem.
>>> This was analysed on a: MacBook Pro, 13-inch, Late 2011.
>>> With no multi-core setup.
>>>
>>> Script running is:
>>> test_suite/shared_data/dispersion/Kjaergaard_et_al_2013/2_pre_run_r2eff.py
>>>
>>> This script analyses just the R2eff values for 15 residues.
>>> It estimates the errors of R2eff based on 2000 Monte Carlo simulations.
>>> For each residues, there is 14 exponential graphs.
>>>
>>> The script was broken after 35 simulations.
>>> This was measured to 20 minutes.
>>> So 500 simulations would take about 4.8 Hours.
>>>
>>> The R2eff values and errors can by scipy.optimize.leastsq can instead be
>>> calculated in: 15 residues * 0.02 seconds = 0.3 seconds.
>>>
>>>
>>>
>>>
>>>     _______________________________________________________
>>>
>>> Reply to this item at:
>>>
>>>   <http://gna.org/task/?7822>
>>>
>>> _______________________________________________
>>>   Message sent via/by Gna!
>>>   http://gna.org/
>>>
>>>
>>> _______________________________________________
>>> relax (http://www.nmr-relax.com)
>>>
>>> This is the relax-devel mailing list
>>> [email protected]
>>>
>>> To unsubscribe from this list, get a password
>>> reminder, or change your subscription options,
>>> visit the list information page at
>>> https://mail.gna.org/listinfo/relax-devel

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: [task #7822] Implement user function to estimate R2eff and associated errors for exponential curve fitting.

Reply via email to