Re: [task #7822] Implement user function to estimate R2eff and associated errors for exponential curve fitting.

Troels Emtekær Linnet Mon, 25 Aug 2014 04:03:13 -0700

2014-08-25 11:19 GMT+02:00 Edward d'Auvergne <[email protected]>:
> Hi Troels,
>
> Unfortunately you have gone ahead an implemented a solution without
> first discussing or planning it.  Hence the current solution has a
> number of issues:
>
> 1)  Target function replication.  The solution should have reused the
> C modules.  The original Python code for fitting exponential curves
> was converted to C code for speed
> (http://gna.org/forum/forum.php?forum_id=1043).  Note that two point
> exponentials that decay to zero is not the only way that data can be
> collected, and that is the reason for Sebastien Morin's
> inversion-recovery branch (which was never completed).  Anyway, the
> code duplication is not acceptable.  If the C module is extended with
> new features, such as having the true gradient and Hessian functions,
> then the Python module will then be out of sync.  And vice-versa.  If
> a bug is found in one module and fixed, it may still be present in the
> second.  This is a very non-ideal situation for relax to be in, and is
> the exact reason why I did not allow the cst branch to be merged back
> to trunk.


Hi Edward.

I prefer not to make this target function dependent on C-code compilation.

Compilation of code on windows can be quite a hairy thing.

For example see:
http://wiki.nmr-relax.com/Installation_windows_Python_x86-32_Visual_Studio_Express_for_Windows_Desktop#Install_Visual_Studio_Express_2012_for_Windows_Desktop

Visua Studio Express is several hundreds of megabyte installation, for
just compiling an exponential curve. ?
This is way, way overkill for this situation.




> 2)  Scipy is now a dependency for the dispersion analysis!  Why was
> this not discussed?  Coding a function for calculating the covariance
> matrix is basic.  Deriving and coding the real gradient function is
> also basic.  I do not understand why Scipy is now a dependency.  I
> have been actively trying to remove Scipy as a relax dependency and
> only had a single call for numeric quadratic intergration via QUADPACK
> wrappers left to remove for the frame order analysis.  Now Scipy is
> back :(

Hi Edward.

Scipy is a dependency for trying calculation with scipy.optimize.leastsq.

How could it be anymore different?

What you are aiming at, is to add yet another feature for estimating the errors.
A third solution.

What ever the third solution would come up with of dependency, would
depend on the method implemented.
One could also possible imagine to extend this procedures in R, Matlab
or whatever.

Byt they would also need to meet some dependencies.

Of course the best solution would always try to make relax most independent.

But if the desire is to try with scipy.optimize.leastsq, then you are
bound with this dependency.

> 3)  If the covariance function was coded, then the specific analysis
> API could be extended with a new covariance method and the
> relax_disp.r2eff_estimate user function could have simply been called
> error_estimate.covariance_matrix, or something like that.  Then this
> new error_estimate.covariance_matrix user function could replace the
> monte_carlo user functions for all analyses, as a rough error
> estimator.

That would be the third possibility.

> 4)  For the speed of optimisation part of the new
> relax_disp.r2eff_estimate user function, this is not because scipy is
> faster than minfx!!!  It is the choice of algorithms, the numerical
> gradient estimate, etc.
> (http://thread.gmane.org/gmane.science.nmr.relax.scm/22979/focus=6812).

This sound good.

But I can only say, that as I user I meet a "big wall of time
consumption", for the error
estimation of R2eff via Monte-Carlo.

As a user, I needed more options to try out.

> 5)  Back to Scipy.  Scipy optimisation is buggy full stop.  The
> developers ignored my feedback back in 2003.  I assumed that the
> original developers had just permanently disappeared, and they really
> never came back.  The Scipy optimisation code did not change for many,
> many years.  While it looks like optimisation works, in some cases it
> does fails hard, stopping in a position in the space where there is no
> minimum!  I added the dx.map user function to relax to understand
> these Scipy rubbish results.  And I created minfx to work around these
> nasty hidden failures.  I guess such failures are due to them not
> testing the functions as part of a test suite.  Maybe they have fixed
> the bugs now, but I really can no longer trust Scipy optimisation.
>

I am sorry to hear about this.

And I am totally convinced that minfx is better for minimising the
dispersion models.
You have proven that quite well in your papers.

I do though have a hard time believing that minimisation of an
exponential function should be
subject to erroneous results.

Anyway, this is still left to "freedom of choice" for the user.


> There is another solution that avoids this, and which would have been
> a much better solution:
>
> a)  Deriving and coding the gradients (and maybe Hessian),
> b)  Add a covariance function to the relax library.
>
> This way you could simply continue using minfx, though with the faster
> optimisation logarithms and have a new user function that is
> independent of the analysis type and hence much more useful.  If you
> did have the Hessian in the C module, then this with Newton
> optimisation followed by calling error_estimate.covariance_matrix
> should be even faster.  Please discuss and plan solutions on the
> mailing lists before implementing them.  The quickest solution to
> implement is not always the best.
>
> Alternatively, a quick solution for a once-off problem could be coded
> into a temporary branch, but that branch is then not merged back into
> the main line.
>
> Cheers,
>
> Edward
>
>
>
>
> On 24 August 2014 17:56, Troels E. Linnet
> <[email protected]> wrote:
>> URL:
>>   <http://gna.org/task/?7822>
>>
>>                  Summary: Implement user function to estimate R2eff and
>> associated errors for exponential curve fitting.
>>                  Project: relax
>>             Submitted by: tlinnet
>>             Submitted on: Sun 24 Aug 2014 03:56:36 PM UTC
>>          Should Start On: Sun 24 Aug 2014 12:00:00 AM UTC
>>    Should be Finished on: Sun 24 Aug 2014 12:00:00 AM UTC
>>                 Category: relax's source code
>>                 Priority: 5 - Normal
>>                   Status: In Progress
>>         Percent Complete: 0%
>>              Assigned to: tlinnet
>>              Open/Closed: Open
>>          Discussion Lock: Any
>>                   Effort: 0.00
>>
>>     _______________________________________________________
>>
>> Details:
>>
>> A verification script, showed that using scipy.optimize.leastsq reaches the
>> exact same parameters as minfx for exponential curve fitting.
>>
>> The verification script is in:
>> test_suite/shared_data/curve_fitting/profiling/profiling_relax_fit.py
>> test_suite/shared_data/curve_fitting/profiling/verify_error.py
>>
>> The profiling script shows that a 10 X increase in speed can be reached by
>> removing
>> the linear constraints when using minfx.
>>
>> The profiling also shows that scipy.optimize.leastsq is 10X as fast as using
>> minfx, even without linear constraints.
>>
>> scipy.optimize.leastsq is a wrapper around wrapper around MINPACK's lmdif and
>> lmder algorithms.
>>
>> MINPACK is a FORTRAN90 library which solves systems of nonlinear equations, 
>> or
>> carries out the least squares minimization of the residual of a set of linear
>> or nonlinear equations.
>>
>>  The verification script also shows, that a very heavy and time consuming
>> monte carlo simulation of 2000 steps, reaches the same errors as the errors
>> reported by scipy.optimize.leastsq.
>>
>> The return from scipy.optimize.leastsq, gives the estimated co-variance.
>> Taking the square root of the co-variance corresponds with 2X error reported
>> by minfx after 2000 Monte-Carlo simulations.
>>
>> This could be an extremely time saving step, when performing model fitting in
>> R1rho, where the errors of the R2eff values, are estimated by Monte-Carlo
>> simulations.
>>
>> The following setup illustrates the problem.
>> This was analysed on a: MacBook Pro, 13-inch, Late 2011.
>> With no multi-core setup.
>>
>> Script running is:
>> test_suite/shared_data/dispersion/Kjaergaard_et_al_2013/2_pre_run_r2eff.py
>>
>> This script analyses just the R2eff values for 15 residues.
>> It estimates the errors of R2eff based on 2000 Monte Carlo simulations.
>> For each residues, there is 14 exponential graphs.
>>
>> The script was broken after 35 simulations.
>> This was measured to 20 minutes.
>> So 500 simulations would take about 4.8 Hours.
>>
>> The R2eff values and errors can by scipy.optimize.leastsq can instead be
>> calculated in: 15 residues * 0.02 seconds = 0.3 seconds.
>>
>>
>>
>>
>>     _______________________________________________________
>>
>> Reply to this item at:
>>
>>   <http://gna.org/task/?7822>
>>
>> _______________________________________________
>>   Message sent via/by Gna!
>>   http://gna.org/
>>
>>
>> _______________________________________________
>> relax (http://www.nmr-relax.com)
>>
>> This is the relax-devel mailing list
>> [email protected]
>>
>> To unsubscribe from this list, get a password
>> reminder, or change your subscription options,
>> visit the list information page at
>> https://mail.gna.org/listinfo/relax-devel

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: [task #7822] Implement user function to estimate R2eff and associated errors for exponential curve fitting.

Reply via email to