Re: [sr #3154] Implementation of Baldwin (2014) B14 model - 2-site exact solution model for all time scales

Edward d'Auvergne Sun, 04 May 2014 04:44:22 -0700

Hi Troels,

Awesome work!  It's looking great.  For completing the B14 model
support (http://wiki.nmr-relax.com/B14), I have a few hints.  But
first, to fully have the model supported in relax requires a few more
small steps:


1)  Adding the B14 model to the dispersion auto-analysis
(http://wiki.nmr-relax.com/Tutorial_for_adding_relaxation_dispersion_models_to_relax#The_auto-analysis).
 This is needed for the output file creation.

2)  Adding the B14 model to the GUI
(http://wiki.nmr-relax.com/Tutorial_for_adding_relaxation_dispersion_models_to_relax#The_GUI).
 Users currently cannot see the model in the model list.

3)  Adding the B14 model to the manual
(http://wiki.nmr-relax.com/Tutorial_for_adding_relaxation_dispersion_models_to_relax#The_relax_manual).
 This is very important.  It needs its own section in the dispersion
chapter, the equations for which can come directly from the wiki page
(http://wiki.nmr-relax.com/B14).  It also needs to be added to the
dispersion model table and the dispersion software comparison table.

These 3 steps will complete the implementation in relax.  Then there
is some clean ups.

- For the RelaxWarnings you see there, you can find exactly where they
come from by using the --traceback and --numpy-raise command line
options.  These are actually numpy math errors, for example divide by
zero, math domain errors, etc.  They should all be caught and the
error avoided.  See the other lib.dispersion modules for what to do
when this occurs, as I have already written the code to handle these
situations for all other models.

- As you said, code optimisation.  There is a lot that can be done
here thanks to the design of relax, see
http://thread.gmane.org/gmane.science.nmr.relax.scm/20681/focus=5516
and http://thread.gmane.org/gmane.science.nmr.relax.devel/5506/focus=5509.
 Shifting the all calculations possible into target function
initialisation or the highest level of the target function will really
massively speed things up.  Refactoring Andy's code to user the
parameter naming convention used in relax and the relax coding style
would also be worthwhile.


I also have one last suggestion.  It's rather simple and quick, once
the B14 model is complete.  Split it into two models!  This is done
for all other models in relax which support the R20A and R20B
parameters.  The 'CR72' and 'CR72 full' models are probably the best
example.  The idea is to create a model whereby R20A = R20B.  For
this, each section of the tutorial at
http://wiki.nmr-relax.com/Tutorial_for_adding_relaxation_dispersion_models_to_relax
should be followed sequentially.  Users in the field will often use
the R20A = R20B models as the full R20A != R20B models are sometimes
too numerically unstable for their noisy data.  Therefore having both
'B14' and 'B14 full' models would be beneficial.  Following the
tutorial step by step and having one commit per section (or multiple),
and using the CR72 models as the reference, I usually manage to
complete this model splitting within 30 min.

Cheers,

Edward



On 3 May 2014 22:27, Troels Emtekær Linnet <[email protected]> wrote:
> Dear Andrew.
>
> I am happy to announce, that I have now implemented your model in relax.
>
> At this step, I created sample data R2eff data with your provided
> script, from the spin dynamic parameters from your paper:
> Dynamic parameters
> 1) kex=1000. rad/s
> 2) pb=0.01
> 3) dw_ppm=2.0
> 4) R2g= r2a = 2.0 rad/s
> 5) R2e=r2b = 100. rad/s.
>
> Experiment settings
> 1) isotope('1H', spin_id='@H')
> 2) ncycs = [2, 4, 8, 10, 20, 40, 500]
> 3) sfrq= 200. * 1E6 # 1H Larmor frequency.
> 4) Trelax=0.04 # Total time of CPMG block.
>
> When loading the generated R2eff data, relax will now do grid search,
> minimise, and find the
> dynamic parameters.
>
> This is implemented with the systemtest:
>     relax -s Relax_disp.test_baldwin_synthetic -d
>
> (Look for def test_baldwin_synthetic(self): )
> http://svn.gna.org/viewcvs/*checkout*/relax/trunk/test_suite/system_tests/relax_disp.py?revision=HEAD
>
> An example graph for the test data is here:
> http://wiki.nmr-relax.com/B14#Example_graph_of_test_data
>
> The main code for the implementation is:
>
>         The library code.
>         
> http://svn.gna.org/viewcvs/*checkout*/relax/trunk/lib/dispersion/b14.py?revision=HEAD
>
>          The target function. The func_B14().
>           
> http://svn.gna.org/viewcvs/*checkout*/relax/trunk/target_functions/relax_disp.py?revision=HEAD
>
> All commits that created the code, approx 20-30 commits, with
> corrections, can be found on the mailing lists with this search.
> http://search.gmane.org/?query=https%3A%2F%2Fgna.org%2Fsupport%2F%3F3154&author=&group=gmane.science.nmr.relax.scm&sort=revdate&DEFAULTOP=and&TOPDOC=10&xP=https%09gna%09org%09support%093154&xFILTERS=Gscience.nmr.relax.scm---A
>
> This follows the tutorial at:
> http://wiki.nmr-relax.com/Tutorial_for_adding_relaxation_dispersion_models_to_relax
>
> Step 1-9 is completed.
>
> Now follows step 10, which is debugging.
> 1) Optimisation of code
> 2) Prevent trigonometric/math functions to be overloaded. Currently
> the Grid search makes these complaints, for some edge parameter
> values.
>
> RelaxWarning: invalid value encountered in divide
> RelaxWarning: invalid value encountered in sqrt
> RelaxWarning: invalid value encountered in arccosh
> RelaxWarning: invalid value encountered in log
>
> A good test case would also be the creation of R2eff data with the
> already implemented numerical models, and the fit with B14.
>
> Best
> Troels
>
>
>
>
>
>
>
>
> 2014-05-03 14:54 GMT+02:00 Troels Emtekær Linnet <[email protected]>:
>> Dear Andrew Baldwin.
>>
>> I am in the process of implementing your code in relax.
>>
>> I am getting myself into a terrible mess by trying to compare
>> equations in papers, code and conversions.
>>
>> But I hope you maybe have a little time to clear something up?
>>
>> I am looking at the current version of your paper:
>> dx.doi.org/10.1016/j.jmr.2014.02.023
>> which still is still "early version of the manuscript. The manuscript
>> will undergo copyediting, typesetting, and review of the resulting
>> proof before it is published in its final form."
>>
>> 1) Equations are different ?
>> In Appendix 1 – recipe for exact calculation of R2,eff
>> h1 = 2 dw (dR2 + kEG - kGE)
>>
>> But when I compare to:
>>
>> Supplementary Section 4. Relation to Carver Richards equation. Equation 70.
>> h1 = zeta = 2 dw alpha_ =  2 dw (dR2 + kGE - kEG)
>>
>> Is there a typo here? The GE and EG has been swapped.
>>
>> 2) Missing zeta in equation 70
>> There is missing \zeta instead of z in equation 70, which is probably
>> a typesetting error.
>>
>> 3) Definition of " Delta R2" is opposite convention?
>> Near equation 10, 11 you define: delta R2 = R2e - R2g
>> if we let e=B and g=A, then delta R2 = R2B - R2A
>>
>> And in equation 70:
>> alpha_ =  delta R2 + kGE - kEG
>>
>> but i CR original work, A8,
>> alpha_ =  R2A - R2B + kA - kB = - delta R2 + kGE - kEG
>>
>> So, that doesn't match?
>>
>> Sorry if these questions are trivial.
>>
>> But I hope you can help clear them up for me. :-)
>>
>> Best
>> Troels
>>
>>
>>
>>
>>
>>
>> 2014-05-01 10:07 GMT+02:00 Andrew Baldwin <[email protected]>:
>>>
>>> The Carver and Richards code in relax is fast enough, though Troels
>>> might have an interest in squeezing out a little more speed.  Though
>>> it would require different value checking to avoid NaNs, divide by
>>> zeros, trig function overflows (which don't occur for math.sqrt), etc.
>>>  Any changes would have to be heavily documented in the manual, wiki,
>>> etc.
>>>
>>>
>>> The prove that the two definitions are equivalent is relatively
>>> straightforward. The trig-to-exp command in the brilliant (and free)
>>> symbolic program sage might prove helpful in doing that.
>>>
>>>
>>> For the tau_c, tau_cp, tau_cpmg, etc. definitions, comparing the relax
>>> code to your script will ensure that the correct definition is used.
>>> That's how I've made sure that all other dispersion models are
>>> correctly handled - simple comparison to other software.  I'm only
>>> aware of two definitions though:
>>>
>>> tau_cpmg = 1 / (4 nu_cpmg)
>>> tau_cpmg = 1 / (2 nu_cpmg)
>>>
>>>
>>> tau_cpmg = 1/(nu_cpmg).
>>>
>>> Carver and Richard's switch definitions half way through the paper
>>> somewhere.
>>>
>>>
>>> What is the third?  Internally in relax, the 1 / (4 nu_cpmg)
>>> definition is used.  But the user inputs nu_cpmg.  This avoids the
>>> user making this mistake as nu_cpmg only has one definition.
>>>
>>>
>>> I've seen people use nu_cpmg defined as the pulse frequency. It's just an
>>> error that students make when things aren't clear. I've often seen brave
>>> student from a lab that has never tried CPMG before do this. Without someone
>>> to tell them that this is wrong, it's not obvious to them that they've made
>>> a mistake. I agree with you that this is solved with good documentation.
>>>
>>>
>>>
>>> You guys are free to use my code (I don't mind the gnu license) or of course
>>> implement from scratch as needed.
>>>
>>> Cheers!  For a valid copyright licensing agreement, you'll need text
>>> something along the lines of:
>>>
>>> "I agree to licence my contributions to the code in the file
>>> http://gna.org/support/download.php?file_id=20615 attached to
>>> http://gna.org/support/?3154 under the GNU General Public Licence,
>>> version three or higher."
>>>
>>> Feel free to copy and paste.
>>>
>>>
>>> No problem:
>>>
>>> "I agree to licence my contributions to the code in the file
>>> http://gna.org/support/download.php?file_id=20615 attached to
>>> http://gna.org/support/?3154 under the GNU General Public Licence,
>>> version three or higher."
>>>
>>>
>>> I'd like to note again though that anyone using this formula to fit data,
>>> though exact in the case of 2site exchange/inphase magnetisation, evaluated
>>> at double floating point precision should not be doing so! Neglecting scalar
>>> coupling/off resonance/spin flips/the details of the specific pulse sequence
>>> used will lead to avoidable foobars. I do see value in this, as described in
>>> the paper, as being a feeder for initial conditions for a more fancy
>>> implemenation. But beyond that, 'tis a bad idea. Ideally this should appear
>>> in big red letters in your (very nice!) gui when used.
>>>
>>> In relax, we allow the user to do anything though, via the
>>> auto-analysis (hence the GUI), we direct the user along the best path.
>>>  The default is to use the CR72 and 'NS CPMG 2-site expanded' models
>>> (Carver & Richards, and Martin Tollinger and Nikolai Skrynnikov's
>>> Maple expansion numeric model).  We use the CR72 model result as the
>>> starting point for optimisation of the numeric models, allowing a huge
>>> speed up in the analysis.  The user can also choose to not use the
>>> CR72 model results for the final model selection - for determining
>>> dispersion curve significance.
>>>
>>>
>>> Here's the supp from my CPMG formula paper (attached). Look at the last
>>> figure. Maybe relevant. Nikolai's algorithm blows up sometimes when you
>>> evaluate to double float point precision (as you will when you have a
>>> version in python or matlab). The advantage of Nicolai's formula, or mine is
>>> that they won't fail when Pb starts to creep above a per cent or so.
>>>
>>> Using the simple formula as a seed for the more complex on is a good idea.
>>> The most recent versions of CATIA have something analogous.
>>>
>>>
>>> As for the scalar coupling and spin flips, I am unaware of any
>>> dispersion software that handles this.
>>>
>>>
>>> CATIA. Also cpmg_fig I believe. In fact, I think we may have had this
>>> discussion before?
>>>
>>> https://plus.google.com/s/cpmg%20glove
>>>
>>> If you consider experimental validation a reasonable justification for
>>> inclusion of the effects then you might find this interesting:
>>>
>>> Spin flips are also quite relevant to NH/N (and in fact any spin system).
>>> The supplementary section of Flemming and Pramodh go into it here for NH/N
>>> http://www.pnas.org/content/105/33/11766.full.pdf
>>>
>>> And this:
>>> JACS (2010) 132: 10992-5
>>> Figure 2:
>>> r2 0.97, rmsd  8.0 ppb (no spin flips)
>>> r2 0.99, rmsd  5.7 ppb (spin flips).
>>>
>>> The improvements are larger than experimental uncertainties.
>>>
>>> When we design these experiments and test them, we need to think about the
>>> details. This is in part because Lewis beats us if we don't. You can imagine
>>> that it comes as a surprise then when we see people neglecting this. In
>>> short, the parameters you extract from fitting data will suffer if the
>>> details are not there. In the case of spin flips, the bigger the protein,
>>> the bigger the effect. In your code, you have the opportunity to do things
>>> properly. This leaves the details behind the scenes, so the naive user
>>> doesn't have to think about them.
>>>
>>>
>>> And only Flemming's CATIA
>>> handles the CPMG off-resonance effects.  This is all explained in the
>>> relax manual.  I have asked the authors of most dispersion software
>>> about this too, just to be sure.  I don't know how much of an effect
>>> these have though.  But one day they may be implemented in relax as
>>> well, and then the user can perform the comparison themselves and see
>>> if all these claims hold.
>>>
>>>
>>> Myint, W. & Ishima, R. Chemical exchange effects during refocusing pulses in
>>> constant-time CPMG relaxation dispersion experiments. J Biomol Nmr 45,
>>> 207-216, (2009).
>>>
>>> also:
>>>
>>> Bain, A. D., Kumar Anand, C. & Nie, Z. Exact solution of the CPMG pulse
>>> sequence with phase variation down the echo train: application to R(2)
>>> measurements. J Magn Reson 209, 183- 194, (2011).
>>>
>>>
>>> Or just simulate the off-resonance effects yourself to see what happens. For
>>> NH you see the effects clearly for glycines and side chains, if the carrier
>>> is in the centre around 120ppm. The problem generally gets worse the higher
>>> field you go though this of course depends when you bought your probe. You
>>> seem to imply that these effects are almost mythical. I assure you that they
>>> come out happily from the Bloch equations.
>>>
>>> Out of curiosity, from a philosophical perspective, I wonder if you'd agree
>>> with this statement:
>>>
>>> "the expected deviations due to approximations in a model should be lower
>>> than the experimental errors on each datapoint."
>>>
>>>
>>>
>>> Anyway, the warnings about analytic versers numeric are described in
>>> the manual.  But your new model which adds a new factor to the CR72
>>> model, just as Dimitry Korzhnev's cpmg_fit software does for his
>>> multiple quantum extension of the equation (from his 2004 and 2005
>>> papers), sounds like it removes the major differences between the
>>> analytic and numeric results anyway.  In any case, I have never seen
>>> an analytic result which is more than 10% different in parameter
>>> values (kex specifically) compared to the numeric results.  I am
>>> constantly on the lookout for a real or synthetic data set to add to
>>> relax to prove this wrong though.
>>>
>>>
>>> I think there's a misunderstanding in what I mean by numerical modelling.
>>> For the 2x2 matrix (basis: I+, ground and excited) from the Bloch McConnell
>>> equation, you can manipulate this to get an exact solution. Nikolai's
>>> approach does this, though his algorithm can trip up sometimes for
>>> relatively exotic parameters when you use doubles (see attached). My formula
>>> also does this. I agree with you entirely that in this instance, numerically
>>> solving the equations via many matrix exponentials is irrelevant as you'll
>>> get an identical result to using a formula.
>>>
>>> My central thesis here though is that to get an accurate picture of the
>>> experiment you need more physics. This means a bigger basis. To have off
>>> resonance effects, you need a minimum 6x6 (Ix,Iy,Iz, ground and excited). To
>>> include scalar coupling and off resonance, you need a 12x12
>>> (Ix,Iy,Iz,IxHz,IyHz,IzHz, ground and excited). Including R1 means another
>>> element and so on. The methyl group, for example, means you need 24.
>>>
>>> So when we use the term numerical treatment, we generally mean a calculation
>>> in a larger basis, as is necessary to take into account the appropriate spin
>>> physics. There aren't formulas to deal with these situations. In the 6x6
>>> case for example, you need 6 Eigenvalues, which is going to make life very
>>> rough for someone brave enough to attempt a close form solution. The Palmer
>>> and Trott trick used in 2002 for R1rho is a smart way of ducking the problem
>>> of having 6 Eigenvalues, but for CPMG unfortunately you need several
>>> Eigenvalues, not just the smallest.
>>>
>>> The 2x2 matrix that Nikolai and Martin, Carver Richard's and I analyse does
>>> not include scalar coupling, as magnetisation is held in-phase (in addition
>>> to neglecting all the other stuff detailed above). So it is a good
>>> representation for describing the continuous wave in-phase experiments
>>> introduced here (neglecting relaxation effects and off resonance):
>>>
>>> Vallurupalli, P.; Hansen, D. F.; Stollar, E. J.; Meirovitch, E.; Kay, L. E.
>>> Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 18473–18477.
>>>
>>> and here:
>>>
>>> Baldwin, A. J.; Hansen, D. F.; Vallurupalli, P.; Kay, L. E. J. Am. Chem.
>>> Soc. 2009, 131, 11939–48.
>>>
>>> But I think are the only two where this formula is directly applicable. Only
>>> if you have explicit decoupling during the CPMG period do you satisfy this
>>> condition. So this is not the case for all other experiments and certainly
>>> not true for those used most commonly.
>>>
>>> Anyhow. Best of luck with the software. I would recommend that you consider
>>> implementing these effects and have a look at some of the references. The
>>> physics are fairly complex, but the implementations are relatively
>>> straightforward and amount to taking many matrix exponentials. If you do
>>> this, I think you'd end up with a solution that really is world-leading.
>>>
>>> As it stands though, in your position, I would worry that on occasion, users
>>> will end up getting slightly wrong parameters out from your code by
>>> neglecting these effects. If a user trusts this code then, in turn, they
>>> might lead themselves to dodgy biological conclusions. For the time being,
>>> I'll stick to forcing my students to code things up themselves.
>>>
>>> All best wishes,
>>>
>>> Andy.
>>>
>>>
>>>

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
[email protected]

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: [sr #3154] Implementation of Baldwin (2014) B14 model - 2-site exact solution model for all time scales

Reply via email to