Re: Full analysis issue

Edward d'Auvergne Tue, 27 May 2008 05:31:44 -0700

On Fri, May 9, 2008 at 5:45 PM, Sébastien Morin
<[EMAIL PROTECTED]> wrote:
> Hi Ed,
>
> First, thanks a lot for this help !
>
> Second, I have to apologize for the length of this mail...
>
>
> Ok...
>
>
> My system is a 271 residue globular protein (230 residues with data at 3
> fields = 2070 observables). An homologous protein is being studied in
> the lab and analysing relaxation data using either the diffusion seeded
> approach in ModelFree or the new protocol of the full_analysis script
> yields similar results with a high mean S2 (~0.90) and a few Rex (15-20)
> throughout the protein. Thus, the problem here with my system is
> probably external to the approaches and the user...
>
>
> Ok...
>
>
> I tried using ModelFree with relax (script palmer.py : ModelFree as an
> engine for optimization, but relax for automating and AIC model
> selection) and got similar results than with the full_analysis.py
> approach... For the two situations tested (see below), no oscillation
> occured. Here are some stats :
>
> =======================================================================
> Approach        Diff     Iter  Chi2    AIC     Nb_Rex  <Rex>_+-_StdDev
> ==============  =======  ====  ======  ======  ======  ===============
> palmer          prolate  15    ~12990  ~14060  182     1.602_+-_0.770
>
> palmer_hybrid   prolate  12    ~ 2715  ~ 3660  129     0.902_+-_0.571
>
> full            prolate   5    ~13090  ~14125  181     1.671_+-_0.782
>
> full_hybrid     prolate   7    ~ 2750  ~ 3720  145     2.431_+-_1.546
> =======================================================================
>
> It seems that the new protocol is not the source of the problem.
> Moreover, it is obvious from the AIC value (and also from the diffusion
> tensor details, not shown here) that the hybrid (without the highly
> flexible C-terminus) is a better description of the system. However, as
> is seen here, the Rex values seem quite small and there are way too much
> Rex (> 50 % of all residues)... These may thus be non significative, but
> then, how can one exclude such "artifacts" when doing iterative
> optimization (with either approach)..? How can one decide to choose
> another model than with Rex when iterating to find the best diffusion
> tensor..?


This Rex all over the place is an indication of trouble.  Especially
if the local tm models don't show the same Rex pattern!  This is
likely to be the artificial Rex values as described in Tjandra et al.,
1996 (also discussed in depth in d'Auvergne and Gooley, 2007 and 2008b
(http://dx.doi.org/10.1039/b702202f and
http://dx.doi.org/10.1007/s10858-007-9213-3)).  The problem is likely
that the diffusion tensor description is inadequate.


> Ok...
>
>
> Maybe, as you proposed, the problem arises because of the crystal
> structure being inappropriate for describing the solution structure...
> The crystal structure I use has a resolution of 1.95 A. Protons were not
> visible but were added using CHARMM.  Moreover, different snapshots from
> molecular mechanics in CHARMM were also tested to see if fluctuations in
> NH bond orientation could yield better optimizations... It was not the case.
>
> I'll try to assess this issue of the crystal structure by running tests
> (with palmer.py and also full_analysis.py approaches) using a different
> structure (a ponctual mutant) also from crystallography... The
> resolution of this structure is also quite low (1.75 A). Anyway, I don't
> have choice since no solution structure exists, neither better crystal
> structures... If ever the crystal structure is the cause of this
> problem, what can one do ? Is one obliged to do his analysis with a
> local_tm or a sphere diffusion tensor ? Is it a waste if on does so with
> good quality data at three fields ???

The underlying structure is only one of a few issues which may trigger
this problem.  I'll describe more below.  But if this structure is not
representative of solution conditions, then the local tm and spherical
diffusion models is all you can use.  Because of its construction, the
local tm model is highly sensitive to noise and can be quite unstable,
as can be seen in the plot of the tm value, and hence this model
significantly benefits from more and higher quality data.  This will
improve the dynamic description obtained from the local tm model, and
hence help any next steps in the analysis.


> Ok...
>
>
> What about the AIC for the local_tm model VS the ellipsoid in the
> full_analysis approach ? Here are some stats :
>
> =======================================================================
> Approach     Models  Diff       AIC
> ===========  ======  =========  ======
> full         m1-m5   local_tm   ~ 4510
> full         m1-m5   ellipsoid  ~12710
>
> full         m0-m9   local_tm   ~ 4410
> full         m0-m9   ellipsoid  ~ 5210

Ok, this huge differential decrease in AIC values between the 2
diffusion models using the different sets of model-free models is
probably an indication that the ellipsoid diffusion model with
model-free models m0 to m9 is absorbing an artifact.  This is probably
artificial Rex or ts values absorbing the inadequacy of the diffusion
model, a bit like the artificial Rex of Tjandra et al., 1996 but with
the diffusion model being more complex than the ellipsoid because of
the C-terminus (rather than the artificial Rex when using spherical
diffusion rather than a spheroid or ellipsoid).


> full_hybrid  m1-m5   local_tm   ~ 4510
> full_hybrid  m1-m5   ellipsoid  ~ 4720 *
>
> full_hybrid  m0-m9   local_tm   ~ 4410
> full_hybrid  m0-m9   ellipsoid  ~ 4570 **

So excluding the C-terminal tail fixes this, but still the ellipsoid
is insufficient.


> =======================================================================
> *  not converged after 35 rounds (oscillates)
> ** not converged after 26 rounds (oscillates)
>
> As said before, the hybrid improves the description of the diffusion,
> however, there is still a problem : first, the local_tm diffusion is
> still selected over the ellipsoid (even if the difference is now
> smaller), second, the ellipsoid optimizations don't converge and
> oscillate...

The oscillation is not really a problem.  But somehow I (or anyone
who's interested) should try to add a method or algorithm to relax to
detect this oscillatory swinging between different universes around
the 'universal' solution to stop any automated procedures.

The selection of the local tm says one thing - that the most complex
hybrid model using an ellipsoidal core is insufficient.  The diffusion
is probably more complex.  Along these lines, there is one issue which
would cause the diffusion tensor to be more complex than a simple
isolated particle tumbling as an ellipsoid (with no large concerted
internal motions such as inter-domain dynamics).  That is a phenomenon
first investigated in Schurr et al., 1994 at that back of that paper,
and that is partial dimerisation.  If you have 5% dimer in the NMR
tube, even a non-specific dimerisation, then this could cause the
problems you are seeing.  The diffusion tensor would then be a
superposition of 2 very different ellipsoidal diffusion tensors, say
D1 and D2 weighted by the populations p1 and p2 (the isotropic and
spheroid tensors could be simplifications of D1 and/or D2).  Then the
single ellipsoid would be insufficient.  The local tm diffusion model
could absorb p1.D1+p2.D2 into the single tm (very roughly considering
each vector experiences then up to 2*5 global correlation times) and
return a slightly better picture of the internal dynamics, and hence
be chosen by AIC model selection.  This has been investigated
elsewhere by looking at concentration dependence, but I can't remember
off the top of my head the references right now.

I'm not sure if this would have an effect, but maybe the large
movement of the C-terminus is modulating the diffusion of the core as
well, shifting it away from ellipsoidal behaviour.  There are many
other dynamic events which could cause the full single ellipsoid
equations to be insufficient.  Whatever is happening I think you are
walking on the cutting edge.  The theory you need for your current
data set does not exist, as far as I know, let alone has been properly
tested.

Oh, one other thing that it could be (although I think I remember you
saying that that wasn't the case already) is that a number of
different NMR samples were used, and that the protein and/or salt
concentration was not 100% identical in each.  Although not in your
case, this could also be caused by improper temperature calibration,
i.e. not using MeOH or another temp reference to calibrate different
experiments and different spectrometers, or temperature compensatory
blocks at the start of the R2, or single scan interleaving, etc.


> Now, what about the Rex and slow motions (ts) in the local_tm diffusion
> ? Here are some stats :
>
> =======================================================================
> Approach     Models  Diff       Nb_Rex  Nb_ts
> ===========  ======  =========  ======  =====
> full         m1-m5   local_tm    58      30
> full         m1-m5   ellipsoid  171      21
>
> full         m0-m9   local_tm    63      41
> full         m0-m9   ellipsoid  144      49
>
> full_hybrid  m1-m5   local_tm    58      30
> full_hybrid  m1-m5   ellipsoid  142 *    28
>
> full_hybrid  m0-m9   local_tm    64      41
> full_hybrid  m0-m9   ellipsoid  145 **   50
> =======================================================================
> *  not converged after 35 rounds (oscillates)
> ** not converged after 26 rounds (oscillates)

Maybe the flexible tail is causing your sample to oscillate, swimming
around like a cork-screw, in the NMR tube ;)  Seriously though, the
oscillation isn't a worry but the Rex parameter count is probably
demonstrating that this Rex is artificial, caused by the full
diffusion description being insufficient.


> As you can see, there are way more Rex in the ellipsoid, which probably
> means that there is a problem with the diffusion tensor... For the slow
> ns motions, there doesn't seem to be significantly more in the ellipsoid
> description... Moreover, the sphere diffusion tensor which is not
> NH-vector-orientation-dependent, also as a high degree of Rex, similar
> ns motions and AIC values similar (just a bit higher) to what is
> observed for the ellipsoid :
>
> =======================================================================
> Approach     Models  Diff       Nb_Rex  Nb_ts  AIC
> ===========  ======  =========  ======  =====  ======
> full         m1-m5   sphere     191      20    ~15200
>
> full         m0-m9   sphere     155      47    ~ 5640
>
> full_hybrid  m1-m5   sphere     145      31    ~ 5190
>
> full_hybrid  m0-m9   sphere     153      47    ~ 5030
> =======================================================================
>
> Should the sphere diffusion tensor yield similar results as the local_tm
> ? If there is a major difference between those two, does it mean that
> concerted motions may be present and that an hybrid model could solve
> the issue ?

No, the sphere should show artificial motions all over the place.  I
could guarantee that for all molecules, nothing tumbles truly as a
sphere!


> Ok...
>
>
> Now, are there concerted motions apparent from the local_tm results..? I
> plotted the results from the local_tm run after aic model selection
> (Would it be better if I'd look at the local_tm run for model 1 or 2
> only ? Can model selection here bias the results ?) and couldn't find
> any obvious link between different parts of the protein for one or more
> parameters among S2, S2f, S2s, Rex, te, tf, ts, chi2.

I can't remember if anyone has tried to isolate concerted motions from
model-free results.  I wouldn't use the local tm values though as
these simply indicate the shape of the diffusion tensor (well very
roughly considering that the single tm mimics in reality a number of
global correlation times, i.e. 5 in the pure ellipsoid).


> However, a small relation seems to exist for the local_tm distribution
> and the domain (The inverse is seen for the S2, but to a lesser extent.
> When looking at the tm1 run, the local_tm is also a bit smaller in the
> same domain [a small difference of 0.5-1.0 ns for values of ~13 ns], but
> the S2 are similar, which points to a difference for the two domains).
>
> My protein is globular, but has two structural domains side by side, an
> all alpha domain and an alpha/beta domain. In the homologous protein,
> there seems to exist Rex at the interface (which spans a surface of four
> 10 residue beta strands, which is big and is expected to be quite
> rigid). Maybe the two domains are a bit different in my system which
> could cause the problems I encounter. I'll try to assess this by running
> full_analysis runs on the different domains alone...

Because you have 2 domains, I would try the analysis like I did in
Horne et al., 2007 (http://dx.doi.org/10.1016/j.jmb.2007.05.067).
Maybe try a hybrid with the local tm values for the C-terminus, and 2
different diffusion tensors for each domain separately.  Maybe you
have inter-domain motions which would also explain why the single
ellipsoid is insufficient.  As long as the hybrid diffusion model
covers exactly the same spin systems as in all other models it is
compared to, you can construct highly complex models with relax.


> Ok...
>
>
> Well, I'm out of idea now... If you have any idea that could help, these
> will be more than welcome !

I think I'm out of ideas too!  For now anyway.


> I hope this discussion can also help other people solving difficulties
> encountered in their analysis or help them get more information out of
> their system...
>
> Thanks a lot once more !
>
> Cheers !
>
>
>
> Sébastien
>
>
> P.S. Again, sorry for the length of the mail...

That's not a problem.  I hope some of my suggestions will be useful.

Regards,

Edward

_______________________________________________
relax (http://nmr-relax.com)

This is the relax-users mailing list
relax-users@gna.org

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-users

Re: Full analysis issue

Reply via email to