On Thu, 30 Jun 2022 at 08:19, Johan Wallerstein <[email protected]> wrote:
>
> Hi,
>
> I perform CPMG-RD cluster fitting using relax, cluster refer to grouping
> several residues (between 3 to 14 residues) for data from a 45 kDa protein.
> The software is a good tool for doing this analysis. I marginally adjust the
> core protocol with the header
>
>
>
> """Script for performing a full relaxation dispersion analysis using
> CPMG-type data."""
>
> I use only the CR72-model and I have a PRE_RUN_DIR from a run with individual
> residues. I use duplicates for error estimation, on both the 800 MHz and 900
> MHz data set, and AIC for model selection.
> When I analyse the clustered data I’m curious to get R2eff_(back_calc) for
> each data point. I clarify my main question by attaching some of my data.
>
> ###########
>
> For residue 530, when I do individual fit I get this output.
>
> From the log-file:
>
> ———
>
> The spin cluster [':530@N'].
> # Data pipe Num_params_(k) Num_data_sets_(n) Chi2
> Criterion
> No Rex - relax_disp 2 25 21.11216
> 25.11216
> CR72 - relax_disp 5 25 13.93686
> 23.93686
> The model from the data pipe 'CR72 - relax_disp' has been selected.
>
> ———
>
> The file ‘disp_530_N.out’ in /final gives the following data table:
>
> # Experiment_name Field_strength_(MHz) Disp_point_(Hz)
> R2eff_(measured) R2eff_(back_calc) R2eff_errors
> 'SQ CPMG' 799.870000000 25.000000
> 17.523783179912268 16.953711340740483 0.831932502443187
> 'SQ CPMG' 799.870000000 50.000000
> 16.513029763549930 16.914478241596726 0.805586049587058
> 'SQ CPMG' 799.870000000 75.000000
> 16.920353186819355 16.875245142453196 0.816049323427317
> 'SQ CPMG' 799.870000000 100.000000
> 16.667402888129434 16.836012043882192 0.809527349094067
> 'SQ CPMG' 799.870000000 150.000000
> 16.454146002323920 16.757546676539960 0.804090431533660
> 'SQ CPMG' 799.870000000 200.000000
> 16.359623786385509 16.679111600438773 0.801698521274394
> 'SQ CPMG' 799.870000000 300.000000
> 15.525257427659495 16.523477804972345 0.781054888748662
> 'SQ CPMG' 799.870000000 350.000000
> 16.609858567997016 16.447662190184474 0.808054742944598
> 'SQ CPMG' 799.870000000 400.000000
> 16.844330710216166 16.374401478154368 0.814080812205130
> 'SQ CPMG' 799.870000000 500.000000
> 17.414128601521103 16.238705811895670 0.829011905615397
> 'SQ CPMG' 799.870000000 600.000000
> 16.093980388806685 16.120475644003818 0.795034804815920
> 'SQ CPMG' 799.870000000 800.000000
> 15.988036247232372 15.937187687218284 0.792401090807446
> 'SQ CPMG' 799.870000000 1000.000000
> 15.732649459437805 15.811741022120714 0.786107934589661
> 'SQ CPMG' 900.130000000 57.000000
> 19.386713898811351 20.163621643615215 0.801212497068354
> 'SQ CPMG' 900.130000000 114.000000
> 21.873502893081564 20.050473540803750 0.859660006101508
> 'SQ CPMG' 900.130000000 171.000000
> 19.133628964210569 19.937331394199191 0.795598311646227
> 'SQ CPMG' 900.130000000 228.000000
> 20.497316023709256 19.824330722189416 0.826567798566107
> 'SQ CPMG' 900.130000000 285.000000
> 20.091262254550443 19.712140304427066 0.817160298225920
> 'SQ CPMG' 900.130000000 400.000000
> 19.177817248045365 19.494278567900892 0.796574222459005
> 'SQ CPMG' 900.130000000 514.000000
> 19.111643299707755 19.300194513689348 0.795113430566997
> 'SQ CPMG' 900.130000000 628.000000
> 18.432363807026835 19.135138271300775 0.780352695478047
> 'SQ CPMG' 900.130000000 742.000000
> 19.383070346051138 18.999531230125285 0.801131245976946
> 'SQ CPMG' 900.130000000 857.000000
> 18.560791856291317 18.889165645990943 0.783110910382522
> 'SQ CPMG' 900.130000000 971.000000
> 18.810639108776328 18.801416118812085 0.788520121686263
> 'SQ CPMG' 900.130000000 1085.000000
> 18.943973311789268 18.730884832131551 0.791430360141496
>
> ###########
>
> For a cluster fit (including residue 530) I get this output from the log-file:
>
> ———
>
> The spin cluster [':530@N', ':536@N', ':537@N', ':538@N', ':550@N', ':551@N',
> ':552@N'].
> # Data pipe Num_params_(k) Num_data_sets_(n) Chi2
> Criterion
> No Rex - relax_disp 14 175 458.66116
> 486.66116
> CR72 - relax_disp 23 175 117.29418
> 163.29418
> The model from the data pipe 'CR72 - relax_disp' has been selected.
> ———
This looks reasonable. This is 7 spins, so on average, 117.29/7 =
16.76, which is a little more than the single spin value of 13.94.
> But there is no corresponding data table.
Do you mean that there is no ‘disp_530_N.out’ file for the clustered analysis?
> ###########
>
> QUESTION 1:
> Is it possible to get, or easily create a table with, in my case, 175
> R2eff_(back_calc) for the cluster, so that I can get better resolution on the
> Chi2 = 117.29418 above ?
> And possibly study how a single residue affect the cluster fitting.
Try the value.write() user function:
https://www.nmr-relax.com/manual/value_write.html
Make sure to set the 'bc' flag to True.
> QUESTION 2:
> Are there any reference to methods used for doing efficient selection of
> residues included in the cluster? There is obviously an immense number of
> combinations of residues to make clusters in a normal size protein. I
> consider making a program/script for this process and would be curious to get
> some inspiration.
As far as I am aware, human logic is used for this process. You
identify a rigid moving unit in your system yourself with similar
dispersion results and then use clustering on that. I would assume
that an automated system to find clusters would be computationally
very expensive, despite being able to run on a computer cluster via
MPI. And that such a project would take up half or more of a PhD
student's time. Then again, I wouldn't be surprised if there is now a
publication exploring this concept. If you do find one, I'd be
interested in hearing about it.
Regards,
Edward
_______________________________________________
nmr-relax-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nmr-relax-users