Re: [ccp4bb] bond lengths, angles, ideality and refinements

Anastassis Perrakis Wed, 09 Jan 2008 11:48:52 -0800

I would only like to iterate a small comment I posted before:

Should the cell parameters be inaccurate, optimization of weights bycross-validation (getting the best Rfree) will result in 'higher' RMSD.It is easy to think about it: if in a cell is measured to be 1%larger than in reality, all bonds would 'prefer' to be 1% larger thanthe 'correct'dictionary values, resulting in a higher RMSD to satisfy that andthat structure would have the lowest Rfree because the X-ray data

would be fitted better.

I actually think that inaccurate cells are a big source of misery inmany refinements. I have found the ideaof WhatCheck to actually check your cell by looking at the projectionof bond lengths of certain types along the cell axes most useful.I would hardly advocate to measure your cell that way, but going backto you data and looking at the cell again would be worth it.


To make it more fun, cells change during radiation damage, so ...

best regards, Tassos

On 9 Jan 2008, at 20:15, Ian Tickle wrote:

Hi William & others,
Indeed, phenix.refine uses cross-validation to optimise the scalingof the X-ray & B-factor weights. All I did was demonstrate thatyou can do essentially the same thing as phenix.refine but usingRefmac instead. I don't claim to have done anything new, except Imodified Refmac to print out the free likelihood and used that as atarget function instead of Rfree, as suggested by Gerard Bricognein Meth. Enzymol. (1997) 276, 361-423. Whatever value of the RMSD(or better the RMS Z-score) comes out of that, you can be sure thatit's based purely objectively on the experimental data, not oncompletely arbitrary and unjustifiable subjective choices, which iswhat Jaskolski et al. appear to be suggesting. Cross-validation isa well-established methodology in statistics, it's certainly not'numerology'!
Of course then you have to come up with some theory to explain theexperimental results, i.e. why the RMSD that comes out must alwaysbe <= the RMS standard uncertainty, but actually that's notdifficult since the RMSD is related to the accuracy and the SU isrelated to the precision, and on the face of it there's no reasonwhy these should be related at all (as Gerard nicely demonstratedwith his dartboard analogy in Leeds!). Jaskolski et al.'s theorythat always RMSD = <SU> regardless of resolution just doesn't fitthe experimental results, and as every good scientist knows, itonly takes one ugly fact to destroy a beautiful theory.
As you point out, setting a target value of 0.02 Ang or higher forthe RMSD bonds and similarly for the angles, unless you have veryhigh resolution data, will inevitably result in take-up of somefraction of the random experimental errors into the refinedparameters, in order to inflate the RMSD/RMSZ's to their targetvalues and reduce Rwork at the expense of Rfree - otherwise knownas overfitting! It's not recommended practice to deliberatelycause random errors (however small) to be added to your co-ordinates! This is obvious if you think about what happens at lowresolution: there's no justification for refining individual xyz &B's, so the optimal procedure is to use constrained refinement withthe torsion angles as parameters, or restrained refinement with*very* tight restraints (if that's feasible). Whether you useconstrained refinement or its restrained equivalent, it will keepthe bond lengths & angles fixed at the initial dictionary values sothe RMSD's will be identically zero, or very nearly so, throughoutthe refinement.
Someone mentioned 'experienced crystallographers': actually sincethe distinction between RMSD & SU is purely a question ofstatistics not of crystallography, any crystallographic experienceis unlikely to be relevant!
The other question you raised is why Refmac doesn't refine theRMSD's much nearer to zero - this is something I also commented on;also why the Rfree & LLfree plots are so noisy compared with thosefrom CNS & phenix.refine. I think it's to do with rounding errorsin the gradient calculation and/or optimisation code. Refmac maybe using single precision, whereas phenix.refine may be usingdouble - I'm just guessing, maybe the programmers could comment?This is something I would like to see improved, in order to makecross-validation with Refmac more reliable & useful.
Cheers

-- Ian
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of William Scott
Sent: 09 January 2008 17:32
To: William Scott
Cc: ccp4bb@jiscmail.ac.uk
Subject: Re: [ccp4bb] bond lengths, angles, ideality and refinements

Sorry, that should have read

"because the value is established by social consensus, it is thus NOT
guaranteed to be perfectly accurate, ..."

In other words, one can imagine some source of systematic error in
establishing an ideal bond length.  For example, the crystal packing
environment of small molecules might tend to distort a bond
by a couple
hundredths of an Ångstrom.


William Scott wrote:
Dear Yang Li:


Happy New Year to you, too, (ahead of Feb. 7th).

You certainly owe us no apology; the reverse may not be true.

Your question is an important one, as is what you have
written below.
I'm not certain I have a completely satisfactory answer.

The reason is that ideal bond lengths may or may not be
"true" in the
sense that the value is established by social consensus, and is thus
guaranteed to be perfectly accurate, even though it may be
quite precise.
Because of this, and because of natural deviations from
ideality (which
really only become trustworthy observations at extremely
high resolution),
a certain amount of "wiggle room" is typically allowed in
terms of rmsd.
The more conservative the refinement, the smaller the rmsd
from ideality
will be.

Some people believe 0.02 Å deviation from ideality is
reasonable, based on
the accuracy of the dictionary values of bond lengths and
angles; others
consider that to be "too sloppy" and a way to artificially deflate
Rfactors.

I seem to have detected a tendency in the literature to aim
for about 0.01
Å deviation.  The new refinement program phenix.refine,
which is supposed
to optimize weighting between X-ray terms and
stereochemical constraints
automatically, seems to settle in at quite conservative
values, such as
0.005 Å, whereas with refmac, I can't seem to get the
geometry any more
ideal than 0.005 Å even if I try to idealize a structure in
the absence of
X-ray data.

So, like you, I am a bit confused, and wouldn't mind
hearing more from the
experts.

All the best,

Bill






yang li wrote:
Dear All,
      I am very sorry to involve you into such insignificance
discussion,
I
have reached agreement
with Prof Gerard, please stop talking about things beyond science,
thanks!
      I read a book today, which said "A refined model
should exhibit
rms
deviations of no more
than 0.02A for bond length and 4 for bond angels", I just
wonder about
the
standard of the
bond length and the bond angel. I think most of you have
read similar
words!
But maybe I
didnot express clearly and made some phrasal mistakes.
      At last, happy new year to you all--though very late!


Sincerely!
Yang Li
Disclaimer
This communication is confidential and may contain privilegedinformation intended solely for the named addressee(s). It may notbe used or disclosed except for the purpose for which it has beensent. If you are not the intended recipient you must not review,use, disclose, copy, distribute or take any action in reliance uponit. If you have received this communication in error, please notifyAstex Therapeutics Ltd by emailing [EMAIL PROTECTED]and destroy all copies of the message and any attached documents.Astex Therapeutics Ltd monitors, controls and protects all itsmessaging traffic in compliance with its corporate email policy.The Company accepts no liability or responsibility for any onwardtransmission or use of emails and attachments having left the AstexTherapeutics domain. Unless expressly stated, opinions in thismessage are those of the individual sender and not of AstexTherapeutics Ltd. The recipient should check this email and anyattachments for the presence of computer viruses. AstexTherapeutics Ltd accepts no liability for damage caused by anyvirus transmitted by this email. E-mail is susceptible to datacorruption, interception, unauthorized amendment, and tampering,Astex Therapeutics Ltd only send and receive e-mails on the basisthat the Company is not liable for any such alteration or anyconsequences thereof.Astex Therapeutics Ltd., Registered in England at 436 CambridgeScience Park, Cambridge CB4 0QA under number 3751674

Re: [ccp4bb] bond lengths, angles, ideality and refinements

Reply via email to