It is true that multicopy refinement was essential for the suppression
of Rwork. However, the whole point of the Rfree is that it is supposed
to be independent of the number of parameters you're refining. Simply
throwing multiple copies of the model into the refinement shouldn't have
affected Rfree, IF IT WERE TRULY "FREE".
It was almost certainly NCS-mediated spillover that allowed the
multicopy, parameter-driven reduction in Rwork to pull down the Rfree
values as well. The experiment is probably not worth the time it would
take to do, but I suspect that if MsbA and EmrE test sets had been
chosen in thin shells, then Rfree wouldn't have shown nearly the
"improvement" it did.
Dean
Phil Jeffrey wrote:
While NCS probably played a role in the first crystal form of MsbA (P1,
8 monomers), this is also the one that showed the greatest improvement
in R-free once the structure was correctly redetermined (7% or 14%
depending on which refinement protocols you compare).
The other crystal form of MsbA and the crystal forms of EmrE didn't have
particularly high-copy NCS (2 dimers, 4 monomers, dimer, 2 tetramers)
and the R-frees were somewhat comparable in all cases (31-36% for the
redetermined structures).
The *major* source of the R-free suppression in all these cases with the
inappropriate use of multi-copy refinement at low resolution.
Phil Jeffrey
Princeton
Dean Madden wrote:
Hi Dirk,
I disagree with your final sentence. Even if you don't apply NCS
restraints/constraints during refinement, there is a serious risk of
NCS "contaminating" your Rfree. Consider the limiting case in which
the "NCS" is produced simply by working in an artificially low
symmetry space-group (e.g. P1, when the true symmetry is P2): in this
case, putting one symmetry mate in the Rfree set, and one in the Rwork
set will guarantee that Rfree tracks Rwork. The same effect applies to
a large extent even if the NCS is not crystallographic.
Bottom line: thin shells are not a perfect solution, but if NCS is
present, choosing the free set randomly is *never* a better choice,
and almost always significantly worse. Together with multicopy
refinement, randomly chosen test sets were almost certainly a major
contributor to the spuriously good Rfree values associated with the
retracted MsbA and EmrE structures.
Best wishes,
Dean
Dirk Kostrewa wrote:
Dear CCP4ers,
I'm not convinced, that thin shells are sufficient: I think, in
principle, one should omit thick shells (greater than the diameter of
the G-function of the molecule/assembly that is used to describe
NCS-interactions in reciprocal space), and use the inner thin layer
of these thick shells, because only those should be completely
independent of any working set reflections. But this would be too
"expensive" given the low number of observed reflections that one
usually has ...
However, if you don't apply NCS restraints/constraints, there is no
need for any such precautions.
Best regards,
Dirk.
Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf:
It is important when using NCS that the Rfree reflections be
selected is
distributed thin resolution shells. That way application of NCS
should not
mix Rwork and Rfree sets. Normal random selection or Rfree + NCS
(especially 4x or higher) will drive Rfree down unfairly.
Doug Ohlendorf
-----Original Message-----
From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of
Eleanor Dodson
Sent: Tuesday, February 05, 2008 3:38 AM
To: CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] an over refined structure
I agree that the difference in Rwork to Rfree is quite acceptable at
your resolution. You cannot/ should not use Rfactors as a criteria
for structure correctness.
As Ian points out - choosing a different Rfree set of reflections
can change Rfree a good deal.
certain NCS operators can relate reflections exactly making it hard
to get a truly independent Free R set, and there are other reasons
to make it a blunt edged tool.
The map is the best validator - are there blobs still not fitted?
(maybe side chains you have placed wrongly..) Are there many
positive or negative peaks in the difference map? How well does the
NCS match the 2 molecules?
etc etc.
Eleanor
George M. Sheldrick wrote:
Dear Sun,
If we take Ian's formula for the ratio of R(free) to R(work) from
his paper Acta D56 (2000) 442-450 and make some reasonable
approximations,
we can reformulate it as:
R(free)/R(work) = sqrt[(1+Q)/(1-Q)] with Q = 0.025pd^3(1-s)
where s is the fractional solvent content, d is the resolution, p is
the effective number of parameters refined per atom after allowing for
the restraints applied, d^3 means d cubed and sqrt means square root.
The difficult number to estimate is p. It would be 4 for an
isotropic refinement without any restraints. I guess that p=1.5
might be an appropriate value for a typical protein refinement
(giving an R-factor
ratio of about 1.4 for s=0.6 and d=2.8). In that case, your
R-factor ratio of 0.277/0.215 = 1.29 is well within the allowed range!
However it should be added that this formula is almost a
self-fulfilling prophesy. If we relax the geometric restraints we
increase p, which then leads to a larger 'allowed' R-factor ratio!
Best wishes, George
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-2582
*******************************************************
Dirk Kostrewa
Gene Center, A 5.07
Ludwig-Maximilians-University
Feodor-Lynen-Str. 25
81377 Munich
Germany
Phone: +49-89-2180-76845
Fax: +49-89-2180-76999
E-mail: [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>
*******************************************************
--
Dean R. Madden, Ph.D.
Department of Biochemistry
Dartmouth Medical School
7200 Vail Building
Hanover, NH 03755-3844 USA
tel: +1 (603) 650-1164
fax: +1 (603) 650-1128
e-mail: [EMAIL PROTECTED]