I do agree with Tim's reasoning in general, but as Pavel also implied
by offering the statistics,
I would not be worried about the difference, but by the unreasonably
high absolute value of Free R for 2.0 A resolution.
I do not think that its simple 'over-fitting' and my worry would not
be just the very high difference between
R/Rfree, but simply the high Rfree. I would more or less bet that if
you check your model in
the MolProbity server (http://molprobity.biochem.duke.edu/) it will
also have appalling geometry scores(*)
Here is what I would suspect and what I would do to check what is wrong:
-very incomplete or badly built model: validate in Molprobity, and
then rebuild (Coot obviously ...).
.... while doing this check the following which just takes computer
time, not yours:
-twining: try any de-twining tool, I would simply switch on twining
refinement in REFMAC and sit back and read the log carefully.
-wrong space group: Try the Zanuda server:
http://www.ysbl.york.ac.uk/YSBLPrograms/index.jsp
regards -
Tassos
(*)on the other hand I did bet on Germany winning against Spain, so my
betting skills are worse than these
of a cephalopod mollusk named 'Paul',
http://en.wikipedia.org/wiki/Paul_the_Octopus
PS When you use TLS the average B will always go down since a lot of
the movement is 'absorbed' by the TLS tensors
that describe domain movement - what is in the PDB is just the
'residual' B that cannot be explained by domains moves.
On Jul 8, 2010, at 8:32, Tim Gruene wrote:
Dear Sampath,
You are right, the gap between R and Rfree is significant and
indicates that
your model was overfitted.
Without knowing your data or your model, some reasons for
overfitting might be:
- you used automated placement of water molecules (e.g. through
arpwaters or in
coot) and never checked the water molecules for chemical
reasonability. How
many residues are there in your structure and how many water
molecules?
- there might be a domain that - despite the resolution - does not
resolve with
your data but you built somethig nevertheless
- you build things into your model while using a too low (approx.
<1.0sigma)
sigma-level for your map. At too low a contour level you can often
see what
you _want_ to see in my experience, and not what is there
- you screwed up the Rfree set and it's not indendent anymore.
However, in that
case I would rather expect the difference or the ration to be too
small rather
than too big.
- Your data may be twinned.
That's just a first set of reasons but there might be something one
could only
know by looking at your data.
Tim
On Wed, Jul 07, 2010 at 10:17:14PM -0700, Sampath Natarajan wrote:
Dear all,
I have a question about the R free value. I refined a structure
with 2A
resolution. After model building and restraint refinement using
Refmac
program, the average B factor was around 50 for all atoms. The R/
Rfree were
around 22/34. Then used the TLS refinement choosing entire
molecule. Then
R/Rfree reduced as 20/32. But the average B factor was reduced as
30. The
R/Rfree difference is about 12% in final refinement. I feel it is
significantly higher.
Could any one suggest me to reduce the Rfree value more? or is it
good to
submit the data in the PDB database with this 12% difference?
Thanks for the suggestions.
Sincerely,
Sampath N
--
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen
GPG Key ID = A46BEE1A
P please don't print this e-mail unless you really need to
Anastassis (Tassos) Perrakis, Principal Investigator / Staff Member
Department of Biochemistry (B8)
Netherlands Cancer Institute,
Dept. B8, 1066 CX Amsterdam, The Netherlands
Tel: +31 20 512 1951 Fax: +31 20 512 1954 Mobile / SMS: +31 6 28 597791