I do agree with Tim's reasoning in general, but as Pavel also implied by offering the statistics, I would not be worried about the difference, but by the unreasonably high absolute value of Free R for 2.0 A resolution.

I do not think that its simple 'over-fitting' and my worry would not be just the very high difference between R/Rfree, but simply the high Rfree. I would more or less bet that if you check your model in the MolProbity server (http://molprobity.biochem.duke.edu/) it will also have appalling geometry scores(*)

Here is what I would suspect and what I would do to check what is wrong:

-very incomplete or badly built model: validate in Molprobity, and then rebuild (Coot obviously ...). .... while doing this check the following which just takes computer time, not yours: -twining: try any de-twining tool, I would simply switch on twining refinement in REFMAC and sit back and read the log carefully.
-wrong space group: Try the Zanuda server: 
http://www.ysbl.york.ac.uk/YSBLPrograms/index.jsp

regards -

Tassos

(*)on the other hand I did bet on Germany winning against Spain, so my betting skills are worse than these
of a cephalopod mollusk named 'Paul', 
http://en.wikipedia.org/wiki/Paul_the_Octopus


PS When you use TLS the average B will always go down since a lot of the movement is 'absorbed' by the TLS tensors that describe domain movement - what is in the PDB is just the 'residual' B that cannot be explained by domains moves.

On Jul 8, 2010, at 8:32, Tim Gruene wrote:

Dear Sampath,

You are right, the gap between R and Rfree is significant and indicates that
your model was overfitted.
Without knowing your data or your model, some reasons for overfitting might be: - you used automated placement of water molecules (e.g. through arpwaters or in coot) and never checked the water molecules for chemical reasonability. How many residues are there in your structure and how many water molecules? - there might be a domain that - despite the resolution - does not resolve with
 your data but you built somethig nevertheless
- you build things into your model while using a too low (approx. <1.0sigma) sigma-level for your map. At too low a contour level you can often see what
 you _want_ to see in my experience, and not what is there
- you screwed up the Rfree set and it's not indendent anymore. However, in that case I would rather expect the difference or the ration to be too small rather
 than too big.
- Your data may be twinned.

That's just a first set of reasons but there might be something one could only
know by looking at your data.

Tim



On Wed, Jul 07, 2010 at 10:17:14PM -0700, Sampath Natarajan wrote:
Dear all,

I have a question about the R free value. I refined a structure with 2A resolution. After model building and restraint refinement using Refmac program, the average B factor was around 50 for all atoms. The R/ Rfree were around 22/34. Then used the TLS refinement choosing entire molecule. Then R/Rfree reduced as 20/32. But the average B factor was reduced as 30. The
R/Rfree difference is about 12% in final refinement. I feel it is
significantly higher.

Could any one suggest me to reduce the Rfree value more? or is it good to
submit the data in the PDB database with this 12% difference?

Thanks for the suggestions.

Sincerely,
Sampath N

--
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A


P please don't print this e-mail unless you really need to
Anastassis (Tassos) Perrakis, Principal Investigator / Staff Member
Department of Biochemistry (B8)
Netherlands Cancer Institute,
Dept. B8, 1066 CX Amsterdam, The Netherlands
Tel: +31 20 512 1951 Fax: +31 20 512 1954 Mobile / SMS: +31 6 28 597791




Reply via email to