Dear Derek, I suggest you try 10% for the test set. You should still be able to judge the effect of various restraints (or constraints) as long as you keep the same test set. If you switch test sets, and re-refine, Rfree might change as much as 2% for a test set consisting of 200 reflections - see Fig. 6 in ref. (A. T. Brunger, Free R value: Cross-validation in crystallography, Methods in Enzym. 277, 366-396, 1997). However, using the same test set may allow you to judge the best restraints protocol or weights.
Axel PS: The Methods in Enzym. review also briefly discusses "complete cross-validation". PPS: For refinement at very low resolution, see also: A.T.Brunger, P.D.Adams, P.Fromme, R.Fromme, M.Levitt, G.F. Schroder. Improving the accuracy of macromolecular structure refinement at 7 A resolution. Structure 20, 957-966 (2012). > On Dec 20, 2014, at 1:05 AM, CCP4BB automatic digest system > <lists...@jiscmail.ac.uk> wrote: > > Date: Fri, 19 Dec 2014 11:18:37 +0000 > From: Derek Logan <derek.lo...@biochemistry.lu.se> > Subject: Cross-validation when test set is miniscule > > Hi everyone, > > Right now we have one of those very difficult Rfree situations where it's > impossible to generate a single meaningful Rfree set. Since we're in a bit of > a hurry with this structure it would be good if someone could point me in the > right direction. We have crystals with 1542 non-H atoms in the asymmetric > unit that diffract to only 3.6 Å in P65, which gives us a whopping 2300 > reflections in total. 5% of this is only about 100 reflections. Luckily the > protein is only a single point mutation of a wild type that has been solved > to much better resolution, so we know what it should look like and I simply > want to investigate the effect of different levels of conservatism in the > refinement, e.g. NCS in xyz and B, group B-factors, reference model, > Ramachandran restraints etc. However since the quality criterion for this is > Rfree I'm not able to do this. > > I believe the correct approach is k-fold statistical cross-validation, but > can someone remind me of the correct way to do this? I've done a bit of > Googling without finding anything very helpful. > > Thanks > Derek > ________________________________________________________________________ > Derek Logan tel: +46 46 222 1443 > Associate Professor mob: +46 76 8585 707 > Dept. of Biochemistry and Structural Biology > www.cmps.lu.se<http://www.cmps.lu.se> > Centre for Molecular Protein Science www.maxlab.lu.se/crystal > Lund University, Box 124, 221 00 Lund, Sweden www.saromics.com Axel T. Brunger Investigator, Howard Hughes Medical Institute Professor and Chair, Dept. of Molecular and Cellular Physiology Stanford University Web: http://atbweb.stanford.edu Email: brun...@stanford.edu Phone: +1 650-736-1031