> Selecting a test set that minimizes Rfree is so wrong on so many levels.
> Unless, of course, the only thing I know about Rfree is that it is the
> magic number that I need to make small by all means necessary.

By using a simple genetic algorithm, I managed to get Rfree for a
well-refined model as low as 14.6% and as high as 19.1%.  The dataset is
not too small (~40,000 reflection in all with the standard sized 5% test
set).  So you can get spread as wide as 4.5% even with not-so-small
dataset.  Only ~1/3 of test reflections are exchanged to achieve this.

What's curious is that, contrary to my expectations, the test set
remains well distributed throughout resolution shells upon this awful
"optimization" and the <F/sigF> for the working set and test set remain
close.  Not sure how to judge which model is actually better, but it's
noteworthy that the FOM gets worse for *both* upward and downward
"optimization" of the test set.


-- 
After much deep and profound brain things inside my head, 
I have decided to thank you for bringing peace to our home.
                                    Julian, King of Lemurs

Reply via email to