On Fri, 2011-10-14 at 13:07 -0700, Nat Echols wrote:

> You should enter the statistics for the model and data that you
> actually deposit, not statistics for some other model that you might
> have had at one point but which the PDB will never see.  

If you read my post carefully, you'll see that I never suggested
reporting statistics for one model and depositing the other

> Not only does refining against R-free make it impossible to verify and
> validate your structure, it also means that any time you or anyone
> else wants to solve an isomorphous structure by MR using your
> structure as a search model, or continue the refinement with
> higher-resolution data, you will be starting with a model that has
> been refined against all reflections.  So any future refinements done
> with that model against isomorphous data are pre-biased, making your
> model potentially useless.

Frankly, I think you are exaggerating the magnitude of model bias in the
situation that I described.  You assume that the refinement will become
severely unstable after tossing in the test reflections.  Depending on
the resolution etc, the rms shift of the model may vary but if it even
is, say half an angstrom the model hardly becomes useless (and that is
hugely overestimated).  And at least in theory including *all the data*
should make the model more, not less accurate.

> The benefit of including those extra 5% of data is always minimal 

And so is probably the benefit of excluding when all the steps that
require cross-validation have already been performed.  My thinking is
that excluding data from analysis should always be justified (and in the
initial stages of refinement, it might be as it prevents overfitting),
not the other way around.

Cheers,

Ed.

-- 
"Hurry up before we all come back to our senses!"
                           Julian, King of Lemurs

Reply via email to