Frank, Point #1 - fair point; the reason Rfree is popular, though, is because it > is a *relative* metric, i.e. by now we have a sense of what "good" is. So > I predict an uphill fight for LLfree. >
Why? I don't see any difference. As you say Rfree is a relative metric so your sense of what 'good' is relies on comparisons with other Rfrees (i.e. it can only be 'better' or 'worse' not 'good' or 'bad'), but then the same is true of LLfree (note that they both assume that exactly the same data were used and that only the model has changed). So when choosing between alternative model parameterisations in order to minimise over-fitting we compare their Rfrees and choose the lower one - same with LLfree, or we compare the observed Rfree with the expected Rfree based on Rwork and the obs/param ratio to check for problems with the model - same with LLfree. In fact you can do it better because the observations in LLfree are weighted in exactly the same way as those in the target function. > Point #2 would hold if we routinely let our refinements run to > convergence; seems common though to run "10 cycles" or "50 cycles" instead > and draw conclusions from the behaviour of the metrics. Are the conclusions > really much different from the comparison-at-convergence you advocate? > Which is in practice often less convenient. > > You might do 10 cycles for a quick optimisation of the coordinates, but then I wouldn't place much faith in the R factors! How can you draw any conclusions from their behaviour: there's no way of predicting how they will change in further cycles, the only way to find out is to do it. I'm not saying that you need to refine exhaustively on every run, that would be silly since you don't need to know the correct value of the R factors for every run; but certainly on the final run before PDB submission I would regard stopping the refinement early based on Rfree as implied in Tim's original posting as something akin to 'cheating'. Cheers -- Ian