Re: [ccp4bb] number of cycles in refmac

Frank von Delft Thu, 25 Aug 2011 20:50:00 -0700

Hi Ian

(Yes, your technical point about semantics is correct, I meantover-fitting.)


To pin down your points, though, you're saying:
    1) Don't use Rfree, instead look at LLfree or Hamilton Rfree.

2) Compare only the final values at convergence when choosingbetween different parametrizations (=models)

Point #1 - fair point; the reason Rfree is popular, though, is becauseit is a /relative/ metric, i.e. by now we have a sense of what "good"is. So I predict an uphill fight for LLfree.

Point #2 would hold if we routinely let our refinements run toconvergence; seems common though to run "10 cycles" or "50 cycles"instead and draw conclusions from the behaviour of the metrics. Are theconclusions really much different from the comparison-at-convergence youadvocate? Which is in practice often less convenient.


Cheers
phx

> Isn't this the purpose of cross-validation, to use an independentmeasure to judge when the refinement is /not/ producing the "best" model?
If the value of your chosen X-validation metric at convergenceindicates a problem with the model, parameterisation, weighting etcthen clearly the target function is not indeed the final word: thesolution is to fix whatever is wrong and do the refinement again,until you get a satisfactory value for your metric.
    This may be true;  but as it is independent of refinement, is it
    not nevertheless the only measure I should trust?
No there several possible functions of the test set (e.g. HamiltonRfree, LLfree) that you could use, all potentially equally validX-validation metrics. I would have more faith in a function such asLLfree in which the contributions of the reflections are at leastweighted according to their reliability. It just seems bizarre thatimportant decisions are being based on measurements that may havegross errors without taking those errors into account.
    Or maybe what you intended to say:  only trust refinements for
    which Rfree decreases monotonically, because only then do you have
    a valid choice of parameters.
No, as I indicated above, what Rfree does before convergence isattained is totally meaningless, only the value obtained _at_convergence is meaningful as a X-validation statistic. We wouldn't behaving this discussion if the refinement program omitted themeaningless intermediate values and only printed out the final Rfreeor LLfree. I'm saying that Rfree is not the best X-validation metricbecause poorly measured data are not properly weighted: this is whatActa paper I referenced is saying.
Cheers

-- Ian

Re: [ccp4bb] number of cycles in refmac

Reply via email to