Hi Ian

(Yes, your technical point about semantics is correct, I meant over-fitting.)

To pin down your points, though, you're saying:
    1) Don't use Rfree, instead look at LLfree or Hamilton Rfree.
2) Compare only the final values at convergence when choosing between different parametrizations (=models)

Point #1 - fair point; the reason Rfree is popular, though, is because it is a /relative/ metric, i.e. by now we have a sense of what "good" is. So I predict an uphill fight for LLfree.

Point #2 would hold if we routinely let our refinements run to convergence; seems common though to run "10 cycles" or "50 cycles" instead and draw conclusions from the behaviour of the metrics. Are the conclusions really much different from the comparison-at-convergence you advocate? Which is in practice often less convenient.

Cheers
phx





> Isn't this the purpose of cross-validation, to use an independent measure to judge when the refinement is /not/ producing the "best" model?

If the value of your chosen X-validation metric at convergence indicates a problem with the model, parameterisation, weighting etc then clearly the target function is not indeed the final word: the solution is to fix whatever is wrong and do the refinement again, until you get a satisfactory value for your metric.


    This may be true;  but as it is independent of refinement, is it
    not nevertheless the only measure I should trust?

No there several possible functions of the test set (e.g. Hamilton Rfree, LLfree) that you could use, all potentially equally valid X-validation metrics. I would have more faith in a function such as LLfree in which the contributions of the reflections are at least weighted according to their reliability. It just seems bizarre that important decisions are being based on measurements that may have gross errors without taking those errors into account.

    Or maybe what you intended to say:  only trust refinements for
    which Rfree decreases monotonically, because only then do you have
    a valid choice of parameters.


No, as I indicated above, what Rfree does before convergence is attained is totally meaningless, only the value obtained _at_ convergence is meaningful as a X-validation statistic. We wouldn't be having this discussion if the refinement program omitted the meaningless intermediate values and only printed out the final Rfree or LLfree. I'm saying that Rfree is not the best X-validation metric because poorly measured data are not properly weighted: this is what Acta paper I referenced is saying.

Cheers

-- Ian

Reply via email to