Re: [ccp4bb] Death of Rmerge

Phil Evans Fri, 01 Jun 2012 11:19:51 -0700

As the K & D paper points out, as the signal/noise declines at higher 
resolution, Rmerge goes up to infinity, so there is no sensible way to set a 
limiting value to determine "resolution".


That is not to say that Rmerge has no use: as you say it's a reasonably good 
metric to plot against image number to detect a problem. It just not a suitable 
metric for deciding resolution

I/sigI is pretty good for this, even though the sigma estimates are not very 
reliable. CC1/2 is probably better since it is independent of sigmas and has 
defined values from 1.0 down to 0.0 as signal/noise decreases. But we should be 
careful of any dogma which says what data we should discard, and what the 
cutoff limits should be: I/sigI > 3,2, or 1? CC1/2 > 0.2, 0.3, 0.5 ...? Usually 
it does not make a huge difference, but why discard useful data? Provided the 
data are properly weighted in refinement by weights incorporating observed 
sigmas (true in  Refmac, not true in phenix.refine at present I believe), 
adding extra weak data should do no harm, at least out to some point. Program 
algorithms are improving in their treatment of weak data, but are by no means 
perfect.

One problem as discussed earlier in this thread is that we have got used to the 
idea that nominal resolution is a single number indicating the quality of a 
structure, but this has never been true, irrespective of the cutoff method. 
Apart from the considerable problem of anisotropy, we all need to note the 
wisdom of Ethan Merritt

> "We should also encourage people not to confuse the quality of 
> the data with the quality of the model."

Phil



On 1 Jun 2012, at 18:59, aaleshin wrote:

> Please excuse my ignorance, but I cannot understand why Rmerge is unreliable 
> for estimation of the resolution?
> I mean, from a theoretical point of view, <1/sigma> is indeed a better 
> criterion, but it is not obvious from a practical point of view.
> 
> <1/sigma> depends on a method for sigma estimation, and so same data 
> processed by different programs may have different <1/sigma>. Moreover, 
> HKL2000 allows users to adjust sigmas manually. Rmerge estimates sigmas from 
> differences between measurements of same structural factor, and hence is 
> independent of our preferences.  But, it also has a very important ability to 
> validate consistency of the merged data. If my crystal changed during the 
> data collection, or something went wrong with the diffractometer, Rmerge will 
> show it immediately, but <1/sigma>  will not.
> 
> So, please explain why should we stop using Rmerge as a criterion of data 
> resolution? 
> 
> Alex
> Sanford-Burnham Medical Research Institute
> 10901 North Torrey Pines Road
> La Jolla, California 92037
> 
> 
> 
> On Jun 1, 2012, at 5:07 AM, Ian Tickle wrote:
> 
>> On 1 June 2012 03:22, Edward A. Berry <ber...@upstate.edu> wrote:
>>> Leo will probably answer better than I can, but I would say I/SigI counts
>>> only
>>> the present reflection, so eliminating noise by anisotropic truncation
>>> should
>>> improve it, raising the average I/SigI in the last shell.
>> 
>> We always include unmeasured reflections with I/sigma(I) = 0 in the
>> calculation of the mean I/sigma(I) (i.e. we divide the sum of
>> I/sigma(I) for measureds by the predicted total no of reflections incl
>> unmeasureds), since for unmeasureds I is (almost) completely unknown
>> and therefore sigma(I) is effectively infinite (or at least finite but
>> large since you do have some idea of what range I must fall in).  A
>> shell with <I/sigma(I)> = 2 and 50% completeness clearly doesn't carry
>> the same information content as one with the same <I/sigma(I)> and
>> 100% complete; therefore IMO it's very misleading to quote
>> <I/sigma(I)> including only the measured reflections.  This also means
>> we can use a single cut-off criterion (we use mean I/sigma(I) > 1),
>> and we don't need another arbitrary cut-off criterion for
>> completeness.  As many others seem to be doing now, we don't use
>> Rmerge, Rpim etc as criteria to estimate resolution, they're just too
>> unreliable - Rmerge is indeed dead and buried!
>> 
>> Actually a mean value of I/sigma(I) of 2 is highly statistically
>> significant, i.e. very unlikely to have arisen by chance variations,
>> and the significance threshold for the mean must be much closer to 1
>> than to 2.  Taking an average always increases the statistical
>> significance, therefore it's not valid to compare an _average_ value
>> of I/sigma(I) = 2 with a _single_ value of I/sigma(I) = 3 (taking 3
>> sigma as the threshold of statistical significance of an individual
>> measurement): that's a case of "comparing apples with pears".  In
>> other words in the outer shell you would need a lot of highly
>> significant individual values >> 3 to attain an overall average of 2
>> since the majority of individual values will be < 1.
>> 
>>> F/sigF is expected to be better than I/sigI because dx^2 = 2Xdx,
>>> dx^2/x^2 = 2dx/x, dI/I = 2* dF/F  (or approaches that in the limit . . .)
>> 
>> That depends on what you mean by 'better': every metric must be
>> compared with a criterion appropriate to that metric. So if we are
>> comparing I/sigma(I) with a criterion value = 3, then we must compare
>> F/sigma(F) with criterion value = 6 ('in the limit' of zero I), in
>> which case the comparison is no 'better' (in terms of information
>> content) with I than with F: they are entirely equivalent.  It's
>> meaningless to compare F/sigma(F) with the criterion value appropriate
>> to I/sigma(I): again that's "comparing apples and pears"!
>> 
>> Cheers
>> 
>> -- Ian

Re: [ccp4bb] Death of Rmerge

Reply via email to