Wow, it is quite a lecture here! It is very appreciated. I admit some (most?) of my statements were questionable. Thus, I did not know how sigI would be calculated in case of multiple observations, and, indeed, its proper handling should make <sigI/I> similar to Rmerge. Consequently, <I/sigI> substitutes Rmerge fairly well.
Now, where the metric Rmerge=0.5 came from? If I remember correctly, It was proposed here at ccp4bb. Also, one reviewer suggested to use it. I admit that this is quite an arbitrary value, but when everyone follows it, structures become comparable by this metric. If there is a better approach to estimate the resolution, lets use it, but the common rule should be enforced, otherwise the resolution becomes another venue for cheating. Once again, I was talking about metric for the resolution, it does not need to be equal to metric for the data cutoff. Alex On Jun 3, 2012, at 2:55 PM, Ian Tickle wrote: > Hi Alex > > On 3 June 2012 07:00, aaleshin <aales...@burnham.org> wrote: >> I was also taught that under "normal conditions" this would occur when the >> data are collected up to the shell, in which Rmerge = 0.5. > > Do you have a reference for that? I have not seen a demonstration of > such an exact relationship between Rmerge and resolution, even for > 'normal' data, and I don't think everyone uses 0.5 as the cut-off > anyway (e.g. some people use 0.4, some 0.8 etc - though I agree with > Phil that we shouldn't get too hung up about the exact number!). > Certainly having used the other suggested criteria for resolution > cut-off (I/sigma(I) & CC(1/2)), the corresponding Rmerge (and Rpim > etc) seems to vary a lot (or maybe my data weren't 'normal'). > >> One can collect more data (up to Rmerge=1.0 or even 100) but the resolution >> of the electron density map will not change significantly. > > I think we are all at least agreed that beyond some resolution > cut-off, adding further higher resolution 'data' will not result in > any further improvement in the map (because the weights will become > negligible). So it would appear prudent at least to err on the high > resolution side! > >> I solved several structures of my own, and this simple rule worked every >> time. > > In what sense do you mean it 'worked'? Do you mean you tried > different cut-offs in Rmerge (e.g. 0.25, 0.50, 0.75, 1.00 ...) and > then used some metric to judge when there was no further significant > change in the map and you noted that the optimal value of your chosen > metric always occurs around Rmerge 0.5?; and if so how did you judge a > 'significant change'? Personally I go along with Dale's suggestion to > use the optical resolution of the map to judge when no further > improvement occurs. This would need to be done with the completely > refined structure because presumably optical resolution will be > reduced by phase errors. Note that it wouldn't be necessary to > actually quote the optical resolution in place of the X-ray resolution > (that would confuse everyone!), you just need to know the value of the > X-ray resolution cut-off where the optical resolution no longer > changes (it should be clear from a plot of X-ray vs. optical > resolution). > >> I is measured as a number of detector counts in the reflection minus >> background counts. >> sigI is measured as sq. root of I plus standard deviation (SD) for the >> background plus various deviations from ideal experiment (like noise from >> satellite crystals). > > The most important contribution to the sigma(I)'s, except maybe for > the weak reflections, actually comes from differences between the > intensities of equivalent reflections, due to variations in absorption > and illuminated volume, and other errors in image scale factors > (though these are all highly correlated). These are of course exactly > the same differences that contribute to Rmerge. E.g. in Scala the > SDFAC & SDADD parameters are automatically adjusted to fit the > observed QQ plot to the expected one, in order to account for such > differences. > >> Obviously, sigI cannot be measured accurately. Moreover, the 'resolution' is >> related to errors in the structural factors, which are average from several >> measurements. >> Errors in their scaling would affect the 'resolution', and <I/sigI> does not >> detect them, but Rmerge does! > > Sorry you've lost me here, I don't see why <I/sigI> should not detect > scaling errors: as indicated above if there are errors in the scale > factors this will inflate the sigma(I) values via increased SDFAC > and/or SDADD, which will increase the sigma(I) values which will in > turn reduce the <I/sigma(I)> values exactly as expected. I see no > difference in the behaviour of Rmerge and <I/sigma(I)> (or indeed in > CC(1/2)) in this respect, since they all depend on the differences > between equivalents. > >> Rmerge, it means that the symmetry related reflections did not merge well. >> Under those conditions, Rmerge becomes a much better criterion for >> estimation of the 'resolution' than <sigi/I>. > > As indicated above, if the symmetry equivalents don't merge well it > will increase the sigma(I)'s and reduce <I/sigma(I)>, so in this > respect I don't see why Rmerge should be any better than <I/sigma(I)>. > My biggest objection to Rmerge (and this applies also to CC(1/2)) is > that it involves throwing away valuable information, namely the > measured sigma(I) values from counting stats. This is not usually a > good idea (in statistical parlance it reduces the 'power' of the test) > - and it's not as though one can argue the sigma's are so small that > they can be neglected (at least not for the weak reflections). Even > though as you say the estimates of sigma(I) may not be very accurate, > it seems to me that any estimate is better than no estimate. In any > case the estimates of sigma(I) are probably quite accurate for the > weak reflections, it's just for the strong ones that the assumptions > tend to break down. However if we're estimating resolution from > <I/sigma(I)> it's only the weak reflections in the outer shell that > are relevant, so I don't think accuracy of sigma(I) is an issue. > >> If someone decides to use <I/sigI> instead of Rmerge, fine, let it be 2.0. > > As I indicated previously I think 2 is too high, it should be much > closer to 1 (and again it would appear prudent to err on the side of > the lower value), because in the outer shell the majority of > I/sigma(I) values will be < 1 (just from the normal distribution of > errors). This means that in order to get an average value of > I/sigma(I) = 2 you need a lot of very significant intensities >> 3. > The fallacy here lies in comparing the average I/sigma(I) with the > standard '3 sigma' criterion which is actually appropriate only for a > single intensity. Of course data anisotropy may well "throw a spanner > in the works". > >> Alternatively, the resolution could be estimated from the electron density >> maps. > > I agree, using the optical resolution in the manner indicated above, > but still quoting the corresponding X-ray resolution for backwards > compatibility! > >> I hope everyone agrees that the resolution should not be dead.. > > I completely agree: I say "Long live the resolution!" (sorry I > couldn't resist it). > > -- Ian