On Sun, May 23, 2010 at 5:24 AM, Pavel Afonine <pafon...@lbl.gov> wrote:
> Hi,
>
>
> On 5/22/10 8:23 PM, Zhang, Hailiang wrote:
>>
>> Thanks a lot! Actually my map has very weak density at only certain
>> region, and I want to numerically say that this region is "weakly"
>> correlated to the atomic map at this region. But according to the CC
>> formula, if investigating by residue, CC only depends on the relative
>> "shape" of the density, NOT the relative "density".
>
> This is exactly why I suggested to look at the triplet of values {map CC,
> 2mFo-DFc, mFo-DFc}. The map CC will tell you similary of shapes, indeed, and
> the two maps (2mFo-DFc, mFo-DFc) will tel you how to qualify this
> similarity.
>
>> Instead, I think real-space R will reflect the weak density.

I have long been advocating the use of the RMS density deviation
Z-score statistic for this purpose, i.e.
sqrt(sum(delta_rho^2)/Npoints)/sigma(rho), where delta_rho is a
grid-point value in a 2(mFo-DFc) map, the sum is over the region of
interest (atom, residue, side-chain etc), Npoints is the number of
grid points in the region, and sigma(rho) is the uncertainty in the
density (which must be estimated for the asymmetric unit of the map,
NOT just the region).  Note that it's equivalent to the negative
log-likelihood, which IMO is the appropriate statistic for this
situation, if it's assumed that the errors in the density have a
normal distribution.

As you correctly point out, the CC, being insensitive to both relative
scale and zero-level, is essentially sensitive only to shape, not to
density values; the problem is that the CC statistic is designed
specifically for use in the case where the relative scale and level
are a priori unknown which is clearly not the case here, so it's
hardly surprising that the CC can be misleading.  The advantage of the
RMS-Z statistic over the crystallographic R-factor is that the
statistical properties of the former have been thoroughly investigated
by professional statisticians in numerous papers over many years
(indeed centuries!) and are dealt in with all standard textbooks on
statistics, whereas information on the statistical properties of the
R-factor is very scant and is limited to the crystallographic
literature (the statistics of the Hamilton R-factor are of course
known, but that's defined differently and there are specific
conditions on its use).  This means that it's necessary to rely on
empirical comparisons rather than solid statistics if you want to do
any kind of significance testing on R-factors.

-- Ian

Reply via email to