Re: [ccp4bb] xds question

Robert Immormino Tue, 08 Feb 2011 06:07:54 -0800

Hi,
I've pasted below the reasons from Dan Gewirth and the HKL2000 manual
authors for having a -3 sigma cutoff... I'll add briefly that if you
assume the weak data has a Gaussian distribution around zero a -3
sigma cutoff allows you to record ~99.8% of the data.
-bob



SIGMA CUTOFF

Cutoff for rejecting measurements on input. Default = -3.0. Be very
careful if you increase this.

What is the rationale for using sigma cutoff -3.0 in SCALEPACK?
Wouldn't you want to reject all negative intensities? Why shouldn't
you use a sigma cutoff 1.0 or zero? The answer to these questions is
as follows: The best estimate of I may be negative, due to background
subtraction and background fluctuation. Negative measurements
typically represent random fluctuations in the detector's response to
an X-ray signal. If a measurement is highly negative (<= -3[[sigma]])
than it may be more likely the result of a mistake, rather than just
random fluctuation.

If one eliminates negative fluctuations, but not the positive ones
before averaging, the result will be highly biased. In SCALEPACK,
sigma cutoff is applied before averaging. If one rejects all negative
intensities before averaging a number of things would happen:

   1.  The averaged intensity would always be positive;
   2.  For totally random data with redundancy 8, in a shell where
there was no signal, , there would be on average 4 positive
measurements, with average intensity one sigma. This is because the
negative measurements had been thrown out. So the average of the four
remaining measurements would be about 2 sigma! This would look like a
resolution shell with a meaningful signal;
   3.  R-merge would be always less than the R-merge with negative
measurements included;
   4.  A SIGMA CUTOFF of 1 would improve R-merge even more, by
excluding even more valid measurements.

Why should this worry you? Exclusion of valid measurements will
deteriorate the final data set. One may notice an inverse relationship
between R-merge and data quality as a function of "sigma cutoff". So
much for using R-merge as any criterion of success.

Even the best (averaged) estimate of intensity may be negative. How to
use negative I estimates in subsequent phasing and refinement steps is
a separate story. The author of SCALEPACK suggests the following:

   1. You should never convert I into F.
   2. You should square Fcalc and compare it to I. Most, but not all
of the crystallography programs do not do this. That is life. In the
absence of the proper treatment one can do approximations. One of them
is provided by French and also by French and Wilson. An implementation
of their ideas is in the CCP4 program TRUNCATE. A very simplified and
somewhat imprecise implementation of TRUNCATE is this:

if I > [[sigma]](I), F=sqrt(I)

if I < [[sigma]](I), F=sqrt([[sigma]](I))
format  SIGMA CUTOFF value
default         -3
example         SIGMA CUTOFF -2.5

referenced from:
http://www.hkl-xray.com/hkl_web1/hkl/Scalepack_Keywords.html

Re: [ccp4bb] xds question

Reply via email to