Re: [ccp4bb] truncate ignorance

Ethan Merritt Mon, 08 Sep 2008 13:04:42 -0700

On Monday 08 September 2008 12:30:29 Phoebe Rice wrote:
> Dear Experts,
> 
> At the risk of exposing excess ignorance, truncate makes me 
> very nervous because I don't quite get exactly what it is 
> doing with my data and what its assumptions are.
> 
> From the documentation:
> ========================================================
> ... the "truncate" procedure (keyword TRUNCATE YES, the 
> default) calculates a best estimate of F from I, sd(I), and 
> the distribution of intensities in resolution shells (see 
> below). This has the effect of forcing all negative 
> observations to be positive, and inflating the weakest 
> reflections (less than about 3 sd), because an observation 
> significantly smaller than the average intensity is likely 
> to be underestimated. 
> =========================================================
> 
> But is it really true, with data from nice modern detectors, 
> that the weaklings are underestimated?


It isn't really an issue of the detector per se, although in
principle you could worry about non-linear response to the
input rate of arriving photons.

In practice the issue, now as it was in 1977 (French&Wilson),
arises from the background estimation, profile fitting, and
rescaling that are applied to the individual pixel contents
before they are bundled up into a nice "Iobs".

I will try to restate the original French & Wilson argument,
avoiding the terminology of maximum likelihood and Bayesian statistics.

1) We know the true intensity cannot be negative.
2) The existence of Iobs<0 reflections in the data set means
   that whatever we are doing is producing some values of
   Iobs that are too low.
3) Assuming that all weak-ish reflections are being processed
   equivalently, then whatever we doing wrong for reflections with
   Iobs near zero on the negative side surely is also going wrong
   for their neighbors that happen to be near Iobs=0 on the positive
   side.
4) So if we "correct" the values of Iobs that went negative, for
   consistency we should also correct the values that are nearly
   the same but didn't quite tip over into the negative range.

> Do I really want to inflate them?

Yes.

> Exactly what assumptions is it making about the expected 
> distributions?

Primarily that
1) The histogram of true Iobs is smooth
2) No true Iobs are negative

> How compatible are those assumptions with serious anisotropy 
> and the wierd Wilson plots that nucleic acids give?

Not relevant

> Note the original 1978 French and Wilson paper says:
> "It is nevertheless important to validate this agreement for 
> each set of data independently, as the presence of atoms in 
> special positions or the existence of noncrystallographic 
> elements of symmetry (or pseudosymmetry) may abrogate the 
> application of these prior beliefs for some crystal 
> structures."

It is true that such things matter when you get down to the
nitty-gritty details of what to use as the "expected distribution".
But *all* plausible expected distributions will be non-negative
and smooth.


> 
> Please help truncate my ignorance ...
> 
>     Phoebe
> 
> ==========================================================
> Phoebe A. Rice
> Assoc. Prof., Dept. of Biochemistry & Molecular Biology
> The University of Chicago
> phone 773 834 1723
> http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
> 
> RNA is really nifty
> DNA is over fifty
> We have put them 
>   both in one book
> Please do take a 
>   really good look
> http://www.rsc.org/shop/books/2008/9780854042722.asp
> 



-- 
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742

Re: [ccp4bb] truncate ignorance

Reply via email to