James,

Where we diverge is with your interpretation that big differences lead to small 
FOMs.  The size of the FOM depends on the product of Fo and Fc, not their 
difference.  The FOM for a reflection where Fo=1000 and Fc=10 is very different 
from the FOM for a reflection with Fo=5000 and Fc=4010, even though the 
difference is the same.

Expanding on this: 

1. The FOM actually depends more on the E values, i.e. reflections smaller than 
average get lower FOM values than ones bigger than average.  In the resolution 
bin from 5.12 to 5.64Å of 2vb1, the mean observed intensity is 20687 and the 
mean calculated intensity is 20022, which means that 
Eobs=Sqrt(145.83/20687)=0.084 and Ecalc=Sqrt(7264/20022)=0.602.  This 
reflection gets a low FOM because the product (0.050) is such a small number, 
not because the difference is big.

2. You have to consider the role of the model error in the difference, because 
for precisely-measured data most of the difference comes from model error.  In 
this resolution shell, the correlation coefficient between Iobs and Fcalc^2 is 
about 0.88, which means that sigmaA is about Sqrt(0.88) = 0.94.  The variance 
of both the real and imaginary components of Ec (as an estimate of the phased 
true E) will be (1-0.94^2)/2 = 0.058, so the standard deviations of the real 
and imaginary components of Ec will be about 0.24.  In that context, the 
difference between Eobs and Ecalc is nothing like a 2000-sigma outlier.

Looking at this another way, the reason why the FOM is low for this reflection 
is that the conditional probability distribution of Eo given Ec has significant 
values on the other side of the origin of the complex plane. That means that 
the *phase* of the complex Eo is very uncertain.  The figures in this web page 
(https://www-structmed.cimr.cam.ac.uk/Course/Statistics/statistics.html) should 
help to explain that idea.

Best wishes,

Randy

> On 16 Oct 2019, at 16:02, James Holton <jmhol...@lbl.gov> wrote:
> 
> 
> All very true Randy,
> 
> But nevertheless every hkl has an FOM assigned to it, and that is used to 
> calculate the map.  Statistical distribution or not, the trend is that hkls 
> with big amplitude differences get smaller FOMs, so that means large 
> model-to-data discrepancies are down-weighted.  I wonder sometimes at what 
> point this becomes a self-fulfilling prophecy?  If you look in detail and the 
> Fo-Fc differences in pretty much any refined structure in the PDB you will 
> find huge outliers.  Some are hundreds of sigmas, and they can go in either 
> direction.
> 
> Take for example reflection -5,2,2 in the highest-resolution lysozyme 
> structure in the PDB: 2vb1.  Iobs(-5,2,2) was recorded as 145.83 ± 3.62 (at 
> 5.4 Ang) with Fcalc^2(-5,2,2) = 7264.  A 2000-sigma outlier!  What are the 
> odds?   On the other hand, Iobs(4,-6,2) = 1611.21 ± 30.67 vs Fcalc^2(4,-6,2) 
> = 73, which is in the opposite direction.  One can always suppose 
> "experimental errors", but ZD sent me these images and I have looked at all 
> the spots involved in these hkls.  I don't see anything wrong with any of 
> them.  The average multiplicity of this data set was 7.1 and involved 3 
> different kappa angles, so I don't think these are "zingers" or other weird 
> measurement problems.
> 
> I'm not just picking on 2vb1 here.  EVERY PDB entry has this problem.  Not 
> sure where it comes from, but the FOM assigned to these huge differences is 
> always small, so whatever is causing them won't show up in an FOM-weighted 
> map.
> 
> Is there a way to "change up" the statistical distribution that assigns FOMs 
> to hkls?  Or are we stuck with this systematic error?
> 
> -James Holton
> MAD Scientist
> 
> On 10/4/2019 9:31 AM, Randy Read wrote:
>> Hi James,
>> 
>> I'm sure you realise this, but it's important for other readers to remember 
>> that the FOM is a statistical quantity: we have a probability distribution 
>> for the true phase, we pick one phase (the "centroid" phase that should 
>> minimise the RMS error in the density map), and then the FOM is the expected 
>> value of the phase error, obtained by taking the cosines of all possible 
>> phase differences and weighting by the probability of that phase difference. 
>>  Because it's a statistical quantity from a random distribution, you really 
>> can't expect this to agree reflection by reflection!  It's a good start to 
>> see that the overall values are good, but if you want to look more closely 
>> you have to look a groups of reflections, e.g. bins of resolution, bins of 
>> observed amplitude, bins of calculated amplitude.  However, each bin has to 
>> have enough members that the average will generally be close to the expected 
>> value.
>> 
>> Best wishes,
>> 
>> Randy Read
>> 
>>> On 4 Oct 2019, at 16:38, James Holton <jmhol...@lbl.gov 
>>> <mailto:jmhol...@lbl.gov>> wrote:
>>> 
>>> I've done a few little experiments over the years using simulated data 
>>> where I know the "correct" phase, trying to see just how accurate FOMs are. 
>>>  What I have found in general is that overall FOM values are fairly well 
>>> correlated to overall phase error, but if you go reflection-by-reflection 
>>> they are terrible.  I suppose this is because FOM estimates are rooted in 
>>> amplitudes.  Good agreement in amplitude gives you more confidence in the 
>>> model (and therefore the phases), but if your R factor is 55% then your 
>>> phases probably aren't very good either.  However, if you look at any given 
>>> h,k,l those assumptions become less and less applicable.  Still, it's the 
>>> only thing we've got.
>>> 
>>> 2qwAt the end of the day, the phase you get out of a refinement program is 
>>> the phase of the model.  All those fancy "FWT" coefficients with "m" and 
>>> "D" or "FOM" weights are modifications to the amplitudes, not the phases.  
>>> The phases in your 2mFo-DFc map are identical to those of just an Fc map.  
>>> Seriously, have a look!  Sometimes you will get a 180 flip to keep the sign 
>>> of the amplitude positive, but that's it.  Nevertheless, the electron 
>>> density of a 2mFo-DFc map is closer to the "correct" electron density than 
>>> any other map.  This is quite remarkable considering that the "phase error" 
>>> is the same.
>>> 
>>> This realization is what led my colleagues and I to forget about "phase 
>>> error" and start looking at the error in the electron density itself 
>>> (10.1073/pnas.1302823110).  We did this rather pedagogically.  Basically, 
>>> pretend you did the whole experiment again, but "change up" the source of 
>>> error of interest.  For example if you want to see the effect of sigma(F) 
>>> then you add random noise with the same magnitude as sigma(F) to the Fs, 
>>> and then re-refine the structure.  This gives you your new phases, and a 
>>> new map. Do this 50 or so times and you get a pretty good idea of how any  
>>> source of error of interest propagates into your map.  There is even a 
>>> little feature in coot for animating these maps, which gives a much more 
>>> intuitive view of the "noise".  You can also look at variation of model 
>>> parameters like the refined occupancy of a ligand, which is a good way to 
>>> put an "error bar" on it.  The trick is finding the right source of error 
>>> to propagate.
>>> 
>>> -James Holton
>>> MAD Scientist
>>> 
>>> 
>>> On 10/2/2019 2:47 PM, Andre LB Ambrosio wrote:
>>>> Dear all,
>>>> 
>>>> How is the phase error estimated for any given reflection, specifically in 
>>>> the context of model refinement? In terms of math I mean.
>>>> 
>>>> How useful is FOM in assessing the phase quality, when not for initial 
>>>> experimental phases?
>>>> 
>>>> Many thank in advance,
>>>> 
>>>> Andre.
>>>> 
>>>> To unsubscribe from the CCP4BB list, click the following link:
>>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 
>>>> <https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1>
>>> 
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 
>>> <https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1>
>> ------
>> Randy J. Read
>> Department of Haematology, University of Cambridge
>> Cambridge Institute for Medical Research     Tel: + 44 1223 336500
>> The Keith Peters Building                               Fax: + 44 1223 336827
>> Hills Road                                                       E-mail: 
>> rj...@cam.ac.uk <mailto:rj...@cam.ac.uk>
>> Cambridge CB2 0XY, U.K.                             
>> www-structmed.cimr.cam.ac.uk <http://www-structmed.cimr.cam.ac.uk/>
> 

------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     Tel: + 44 1223 336500
The Keith Peters Building                               Fax: + 44 1223 336827
Hills Road                                                       E-mail: 
rj...@cam.ac.uk
Cambridge CB2 0XY, U.K.                             www-structmed.cimr.cam.ac.uk


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Reply via email to