Re: [ccp4bb] question about SIGF

Ronald E Stenkamp Sat, 20 Aug 2011 16:44:45 -0700

James, could you please give more information about where and/or how you obtained the 
relationship "sigma(I)/I = 2*sigma(F)/F"?  A different equation, 
sigma(I)=2*F*sigma(F), can be derived from sigma(I)^2 = (d(I)/dF)^2 * sigma(F)^2.  I 
understand that that equation is based on normal distribution of errors and has numerical 
problems when F is small, so there are other approximations that have been used to 
convert sigma(I) to sigma(F).  However, none that I've seen end up stating that 
sigma(F)=0.5.  Thanks.  Ron


On Sat, 20 Aug 2011, James Holton wrote:

There is a formula for sigma(F) (aka "SIGF"), but it is actually a commonmisconception that it is simply related to F. You need to know a few otherthings about the experiment that was done to collect the data. Themisconception seems to arise because the fist thing textbooks tell you isthat F = sqrt(I), where "I" is the intensity of the spot. Then, later on,they tell you that sigma(I) = sqrt(I) because of "counting statistics". Now,if you look up a table of error-propagation formulas, you will find that ifI=F^2, then sigma(I)/I = 2*sigma(F)/F, and by substituting these equationstogether you readily obtain:
sigma(F) = F/2*sigma(I)/I
sigma(F) = F/2*sigma(I)/F^2
sigma(F) = sigma(I)/(2*F)
sigma(F) = sigma(I)/(2*sqrt(I))
sigma(F) = sqrt(I)/(2*sqrt(I))
sigma(F) = 0.5
Which says that the error in F is always the same, no matter what yourexposure time? Hmm.
The critical thing missing from the equations above is something wecrystallographers call a "scale factor". We love scale factors because theylet us get away with not knowing a great many things, like the volume of thecrystal, the absolute intensity of the x-ray beam, and the exact "gain" ofthe detector. It's not that we can't measure or look up these things, butfew of us have the time. And, by and large, as long as you are aware thatthere is always an unknown "scale factor", it doesn't really get in your way.So, the real equation is:
I_in_photons = scale*F^2
where scale =Ibeam*re^2*Vxtal*lambda^3*Loentz_factor*Polar_factor*Attenuation_factor*exposure_time/deltaphi/Vcell^2
This "scale factor" comes from Equation 1 in the following paper:
http://dx.doi.org/10.1107/S0907444910007262
where we took pains to describe the exact meaning of each of these variables(and their units!) in great detail. It is open access, so I won't go throughthem here. I will, however, add that for spots on the detector there are afew other factors still missing, like the detector gain, obliquity, parallax,and spot partiality, but these are all "taken care of" by the data processingprogram. The main thing is to figure out the number of photons that wereaccumulated for a given h,k,l index, and then take the square root of that toget the "counting error". Oh, and you also need to know the number of"background" photons that fell into the pixels used to add up photons for theh,k,l of interest. The square root of this count must be combined with the"counting error" of the spot photons, along with a few other sources oferror. This is what we discuss around Equation (18) in the linked-to paperabove.
The short answer, however, is that sqrt(I_in_photons) is only one componentof sigma(I). The other factors fall into three main categories: readoutnoise, counting noise and what I call "fractional noise". Now, if you have anumber of different sources of noise, you get the total noise by adding upthe squares of all the components, and then taking the square root:sigma(I_in_photons) = sqrt( I_in_photons + background_photons +sigma_readout^2 + frac_error*I_in_photons^2 )
For those of you who use SCALA and think the sqrt( sigI^2 + B*I + sdadd*I^2 )form of this equation looks a lot like the SDCORRection line, good job! Thatis a very perceptive observation.
What separates the three kinds of noise is how they relate to the exposuretime. For example, readout noise is always the same, no matter what theexposure time is, but as you increase the exposure time, the number ofphotons in the spots and the background go up proportionally. This meansthat the contribution of "counting noise" to sigma(I) increases as the squareroot of the exposure time. On modern detectors, the read-out noise isequivalent to the "counting noise" of a few (or even zero) photons/pixel, andso as soon as you have more than about 10 photon/pixel of background, thereadout noise is no longer significant.
So, in general, noise increases with the square root of exposure time, butthe signal (I_in_photons) increases in direct proportion to exposure time, sothe signal-to-noise ratio (from counting noise alone) goes up with the squareroot of exposure time. That is, until you hit the third type of noise:fractional noise. There are many sources of fractional noise: shutter timingerror, crystal vibration, fliker in the incident beam intensity, inaccuratescaling factors (including the absorption correction), and variations indetector sensitivity across its face. Essentially, these amount to the errorbars on all the terms in the "scale" formula above. On a gooddiffractometer, all these errors are small, usually less than a few percent,but one thing they all have in common is their contribution to sigma(I) isproportional to the the signal (I_in_photons), not the square root of it!This is the reason why the Rmerge of bright, low-angle spots never gets downto the the 0.1% you would expect from counting a million photons, even if youdid count that many. It is also the reason why "SDadd" in SCALA (the"estimated error" in scalepack) tends to be around 3-5%. It is also thereason why measuring anomalous differences smaller than ~3% is so difficult.
So, if you know the magnitude of all these sources of error, then it shouldbe possible to derive a formula for SIGF, but I don't think anyone has quitewritten down the whole thing yet. Possibly because the difference betweenFobs and Fcalc is about 4-5x bigger than SIGF anyway.
-James Holton
MAD Scientist

On 8/18/2011 8:19 AM, G Y wrote:
Dear all,
I am a student in crystallography. So not quite familiar with some evenbasic concepts.In shelx .hkl file or ccp4 .mtz file there is a column SIGF which isrelated to standard deviation of the structure factor. I read through manytext book for crystallography, there are many formulas about this topic.Sometimes it is a square of sigma, sometimes it is not.
My question is:
1. What is the exact mathematical formula for SIGF or SIGFP in ccp4 orshelx format file?
2. If it can be calculated from F, why it is necessary to include it inccp4 or shelx reflection file (they have F already) ?
3. Is this value really important in structure determination? Why and how?As I understood, during data collection each reflection measured severaltimes, so there is a deviation from the average F. That is the meaning ofSIGF. But how to use this value in structure determination? Is there somekind of correction or refinement on F according to SIGF?
And also when multiplicity during data collection is low, the SIGF wouldnot be so interested. So is that means the SIGF would not be so importantin some measurements?
Any kind reply from you guys would be greatly appreciated. Many thanks!

Best regards,
G

Re: [ccp4bb] question about SIGF

Reply via email to