[ccp4bb] mosflm gain
wondering if mosflm can automatically estimate the gain. i.e. i gather it is still estimated the usual way. -Bryan
Re: [ccp4bb] mosflm gain
Usually Mosflm will use a default value for the gain that depends on the type of detector used. This value is not realistic for CCD detectors, that is it is not really equal to the ratio of ADUs to incident X-ray photons, however it satisfies typical images under the assumptions of pixel independence and Poisson distribution, which are not true either. Inasmuch as the gain is just a scale factor in the data, it doesn't really matter that it isn't physically meaningful in the way you might expect from its name. However, the procedure of calculating gain from the variance to mean ratio from a background region of the image, which is the only simple automatic approach available if all you have is an image, should be avoided if you are looking for the gain in "real" units. I realise that didn't answer the question, but I thought it might be worth pointing out. -- David On 3 March 2011 20:34, Bryan Lepore wrote: > wondering if mosflm can automatically estimate the gain. > > i.e. i gather it is still estimated the usual way. > > -Bryan >
Re: [ccp4bb] mosflm gain
Dear Bryan, The quick answer is no. As David Waterman mentioned, it has a default value for the gain for each type of detector that it can deal with. A more detailed answer. An incorrect value for the gain can be indicated by values of the BGRATIO which differ significantly from unity (1.0). BGRATIO is the ratio of the rms variation in the background to the variation expected on the basis of Poisson statistics, using the gain to convert from digitised values in the image to X-ray photons. This is calculated for all measured spots, and binned as a function of intensity for each image measured (and printed in the full logfile). Mosflm prints a warning message if this differs from 1.0 by more than 10% and will suggest an "improved" value for the gain that should be used. There are a host of caveats in this procedure. For example, if the images contains significant diffuse scatter around the Bragg spots, the BGRATIO may be above 1.0 ... this is probably the commonest effect, but does not mean the gain is wrong. If for any reason the mask definition (defining the boundary between background and spot) has not worked correctly so the spot extends into the background, this will also give a BGRATIO of one (in this case, the BGRATIO will tend to be close to 1.0 for weak spots but greater than 1.0 for strong ones). The boundary is controlled by the "Profile tolerance" parameters, which are sometimes set artificially high to help process images where the spots are not fully resolved. This is why Mosflm does not automatically update its default value for the gain based on the BGRATIO. As David has mentioned, this procedure also assumes that adjacent pixels are independent, which they most certainly are not (except possibly for some pixel detectors), due to the point spread function of the detectors and corrections that are applied to the raw images. Does it matter ? The gain is used to identify outliers in the background plane determination (eg due to zingers, shadows, ice spots, hot pixels etc) which are rejected from the calculation, so if it is significantly in error this will introduce systematic errors in the integrated intensities. This can show up in the cumulative intensity distribution in Truncate if the gain is a very long way off. I have not done a proper study of this, but I think it would need to be out by more than 20% to have a significant effect. The gain is also used to calculate sig(I), however, the sig(I) values from Mosflm are adjusted in SCALA to reflect the true variation between symmetry related reflections so that providing the multiplicity is high enough for this to work correctly this will not have any real effect on the final merged data. The bottom line is that the estimates for sig(I) that emerge for this procedure seem to be quite good, in that the correction factors that are subsequently applied in SCALA for cases where other systematic errors are small (ie no radiation damage, absorption etc etc) are very close to 1.0. Best wishes, Andrew On 3 Mar 2011, at 20:34, Bryan Lepore wrote: wondering if mosflm can automatically estimate the gain. i.e. i gather it is still estimated the usual way. -Bryan
Re: [ccp4bb] mosflm gain
Dear All, spotted a mistake in my response, please see the correction below (in bold): There are a host of caveats in this procedure. For example, if the images contains significant diffuse scatter around the Bragg spots, the BGRATIO may be above 1.0 ... this is probably the commonest effect, but does not mean the gain is wrong. If for any reason the mask definition (defining the boundary between background and spot) has not worked correctly so the spot extends into the background, this will also give a BGRATIO of GREATER THAN one (in this case, the BGRATIO will tend to be close to 1.0 for weak spots but greater than 1.0 for strong ones). The boundary is controlled by the "Profile tolerance" parameters, which are sometimes set artificially high to help process images where the spots are not fully resolved. Andrew On 3 Mar 2011, at 20:34, Bryan Lepore wrote: wondering if mosflm can automatically estimate the gain. i.e. i gather it is still estimated the usual way. -Bryan
Re: [ccp4bb] mosflm gain
I have found that the best way to get the GAIN "right" in MOSFLM is to have a look at the optimum "Sdfac" parameter at the end of SCALA (the first of the three SDCORRection values). Specifically, if SDFac is > 1, then you need to increase the GAIN. This is because SDFac>1 means that the spots were noisier than MOSFLM thought they should be, and if a given number of ADU is noisier than expected, then there must have been fewer photons involved in generating the signal. This means that the "true gain" was higher. Yes, there are other sources of error, like shutter jitter, beam flicker, calibration errors, absorption effects, scale factor errors, etc. But these are all directly proportional to the intensity, and therefore accounted for by adjusting SDadd (the last of the three SDCORR values). SDfac accounts for noise proportional to the square root of intensity, and only shot noise (like photon counting) behaves like that. David Waterman makes an excellent point that the point-spread function (PSF) acts like a smoothing filter and makes the background look less noisy than photon-counting error permits. This makes the BGRATIO-estimated GAIN lower than the "true" GAIN. However, one can argue that this is not always a bad thing, since the error in measuring the intensity of a given area of flat background really is "better than photon counting". This is because you have the smoothing effect of the PSF working "for you": bringing in signal from areas outside the region you are measuring (prior knowledge of "flatness" if you will). However, this smoothing effect of the PSF does not apply to spots because spot photons all arrive in essentially the same place, and no "smoothing" will change the intrinsic noise of the total number of photons that actually arrived. The upshot of this is that we really need two different values for GAIN, one for the background and one for the background-subtracted spot intensity. The influence on sigma(I) would depend on the relative contributions from the spot vs the background under it. I am pretty sure this is not implemented. It is perhaps interesting that there is also a third type of noise which is independent of the spot intensity: "read-out noise". This used to be called "fog" on film detectors. Despite all the money we spend on detectors that minimize it, there is no specific accounting for read-out noise in MOSFLM or any other integration package I am aware of. However, a "trick" to account for it is to simply lower the ADCOFFSET. For example, using 1 A X-rays on an ADSC Q315r detector in hwbin mode, the true GAIN is 1.8 ADU/photon, the ADCOFFSET is 40 ADU, and the read-out noise is equivalent to the noise deposited by ~2 photon/pixel of x-ray background. This means that a blank image has an average value of 40 ADU and rms variation of ~2.5 ADU, but this is equivalent to an image from a detector with the same gain, no read-out noise, and ADCOFFSET of 36 that was "fogged" by 2 photons/pixel (regardless of exposure time). Yes, this is a small change in ADCOFFSET, and I doubt you will notice the difference. I think this speaks to the fact that, on modern detectors at least, read-out noise is essentially negligible. Another way to get the GAIN, of course, is to measure it directly. I did this on an ADSC Q315 detector in swbin mode by comparison to a NaI:Tl scintillator (after accounting for the window and sensor thickness of the latter device): http://bl831.als.lbl.gov/~jamesh/pickup/Q315_gain.png You can see how the GAIN changes appreciably with photon energy, and this is largely because lower-energy photons generate less signal. GAIN also changes with the detector read-out mode. For example, this number is 3 times higher for a Q315r in hwbin mode. I have listed my best information on the typical GAIN and read-out noise of common detectors on my "minimum crystal size" page here: http://bl831.als.lbl.gov/xtalsize.html You can extract the parameters by selecting the "detector type = " you want, and then switching it again to "Custom..." -James Holton MAD Scientist On 3/3/2011 12:34 PM, Bryan Lepore wrote: wondering if mosflm can automatically estimate the gain. i.e. i gather it is still estimated the usual way. -Bryan
Re: [ccp4bb] mosflm gain
I have to say that I don't fully agree with James' recommendation to adjust the GAIN in MOSFLM until the calculated SDFAC parameter in SCALA is 1.0. (Background information, the sigmas from Mosflm sd(I) are corrected in SCALA according to sd(I) corrected = SdFac * sqrt{sd(I)**2 + SdB*Ihl + (SdAdd*Ihl)**2} in order to get the best agreement between corrected sigmas and the observed differences between symmtery/Friedel related intensities) While I fully agree with his argument that systematic errors such as absorption, etc give an error proportional to the intensity, and therefore should be corrected by the SDADD term rather than SDFAC, in any "real world" data set that I have come across the situation is not so simple. Indeed, according to the usual treatment of errors there should be no need for the SDB term in SCALA, but in practice it is essential to have this term to be able to match corrected sigmas with the observed differences between symmetry related reflections. It also turns out that the three variable parameters SDFAC, SDB and SDADD are highly correlated, so one can get rather different values for any individual parameter from very similar datasets. Radiation damage is certainly one source of error which would not be expected to follow a simple error model, or non-isomorphism if multiple crystals have been used. Phil Evans is not entirely happy with the behaviour of the refinement of these parameters and is in fact currently looking at this, but there is a basic problem here that one is trying to use a simple error model for a situation where (for whatever reason) it does not really apply. The sigma estimates from MOSFLM are only intended to give an estimate of the random error in the intensities. In my opinion, trying to account for systematic errors is best done at the point of merging the data where much more information is available (ie symmetry related measurements). I would be most interested to hear of any examples where the default value of the GAIN in MOSFLM is clearly wrong, but to the best of my current knowledge the default GAIN is perfectly adequate. Best wishes Andrew On 4 Mar 2011, at 19:47, James Holton wrote: I have found that the best way to get the GAIN "right" in MOSFLM is to have a look at the optimum "Sdfac" parameter at the end of SCALA (the first of the three SDCORRection values). Specifically, if SDFac is > 1, then you need to increase the GAIN. This is because SDFac>1 means that the spots were noisier than MOSFLM thought they should be, and if a given number of ADU is noisier than expected, then there must have been fewer photons involved in generating the signal. This means that the "true gain" was higher. Yes, there are other sources of error, like shutter jitter, beam flicker, calibration errors, absorption effects, scale factor errors, etc. But these are all directly proportional to the intensity, and therefore accounted for by adjusting SDadd (the last of the three SDCORR values). SDfac accounts for noise proportional to the square root of intensity, and only shot noise (like photon counting) behaves like that. David Waterman makes an excellent point that the point-spread function (PSF) acts like a smoothing filter and makes the background look less noisy than photon-counting error permits. This makes the BGRATIO-estimated GAIN lower than the "true" GAIN. However, one can argue that this is not always a bad thing, since the error in measuring the intensity of a given area of flat background really is "better than photon counting". This is because you have the smoothing effect of the PSF working "for you": bringing in signal from areas outside the region you are measuring (prior knowledge of "flatness" if you will). However, this smoothing effect of the PSF does not apply to spots because spot photons all arrive in essentially the same place, and no "smoothing" will change the intrinsic noise of the total number of photons that actually arrived. The upshot of this is that we really need two different values for GAIN, one for the background and one for the background- subtracted spot intensity. The influence on sigma(I) would depend on the relative contributions from the spot vs the background under it. I am pretty sure this is not implemented. It is perhaps interesting that there is also a third type of noise which is independent of the spot intensity: "read-out noise". This used to be called "fog" on film detectors. Despite all the money we spend on detectors that minimize it, there is no specific accounting for read-out noise in MOSFLM or any other integration package I am aware of. However, a "trick" to account for it is to simply lower the ADCOFFSET. For example, using 1 A X-rays on an ADSC Q315r detector in hwbin mode, the true GAIN is 1.8 ADU/photon, the ADCOFFSET is 40 ADU, and the read
Re: [ccp4bb] mosflm gain
Andrew! You don't believe me? Well, I suppose it serves me right for not explaining where the idea came from (see below). I do, however, agree with Andrew's assessment that the default-chosen gain in MOSFLM is adequate for all practical purposes. Any error in GAIN will be almost exactly compensated for by a corresponding change in Sdfac in SCALA, and the final value of sigma(I) will be essentially the same. The only possible difference will be in the sigma-based outlier rejection within MOSFLM, but since the typical errors in the sigma are only ~30%, I predict it will be hard to find a situation where this makes or breaks a structure determination. So, by way of explanation: there are three things that led me to this conclusion: 1) the control: fake data with all pixels independent. adjusting the GAIN as MOSFLM recommends from the BGRATIO analysis does, in fact, reproduce the "correct" value of the gain used to generate the fake data. In SCALA, Sdfac refines to ~1.0, SdB refines to 0, and Sdadd refines to the actual magnitude of fractional error (introduced by beam flicker, shutter jitter, etc.). No surprises here. 2) "blur" the fake data with the point-spread function (PSF) empirically derived for my detector In this case, the "MOSFLM-refined gain" is too low. In SCALA, Sdfac refines to ~1.3, SdB refines to 3-5, and Sdadd is a bit low. These parameters are about what I see processing good real data. 3) use real data, but force MOSFLM to use the GAIN calibrated independently for the detector MOSFLM grumbles a lot about the BGRATIO. In SCALA, Sdfac refines to ~1, and SdB refines to ~0. Sdadd is consistent with my independently-measured fractional error sources. Now, I have not evaluated this approach on a huge number of data sets, but in this case the PSF was both necessary and sufficient to explain the "mystery of SdB". That is: the need for SdB arises because using an "incorrect" gain creates a correlation between Sdfac and Sdadd. I imagine there are other ways to get a non-zero SdB as well, but for "good data" I suspect this is the dominant mechanism. I never wrote this up because I am fairly certain the article would do nothing to improve the impact factor of the journal in which it was published, but this anecdote might perhaps be useful to Andrew, Phil, and a few other readers of this list. -James Holton MAD Scientist On 3/7/2011 2:00 AM, A Leslie wrote: I have to say that I don't fully agree with James' recommendation to adjust the GAIN in MOSFLM until the calculated SDFAC parameter in SCALA is 1.0. (Background information, the sigmas from Mosflm sd(I) are corrected in SCALA according to sd(I) corrected = SdFac * sqrt{sd(I)**2 + SdB*Ihl + (SdAdd*Ihl)**2} in order to get the best agreement between corrected sigmas and the observed differences between symmtery/Friedel related intensities) While I fully agree with his argument that systematic errors such as absorption, etc give an error proportional to the intensity, and therefore should be corrected by the SDADD term rather than SDFAC, in any "real world" data set that I have come across the situation is not so simple. Indeed, according to the usual treatment of errors there should be no need for the SDB term in SCALA, but in practice it is essential to have this term to be able to match corrected sigmas with the observed differences between symmetry related reflections. It also turns out that the three variable parameters SDFAC, SDB and SDADD are highly correlated, so one can get rather different values for any individual parameter from very similar datasets. Radiation damage is certainly one source of error which would not be expected to follow a simple error model, or non-isomorphism if multiple crystals have been used. Phil Evans is not entirely happy with the behaviour of the refinement of these parameters and is in fact currently looking at this, but there is a basic problem here that one is trying to use a simple error model for a situation where (for whatever reason) it does not really apply. The sigma estimates from MOSFLM are only intended to give an estimate of the random error in the intensities. In my opinion, trying to account for systematic errors is best done at the point of merging the data where much more information is available (ie symmetry related measurements). I would be most interested to hear of any examples where the default value of the GAIN in MOSFLM is clearly wrong, but to the best of my current knowledge the default GAIN is perfectly adequate. Best wishes Andrew On 4 Mar 2011, at 19:47, James Holton wrote: I have found that the best way to get the GAIN "right" in MOSFLM is to have a look at the optimum "Sdfac" parameter at the end of SCALA (the first of the three SDCORRection values). Specifically, if SDFac is > 1, then you need to increase the GAIN. This is because SDFac>1 means that the spots were
Re: [ccp4bb] mosflm gain
Dear James, Many thanks for the detailed explanation. I do find your results very interesting and (when time allows !) I will certainly investigate this effect in more detail and see if I find similar results for data that shows significant levels of radiation damage (as mine invariably seem to do). I have to admit that it is not entirely clear to me why PSF would result in a correlation between SDFAC and SDADD, although this is clearly what you see. It would be rewarding to get to the bottom of this. As I mentioned earlier, Phil Evans is currently looking at the refinement of the SD parameters in relation to Aimless (the imminent replacement for SCALA) so he is also very interested in figuring out exactly what is going on here (but does not have any answers as yet). Best wishes, Andrew > > Andrew! You don't believe me? Well, I suppose it serves me right for > not explaining where the idea came from (see below). > > I do, however, agree with Andrew's assessment that the default-chosen > gain in MOSFLM is adequate for all practical purposes. Any error in > GAIN will be almost exactly compensated for by a corresponding change in > Sdfac in SCALA, and the final value of sigma(I) will be essentially the > same. The only possible difference will be in the sigma-based outlier > rejection within MOSFLM, but since the typical errors in the sigma are > only ~30%, I predict it will be hard to find a situation where this > makes or breaks a structure determination. > > So, by way of explanation: there are three things that led me to this > conclusion: > 1) the control: fake data with all pixels independent. > adjusting the GAIN as MOSFLM recommends from the BGRATIO analysis > does, in fact, reproduce the "correct" value of the gain used to > generate the fake data. In SCALA, Sdfac refines to ~1.0, SdB refines to > 0, and Sdadd refines to the actual magnitude of fractional error > (introduced by beam flicker, shutter jitter, etc.). No surprises here. > 2) "blur" the fake data with the point-spread function (PSF) empirically > derived for my detector > In this case, the "MOSFLM-refined gain" is too low. In SCALA, > Sdfac refines to ~1.3, SdB refines to 3-5, and Sdadd is a bit low. > These parameters are about what I see processing good real data. > 3) use real data, but force MOSFLM to use the GAIN calibrated > independently for the detector > MOSFLM grumbles a lot about the BGRATIO. In SCALA, Sdfac refines to > ~1, and SdB refines to ~0. Sdadd is consistent with my > independently-measured fractional error sources. > > Now, I have not evaluated this approach on a huge number of data sets, > but in this case the PSF was both necessary and sufficient to explain > the "mystery of SdB". That is: the need for SdB arises because using an > "incorrect" gain creates a correlation between Sdfac and Sdadd. I > imagine there are other ways to get a non-zero SdB as well, but for > "good data" I suspect this is the dominant mechanism. I never wrote > this up because I am fairly certain the article would do nothing to > improve the impact factor of the journal in which it was published, but > this anecdote might perhaps be useful to Andrew, Phil, and a few other > readers of this list. > > -James Holton > MAD Scientist > > > On 3/7/2011 2:00 AM, A Leslie wrote: >> >> >> I have to say that I don't fully agree with James' recommendation to >> adjust the GAIN in MOSFLM until the calculated SDFAC parameter in >> SCALA is 1.0. >> >> (Background information, the sigmas from Mosflm sd(I) are corrected in >> SCALA according to >>sd(I) corrected = SdFac * sqrt{sd(I)**2 + SdB*Ihl + >> (SdAdd*Ihl)**2} >> in order to get the best agreement between corrected sigmas and the >> observed differences between symmtery/Friedel related intensities) >> >> >> While I fully agree with his argument that systematic errors such as >> absorption, etc give an error proportional to the intensity, and >> therefore should be corrected by the SDADD term rather than SDFAC, in >> any "real world" data set that I have come across the situation is not >> so simple. Indeed, according to the usual treatment of errors there >> should be no need for the SDB term in SCALA, but in practice it is >> essential to have this term to be able to match corrected sigmas with >> the observed differences between symmetry related reflections. It also >> turns out that the three variable parameters SDFAC, SDB and SDADD are >> highly correlated, so one can get rather different values for any >> individual parameter from very similar datasets. Radiation damage is >> certainly one source of error which would not be expected to follow a >> simple error model, or non-isomorphism if multiple crystals have been >> used. >> >> Phil Evans is not entirely happy with the behaviour of the refinement >> of these parameters and is in fact currently looking at this, but >> there is a basic problem here that one is trying to use a simple