Re: [ccp4bb] help with wwPDB validation warning
This old message claims that / = "weighted" by s. If counting error is significant, s will be larger for stronger reflections, which are likely to have small I/s, so in general / > unweighted . As Werten points out. However this seems to be the opposite of the OP's situation, if both measures refer to the same outer shell. EAB --- To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ --- Begin Message --- b...@freesurf.fr wrote: *** For details on how to be removed from this list visit the *** *** CCP4 home page http://www.ccp4.ac.uk *** From these we calculate a weighted mean and an error estimate of the mean sd(), and a ratio for that unique reflection /sd() What is printed as Mn(I)/sd is the mean value of that ratio for all reflections (possibly in a resolution bin) ie Which is very different (!) from anything one could extract from a Scalepack/HKL2000 output file, which only lists average intensity and average sigma (in the final table at the end). The ratio of those two numbers, i.e. /, tends to be considerably more "optimistic" than a proper . Werten leaves it as an exercise for the reader to verify this. Here is my submission: The (ratio of means) is equivalent to a weighted (mean of ratios): / = sum(I)/n / sum(s)/n = sum (I) / sum(s) = sum(s*(I/s)) / sum(s) = "weighted" by s This means / looks like but with more weight given to those measurements with large s. If the standard deviation is related to counting error, then s is larger for strong reflections (yes of course as a percentage it is smaller for strong reflections, but in absolute value it is larger). Thus the statistic / is weighted toward strong reflections, and will be more optimistic than unweighted . The next-to-last table of truncate output lists , /, , / vs resolution. In the critical last shells and / tend to be the same, perhaps because all reflections are around the noise level and s is dominated by background counts so that all reflections are weighted equally. for R-type factors (error/magnitude) the ratio of the means (as in R-merge) is more optimistic than the mean of the ratios (as in std deviation), and by a much larger extent: weighted by I, whereas "s" varies at most as the sqrt(I). If we used the mean ratio the result would be to horrible to contemplate, with all those weak reflections at the noise level weighted equally with the strong! But seriously, it makes some sense for a statistic to be weighted toward the strong reflections, since they contribute more to the map. Who cares if a reflection measurement has 70% error if it is so weak that it could be omitted completely with no visible effect on the map? Ed Best regards, S. Werten --- End Message ---
Re: [ccp4bb] help with wwPDB validation warning
Dear Gerard, I disagree in two points with what you write: On Mon, 10 Jun 2024 19:15:43 +0100, Gerard Bricogne wrote: ... > Much worse, in fact: that quantity (I_avg/sigI_avg) makes no sense >whatsoever in statistical terms. It must be a relic of a quantity that may >have seemed like a good idea to someone at some stage, and has since been >dutifully carried along forever after, and "gold-plated" so as to still be >present in the latest revision of the mmCIF dictionary. The quantity I_avg/sigI_avg (= /) makes as much sense in statistical terms as does Mn(I/sigI) (=). As I said, numerically they are often similar (within 20% or so), in particular at high resolution. IIRC, SCALEPACK (from the HKL package) prints out / (or used to print it out; maybe this has changed). So there might be a reference to the usage of "I_avg/sigI_avg". > > Perhaps you could request a reference to the publication in which this >quantity was proposed as a validation criterion and its acceptable limits >were derived :-) . > > This being said, if it is indeed the case that the average value of >your intensities is smaller than the average of their standard deviations, >there is definitely something wrong somewhere. Perhaps a confusion between >columns containing values pertaining to intensities vs. amplitudes? There is nothing wrong if "the average value of your intensities is smaller than the average of their standard deviations", if the warning that Aline reports refers to the outer shell. If it refers to the whole dataset, yes then there's something wrong. Best wishes, Kay To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] help with wwPDB validation warning
Ok - I have tracked down where it is coming from. The value being reported is sum(I)/sum(sigma_I). This is not the same as Mean(I/sigmaI) - as I interpret the later as (Sum(I/sigma))/n. Where I comes from is the intensity_meas, intensity_meas_au, or intensity column - whichever is present (priority left to right). No averages here. It is a simple diagnostic >80 or < 2 - report a warning. This is from the old sf_convert program (written > 15 years ago) and the various warnings were carried over to a re-implementation in python. At the time the program was implemented, the PDB had only started trying to make sense of the experimental data provided by the authors. Remember, experimental X-ray data were not required by the PDB until 2008. Prior to that the data were optional, and some are a mess. Use of experimental data in validation came afterwards. MTZ files might have been accepted back then (I cannot remember) - so ensuring that the conversion did not result in incorrect translation was important. The person doing the work at the time was a structural biologist, but may have come up with his own analyses to find conversion issues. There will likely not be a reference. Certainly the sources do not reference a methodology here. So - what can we do moving forward? Using community standards for identification of such errors should be incorporated. Changing the code is relatively easy. Choosing the correct formulas would be the most meaningful. And if sf_convert reports different data from AIMLESS - we should strive to understand why. Ezra On 6/10/24 2:15 PM, Gerard Bricogne wrote: Dear Aline, This is an intriguing message: by what exact piece of software was it produced? The notation I_avg/sigI_avg does not appear in the definition of the closest item in the mmCIF dictionary, which would be _reflns_shell.meanI_over_sigI_obs that can be found at https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_reflns_shell.meanI_over_sigI_obs.html As Kay explained, the quantity for which you are getting a warning is a ratio of averages, which is not at all the same as the usual average of (signal-to-noise) ratios, denoted Mean((I)/sd(I)) in AIMLESS. Much worse, in fact: that quantity (I_avg/sigI_avg) makes no sense whatsoever in statistical terms. It must be a relic of a quantity that may have seemed like a good idea to someone at some stage, and has since been dutifully carried along forever after, and "gold-plated" so as to still be present in the latest revision of the mmCIF dictionary. Perhaps you could request a reference to the publication in which this quantity was proposed as a validation criterion and its acceptable limits were derived :-) . This being said, if it is indeed the case that the average value of your intensities is smaller than the average of their standard deviations, there is definitely something wrong somewhere. Perhaps a confusion between columns containing values pertaining to intensities vs. amplitudes? To sober me up from all this speculation, Clemens Vonrhein tells me that it is very likely that it is not the I_avg/sigI_avg quantity that is actually being calculated, and that it is simply a "normal" quantity (e.g. Mean((I)/sd(I)) that is being mis-described in the warning message. With best wishes, Gerard. -- On Fri, Jun 07, 2024 at 03:03:30PM +0100, Aline Dias da Purificação wrote: Dear all, I am currently validating a structure for deposition in the wwPDB and encountered the following warning in the validation system: Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or SigIo in SF file). The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I didn't understand the warning. Has anyone experienced this before and could assist me? Thank you. To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
Re: [ccp4bb] help with wwPDB validation warning
Dear Aline, This is an intriguing message: by what exact piece of software was it produced? The notation I_avg/sigI_avg does not appear in the definition of the closest item in the mmCIF dictionary, which would be _reflns_shell.meanI_over_sigI_obs that can be found at https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_reflns_shell.meanI_over_sigI_obs.html As Kay explained, the quantity for which you are getting a warning is a ratio of averages, which is not at all the same as the usual average of (signal-to-noise) ratios, denoted Mean((I)/sd(I)) in AIMLESS. Much worse, in fact: that quantity (I_avg/sigI_avg) makes no sense whatsoever in statistical terms. It must be a relic of a quantity that may have seemed like a good idea to someone at some stage, and has since been dutifully carried along forever after, and "gold-plated" so as to still be present in the latest revision of the mmCIF dictionary. Perhaps you could request a reference to the publication in which this quantity was proposed as a validation criterion and its acceptable limits were derived :-) . This being said, if it is indeed the case that the average value of your intensities is smaller than the average of their standard deviations, there is definitely something wrong somewhere. Perhaps a confusion between columns containing values pertaining to intensities vs. amplitudes? To sober me up from all this speculation, Clemens Vonrhein tells me that it is very likely that it is not the I_avg/sigI_avg quantity that is actually being calculated, and that it is simply a "normal" quantity (e.g. Mean((I)/sd(I)) that is being mis-described in the warning message. With best wishes, Gerard. -- On Fri, Jun 07, 2024 at 03:03:30PM +0100, Aline Dias da Purificação wrote: > Dear all, > > I am currently validating a structure for deposition in the wwPDB and > encountered the following warning in the validation system: > > Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or SigIo > in SF file). > > The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I didn't > understand the warning. > > Has anyone experienced this before and could assist me? > > Thank you. > > > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing > list hosted by www.jiscmail.ac.uk, terms & conditions are available at > https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] help with wwPDB validation warning
Hi Aline, I see nothing wrong with / being 0.83 Mean((I)/sd(I)) is , which is not the same as / so you cannot expect the numerical values to be the same (even in case the resolution shell definition is identical), although the two values usually do not differ much. Take for example two reflections, I1=100 sigI1=20 and I2=200 sigI2=10. Then =(5+20)/2=12.5 and /=150/15=10 . Best, Kay On Fri, 7 Jun 2024 15:03:30 +0100, Aline Dias da Purificação wrote: >Dear all, > >I am currently validating a structure for deposition in the wwPDB and >encountered the following warning in the validation system: > >Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or SigIo >in SF file). > >The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I didn't >understand the warning. > >Has anyone experienced this before and could assist me? > >Thank you. > > > >To unsubscribe from the CCP4BB list, click the following link: >https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > >This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing >list hosted by www.jiscmail.ac.uk, terms & conditions are available at >https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] help with wwPDB validation warning
Hmm - no idea but perhapd=s interesting that 0.83 ~ 1.7/2 Eleanor On Fri, 7 Jun 2024 at 15:13, Aline Dias da Purificação < d5ed37c6eb7b-dmarc-requ...@jiscmail.ac.uk> wrote: > Dear all, > > I am currently validating a structure for deposition in the wwPDB and > encountered the following warning in the validation system: > > Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or > SigIo in SF file). > > The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I > didn't understand the warning. > > Has anyone experienced this before and could assist me? > > Thank you. > > > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a > mailing list hosted by www.jiscmail.ac.uk, terms & conditions are > available at https://www.jiscmail.ac.uk/policyandsecurity/ > To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/