Re: [ccp4bb] help with wwPDB validation warning

2024-06-14 Thread Edward Berry

This old message claims that / =  "weighted" by s.

If counting error is significant, s will be larger for stronger reflections, which are likely to 
have small I/s,  so in general /  > unweighted . As Werten points 
out.

However this seems to be the opposite of the OP's situation, if both measures 
refer to the same outer shell.
EAB
---



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/
--- Begin Message ---



b...@freesurf.fr wrote:

***  For details on how to be removed from this list visit the  ***
***  CCP4 home page http://www.ccp4.ac.uk ***




From these we calculate a weighted mean  and an error estimate of

the mean sd(), and a ratio for that unique reflection
/sd()

What is printed as Mn(I)/sd is the mean value of that ratio for all
reflections (possibly in a resolution bin) ie





Which is very different (!) from anything one could extract from a
Scalepack/HKL2000 output file, which only lists average intensity and
average sigma (in the final table at the end). The ratio of those two
numbers, i.e. /, tends to be considerably more "optimistic" than
a proper .



Werten leaves it as an exercise for the reader
to verify this. Here is my submission:

The (ratio of means) is equivalent to a weighted
(mean of ratios):

/  = sum(I)/n / sum(s)/n

 = sum (I)  / sum(s)

 = sum(s*(I/s)) / sum(s)

 =  "weighted" by s

This means /  looks like  but with more
weight given to those measurements with large s.
If the standard deviation is related to counting
error, then s is larger for strong reflections
(yes of course as a percentage it is smaller
for strong reflections, but in absolute value
it is larger).
Thus the statistic / is weighted toward
strong reflections, and will be more optimistic
than unweighted .

The next-to-last table of truncate output lists
, /, , / vs resolution.
In the critical last shells  and /
tend to be the same, perhaps because all
reflections are around the noise level and s
is dominated by background counts so that all
reflections are weighted equally.


for R-type factors (error/magnitude) the ratio of
the means (as in R-merge) is more optimistic
than the mean of the ratios (as in std deviation),
and by a much larger extent: weighted by I,
whereas "s" varies at most as the sqrt(I).
If we used the mean ratio the result would be
to horrible to contemplate, with all those weak
reflections at the noise level weighted equally
with the strong! But seriously, it makes some
sense for a statistic to be weighted toward the
strong reflections, since they contribute more
to the map. Who cares if a reflection measurement
has 70% error if it is so weak that it could be
omitted completely with no visible effect on
the map?

Ed


Best regards,

   S. Werten










--- End Message ---


Re: [ccp4bb] help with wwPDB validation warning

2024-06-11 Thread Kay Diederichs
Dear Gerard,

I disagree in two points with what you write:

On Mon, 10 Jun 2024 19:15:43 +0100, Gerard Bricogne  
wrote:
...
> Much worse, in fact: that quantity (I_avg/sigI_avg) makes no sense
>whatsoever in statistical terms. It must be a relic of a quantity that may
>have seemed like a good idea to someone at some stage, and has since been
>dutifully carried along forever after, and "gold-plated" so as to still be
>present in the latest revision of the mmCIF dictionary.

The quantity I_avg/sigI_avg (= /) makes as much sense in statistical 
terms as does Mn(I/sigI) (=).
As I said, numerically they are often similar (within 20% or so), in particular 
at high resolution.
IIRC, SCALEPACK (from the HKL package) prints out / (or used to print 
it out; maybe this has changed).
So there might be a reference to the usage of "I_avg/sigI_avg".

>
> Perhaps you could request a reference to the publication in which this
>quantity was proposed as a validation criterion and its acceptable limits
>were derived :-) .
>
> This being said, if it is indeed the case that the average value of
>your intensities is smaller than the average of their standard deviations,
>there is definitely something wrong somewhere. Perhaps a confusion between
>columns containing values pertaining to intensities vs. amplitudes?

There is nothing wrong if "the average value of your intensities is smaller 
than the average of their standard deviations",
if the warning that Aline reports refers to the outer shell. If it refers to 
the whole dataset, yes then there's something wrong.

Best wishes,
Kay



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] help with wwPDB validation warning

2024-06-10 Thread Ezra Peisach

Ok - I have tracked down where it is coming from.

The value being reported is sum(I)/sum(sigma_I).   This is not the same 
as Mean(I/sigmaI) - as I interpret the later as (Sum(I/sigma))/n.


Where I comes from is the intensity_meas, intensity_meas_au, or 
intensity column - whichever is present (priority left to right).


No averages here.

It is a simple diagnostic >80 or < 2 - report a warning.

This is from the old sf_convert program (written > 15 years ago) and the 
various warnings were carried over to a re-implementation in python.


At the time the program was implemented, the PDB had only started trying 
to make sense of the experimental data provided by the authors.  
Remember, experimental X-ray data were not required by the PDB until 
2008. Prior to that the data were optional, and some are a mess.  Use of 
experimental data in validation came afterwards.  MTZ files might have 
been accepted back then (I cannot remember) - so ensuring that the 
conversion did not result in incorrect translation was important.


The person doing the work at the time was a structural biologist, but 
may have come up with his own analyses to find conversion issues.  There 
will likely not be a reference.  Certainly the sources do not reference 
a methodology here.


So - what can we do moving forward?  Using community standards for 
identification of such errors should be incorporated. Changing the code 
is relatively easy.  Choosing the correct formulas would be the most 
meaningful.  And if sf_convert reports different data from AIMLESS - we 
should strive to understand why.



Ezra




On 6/10/24 2:15 PM, Gerard Bricogne wrote:

Dear Aline,

  This is an intriguing message: by what exact piece of software was it
produced?

  The notation I_avg/sigI_avg does not appear in the definition of the
closest item in the mmCIF dictionary, which would be

_reflns_shell.meanI_over_sigI_obs

that can be found at

https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_reflns_shell.meanI_over_sigI_obs.html

  As Kay explained, the quantity for which you are getting a warning is a
ratio of averages, which is not at all the same as the usual average of
(signal-to-noise) ratios, denoted Mean((I)/sd(I)) in AIMLESS.

  Much worse, in fact: that quantity (I_avg/sigI_avg) makes no sense
whatsoever in statistical terms. It must be a relic of a quantity that may
have seemed like a good idea to someone at some stage, and has since been
dutifully carried along forever after, and "gold-plated" so as to still be
present in the latest revision of the mmCIF dictionary.

  Perhaps you could request a reference to the publication in which this
quantity was proposed as a validation criterion and its acceptable limits
were derived :-) .

  This being said, if it is indeed the case that the average value of
your intensities is smaller than the average of their standard deviations,
there is definitely something wrong somewhere. Perhaps a confusion between
columns containing values pertaining to intensities vs. amplitudes?

  To sober me up from all this speculation, Clemens Vonrhein tells me
that it is very likely that it is not the I_avg/sigI_avg quantity that is
actually being calculated, and that it is simply a "normal" quantity (e.g.
Mean((I)/sd(I)) that is being mis-described in the warning message.


  With best wishes,

   Gerard.

--
On Fri, Jun 07, 2024 at 03:03:30PM +0100, Aline Dias da Purificação wrote:

Dear all,

I am currently validating a structure for deposition in the wwPDB and 
encountered the following warning in the validation system:

Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or SigIo in 
SF file).

The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I didn't 
understand the warning.

Has anyone experienced this before and could assist me?

Thank you.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list

Re: [ccp4bb] help with wwPDB validation warning

2024-06-10 Thread Gerard Bricogne
Dear Aline,

 This is an intriguing message: by what exact piece of software was it
produced? 

 The notation I_avg/sigI_avg does not appear in the definition of the
closest item in the mmCIF dictionary, which would be

   _reflns_shell.meanI_over_sigI_obs

that can be found at 

https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_reflns_shell.meanI_over_sigI_obs.html

 As Kay explained, the quantity for which you are getting a warning is a
ratio of averages, which is not at all the same as the usual average of
(signal-to-noise) ratios, denoted Mean((I)/sd(I)) in AIMLESS. 

 Much worse, in fact: that quantity (I_avg/sigI_avg) makes no sense
whatsoever in statistical terms. It must be a relic of a quantity that may
have seemed like a good idea to someone at some stage, and has since been
dutifully carried along forever after, and "gold-plated" so as to still be
present in the latest revision of the mmCIF dictionary. 

 Perhaps you could request a reference to the publication in which this
quantity was proposed as a validation criterion and its acceptable limits
were derived :-) .

 This being said, if it is indeed the case that the average value of
your intensities is smaller than the average of their standard deviations,
there is definitely something wrong somewhere. Perhaps a confusion between
columns containing values pertaining to intensities vs. amplitudes?

 To sober me up from all this speculation, Clemens Vonrhein tells me
that it is very likely that it is not the I_avg/sigI_avg quantity that is
actually being calculated, and that it is simply a "normal" quantity (e.g.
Mean((I)/sd(I)) that is being mis-described in the warning message.


 With best wishes,

  Gerard.

--
On Fri, Jun 07, 2024 at 03:03:30PM +0100, Aline Dias da Purificação wrote:
> Dear all,
> 
> I am currently validating a structure for deposition in the wwPDB and 
> encountered the following warning in the validation system: 
> 
> Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or SigIo 
> in SF file). 
> 
> The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I didn't 
> understand the warning.
> 
> Has anyone experienced this before and could assist me?
> 
> Thank you.
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] help with wwPDB validation warning

2024-06-08 Thread Kay Diederichs
Hi Aline,

I see nothing wrong with / being 0.83

Mean((I)/sd(I)) is , which is not the same as / so you cannot 
expect the numerical values to be the same (even in case the resolution shell 
definition is identical), although the two values usually do not differ much. 
Take for example two reflections, I1=100 sigI1=20 and I2=200 sigI2=10. Then 
=(5+20)/2=12.5 and /=150/15=10 .

Best,
Kay


On Fri, 7 Jun 2024 15:03:30 +0100, Aline Dias da Purificação 
 wrote:

>Dear all,
>
>I am currently validating a structure for deposition in the wwPDB and 
>encountered the following warning in the validation system: 
>
>Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or SigIo 
>in SF file). 
>
>The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I didn't 
>understand the warning.
>
>Has anyone experienced this before and could assist me?
>
>Thank you.
>
>
>
>To unsubscribe from the CCP4BB list, click the following link:
>https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
>This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] help with wwPDB validation warning

2024-06-07 Thread Eleanor Dodson
Hmm - no idea but perhapd=s interesting that 0.83 ~ 1.7/2
Eleanor

On Fri, 7 Jun 2024 at 15:13, Aline Dias da Purificação <
d5ed37c6eb7b-dmarc-requ...@jiscmail.ac.uk> wrote:

> Dear all,
>
> I am currently validating a structure for deposition in the wwPDB and
> encountered the following warning in the validation system:
>
> Warning: Value of (I_avg/sigI_avg = 0.83) is out of range (check Io or
> SigIo in SF file).
>
> The Mean((I)/sd(I)) in the aimless log is 1.7 in the OuterShell, so I
> didn't understand the warning.
>
> Has anyone experienced this before and could assist me?
>
> Thank you.
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/