Dear Randy
Yes this makes sense.
Certainly cut offs are bad – I hope I my post wasn’t implying one should cut 
off the data at some particular resolution shell. Some reflections in a shell 
will be weak and some stronger. Knowing which are which is of course 
information.
I will have a look at the 2019 CCP4 study weekend paper
Regards
Colin

From: Randy Read <rj...@cam.ac.uk>
Sent: 20 February 2020 11:45
To: Nave, Colin (DLSLtd,RAL,LSCI) <colin.n...@diamond.ac.uk>
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] [3dem] Which resolution?

Dear Colin,

Over the last few years we've been implementing measures of information gain to 
evaluate X-ray diffraction data in our program Phaser. Some results in a paper 
that has been accepted for publication in the 2019 CCP4 Study Weekend special 
issue are relevant to this discussion.

First, looking at data deposited in the PDB, we see that the information gain 
in the highest resolution shell is typically about 0.5-1 bit per reflection 
(though we haven't done a comprehensive analysis yet).  A very rough 
calculation suggests that a half-bit resolution threshold is equivalent to 
something like an I/SIGI threshold of one.  So that would fit with the idea 
that a possible resolution limit measure would be the resolution where the 
average information per reflection drops to half a bit.

Second, even if the half-bit threshold is where the data are starting to 
contribute less to the image and to likelihood targets for tasks like molecular 
replacement and refinement, weaker data still contribute some useful signal 
down to limits as low as 0.01 bit per reflection.  So any number attached to 
the nominal resolution of a data set should not necessarily be applied as a 
resolution cutoff, at least as long as the refinement target (such as our 
log-likelihood-gain on intensity or LLGI score) accounts properly for large 
measurement errors.

Best wishes,

Randy


On 20 Feb 2020, at 10:15, Nave, Colin (DLSLtd,RAL,LSCI) 
<colin.n...@diamond.ac.uk<mailto:colin.n...@diamond.ac.uk>> wrote:

Dear all,
I have received a request to clarify what I mean by threshold in my 
contribution of 17 Feb  below and then post the clarification on CCP4BB. Being 
a loyal (but very sporadic) CCP4BBer I am now doing this. My musings in this 
thread are as much auto-didactic as didactic. In other words I am trying to 
understand it all myself.

Accepting that the FSC is a suitable metric (I believe it is) I think the most 
useful way of explaining the concept of the threshold is to refer to section 
4.2 and fig. 4 of Heel and Schatz (2005), Journal of Structural Biology, 151, 
250-262. Figure 4C show an FSC together with a half bit information curve and 
figure 4D shows the FSC with a 3sigma curve.

The point I was trying to make in rather an obtuse fashion is that the choice 
of threshold will depend on what one is trying to see in the image. I will try 
and give an example related to protein structures rather than uranium hydride 
or axons in the brain. In general protein structures consist of atoms with 
similar scattering power (C, N, O with the hydrogens for the moment invisible) 
and high occupancy. When we can for example distinguish side chains along the 
backbone we have a good basis for starting to interpret the map as a particular 
structure. An FSC with a half bit threshold at the appropriate resolution 
appears to be a good guide to whether one can do this. However, if a particular 
sidechain is disordered with 2 conformations, or a substrate is only 50% 
occupied, the contribution in the electron density map is reduced and might be 
difficult to distinguish from the noise. A  higher threshold might be necessary 
to see these atoms but this would occur at a lower resolution than given by the 
half bit threshold. One could instead increase the exposure to improve the 
resolution but of course radiation damage lurks. For reporting structures, the 
obvious thing to do is to show the complete FSC curves together with a few 
threshold curves (e.g. half bit, one bit, 2 bits). This would enable people to 
judge whether the data is likely to meet their requirements. This of course 
departs significantly from the desire to have one number. A compromise might be 
to report FSC resolutions at several thresholds.

I understand that fixed value thresholds (e.g. 0.143) were originally adopted 
for EM to conform to standards prevalent for crystallography at the time. This 
would have enabled comparison between the two techniques. For many cases (as 
stated in Heel and Schatz) there will be little difference between the 
resolution given by a half bit and that given by 0.143. However, if the former 
is mathematically correct and easy to implement then why not use it for all 
techniques? The link to Shannon is a personal reason I have for preferring a 
threshold based on information content. If I had scientific “heroes” he would 
be one of them.


I have recently had a paper on x-ray imaging of biological cells accepted for 
publication. This includes

“In order to compare theory or simulations with experiment, standard methods of 
reporting results covering parameters such as the feature examined (e.g. which 
cellular organelle), resolution, contrast, depth of material (for 2D), estimate 
of noise and dose should be encouraged. Much effort has gone in to doing this 
for fields such as macromolecular crystallography but it has to be admitted 
that this is still an ongoing process.”
I think recent activity agrees with the last 6 words!



Don’t read the next bit if not interested in the relationship between the Rose 
criterion and FSC thresholds.

The recently submitted paper also includes

“A proper analysis of the relationship between the Rose criterion and FSC 
thresholds is outside the scope of this paper and would need to take account of 
factors such as the number of image voxels, whether one is in an atomicity or 
uniform voxel regime and the contrast of features to be identified in the 
image.”

This can justifiably be interpreted as saying I did not fully understand the 
relationship itself and was a partial reason why I raised the issue in another 
message to this thread.
Who cares anyway about the headline resolution? Well, defining a resolution can 
be important if one wants to calculate the exposure required to see particular 
features and whether they are then degraded by radiation damage. This relates 
to the issue I raised concerning the Rose criterion. As an example one might 
have a virus particle with an average density of 1.1 embedded in an object (a 
biological cell) of density 1.0 (I am keeping the numbers simple). The virus 
has a diameter of 50nm. There are 5000 voxels in the image (the number 5000 was 
used by Rose when analysing images from televisions). This gives 5000 chances 
of a false alarm so, I want to ensure the signal to noise ratio in the image is 
sufficiently high. This is why Rose adopted a contrast to noise ratio of 5 
(Rose criterion K of 5). For each voxel in the image we need a noise level 
sufficiently low to identify the feature. For a Rose criterion of 5 and the 
contrast of 0.1 it means that we need an average (?) of 625 photons per Shannon 
reciprocal voxel (the “speckle” given by the object as a whole) at the required 
resolution (1/50nm) in order to achieve this. The expression for the required 
number of photons is (K/2C)**2. However, if we have already identified a 
candidate voxel for the virus (perhaps using labelled fluorescent methods) we 
can get away with a Rose criterion of 3 (equivalent to K=5 over 5000 pixels) 
and 225 photons will suffice. For this case, a signal to noise ratio of 3 
corresponds to a  0.0027 probability of the event occurring due to Random 
noise. The information content is therefore –log20.0027 which is 8.5 bits. I 
therefore have a real space information content of 8.5 bits and an average 225 
photons at the resolution limit. The question is to relate these and come up 
with the appropriate value for the FSC threshold so I can judge whether a 
particle with this low contrast can be identified. In the above example, the 
object (biological cell) as a whole has a defined boundary and forms a natural 
sharp edged mask. The hard edge mask ( see Heel and Schatz section 4.7) is 
therefore present.

I am sure Marin (or others) will put me right of there are mistakes in the 
above.

Finally, for those interested in the relationship between information content 
and probability the article by Weaver (one of Shannon’s collaborators) gives a 
non-mathematical and perhaps philosophical description. It can be found at
http://www.mt-archive.info/50/SciAm-1949-Weaver.pdf

Sorry for the long reply – but at least some of it was requested!

Colin



From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of colin.n...@diamond.ac.uk<mailto:colin.n...@diamond.ac.uk>
Sent: 17 February 2020 11:26
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: [ccp4bb] FW: [ccp4bb] [3dem] Which resolution?


Dear all.
Would it help to separate out the issue of the FSC from the value of the 
threshold? My understanding is that the FSC addresses the spatial frequency at 
which there is a reliable information content in the image. This concept should 
apply to a wide variety of types of image. The issue is then what value of the 
threshold to use. For interpretation of protein structures (whether by x-ray or 
electron microscopy), a half bit threshold appears to be appropriate. However, 
for imaging the human brain (one of Marin’s examples) a higher threshold might 
be adopted as a range of contrasts might be present (axons for example have a 
similar density to the surroundings). For crystallography, if one wants to see 
lighter atoms (hydrogens in the presence of uranium or in proteins) a higher 
threshold might also be appropriate. I am not sure about this to be honest as a 
2 bit threshold (for example) would mean that there is information to higher 
resolution at a threshold of a half bit (unless one is at a diffraction or 
instrument limited resolution).

Most CCP4BBers will understand that a single number is not good enough. 
However, many users of the protein structure databases will simply search for 
the structure with the highest named resolution. It might be difficult to send 
these users to re-education camps.

Regards
Colin

From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Petrus Zwart
Sent: 16 February 2020 21:50
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] [3dem] Which resolution?

Hi All,

How is the 'correct' resolution estimation related to the estimated error on 
some observed hydrogen bond length of interest, or an error on the estimated 
occupancy of a ligand or conformation or anything else that has structural 
significance?

In crystallography, it isn't really (only in some very approximate fashion), 
and I doubt that in EM there is something to that effect. If you want to use 
the resolution to get a gut feeling on how your maps look and how your data 
behaves, it doesn't really matter what standard you use, as long as you are 
consistent in the use of the metric you use. If you want to use this estimate 
to get to uncertainties of model parameters, you better try something else.

Regards
Peter Zwart



On Sun, Feb 16, 2020 at 8:38 AM Marin van Heel 
<0000057a89ab08a1-dmarc-requ...@jiscmail.ac.uk<mailto:0000057a89ab08a1-dmarc-requ...@jiscmail.ac.uk>>
 wrote:
Dear Pawel and All others ....
This 2010 review is - unfortunately - largely based on the flawed statistics I 
mentioned before, namely on the a priori assumption that the inner product of a 
signal vector and a noise vector are ZERO (an orthogonality assumption).  The 
(Frank & Al-Ali 1975) paper we have refuted on a number of occasions (for 
example in 2005, and most recently in our BioRxiv paper) but you still take 
that as the correct relation between SNR and FRC (and you never cite the 
criticism...).
Sorry
Marin

On Thu, Feb 13, 2020 at 10:42 AM Penczek, Pawel A 
<pawel.a.penc...@uth.tmc.edu<mailto:pawel.a.penc...@uth.tmc.edu>> wrote:
Dear Teige,

I am wondering whether you are familiar with

Resolution measures in molecular electron microscopy.
Penczek PA. Methods Enzymol. 2010.
Citation
Methods Enzymol. 2010;482:73-100. doi: 10.1016/S0076-6879(10)82003-8.

You will find there answers to all questions you asked and much more.

Regards,
Pawel Penczek

Regards,
Pawel
_______________________________________________
3dem mailing list
3...@ncmir.ucsd.edu<mailto:3...@ncmir.ucsd.edu>
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem

________________________________
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1


--
------------------------------------------------------------------------
P.H. Zwart
Staff Scientist
Molecular Biophysics and Integrated Bioimaging &
Center for Advanced Mathematics for Energy Research Applications
Lawrence Berkeley National Laboratories
1 Cyclotron Road, Berkeley, CA-94703, USA
Cell: 510 289 9246

PHENIX:   http://www.phenix-online.org<http://www.phenix-online.org/>
CAMERA: http://camera.lbl.gov/
-------------------------------------------------------------------------

________________________________
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

--
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom


________________________________
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

--
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom


________________________________
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

------
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research     Tel: + 44 1223 336500
The Keith Peters Building                               Fax: + 44 1223 336827
Hills Road                                                       E-mail: 
rj...@cam.ac.uk<mailto:rj...@cam.ac.uk>
Cambridge CB2 0XY, U.K.                             
www-structmed.cimr.cam.ac.uk<http://www-structmed.cimr.cam.ac.uk>


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1

Reply via email to