Hi Jacob, On Thu, Mar 12, 2020 at 9:13 AM Keller, Jacob <kell...@janelia.hhmi.org> wrote:
> I would think the most information-reflecting representation for > systematic absences (or maybe for all reflections) would be not I/sig but > the reflection's (|log|) ratio to the expected intensity in that shell > (median intensity, say). Xtriage does something like this as part of its space group assignment algorithm. A choice of space group implies assigning reflections the label acentric, centric or absent. Each of these have their own prior distribution, which can be convoluted with a gaussian to compute a likelihood for that specific space group hypothesis. It provides a decent way of assigning space groups in an automated manner. > (...) > > Maybe more generally, should refinement incorporate weighting for these > deviant spots? Or maybe it already does, but my understanding was that > I/sig was the most salient for weighting. > The best option is to have a decent likelihood function that takes into account the (almost) full uncertainty of the observation into consideration, as described by Read & Pannu (https://bit.ly/2W6qmVR) including various numerical /mathematical approaches to compute this ( Read & McCoy https://bit.ly/2Qa6b5I; Perpendicular Pronoun & Perryman https://bit.ly/2TKjJXH ). P > JPK > > +++++++++++++++++++++++++++++++++++++++++++++++++ > Jacob Pearson Keller > Research Scientist / Looger Lab > HHMI Janelia Research Campus > 19700 Helix Dr, Ashburn, VA 20147 > Desk: (571)209-4000 x3159 > Cell: (301)592-7004 > +++++++++++++++++++++++++++++++++++++++++++++++++ > > The content of this email is confidential and intended for the recipient > specified in message only. It is strictly forbidden to share any part of > this message with any third party, without a written consent of the sender. > If you received this message by mistake, please reply to this message and > follow with its deletion, so that we can ensure such a mistake does not > occur in the future. > > -----Original Message----- > From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Kay > Diederichs > Sent: Tuesday, March 10, 2020 2:48 AM > To: CCP4BB@JISCMAIL.AC.UK > Subject: Re: [ccp4bb] [3dem] Which resolution? > > I'd say that it depends on your state of knowledge, and on their I and > sigma. > > - if you know the space group for sure before you do the measurement of > the systematic absences, their I and sigma don't matter to you (because > they don't influence your mental model of the experiment), so their > information content is (close to) zero. > - if the space group is completely unknown, some groups of reflections > (e.g. h,k,l = 0,0,2n+1) can only be considered "potentially systematic > absences". Then both I and sigma matter. "small" or "high" I/sigma for each > member of such a group of reflections would indeed add quite some > information in this situation, so an information content of up to 1 bit > would be justified. "intermediate" I/sigma (say, 0.5 to 2) would be closer > to zero bit, since it does not let you safely decide between "yes" or "no" > (the recent paper by Randy Read and coworkers relates I and sigma to bits > of information, but not in the context of decision making from potentially > systematic absent reflections). > > So it is not quite straightforward, I think. > > best wishes, > Kay > > On Tue, 10 Mar 2020 01:26:03 +0100, James Holton <jmhol...@lbl.gov> wrote: > > >I'd say they are 1 bit each, since they are the answer to a yes-or-no > >question. > > > >-James Holton > >MAD Scientist > > > >On 2/27/2020 6:32 PM, Keller, Jacob wrote: > >> How would one evaluate the information content of systematic absences? > >> > >> JPK > >> > >> On Feb 26, 2020 8:14 PM, James Holton <jmhol...@lbl.gov> wrote: > >> In my opinion the threshold should be zero bits. Yes, this is where > >> CC1/2 = 0 (or FSC = 0). If there is correlation then there is > >> information, and why throw out information if there is information to > >> be had? Yes, this information comes with noise attached, but that is > >> why we have weights. > >> > >> It is also important to remember that zero intensity is still useful > >> information. Systematic absences are an excellent example. They > >> have no intensity at all, but they speak volumes about the structure. > >> In a similar way, high-angle zero-intensity observations also tell us > >> something. Ever tried unrestrained B factor refinement at poor > >> resolution? It is hard to do nowadays because of all the safety > >> catches in modern software, but you can get great R factors this way. > >> A telltale sign of this kind of "over fitting" is remarkably large > >> Fcalc values beyond the resolution cutoff. These don't contribute to > >> the R factor, however, because Fobs is missing for these hkls. So, > >> including zero-intensity data suppresses at least some types of > >> over-fitting. > >> > >> The thing I like most about the zero-information resolution cutoff is > >> that it forces us to address the real problem: what do you mean by > >> "resolution" ? Not long ago, claiming your resolution was 3.0 A > >> meant that after discarding all spots with individual I/sigI < 3 you > >> still have 80% completeness in the 3.0 A bin. Now we are saying we > >> have a > >> 3.0 A data set when we can prove statistically that a few > >> non-background counts fell into the sum of all spot areas at 3.0 A. > >> These are not the same thing. > >> > >> Don't get me wrong, including the weak high-resolution information > >> makes the model better, and indeed I am even advocating including all > >> the noisy zeroes. However, weak data at 3.0 A is never going to be > >> as good as having strong data at 3.0 A. So, how do we decide? I > >> personally think that the resolution assigned to the PDB deposition > >> should remain the classical I/sigI > 3 at 80% rule. This is really > >> the only way to have meaningful comparison of resolution between very > >> old and very new structures. One should, of course, deposit all the > >> data, but don't claim that cut-off as your "resolution". That is > >> just plain unfair to those who came before. > >> > >> Oh yeah, and I also have a session on "interpreting low-resolution > >> maps" at the GRC this year. > >> https://urldefense.com/v3/__https://www.grc.org/diffraction-methods-i > >> n-structural-biology-conference/2020/__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqX > >> MswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFrPUFJmw$ > >> > >> So, please, let the discussion continue! > >> > >> -James Holton > >> MAD Scientist > >> > >> On 2/22/2020 11:06 AM, Nave, Colin (DLSLtd,RAL,LSCI) wrote: > >>> > >>> Alexis > >>> > >>> This is a very useful summary. > >>> > >>> You say you were not convinced by Marin's derivation in 2005. Are > >>> you convinced now and, if not, why? > >>> > >>> My interest in this is that the FSC with half bit thresholds have > >>> the danger of being adopted elsewhere because they are becoming > >>> standard for protein structure determination (by EM or MX). If it is > >>> used for these mature techniques it must be right! > >>> > >>> It is the adoption of the ½ bit threshold I worry about. I gave a > >>> rather weak example for MX which consisted of partial occupancy of > >>> side chains, substrates etc. For x-ray imaging a wide range of > >>> contrasts can occur and, if you want to see features with only a > >>> small contrast above the surroundings then I think the half bit > >>> threshold would be inappropriate. > >>> > >>> It would be good to see a clear message from the MX and EM > >>> communities as to why an information content threshold of ½ a bit is > >>> generally appropriate for these techniques and an acknowledgement > >>> that this threshold is technique/problem dependent. > >>> > >>> We might then progress from the bronze age to the iron age. > >>> > >>> Regards > >>> > >>> Colin > >>> > >>> *From:*CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> *On Behalf Of > >>> *Alexis Rohou > >>> *Sent:* 21 February 2020 16:35 > >>> *To:* CCP4BB@JISCMAIL.AC.UK > >>> *Subject:* Re: [ccp4bb] [3dem] Which resolution? > >>> > >>> Hi all, > >>> > >>> For those bewildered by Marin's insistence that everyone's been > >>> messing up their stats since the bronze age, I'd like to offer what > >>> my understanding of the situation. More details in this thread from > >>> a few years ago on the exact same topic: > >>> > >>> https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3d > >>> em/2015-August/003939.html__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_ > >>> 7U-msuYRN_IWolD5KPaoP8Xsj8THkFyeegrI8$ > >>> <https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3 > >>> dem/2015-August/003939.html__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4 > >>> -1ibr1oaahxT_2BAAetUTMNdfRqUCmIsJF61uc$> > >>> > >>> https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3d > >>> em/2015-August/003944.html__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_ > >>> 7U-msuYRN_IWolD5KPaoP8Xsj8THkFj5n6OLY$ > >>> <https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/pipermail/3 > >>> dem/2015-August/003944.html__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4 > >>> -1ibr1oaahxT_2BAAetUTMNdfRqUCmIPu-nRBo$> > >>> > >>> Notwithstanding notational problems (e.g. strict equations as > >>> opposed to approximation symbols, or omission of symbols to denote > >>> estimation), I believe Frank & Al-Ali and "descendent" papers (e.g. > >>> appendix of Rosenthal & Henderson 2003) are fine. The cross terms > >>> that Marin is agitated about indeed do in fact have an expectation > >>> value of 0.0 (in the ensemble; if the experiment were performed an > >>> infinite number of times with different realizations of noise). I > >>> don't believe Pawel or Jose Maria or any of the other authors really > >>> believe that the cross-terms are orthogonal. > >>> > >>> When N (the number of independent Fouier voxels in a shell) is large > >>> enough, mean(Signal x Noise) ~ 0.0 is only an approximation, but a > >>> pretty good one, even for a single FSC experiment. This is why, in > >>> my book, derivations that depend on Frank & Al-Ali are OK, under the > >>> strict assumption that N is large. Numerically, this becomes > >>> apparent when Marin's half-bit criterion is plotted - asymptotically > >>> it has the same behavior as a constant threshold. > >>> > >>> So, is Marin wrong to worry about this? No, I don't think so. There > >>> are indeed cases where the assumption of large N is broken. And > >>> under those circumstances, any fixed threshold (0.143, 0.5, > >>> whatever) is dangerous. This is illustrated in figures of van Heel & > >>> Schatz (2005). Small boxes, high-symmetry, small objects in large > >>> boxes, and a number of other conditions can make fixed thresholds > dangerous. > >>> > >>> It would indeed be better to use a non-fixed threshold. So why am I > >>> not using the 1/2-bit criterion in my own work? While numerically it > >>> behaves well at most resolution ranges, I was not convinced by > >>> Marin's derivation in 2005. Philosophically though, I think he's > >>> right - we should aim for FSC thresholds that are more robust to the > >>> kinds of edge cases mentioned above. It would be the right thing to do. > >>> > >>> Hope this helps, > >>> > >>> Alexis > >>> > >>> On Sun, Feb 16, 2020 at 9:00 AM Penczek, Pawel A > >>> <pawel.a.penc...@uth.tmc.edu <mailto:pawel.a.penc...@uth.tmc.edu>> > wrote: > >>> > >>> Marin, > >>> > >>> The statistics in 2010 review is fine. You may disagree with > >>> assumptions, but I can assure you the “statistics” (as you call > >>> it) is fine. Careful reading of the paper would reveal to you > >>> this much. > >>> > >>> Regards, > >>> > >>> Pawel > >>> > >>> > >>> > >>> On Feb 16, 2020, at 10:38 AM, Marin van Heel > >>> <marin.vanh...@googlemail.com > >>> <mailto:marin.vanh...@googlemail.com>> wrote: > >>> > >>> > >>> > >>> ***** EXTERNAL EMAIL ***** > >>> > >>> Dear Pawel and All others .... > >>> > >>> This 2010 review is - unfortunately - largely based on the > >>> flawed statistics I mentioned before, namely on the a priori > >>> assumption that the inner product of a signal vector and a > >>> noise vector are ZERO (an orthogonality assumption). The > >>> (Frank & Al-Ali 1975) paper we have refuted on a number of > >>> occasions (for example in 2005, and most recently in our > >>> BioRxiv paper) but you still take that as the correct > >>> relation between SNR and FRC (and you never cite the > >>> criticism...). > >>> > >>> Sorry > >>> > >>> Marin > >>> > >>> On Thu, Feb 13, 2020 at 10:42 AM Penczek, Pawel A > >>> <pawel.a.penc...@uth.tmc.edu > >>> <mailto:pawel.a.penc...@uth.tmc.edu>> wrote: > >>> > >>> Dear Teige, > >>> > >>> I am wondering whether you are familiar with > >>> > >>> > >>> Resolution measures in molecular electron microscopy. > >>> > >>> Penczek PA. Methods Enzymol. 2010. > >>> > >>> > >>> Citation > >>> > >>> Methods Enzymol. 2010;482:73-100. doi: > >>> 10.1016/S0076-6879(10)82003-8. > >>> > >>> You will find there answers to all questions you asked > >>> and much more. > >>> > >>> Regards, > >>> > >>> Pawel Penczek > >>> > >>> Regards, > >>> > >>> Pawel > >>> > >>> _______________________________________________ > >>> 3dem mailing list > >>> 3...@ncmir.ucsd.edu <mailto:3...@ncmir.ucsd.edu> > >>> > https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFWAPvO-k$ > >>> > >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucs > >>> d.edu_mailman_listinfo_3dem&d=DwMFaQ&c=bKRySV-ouEg_AT-w2QWsTdd9X__KY > >>> h9Eq2fdmQDVZgw&r=yEYHb4SF2vvMq3W-iluu41LlHcFadz4Ekzr3_bT4-qI&m=3-TZc > >>> ohYbZGHCQ7azF9_fgEJmssbBksaI7ESb0VIk1Y&s=XHMq9Q6Zwa69NL8kzFbmaLmZA9M > >>> 33U01tBE6iAtQ140&e=> > >>> > >>> _______________________________________________ > >>> 3dem mailing list > >>> 3...@ncmir.ucsd.edu <mailto:3...@ncmir.ucsd.edu> > >>> > https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFWAPvO-k$ > >>> > >>> <https://urldefense.com/v3/__https://mail.ncmir.ucsd.edu/mailman/lis > >>> tinfo/3dem__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr1oaahxT_2BAA > >>> etUTMNdfRqUCmI7LD77u4$> > >>> > >>> -------------------------------------------------------------------- > >>> ---- > >>> > >>> To unsubscribe from the CCP4BB list, click the following link: > >>> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webad > >>> min?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-ms > >>> uYRN_IWolD5KPaoP8Xsj8THkFg3ruXqc$ > >>> <https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/weba > >>> dmin?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr > >>> 1oaahxT_2BAAetUTMNdfRqUCmI1pndYoE$> > >>> > >>> > >>> -- > >>> > >>> This e-mail and any attachments may contain confidential, copyright > >>> and or privileged material, and are for the use of the intended > >>> addressee only. If you are not the intended addressee or an > >>> authorised recipient of the addressee please notify us of receipt by > >>> returning the e-mail and do not use, copy, retain, distribute or > >>> disclose the information in or attached to the e-mail. > >>> Any opinions expressed within this e-mail are those of the > >>> individual and not necessarily of Diamond Light Source Ltd. > >>> Diamond Light Source Ltd. cannot guarantee that this e-mail or any > >>> attachments are free from viruses and we cannot accept liability for > >>> any damage which you may sustain as a result of software viruses > >>> which may be transmitted in or with the message. > >>> Diamond Light Source Limited (company no. 4375679). Registered in > >>> England and Wales with its registered office at Diamond House, > >>> Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 > >>> 0DE, United Kingdom > >>> > >>> > >>> -------------------------------------------------------------------- > >>> ---- > >>> > >>> To unsubscribe from the CCP4BB list, click the following link: > >>> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webad > >>> min?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-ms > >>> uYRN_IWolD5KPaoP8Xsj8THkFg3ruXqc$ > >>> <https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/weba > >>> dmin?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr > >>> 1oaahxT_2BAAetUTMNdfRqUCmI1pndYoE$> > >>> > >>> > >> > >> > >> --------------------------------------------------------------------- > >> --- > >> > >> To unsubscribe from the CCP4BB list, click the following link: > >> https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webadm > >> in?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuY > >> RN_IWolD5KPaoP8Xsj8THkFg3ruXqc$ > >> <https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webad > >> min?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!TK-tIY-zm5coRu74uWMkIJkTFWNz4-1ibr1o > >> aahxT_2BAAetUTMNdfRqUCmI1pndYoE$> > >> > >> > > > > > >####################################################################### > ># > > > >To unsubscribe from the CCP4BB list, click the following link: > >https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webadmin > >?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_I > >WolD5KPaoP8Xsj8THkFg3ruXqc$ > > > > ######################################################################## > > To unsubscribe from the CCP4BB list, click the following link: > > https://urldefense.com/v3/__https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1__;!!Eh6p8Q!XrEJFTzyDh5AKIyF7aqXMswM8g5VF_7U-msuYRN_IWolD5KPaoP8Xsj8THkFg3ruXqc$ > > ######################################################################## > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > -- ------------------------------------------------------------------------ P.H. Zwart Staff Scientist Molecular Biophysics and Integrated Bioimaging & Center for Advanced Mathematics for Energy Research Applications Lawrence Berkeley National Laboratories 1 Cyclotron Road, Berkeley, CA-94703, USA Cell: 510 289 9246 PHENIX: http://www.phenix-online.org CAMERA: http://camera.lbl.gov/ ------------------------------------------------------------------------- ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1