[ccp4bb] AW: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset?
Hi David, you are right, the M in MPR is just a count of “whatever” is averaged to get the final intensities. However, from this “inexhaustible thread” it is also clear that there will be no agreement on what to call this “whatever” Best, Herman Von: David Waterman Gesendet: Freitag, 3. Juli 2020 13:11 An: Schreuder, Herman /DE Cc: CCP4BB@jiscmail.ac.uk Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset? EXTERNAL : Real sender is dgwater...@gmail.com<mailto:dgwater...@gmail.com> Hi Herman, I started googling and ended up completely lost down a rabbit hole (have a look here if you want to see what I mean: https://plato.stanford.edu/entries/measurement-science/<https://urldefense.proofpoint.com/v2/url?u=https-3A__plato.stanford.edu_entries_measurement-2Dscience_=DwMFaQ=Dbf9zoswcQ-CRvvI7VX5j3HvibIuT3ZiarcKl5qtMPo=HK-CY_tL8CLLA93vdywyu3qI70R4H8oHzZyRHMQu1AQ=NfK-XzbGoQVLv6T5t-KMc5bAwEQPdEr96sFwQ9Ep1Dc=mGF-gpMQiGiilkipkrQQ2T-lET7gg95-qT2CUIVe0gQ=>). As a result I'm no longer sure I know what the word "measurement" means! I tried to simplify things with a practical example. Let's say I take a set of clearly real world measurements (temperature values over time, say). I can take the mean of subsets of these values and maybe that is a "measurement" too - especially for readings taken in quick succession, expressly done to reduce measurement error. But what if I fit a line to a series of values, is the gradient of the line also a "measurement"? Maybe? Anyway, for MPR it probably doesn't matter if the measurement is of a response variable, rather than something "raw". That's because MPR isn't actually affected by the values themselves (ignoring the thorny issue of outlier rejection), it is just a count of them. Cheers -- David On Fri, 3 Jul 2020 at 11:22, Schreuder, Herman /DE mailto:herman.schreu...@sanofi.com>> wrote: Dear David, Thank you for your reaction. It has become clear to me that although most people understand what I intended with “measurement”, in practice it is very much in the eye of the beholder. It was suggested in the BB to use observation instead, but I am fairly sure that some people will also have issues with that. The advantage of multiplicity/redundancy is that it does not mention what is multiple or redundant and that one can refer to the program documentation for an exact definition. Since most people are happy with the multiplicity/redundancy they grew up with, that is the way it will stay. Best regards, Herman Von: David Waterman mailto:dgwater...@gmail.com>> Gesendet: Freitag, 3. Juli 2020 10:49 An: Schreuder, Herman /DE mailto:herman.schreu...@sanofi.com>> Cc: CCP4BB@jiscmail.ac.uk<mailto:CCP4BB@jiscmail.ac.uk> Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset? EXTERNAL : Real sender is dgwater...@gmail.com<mailto:dgwater...@gmail.com> Hi Herman, I like the idea of MPR, but I continue to worry about the term "measurement". The intensity associated with a particular reflection is a fit based on a scaling model, and ultimately, depending on your integration software, may be linked to a weighted sum of two raw measurements: the summation and profile-fitted intensities. I think these are the measurements, not the intensity derived during the scaling procedure. Sure, anyone who wants to be even more pedantic than me will point out that these "raw measurements" are also the result of fitting procedures. However, to my eyes, the difference is that we don't consider the profile and summation integrated intensities to change as a result of the procedure that ultimately determines the statistic (MPR) of interest. During that procedure they are independent, not dependent variables. Maybe I am worrying about nothing. It agree it is fairly clear what you mean by MPR. I just wanted to explore if there was any opportunity for further reducing ambiguity. Cheers -- David On Fri, 3 Jul 2020 at 08:12, Schreuder, Herman /DE mailto:herman.schreu...@sanofi.com>> wrote: Dear Ian, Since some very advanced countries still use miles, Fahrenheit and inches, I did not expect anything to change. It was an escalating discussion in this thread on data completeness(!) on the use of multiplicity vs redundancy that made me suggest a different term. Except for an occasional discussion in the BB, there is nothing against people using the term they are most comfortable with. However, I insist that trying to impose a different definition of “measurement” for MPR vs the definition used for the calculation of redundancy/multiplicity is not a valid argument against MPR. Cheers, Herman Von: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> Im Auftrag von Ian Tickle Gesendet: Donnerstag, 2. Juli 2020 22:06 An: CCP4BB@JISCMAIL.AC.
Re: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset?
Dear Herman and David, This thread seems inexhaustible :-) . On the matter of "measurement" vs. "observation", we seem again to be in a situation described by the British idiom "half of one and half-a-dozen" of the other, i.e. distinct but synonymous terms between which a choice is quite indifferent. In the work on STARANISO and the documentation of that work, a distinction had to be made between the two terms, for which readers are referred to Ian's carefully crafted material at http://staraniso.globalphasing.org/anisotropy_about.html and http://staraniso.globalphasing.org/staraniso_glossary.html Here, a measurement is a number plucked out of examining the raw data, namely an integrated intensity obtained by considering the pixel values around the position in 3D reciprocal space predicted from an indexing solution. The next step is to determine whether this qualifies as an observation, in the sense of containing information that a structural model would be expected to comply with. This determination is carried out by computing a local average of I/sig(I) through reciprocal space and applying a cut-off criterion based on a threshold value for that local average. Other criteria can be considered, and are indeed offered by the program as alternatives. Measurements complying with this selection criterion are then called "observations". In this picture, an observation is defined as a significant measurement. This basic distinction of vocabulary is then extended to talking about "unmeasured" reflections (for which there weren't any detector pixels to catch any photons at their predicted position - e.g. in gaps between detector modules) and "unobserved" reflections (that are unmeasured but for which the analysis of the I/sig(I) distribution predicts that they would have been significant, had they been measured - e.g. in cusps or missing angular ranges, as well as in module gaps etc.). The display of the latter as blue dots in the STARANISO Reciprocal Lattice Viewer then gives a vivid picture of the inadequacies of the experimental protocol used, in failing to catch all the significant diffraction from the sample. This being said, things could very well had been done the other way, saying that the blindly integrated intensity was an observation, and that the subsequent analysis was intended to determine whether you had really measured something significant (i.e. a useful integrated intensity) by making that observation. We were aware of this ambivalence, but felt that we had to comply with the boundary condition that what we ended up with, after conversion to an amplitude, had to be denoted "Fobs" ;-) . If the early crystallographers had used the notation "Fmeas" for what they considered as their experimental data, the choice of terminology would definitely have gone the other way. As Graeme said, use the terminology you want, but document exactly what you mean by it. The two URLs quoted above (especially the second) show that this suggestion was conscientiously followed by the STARANISO developers. With best wishes, Gerard, -- On Fri, Jul 03, 2020 at 10:22:43AM +, Schreuder, Herman /DE wrote: > Dear David, > > Thank you for your reaction. It has become clear to me that although most > people understand what I intended with “measurement”, in practice it is very > much in the eye of the beholder. It was suggested in the BB to use > observation instead, but I am fairly sure that some people will also have > issues with that. > > The advantage of multiplicity/redundancy is that it does not mention what is > multiple or redundant and that one can refer to the program documentation for > an exact definition. Since most people are happy with the > multiplicity/redundancy they grew up with, that is the way it will stay. > > Best regards, > Herman > > > > > Von: David Waterman > Gesendet: Freitag, 3. Juli 2020 10:49 > An: Schreuder, Herman /DE > Cc: CCP4BB@jiscmail.ac.uk > Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of > frames to get a full dataset? > > > EXTERNAL : Real sender is dgwater...@gmail.com<mailto:dgwater...@gmail.com> > > Hi Herman, > > I like the idea of MPR, but I continue to worry about the term "measurement". > The intensity associated with a particular reflection is a fit based on a > scaling model, and ultimately, depending on your integration software, may be > linked to a weighted sum of two raw measurements: the summation and > profile-fitted intensities. I think these are the measurements, not the > intensity derived during the scaling procedure. Sure, anyone who wants to be > even more pedantic than me will point out that these "raw measurements" are >
Re: [ccp4bb] AW: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset?
Dear Colleagues, Now that Herman has announced a quietude I thought you might enjoy this quite short report on a synchrotron radiation issue that came up some years back via the JSR Main Editors into the IUCr Nomenclature Committee, chaired by Andre Authier, Past President of the IUCr:- https://journals.iucr.org/s/issues/2005/03/00/es0344/es0344.pdf Have a great weekend, John Emeritus Professor John R Helliwell DSc > On 3 Jul 2020, at 11:22, Schreuder, Herman /DE > wrote: > > > Dear David, > > Thank you for your reaction. It has become clear to me that although most > people understand what I intended with “measurement”, in practice it is very > much in the eye of the beholder. It was suggested in the BB to use > observation instead, but I am fairly sure that some people will also have > issues with that. > > The advantage of multiplicity/redundancy is that it does not mention what is > multiple or redundant and that one can refer to the program documentation for > an exact definition. Since most people are happy with the > multiplicity/redundancy they grew up with, that is the way it will stay. > > Best regards, > Herman > > > > > Von: David Waterman > Gesendet: Freitag, 3. Juli 2020 10:49 > An: Schreuder, Herman /DE > Cc: CCP4BB@jiscmail.ac.uk > Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of > frames to get a full dataset? > > EXTERNAL : Real sender is dgwater...@gmail.com > > > > Hi Herman, > > I like the idea of MPR, but I continue to worry about the term "measurement". > The intensity associated with a particular reflection is a fit based on a > scaling model, and ultimately, depending on your integration software, may be > linked to a weighted sum of two raw measurements: the summation and > profile-fitted intensities. I think these are the measurements, not the > intensity derived during the scaling procedure. Sure, anyone who wants to be > even more pedantic than me will point out that these "raw measurements" are > also the result of fitting procedures. However, to my eyes, the difference is > that we don't consider the profile and summation integrated intensities to > change as a result of the procedure that ultimately determines the statistic > (MPR) of interest. During that procedure they are independent, not dependent > variables. > > Maybe I am worrying about nothing. It agree it is fairly clear what you mean > by MPR. I just wanted to explore if there was any opportunity for further > reducing ambiguity. > > Cheers > -- David > > > On Fri, 3 Jul 2020 at 08:12, Schreuder, Herman /DE > wrote: > Dear Ian, > > Since some very advanced countries still use miles, Fahrenheit and inches, I > did not expect anything to change. It was an escalating discussion in this > thread on data completeness(!) on the use of multiplicity vs redundancy that > made me suggest a different term. Except for an occasional discussion in the > BB, there is nothing against people using the term they are most comfortable > with. > > However, I insist that trying to impose a different definition of > “measurement” for MPR vs the definition used for the calculation of > redundancy/multiplicity is not a valid argument against MPR. > > Cheers, > Herman > > > > > Von: CCP4 bulletin board Im Auftrag von Ian Tickle > Gesendet: Donnerstag, 2. Juli 2020 22:06 > An: CCP4BB@JISCMAIL.AC.UK > Betreff: Re: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a > full dataset? > > EXTERNAL : Real sender is owner-ccp...@jiscmail.ac.uk > > > > > Well I very much doubt that many software developers are going to trawl > through all their code, comments, output statements & documentation to change > 'redundancy' or 'multiplicity' to 'MPR' or whatever terminology is agreed on > (assuming of course we do manage to come to an agreement, which I doubt). > And good luck with persuading wwPDB to change 'redundancy' in their mmCIF > dictionary! That would be not only pointless but also a lot of work, partly > because terms get abbreviated in code and in outputs (e.g. to 'redund' in > mine, or 'mult'). And don't say I can keep the code & comments the same and > only change the outputs and documentation: that will really tax my brain! > Also don't say this need only apply to new code: no code is ever completely > new, and mixing up old & new terminology would be a disaster waiting to > happen! Also it won't end there: someone will always find terminology that > they disagree with: I can think of plenty cans of worms that we could open, > but I think one is already on
[ccp4bb] AW: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset?
Dear David, Thank you for your reaction. It has become clear to me that although most people understand what I intended with “measurement”, in practice it is very much in the eye of the beholder. It was suggested in the BB to use observation instead, but I am fairly sure that some people will also have issues with that. The advantage of multiplicity/redundancy is that it does not mention what is multiple or redundant and that one can refer to the program documentation for an exact definition. Since most people are happy with the multiplicity/redundancy they grew up with, that is the way it will stay. Best regards, Herman Von: David Waterman Gesendet: Freitag, 3. Juli 2020 10:49 An: Schreuder, Herman /DE Cc: CCP4BB@jiscmail.ac.uk Betreff: Re: [ccp4bb] AW: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset? EXTERNAL : Real sender is dgwater...@gmail.com<mailto:dgwater...@gmail.com> Hi Herman, I like the idea of MPR, but I continue to worry about the term "measurement". The intensity associated with a particular reflection is a fit based on a scaling model, and ultimately, depending on your integration software, may be linked to a weighted sum of two raw measurements: the summation and profile-fitted intensities. I think these are the measurements, not the intensity derived during the scaling procedure. Sure, anyone who wants to be even more pedantic than me will point out that these "raw measurements" are also the result of fitting procedures. However, to my eyes, the difference is that we don't consider the profile and summation integrated intensities to change as a result of the procedure that ultimately determines the statistic (MPR) of interest. During that procedure they are independent, not dependent variables. Maybe I am worrying about nothing. It agree it is fairly clear what you mean by MPR. I just wanted to explore if there was any opportunity for further reducing ambiguity. Cheers -- David On Fri, 3 Jul 2020 at 08:12, Schreuder, Herman /DE mailto:herman.schreu...@sanofi.com>> wrote: Dear Ian, Since some very advanced countries still use miles, Fahrenheit and inches, I did not expect anything to change. It was an escalating discussion in this thread on data completeness(!) on the use of multiplicity vs redundancy that made me suggest a different term. Except for an occasional discussion in the BB, there is nothing against people using the term they are most comfortable with. However, I insist that trying to impose a different definition of “measurement” for MPR vs the definition used for the calculation of redundancy/multiplicity is not a valid argument against MPR. Cheers, Herman Von: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> Im Auftrag von Ian Tickle Gesendet: Donnerstag, 2. Juli 2020 22:06 An: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK> Betreff: Re: [ccp4bb] AW: [EXTERNAL] Re: [ccp4bb] number of frames to get a full dataset? EXTERNAL : Real sender is owner-ccp...@jiscmail.ac.uk<mailto:owner-ccp...@jiscmail.ac.uk> Well I very much doubt that many software developers are going to trawl through all their code, comments, output statements & documentation to change 'redundancy' or 'multiplicity' to 'MPR' or whatever terminology is agreed on (assuming of course we do manage to come to an agreement, which I doubt). And good luck with persuading wwPDB to change 'redundancy' in their mmCIF dictionary! That would be not only pointless but also a lot of work, partly because terms get abbreviated in code and in outputs (e.g. to 'redund' in mine, or 'mult'). And don't say I can keep the code & comments the same and only change the outputs and documentation: that will really tax my brain! Also don't say this need only apply to new code: no code is ever completely new, and mixing up old & new terminology would be a disaster waiting to happen! Also it won't end there: someone will always find terminology that they disagree with: I can think of plenty cans of worms that we could open, but I think one is already one too many! By the way, "measurements per reflection" won't float, because some measurements will be rejected as outliers (that's why we need redundancy! - as opposed to simply measuring intensities for longer). What I call redundancy is "the count of _contributing_ measurements per reflection" (CCMPR, sigh). Personally I think that adding one more term is going to confuse things even more since if I'm right most people will continue to use the old terms in parallel anyway. IMO we should all be free to use the terminology we are most comfortable with, and it's up to the receivers of the information to perform the translation. That's how it always has been, and IMO always will be. Of course it behoves (behooves?) the sender to point to or make available any necessary translation tools,