Re: [ccp4bb] am I doing this right?

Jurgen Bosch Sun, 17 Oct 2021 10:57:34 -0700

The EM and Cryo-EM world might have something to say about that perhaps.

Isn’t RIP phasing coming from what you are describing as well?


Just curious,

Jürgen

__________________________________________
Jürgen Bosch, PhD, MBA
Center for Global Health & Diseases
Case Western Reserve University

https://www.linkedin.com/in/jubosch/

CEO & Co-Founder at InterRayBio, LLC

Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry & Molecular Biology

On Oct 17, 2021, at 13:24, James Holton <jmhol...@lbl.gov> wrote:

Thank you Gergely.  That is interesting!

I don't mind at all making this Bayesian, as long as it works!

Something I'm not quite sure about: does the prior distribution HAVE to
be a gamma distribution? Not that that really narrows things down since
there are an infinite number of them, but is that really the "i have no
idea" prior? Or just a convenient closed-form choice? I've only just
recently heard of conjugate priors.

Much appreciate any thoughts you may have on this,

-James


On 10/16/2021 3:48 PM, Gergely Katona wrote:

Dear James,


If I understand correctly you are looking for a single rate parameter to
describe the pixels in a block. It would also be possible to estimate the
rates for individual pixels or estimate the thickness of the sample from
the counts if you have a good model, that is where Bayesian methods really
shine. I tested the simplest first Bayesian network with 10 and 100 zero
count pixels, respectively:


https://colab.research.google.com/drive/1TGJx2YT9I-qyOT1D9_HCC7G7as1KXg2e?usp=sharing



The two posterior distributions are markedly different even if they start
from the same prior distribution, which I find more intuitive than the
frequentist treatment of uncertainty. You can test different parameters for
the gamma prior or change to another prior distribution. It is possible to
reduce the posterior distributions to their mean or posterior maximum, if
needed. If you are looking for an alternative to the Bayesian perspective
then this will not help, unfortunately.


Best wishes,


Gergely


-----Original Message-----

From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of James Holton

Sent: den 16 oktober 2021 21:01

To: CCP4BB@JISCMAIL.AC.UK

Subject: Re: [ccp4bb] am I doing this right?


Thank you everyone for your thoughtful and thought-provoking responses!


But, I am starting to think I was not as clear as I could have been about
my question.  I am actually concerning myself with background, not
necessarily Bragg peaks.  With Bragg photons you want the sum, but for
background you want the average.


What I'm getting at is: how does one properly weight a zero-photon
observation when it comes time to combine it with others?  Hopefully they
are not all zero.  If they are, check your shutter.


So, ignoring Bragg photons for the moment (let us suppose it is a
systematic absence) what I am asking is: what is the variance, or, better
yet,what is the WEIGHT one should assign to the observation of zero photons
in a patch of 10x10 pixels?


In the absence of any prior knowledge this is a difficult question, but a
question we kind of need to answer if we want to properly measure data from
weak images.  So, what do we do?


Well, with the "I have no idea" uniform prior, it would seem that
expectation (Epix) and variance (Vpix) would be k+1 = 1 for each pixel, and
therefore the sum of Epix and Vpix over the 100 independent pixels is:


Epatch=Vpatch=100 photons


I know that seems weird to assume 100 photons should have hit when we
actually saw none, but consider what that zero-photon count, all by itself,
is really telling you:

a) Epix > 20 ? No way. That is "right out". Given we know its Poisson
distributed, and that background is flat, it is VERY unlikely you have E
that big when you saw zero. Cross all those E values off your list.

b) Epix=0 ? Well, that CAN be true, but other things are possible and all
of them are E>0. So, most likely E is not 0, but at least a little bit
higher.

c) Epix=1e-6 ?  Yeah, sure, why not?

d) Epix= -1e-6 ?  No. Don't be silly.

e) If I had to guess? Meh. 1 photon per pixel?  That would be k+1


I suppose my objection to E=V=0 is because V=0 implies infinite confidence
in the value of E, and that we don't have. Yes, it is true that we are
quite confident in the fact that we did not see any photons this time, but
the remember that E and V are the mean and variance that you would see if
you did a million experiments under the same conditions. We are trying to
guess those from what we've got. Just because you've seen zero a hundred
times doesn't mean the 101st experiment won't give you a count.  If it
does, then maybe Epatch=0.01 and Epix=0.0001?  But what do you do before
you see your first photon?

All you can really do is bracket it.


But what if you come up with a better prior than "I have no idea" ?

Well, we do have other pixels on the detector, and presuming the background
is flat, or at least smooth, maybe the average counts/pixel is a better
prior?


So, let us consider an ideal detector with 1e6 independent pixels. Let us
further say that 1e5 background photons have hit that detector.  I want to
still ignore Bragg photons because those have a very different prior
distribution to the background.  Let us say we have masked off all the
Bragg areas.


The average overall background is then 0.1 photons/pixel. Let us assign
that to the prior probability Ppix = 0.1.  Now let us look again at that
patch of 10x10 pixels with zero counts on it.  We expected to see 10, but
got 0.  What are the odds of that?  Pretty remote.  Less than 1 in a
million.


I suspect in this situation where such an unlikely event has occurred it
should perhaps be given a variance larger than 100. Perhaps quite a bit
larger?  Subsequent "sigma-weighted" summation would then squash its
contribution down to effectively 0. So, relative to any other observation
with even a shred of merit it would have no impact. Giving it V=0, however?
That can't be right.


But what if Ppix=0.01 ?  Then we expect to see zero counts on our 100-pixel
patch about 1/3 of the time. Same for 1-photon observations.

Giving these two kinds of observations the same weight seems more sensible,
given the prior.


Another prior might be to take the flux and sample thickness into account.
Given the cross section of light elements the expected photons/pixel on
most any detector would be:


Ppix = 1.2e-5*flux*exposure*thickness*omega/Npixels

where:

Ppix = expected photons/pixel

Npixels = number of pixels on the detector omega  = fraction of scattered
photons that hit it (about 0.5) thickness = thickness of sample and loop in
microns exposure = exposure time in seconds flux = incident beam flux in
photons/s

1.2e-5 = 1e-4 cm/um * 1.2 g/cm^3 * 0.2 cm^2/g (cross section of oxygen)


If you don't know anything else about the sample, you can at least know
that.


Or am I missing something?


-James Holton

MAD Scientist



On 10/16/2021 12:47 AM, Kay Diederichs wrote:

Dear Gergely,


with " 10 x 10 patch of pixels ", I believe James means that he observes
100 neighbouring pixels each with 0 counts. Thus the frequentist view can
be taken, and results in 0 as the variance, right?


best,

Kay



On Fri, 15 Oct 2021 21:07:26 +0000, Gergely Katona <gergely.kat...@gu.se>
wrote:


Dear James,


Uniform distribution sounds like “I have no idea”, but a uniform
distribution does not go from -inf to +inf. If I believe that every count
from 0 to 65535 has the same probability, then I also expect counts with an
average of 32768 on the image. It is not an objective belief in the end and
probably not a very good idea for an X-ray experiment if the number of
observations are small. Concerning which variance is the right one, the
frequentist view requires frequencies to be observed. In the absence of
frequencies, there is no error estimate. Bayesians at least can determine a
single distribution as an answer without observations and that will be
their prior belief of the variance. Again, I would avoid a uniform a priori
distribution for the variance. For a Poisson distribution the convenient
conjugate prior is the gamma distribution. It can control the magnitude of
k and strength of belief with its location and scale parameter,
respectively.


Best wishes,


Gergely


Gergely Katona, Professor, Chairman of the Chemistry Program Council

Department of Chemistry and Molecular Biology, University of

Gothenburg Box 462, 40530 Göteborg, Sweden

Tel: +46-31-786-3959 / M: +46-70-912-3309 / Fax: +46-31-786-3910

Web: http://katonalab.eu, Email: gergely.kat...@gu.se


From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of James

Holton

Sent: 15 October, 2021 18:06

To: CCP4BB@JISCMAIL.AC.UK

Subject: Re: [ccp4bb] am I doing this right?


Well I'll be...


Kay Diederichs pointed out to me off-list that the k+1 expectation and
variance from observing k photons is in "Bayesian Reasoning in Data
Analysis: A Critical Introduction" by Giulio D. Agostini.  Granted, that is
with a uniform prior, which I take as the Bayesean equivalent of "I have no
idea".


So, if I'm looking to integrate a 10 x 10 patch of pixels on a weak
detector image, and I find that area has zero counts, what variance shall I
put on that observation?  Is it:


a) zero

b) 1.0

c) 100


Wish I could say there are no wrong answers, but I think at least two

of those are incorrect,


-James Holton

MAD Scientist

On 10/13/2021 2:34 PM, Filipe Maia wrote:

I forgot to add probably the most important. James is correct, the expected
value of u, the true mean, given a single observation k is indeed k+1 and
k+1 is also the mean square error of using k+1 as the estimator of the true
mean.


Cheers,

Filipe


On Wed, 13 Oct 2021 at 23:17, Filipe Maia <fil...@xray.bmc.uu.se<mailto:
fil...@xray.bmc.uu.se>> wrote:

Hi,


The maximum likelihood estimator for a Poisson distributed variable is
equal to the mean of the observations. In the case of a single observation,
it will be equal to that observation. As Graeme suggested, you can
calculate the probability mass function for a given observation with
different Poisson parameters (i.e. true means) and see that function peaks
when the parameter matches the observation.


The root mean squared error of the estimation of the true mean from a
single observation k seems to be sqrt(k+2). Or to put it in another way,
mean squared error, that is the expected value of (k-u)**2, for an
observation k and a true mean u, is equal to k+2.


You can see some example calculations at

https://colab.research.google.com/drive/1eoaNrDqaPnP-4FTGiNZxMllP7SFH

kQuS?usp=sharing


Cheers,

Filipe


On Wed, 13 Oct 2021 at 17:14, Winter, Graeme (DLSLtd,RAL,LSCI) <
00006a19cead4548-dmarc-requ...@jiscmail.ac.uk<mailto:
00006a19cead4548-dmarc-requ...@jiscmail.ac.uk>> wrote:

This rang a bell to me last night, and I think you can derive this

from first principles


If you assume an observation of N counts, you can calculate the

probability of such an observation for a given Poisson rate constant

X. If you then integrate over all possible value of X to work out the

central value of the rate constant which is most likely to result in

an observation of N I think you get X = N+1


I think it is the kind of calculation you can perform on a napkin, if

memory serves


All the best Graeme



On 13 Oct 2021, at 16:10, Andrew Leslie - MRC LMB <and...@mrc-lmb.cam.ac.uk
<mailto:and...@mrc-lmb.cam.ac.uk>> wrote:


Hi Ian, James,


                      I have a strong feeling that I have seen this result
before, and it was due to Andy Hammersley at ESRF. I’ve done a literature
search and there is a paper relating to errors in analysis of counting
statistics (se below), but I had a quick look at this and could not find
the (N+1) correction, so it must have been somewhere else. I Have cc’d Andy
on this Email (hoping that this Email address from 2016 still works) and
maybe he can throw more light on this. What I remember at the time I saw
this was the simplicity of the correction.


Cheers,


Andrew


Reducing bias in the analysis of counting statistics data Hammersley,

AP<https://www.webofscience.com/wos/author/record/2665675>

(Hammersley, AP) Antoniadis,

A<https://www.webofscience.com/wos/author/record/13070551>

(Antoniadis, A) NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH

SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED

EQUIPMENT Volume

394

Issue

1-2

Page

219-224

DOI

10.1016/S0168-9002(97)00668-2

Published

JUL 11 1997



On 12 Oct 2021, at 18:55, Ian Tickle <ianj...@gmail.com<mailto:
ianj...@gmail.com>> wrote:



Hi James


What the Poisson distribution tells you is that if the true count is N then
the expectation and variance are also N.  That's not the same thing as
saying that for an observed count N the expectation and variance are N.
Consider all those cases where the observed count is exactly zero.  That
can arise from any number of true counts, though as you noted larger values
become increasingly unlikely.  However those true counts are all >= 0 which
means that the mean and variance of those true counts must be positive and
non-zero.  From your results they are both 1 though I haven't been through
the algebra to prove it.


So what you are saying seems correct: for N observed counts we should be
taking the best estimate of the true value and variance as N+1.  For
reasonably large N the difference is small but if you are concerned with
weak images it might start to become significant.


Cheers


-- Ian



On Tue, 12 Oct 2021 at 17:56, James Holton <jmhol...@lbl.gov<mailto:
jmhol...@lbl.gov>> wrote:

All my life I have believed that if you're counting photons then the

error of observing N counts is sqrt(N).  However, a calculation I

just performed suggests its actually sqrt(N+1).


My purpose here is to understand the weak-image limit of data

processing. Question is: for a given pixel, if one photon is all you

got, what do you "know"?


I simulated millions of 1-second experiments. For each I used a "true"

beam intensity (Itrue) between 0.001 and 20 photons/s. That is, for

Itrue= 0.001 the average over a very long exposure would be 1 photon

every 1000 seconds or so. For a 1-second exposure the observed count

(N) is almost always zero. About 1 in 1000 of them will see one

photon, and roughly 1 in a million will get N=2. I do 10,000 such

experiments and put the results into a pile.  I then repeat with

Itrue=0.002, Itrue=0.003, etc. All the way up to Itrue = 20. At Itrue

20 I never see N=1, not even in 1e7 experiments. With Itrue=0, I

also see no N=1 events.

Now I go through my pile of results and extract those with N=1, and

count up the number of times a given Itrue produced such an event.

The histogram of Itrue values in this subset is itself Poisson, but

with mean = 2 ! If I similarly count up events where 2 and only 2

photons were seen, the mean Itrue is 3. And if I look at only

zero-count events the mean and standard deviation is unity.


Does that mean the error of observing N counts is really sqrt(N+1) ?


I admit that this little exercise assumes that the distribution of

Itrue is uniform between 0.001 and 20, but given that one photon has

been observed Itrue values outside this range are highly unlikely.

The

Itrue=0.001 and N=1 events are only a tiny fraction of the whole.

So, I wold say that even if the prior distribution is not uniform, it

is certainly bracketed. Now, Itrue=0 is possible if the shutter

didn't open, but if the rest of the detector pixels have N=~1,

doesn't this affect the prior distribution of Itrue on our pixel of
interest?


Of course, two or more photons are better than one, but these days

with small crystals and big detectors N=1 is no longer a trivial situation.

I look forward to hearing your take on this.  And no, this is not a trick.


-James Holton

MAD Scientist


#####################################################################

###


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


This message was issued to members of

www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a

mailing list hosted by

www.jiscmail.ac.uk<http://www.jiscmail.ac.uk/>, terms & conditions

are available at https://www.jiscmail.ac.uk/policyandsecurity/


________________________________


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1



________________________________


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1





--


This e-mail and any attachments may contain confidential, copyright and or
privileged material, and are for the use of the intended addressee only. If
you are not the intended addressee or an authorised recipient of the
addressee please notify us of receipt by returning the e-mail and do not
use, copy, retain, distribute or disclose the information in or attached to
the e-mail.

Any opinions expressed within this e-mail are those of the individual and
not necessarily of Diamond Light Source Ltd.

Diamond Light Source Ltd. cannot guarantee that this e-mail or any
attachments are free from viruses and we cannot accept liability for any
damage which you may sustain as a result of software viruses which may be
transmitted in or with the message.

Diamond Light Source Limited (company no. 4375679). Registered in

England and Wales with its registered office at Diamond House,

Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE,

United Kingdom



________________________________


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


________________________________


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1



________________________________


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


#####################################################################

###


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a

mailing list hosted by www.jiscmail.ac.uk, terms & conditions are

available at https://www.jiscmail.ac.uk/policyandsecurity/

######################################################################

##


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a

mailing list hosted by www.jiscmail.ac.uk, terms & conditions are

available at https://www.jiscmail.ac.uk/policyandsecurity/

########################################################################


To unsubscribe from the CCP4BB list, click the following link:

https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing
list hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/


########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing
list hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] am I doing this right?

Reply via email to