Re: [ccp4bb] am I doing this right?

2021-10-17 Thread Gergely Katona
Dear James,

No, the prior does not have to be a gamma distribution. As I understand (and I 
am out of my comfort zone here), the reason for choosing gamma is indeed 
convenience. I believe this convenience is rapidly lost as the Bayesian network 
gets complicated and any hope of an analytical expression for a posterior 
disappears. 

The Poisson distribution has a given likelihood function, so let's look for a 
function to pair with that can achieve something. The need for pairing comes 
from the product of the likelihood and the prior probability. The probability 
of the data from the Bayes formula is just a constant as long as we compare the 
posterior probabilities of the parameters given the same data. As you probably 
noticed in my example the posterior distribution has an exponential shape even 
if it was generated by sampling. But, I would also expect that this will 
transform to a bell shape as the rate increases, very much like much like the 
Poisson distribution changes only the gamma distribution is continuous. So 
would not it be nice to have a prior distribution that matches this posterior? 
Then one could recycle the posterior as a new prior when new data is added 
until the end of time.
One could also define the posterior with just two precise parameter values, 
instead of my embarrassing sampling of this simple scenario. One could compare 
these precise parameter values of the prior and posterior etc.

Aside this convenience, I have other reasons to prefer the gamma distribution 
over a uniform distribution: 
1. Lack of region with 0 probability (well, within the positive territory). It 
is possible to convert even strong priors with a lot of evidence, but not a 
prior probability of 0. That value will always default to 0 posterior 
probability the equivalent of a dogma. 
2. With rates perhaps you would like to consider 0.1, 0.01 and 1E-6 with equal 
probability, but a uniform distribution does not say that. So perhaps the 
logarithm of the rates should have equal probabilities... And voila, the 
exponential distribution does not sound like a bad idea, which happens to be 
the same as a gamma function with alpha=1. It is still prudent to choose a flat 
exponential distribution as a weakly informative prior.
 
In the end, mathematical convenience alone should not dictate what a good prior 
is, in my opinion the expertise and crystallographic needs should take 
precedence. In most of the cases there is much to gain with continuing building 
more complicated Bayesian networks. I always underestimate the power of 
sampling methods and get surprised that they work on problems that I believed 
intractable. It is easy to fall into the trap of frequentist thinking and 
reduce data one step at the time.

Best wishes,

Gergely


-Original Message-
From: James Holton  
Sent: 17 October, 2021 19:25
To: Gergely Katona ; CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] am I doing this right?

Thank you Gergely.  That is interesting!

I don't mind at all making this Bayesian, as long as it works!

Something I'm not quite sure about: does the prior distribution HAVE to be a 
gamma distribution? Not that that really narrows things down since there are an 
infinite number of them, but is that really the "i have no idea" prior? Or just 
a convenient closed-form choice? I've only just recently heard of conjugate 
priors.

Much appreciate any thoughts you may have on this,

-James


On 10/16/2021 3:48 PM, Gergely Katona wrote:
> Dear James,
>
> If I understand correctly you are looking for a single rate parameter to 
> describe the pixels in a block. It would also be possible to estimate the 
> rates for individual pixels or estimate the thickness of the sample from the 
> counts if you have a good model, that is where Bayesian methods really shine. 
> I tested the simplest first Bayesian network with 10 and 100 zero count 
> pixels, respectively:
>
> https://colab.research.google.com/drive/1TGJx2YT9I-qyOT1D9_HCC7G7as1KX
> g2e?usp=sharing
>
>
> The two posterior distributions are markedly different even if they start 
> from the same prior distribution, which I find more intuitive than the 
> frequentist treatment of uncertainty. You can test different parameters for 
> the gamma prior or change to another prior distribution. It is possible to 
> reduce the posterior distributions to their mean or posterior maximum, if 
> needed. If you are looking for an alternative to the Bayesian perspective 
> then this will not help, unfortunately.
>
> Best wishes,
>
> Gergely
>
> -Original Message-
> From: CCP4 bulletin board  On Behalf Of James 
> Holton
> Sent: den 16 oktober 2021 21:01
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] am I doing this right?
>
> Thank you everyone for your thoughtful and thought-provoking responses!
>
> But, I am starting to think I was not as clear as I could have been about my 
> question.  I am actually concerning myself with background, not necessarily 
> Bragg peaks. 

Re: [ccp4bb] am I doing this right?

2021-10-17 Thread Nave, Colin (DLSLtd,RAL,LSCI)
Hi James
For the case under consideration,  isn't the gamma distribution the maximum 
entropy prior i.e. a default with minimum information content. 
Colin

-Original Message-
From: CCP4 bulletin board  On Behalf Of James Holton
Sent: 17 October 2021 18:25
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] am I doing this right?

Thank you Gergely.  That is interesting!

I don't mind at all making this Bayesian, as long as it works!

Something I'm not quite sure about: does the prior distribution HAVE to be a 
gamma distribution? Not that that really narrows things down since there are an 
infinite number of them, but is that really the "i have no idea" prior? Or just 
a convenient closed-form choice? I've only just recently heard of conjugate 
priors.

Much appreciate any thoughts you may have on this,

-James


On 10/16/2021 3:48 PM, Gergely Katona wrote:
> Dear James,
>
> If I understand correctly you are looking for a single rate parameter to 
> describe the pixels in a block. It would also be possible to estimate the 
> rates for individual pixels or estimate the thickness of the sample from the 
> counts if you have a good model, that is where Bayesian methods really shine. 
> I tested the simplest first Bayesian network with 10 and 100 zero count 
> pixels, respectively:
>
> https://colab.research.google.com/drive/1TGJx2YT9I-qyOT1D9_HCC7G7as1KX
> g2e?usp=sharing
>
>
> The two posterior distributions are markedly different even if they start 
> from the same prior distribution, which I find more intuitive than the 
> frequentist treatment of uncertainty. You can test different parameters for 
> the gamma prior or change to another prior distribution. It is possible to 
> reduce the posterior distributions to their mean or posterior maximum, if 
> needed. If you are looking for an alternative to the Bayesian perspective 
> then this will not help, unfortunately.
>
> Best wishes,
>
> Gergely
>
> -Original Message-
> From: CCP4 bulletin board  On Behalf Of James 
> Holton
> Sent: den 16 oktober 2021 21:01
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] am I doing this right?
>
> Thank you everyone for your thoughtful and thought-provoking responses!
>
> But, I am starting to think I was not as clear as I could have been about my 
> question.  I am actually concerning myself with background, not necessarily 
> Bragg peaks.  With Bragg photons you want the sum, but for background you 
> want the average.
>
> What I'm getting at is: how does one properly weight a zero-photon 
> observation when it comes time to combine it with others?  Hopefully they are 
> not all zero.  If they are, check your shutter.
>
> So, ignoring Bragg photons for the moment (let us suppose it is a systematic 
> absence) what I am asking is: what is the variance, or, better yet,what is 
> the WEIGHT one should assign to the observation of zero photons in a patch of 
> 10x10 pixels?
>
> In the absence of any prior knowledge this is a difficult question, but a 
> question we kind of need to answer if we want to properly measure data from 
> weak images.  So, what do we do?
>
> Well, with the "I have no idea" uniform prior, it would seem that expectation 
> (Epix) and variance (Vpix) would be k+1 = 1 for each pixel, and therefore the 
> sum of Epix and Vpix over the 100 independent pixels is:
>
> Epatch=Vpatch=100 photons
>
> I know that seems weird to assume 100 photons should have hit when we 
> actually saw none, but consider what that zero-photon count, all by itself, 
> is really telling you:
> a) Epix > 20 ? No way. That is "right out". Given we know its Poisson 
> distributed, and that background is flat, it is VERY unlikely you have E that 
> big when you saw zero. Cross all those E values off your list.
> b) Epix=0 ? Well, that CAN be true, but other things are possible and all of 
> them are E>0. So, most likely E is not 0, but at least a little bit higher.
> c) Epix=1e-6 ?  Yeah, sure, why not?
> d) Epix= -1e-6 ?  No. Don't be silly.
> e) If I had to guess? Meh. 1 photon per pixel?  That would be k+1
>
> I suppose my objection to E=V=0 is because V=0 implies infinite confidence in 
> the value of E, and that we don't have. Yes, it is true that we are quite 
> confident in the fact that we did not see any photons this time, but the 
> remember that E and V are the mean and variance that you would see if you did 
> a million experiments under the same conditions. We are trying to guess those 
> from what we've got. Just because you've seen zero a hundred times doesn't 
> mean the 101st experiment won't give you a count.  If it does, then maybe 
> Epatch=0.01 and Epix=0.0001?  But what do you do before you see your first 
> photon?
> All you can really do is bracket it.
>
> But what if you come up with a better prior than "I have no idea" ?
> Well, we do have other pixels on the detector, and presuming the background 
> is flat, or at least smooth, maybe the average counts/pixel is a better prior?
>
> So, let 

Re: [ccp4bb] am I doing this right?

2021-10-17 Thread Jurgen Bosch
The EM and Cryo-EM world might have something to say about that perhaps.

Isn’t RIP phasing coming from what you are describing as well?

Just curious,

Jürgen

__
Jürgen Bosch, PhD, MBA
Center for Global Health & Diseases
Case Western Reserve University

https://www.linkedin.com/in/jubosch/

CEO & Co-Founder at InterRayBio, LLC

Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry & Molecular Biology

On Oct 17, 2021, at 13:24, James Holton  wrote:

Thank you Gergely.  That is interesting!

I don't mind at all making this Bayesian, as long as it works!

Something I'm not quite sure about: does the prior distribution HAVE to
be a gamma distribution? Not that that really narrows things down since
there are an infinite number of them, but is that really the "i have no
idea" prior? Or just a convenient closed-form choice? I've only just
recently heard of conjugate priors.

Much appreciate any thoughts you may have on this,

-James


On 10/16/2021 3:48 PM, Gergely Katona wrote:

Dear James,


If I understand correctly you are looking for a single rate parameter to
describe the pixels in a block. It would also be possible to estimate the
rates for individual pixels or estimate the thickness of the sample from
the counts if you have a good model, that is where Bayesian methods really
shine. I tested the simplest first Bayesian network with 10 and 100 zero
count pixels, respectively:


https://colab.research.google.com/drive/1TGJx2YT9I-qyOT1D9_HCC7G7as1KXg2e?usp=sharing



The two posterior distributions are markedly different even if they start
from the same prior distribution, which I find more intuitive than the
frequentist treatment of uncertainty. You can test different parameters for
the gamma prior or change to another prior distribution. It is possible to
reduce the posterior distributions to their mean or posterior maximum, if
needed. If you are looking for an alternative to the Bayesian perspective
then this will not help, unfortunately.


Best wishes,


Gergely


-Original Message-

From: CCP4 bulletin board  On Behalf Of James Holton

Sent: den 16 oktober 2021 21:01

To: CCP4BB@JISCMAIL.AC.UK

Subject: Re: [ccp4bb] am I doing this right?


Thank you everyone for your thoughtful and thought-provoking responses!


But, I am starting to think I was not as clear as I could have been about
my question.  I am actually concerning myself with background, not
necessarily Bragg peaks.  With Bragg photons you want the sum, but for
background you want the average.


What I'm getting at is: how does one properly weight a zero-photon
observation when it comes time to combine it with others?  Hopefully they
are not all zero.  If they are, check your shutter.


So, ignoring Bragg photons for the moment (let us suppose it is a
systematic absence) what I am asking is: what is the variance, or, better
yet,what is the WEIGHT one should assign to the observation of zero photons
in a patch of 10x10 pixels?


In the absence of any prior knowledge this is a difficult question, but a
question we kind of need to answer if we want to properly measure data from
weak images.  So, what do we do?


Well, with the "I have no idea" uniform prior, it would seem that
expectation (Epix) and variance (Vpix) would be k+1 = 1 for each pixel, and
therefore the sum of Epix and Vpix over the 100 independent pixels is:


Epatch=Vpatch=100 photons


I know that seems weird to assume 100 photons should have hit when we
actually saw none, but consider what that zero-photon count, all by itself,
is really telling you:

a) Epix > 20 ? No way. That is "right out". Given we know its Poisson
distributed, and that background is flat, it is VERY unlikely you have E
that big when you saw zero. Cross all those E values off your list.

b) Epix=0 ? Well, that CAN be true, but other things are possible and all
of them are E>0. So, most likely E is not 0, but at least a little bit
higher.

c) Epix=1e-6 ?  Yeah, sure, why not?

d) Epix= -1e-6 ?  No. Don't be silly.

e) If I had to guess? Meh. 1 photon per pixel?  That would be k+1


I suppose my objection to E=V=0 is because V=0 implies infinite confidence
in the value of E, and that we don't have. Yes, it is true that we are
quite confident in the fact that we did not see any photons this time, but
the remember that E and V are the mean and variance that you would see if
you did a million experiments under the same conditions. We are trying to
guess those from what we've got. Just because you've seen zero a hundred
times doesn't mean the 101st experiment won't give you a count.  If it
does, then maybe Epatch=0.01 and Epix=0.0001?  But what do you do before
you see your first photon?

All you can really do is bracket it.


But what if you come up with a better prior than "I have no idea" ?

Well, we do have other pixels on the detector, and presuming the background
is flat, or at least smooth, maybe the average counts/pixel is a 

Re: [ccp4bb] am I doing this right?

2021-10-17 Thread James Holton

Thank you Gergely.  That is interesting!

I don't mind at all making this Bayesian, as long as it works!

Something I'm not quite sure about: does the prior distribution HAVE to 
be a gamma distribution? Not that that really narrows things down since 
there are an infinite number of them, but is that really the "i have no 
idea" prior? Or just a convenient closed-form choice? I've only just 
recently heard of conjugate priors.


Much appreciate any thoughts you may have on this,

-James


On 10/16/2021 3:48 PM, Gergely Katona wrote:

Dear James,

If I understand correctly you are looking for a single rate parameter to 
describe the pixels in a block. It would also be possible to estimate the rates 
for individual pixels or estimate the thickness of the sample from the counts 
if you have a good model, that is where Bayesian methods really shine. I tested 
the simplest first Bayesian network with 10 and 100 zero count pixels, 
respectively:

https://colab.research.google.com/drive/1TGJx2YT9I-qyOT1D9_HCC7G7as1KXg2e?usp=sharing


The two posterior distributions are markedly different even if they start from 
the same prior distribution, which I find more intuitive than the frequentist 
treatment of uncertainty. You can test different parameters for the gamma prior 
or change to another prior distribution. It is possible to reduce the posterior 
distributions to their mean or posterior maximum, if needed. If you are looking 
for an alternative to the Bayesian perspective then this will not help, 
unfortunately.

Best wishes,

Gergely

-Original Message-
From: CCP4 bulletin board  On Behalf Of James Holton
Sent: den 16 oktober 2021 21:01
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] am I doing this right?

Thank you everyone for your thoughtful and thought-provoking responses!

But, I am starting to think I was not as clear as I could have been about my 
question.  I am actually concerning myself with background, not necessarily 
Bragg peaks.  With Bragg photons you want the sum, but for background you want 
the average.

What I'm getting at is: how does one properly weight a zero-photon observation 
when it comes time to combine it with others?  Hopefully they are not all zero. 
 If they are, check your shutter.

So, ignoring Bragg photons for the moment (let us suppose it is a systematic 
absence) what I am asking is: what is the variance, or, better yet,what is the 
WEIGHT one should assign to the observation of zero photons in a patch of 10x10 
pixels?

In the absence of any prior knowledge this is a difficult question, but a 
question we kind of need to answer if we want to properly measure data from 
weak images.  So, what do we do?

Well, with the "I have no idea" uniform prior, it would seem that expectation 
(Epix) and variance (Vpix) would be k+1 = 1 for each pixel, and therefore the sum of Epix 
and Vpix over the 100 independent pixels is:

Epatch=Vpatch=100 photons

I know that seems weird to assume 100 photons should have hit when we actually 
saw none, but consider what that zero-photon count, all by itself, is really 
telling you:
a) Epix > 20 ? No way. That is "right out". Given we know its Poisson 
distributed, and that background is flat, it is VERY unlikely you have E that big when you 
saw zero. Cross all those E values off your list.
b) Epix=0 ? Well, that CAN be true, but other things are possible and all of them 
are E>0. So, most likely E is not 0, but at least a little bit higher.
c) Epix=1e-6 ?  Yeah, sure, why not?
d) Epix= -1e-6 ?  No. Don't be silly.
e) If I had to guess? Meh. 1 photon per pixel?  That would be k+1

I suppose my objection to E=V=0 is because V=0 implies infinite confidence in 
the value of E, and that we don't have. Yes, it is true that we are quite 
confident in the fact that we did not see any photons this time, but the 
remember that E and V are the mean and variance that you would see if you did a 
million experiments under the same conditions. We are trying to guess those 
from what we've got. Just because you've seen zero a hundred times doesn't mean 
the 101st experiment won't give you a count.  If it does, then maybe 
Epatch=0.01 and Epix=0.0001?  But what do you do before you see your first 
photon?
All you can really do is bracket it.

But what if you come up with a better prior than "I have no idea" ?
Well, we do have other pixels on the detector, and presuming the background is 
flat, or at least smooth, maybe the average counts/pixel is a better prior?

So, let us consider an ideal detector with 1e6 independent pixels. Let us 
further say that 1e5 background photons have hit that detector.  I want to 
still ignore Bragg photons because those have a very different prior 
distribution to the background.  Let us say we have masked off all the Bragg 
areas.

The average overall background is then 0.1 photons/pixel. Let us assign that to 
the prior probability Ppix = 0.1.  Now let us look again at that patch of 10x10 
pixels with zero counts on 

Re: [ccp4bb] am I doing this right?

2021-10-17 Thread James Holton


Well Frank, I think it comes down to something I believe you were the 
first to call "dose slicing".


Like fine phi slicing, collecting a larger number of weaker images 
records the same photons, but with more information about the sample 
before it dies. In fine phi slicing the extra information allows you to 
do better background rejection, and in "dose slicing" the extra 
information is about radiation damage. We lose that information when we 
use longer exposures per image, and if you burn up the entire useful 
life of your crystal in one shot, then all information about how the 
spots decayed during the exposure is lost. Your data are also rather 
incomplete.


How much information is lost? Well, how much more disk space would be 
taken up, even after compression, if you collected only 1 photon per 
image?  And kept collecting all the way out to 30 MGy in dose? That's 
about 1 million photons (images) per cubic micron of crystal.  So, I'd 
say the amount of information lost is "quite a bit".


But what makes matters worse is that if you did collect this data set 
and preserved all information available from your crystal you'd have no 
way to process it. This is not because its impossible, its just that we 
don't have the software. Your only choice would be to go find images 
with the same "phi" value and add them together until you have enough 
photons/pixel to index it. Once you've got an indexing solution you can 
map every photon hit to a position in reciprocal space as well as give 
it a time/dose stamp. What do you do with that?  You can do zero-dose 
extrapolation, of course! Damage-free data! Wouldn't that be nice. Or 
can you?  The data you will have in hand for each reciprocal-space pixel 
might look something like:
tic tic .. tic . tic ... tic tictic ... 
tictic.


So. Eight photons.  With time-of-arrival information.  How do you fit a 
straight line to that?  You could "bin" the data or do some kind of 
smoothing thing, but then you are losing information again. Perhaps also 
making ill-founded assumptions. You need error bars of some kind, and, 
better yet, the shape of the distribution implied by those error bars.


And all this makes me think somebody must have already done this. I'm 
willing to bet probably some time in the late 1700s to early 1800s. All 
we're really talking about here is augmenting maximum-likelihood 
estimation of an average value to maximum-likelihood estimation of a 
straight line. That is, slope and intercept, with sigmas on both. I 
suspect the proper approach is to first bring everything down to the 
exact information content of a single photon (or lack of a photon), and 
build up from there.  If you are lucky enough to have a large number of 
photons then linear regression will work, and you are back to Diederichs 
(2003). But when you're photon-starved the statistics of single photons 
become more and more important.  This led me to: is it k? or k+1 ?  When 
k=0 getting this wrong could introduce a factor of infinity.


So, perhaps the big "consequence of getting it wrong" is embarrassing 
myself by re-making a 200-year old mistake I am not currently aware of. 
I am confident a solution exists, but only recently started working on 
this.  So, I figured ... ask the world?


-James Holton
MAD Scientist


On 10/17/2021 1:51 AM, Frank Von Delft wrote:
James, I've been watching the thread with fascination, but also the 
confusion of wild ignorance. I've finally realised why.


What I've missed is: what exactly makes the question so important?  
I've understood what brought it up, if course, but not the consequence 
of getting it wrong.


Frank

Sent from tiny silly touch screen

*From:* James Holton 
*Sent:* Saturday, 16 October 2021 20:01
*To:* CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] am I doing this right?

Thank you everyone for your thoughtful and thought-provoking responses!

But, I am starting to think I was not as clear as I could have been
about my question.  I am actually concerning myself with background, not
necessarily Bragg peaks.  With Bragg photons you want the sum, but for
background you want the average.

What I'm getting at is: how does one properly weight a zero-photon
observation when it comes time to combine it with others? Hopefully
they are not all zero.  If they are, check your shutter.

So, ignoring Bragg photons for the moment (let us suppose it is a
systematic absence) what I am asking is: what is the variance, or,
better yet,what is the WEIGHT one should assign to the observation of
zero photons in a patch of 10x10 pixels?

In the absence of any prior knowledge this is a difficult question, but
a question we kind of need to answer if we want to properly measure data
from weak images.  So, what do we do?

Well, with the "I have no idea" uniform prior, it would seem that
expectation (Epix) and variance (Vpix) would 

Re: [ccp4bb] am I doing this right?

2021-10-17 Thread Kay Diederichs
If you write/analyze/improve a data-reduction program in crystallography, then 
these questions are important: how to estimate the intensity of a Bragg spot by 
evaluating the counts in the pixels of the signal area, and those of the 
background area? When calculating mean and standard deviation of counts, is the 
frequentist view the only correct one, or can we improve the result e.g. by 
thinking about priors?
Consequence of getting it wrong is less accurate intensities, and downstream 
results depending on the intensities.

This is at least how I understand James' questions.

Best wishes,
Kay

On Sun, 17 Oct 2021 08:51:30 +, Frank Von Delft 
 wrote:

>James, I've been watching the thread with fascination, but also the confusion 
>of wild ignorance. I've finally realised why.
>
>What I've missed is: what exactly makes the question so important?  I've 
>understood what brought it up, if course, but not the consequence of getting 
>it wrong.
>
>Frank



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] am I doing this right?

2021-10-17 Thread Frank Von Delft
James, I've been watching the thread with fascination, but also the confusion 
of wild ignorance. I've finally realised why.

What I've missed is: what exactly makes the question so important?  I've 
understood what brought it up, if course, but not the consequence of getting it 
wrong.

Frank

Sent from tiny silly touch screen

From: James Holton 
Sent: Saturday, 16 October 2021 20:01
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] am I doing this right?

Thank you everyone for your thoughtful and thought-provoking responses!

But, I am starting to think I was not as clear as I could have been
about my question.  I am actually concerning myself with background, not
necessarily Bragg peaks.  With Bragg photons you want the sum, but for
background you want the average.

What I'm getting at is: how does one properly weight a zero-photon
observation when it comes time to combine it with others?  Hopefully
they are not all zero.  If they are, check your shutter.

So, ignoring Bragg photons for the moment (let us suppose it is a
systematic absence) what I am asking is: what is the variance, or,
better yet,what is the WEIGHT one should assign to the observation of
zero photons in a patch of 10x10 pixels?

In the absence of any prior knowledge this is a difficult question, but
a question we kind of need to answer if we want to properly measure data
from weak images.  So, what do we do?

Well, with the "I have no idea" uniform prior, it would seem that
expectation (Epix) and variance (Vpix) would be k+1 = 1 for each pixel,
and therefore the sum of Epix and Vpix over the 100 independent pixels is:

Epatch=Vpatch=100 photons

I know that seems weird to assume 100 photons should have hit when we
actually saw none, but consider what that zero-photon count, all by
itself, is really telling you:
a) Epix > 20 ? No way. That is "right out". Given we know its Poisson
distributed, and that background is flat, it is VERY unlikely you have E
that big when you saw zero. Cross all those E values off your list.
b) Epix=0 ? Well, that CAN be true, but other things are possible and
all of them are E>0. So, most likely E is not 0, but at least a little
bit higher.
c) Epix=1e-6 ?  Yeah, sure, why not?
d) Epix= -1e-6 ?  No. Don't be silly.
e) If I had to guess? Meh. 1 photon per pixel?  That would be k+1

I suppose my objection to E=V=0 is because V=0 implies infinite
confidence in the value of E, and that we don't have. Yes, it is true
that we are quite confident in the fact that we did not see any photons
this time, but the remember that E and V are the mean and variance that
you would see if you did a million experiments under the same
conditions. We are trying to guess those from what we've got. Just
because you've seen zero a hundred times doesn't mean the 101st
experiment won't give you a count.  If it does, then maybe Epatch=0.01
and Epix=0.0001?  But what do you do before you see your first photon?
All you can really do is bracket it.

But what if you come up with a better prior than "I have no idea" ?
Well, we do have other pixels on the detector, and presuming the
background is flat, or at least smooth, maybe the average counts/pixel
is a better prior?

So, let us consider an ideal detector with 1e6 independent pixels. Let
us further say that 1e5 background photons have hit that detector.  I
want to still ignore Bragg photons because those have a very different
prior distribution to the background.  Let us say we have masked off all
the Bragg areas.

The average overall background is then 0.1 photons/pixel. Let us assign
that to the prior probability Ppix = 0.1.  Now let us look again at that
patch of 10x10 pixels with zero counts on it.  We expected to see 10,
but got 0.  What are the odds of that?  Pretty remote.  Less than 1 in a
million.

I suspect in this situation where such an unlikely event has occurred it
should perhaps be given a variance larger than 100. Perhaps quite a bit
larger?  Subsequent "sigma-weighted" summation would then squash its
contribution down to effectively 0. So, relative to any other
observation with even a shred of merit it would have no impact. Giving
it V=0, however? That can't be right.

But what if Ppix=0.01 ?  Then we expect to see zero counts on our
100-pixel patch about 1/3 of the time. Same for 1-photon observations.
Giving these two kinds of observations the same weight seems more
sensible, given the prior.

Another prior might be to take the flux and sample thickness into
account.  Given the cross section of light elements the expected
photons/pixel on most any detector would be:

Ppix = 1.2e-5*flux*exposure*thickness*omega/Npixels
where:
Ppix = expected photons/pixel
Npixels = number of pixels on the detector
omega  = fraction of scattered photons that hit it (about 0.5)
thickness = thickness of sample and loop in microns
exposure = exposure time in seconds
flux = incident beam flux in photons/s
1.2e-5 = 1e-4 cm/um * 1.2