Re: [R] sample size estimation for count (poisson?) data?

David Winsemius Thu, 13 Nov 2008 20:18:51 -0800

The notion that you can just add or subtract 0.03 from  estimate is  
obviously incorrect.


Presuming you meant to call you lower bound q05 and the upper bound  
q95, the numbers I get are in your 10,000 iteration loop are 4.97 and  
5.18 (around a mean of 5.08). So roughly a .1 swing on each side of  
the mean or 2% "margin of error" assuming you mean a 90% confidence  
limit. That would be a reasonable 90% CI for an estimate under the  
assumption that it is Poisson which would require some checking  ...  
at a minimum the variance should be near the mean (as it obviously  
would have been in you simulation. Traditionally this sort of estimate  
would be a 95% CI and this  simulation estimate for that would be
 > q.025=quantile(d, 0.025)
 > q.975=quantile(d,0.975)
 > q.025; q.975
     2.5%
4.946429
    97.5%
5.208333
  Which is more like the 3% that you were initially talking about.

But I would have thought that it would be more appropriate to make new  
samples rather than to draw from the same relative small sample and  
the code I would substitute is
 > for (i in 1:10000) {
+ samp = rpois(sample.size,lambda = 5)
+ d[i] = mean(samp)
+ }

 > q.025=quantile(d, 0.025)
 > q.975=quantile(d,0.975)
 > q.025; q.975
     2.5%
4.666667
    97.5%
5.339286

So a 6-7% swing on either side with that size sample. I would think  
that 5 would be a fairly meager observation count. I would ask the  
questions:
- where did the number 5 come from? (the variance of a Poisson  
variable is set when you know the mean, since it is a one parameter  
distribution.)
- the notion of "margin of error" is getting mixed up with 95%  
confidence interval. Which one do you really want? Do you want the  
standard error of the mean to be be less than a specific amount or to  
be a specific fraction f the estimate?
- have you considered that there may be extra-Poisson variation due to  
heterogeneity.  Some section of the hiker population may be more  
observant. In which case the variance of the sample will exceed the  
mean.

-- 
David Winsemius

On Nov 13, 2008, at 4:43 PM, Shawn Morrison wrote:

> Thanks. I did the search before I posted and found those threads.  
> However, it does not seem to do what I want. All I want to do is  
> estimate the sample size for a point estimate, not do a GLM. I just  
> want the mean within a margin of error, and to a given CI.
>
> I've tried writing some code to do a simulation (below). Will this  
> do the job?
>
> #Generate data from Poission distribution, with lambda = 5
> data = rpois(200, lambda = 5)
> mean(data); var(data)
>
> #Parameter Estimates
> moe = 0.03 # margin of error = +/- 3%
> sample.size = 168 # number of hunters to sample
>
> #Draw sample size from population, calc mean. Run 10,000 iterations
> d = numeric(10000)
> for (i in 1:10000) {
> samp = (sample(data, sample.size, replace = FALSE))
> d[i] = mean(samp)
> }
>
> #What are the bounds on the values that correspond to the margin of  
> error?
> lower=mean(data)-moe
> upper=mean(data)+moe
>
> #values from 'd' based on 90% confidence intervals
> q25=quantile(d, 0.05)
> q95=quantile(d,0.95)
>
> #top row = bounds on the mean from the margin of error, second row =  
> bounds based on simulated data and sample size, third row = 1 =  
> true, 0 = false in terms of the sample size being adequate to meet  
> requirements of the margin of error.
> output=rbind(cbind(lower,upper), cbind(q25,q95), cbind(q25>lower,  
> q95<upper))
> row.names(output) = c("known", "estimated","True/False")
> output
>
> On 12-Nov-08, at 4:41 PM, David Winsemius wrote:
>
>> The first hit for search on "sample size" and "poisson" on Baron's  
>> search engine web interface appears on target:
>>
>> http://search.r-project.org/cgi-bin/namazu.cgi?query=%22sample+size%22+poisson&max=100&result=normal&sort=score&idxname=functions&idxname=Rhelp02a
>>
>> Getting the same result from your console window requires a couple  
>> of extra back-slashes:
>>
>> > RSiteSearch(""sample size" poisson")
>> Error: syntax error
>> > RSiteSearch("\"sample size\" poisson")
>> A search query has been submitted to http://search.r-project.org
>> The results page should open in your browser shortly.
>>
>> -- 
>> David Winsemius
>> Heritage Labs
>>
>>
>> On Nov 12, 2008, at 2:46 PM, Shawn Morrison wrote:
>>
>>> Is there a function in R that will allow me to estimate the sample
>>> size required from count data (poisson data?), given the known
>>> variance and desired margin of error and confidence interval?
>>>
>>> My specific data set will be based on a survey of hikers that will  
>>> be
>>> asked about the number of animals of species 'x' they observed  
>>> during
>>> a given period. I need to know the number of hikers to interview.  
>>> ie,
>>> I would like to calculate the mean number of species 'x' +/-  
>>> margin of
>>> error with 95% confidence.
>>>
>>> This is a simple exercise for normally distributed continuous data,
>>> but I'm running into roadblocks for count data.
>>>
>>> Sincerely,
>>> Shawn Morrison
>>>     [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample size estimation for count (poisson?) data?

Reply via email to