Re: [R] power and sample size for a GLM with Poisson response variable

2006-03-08 Thread Wassell, James T., Ph.D.
Craig, Thanks for your follow-up note on using the asypow package.  My
problem was not only constructing the "constraints" vector but, for my
particular situation (Poisson regression, two groups, sample sizes of
(1081,3180), I get very different results using asypow package compared
to my other (home grown) approaches.  

library(asypow)
pois.mean<-c(0.0065,0.0003)
info.pois <- info.poisson.kgroup(pois.mean, group.size=c(1081,3180))
constraints <- matrix(c(2,1,2), ncol = 3, byrow=T)
poisson.object <- asypow.noncent(pois.mean, info.pois,constraints)
power.pois <- asypow.power(poisson.object, c(1081,3180), 0.05)
print(power.pois)

[1] 0.2438309

asy.pwr()   #  the function is shown below.

$power
[1] 0.96

sim.pwr()   #  the function is shown below.
4261000 Poisson random variates simulated 
$power
[1] 0.567

I tend to think the true power is greater than 0.567 but less than 0.96
(not as small as 0.2438309).  

Maybe I am still not using the asypow functions correctly?
Suggestions/comments welcome.  


-Original Message-
From: Craig A. Faulhaber [mailto:[EMAIL PROTECTED] 
Sent: Saturday, March 04, 2006 12:04 PM
To: Wassell, James T., Ph.D.
Subject: Re: [R] power and sample size for a GLM with Poisson response
variable

Hi James,

Thanks again for your help.  With the assistance of a statistician
colleauge of mine, I figured out the "constraints" vector.  It is a
3-column vector describing the null hypothesis.  For example, let's say
you have 3 populations and your null hypothesis is that there is no
difference between the 3.  The first row of the matrix would be "3 1 2" 
indicating that you have 3 populations and that populations 1 and 2 are
equal.  The second row would be "3 2 3" indicating that you have 3
populations and that populations 2 and 3 are equal.  If you had only 2
populations, there would be only one row ("2 1 2", indicating 2
populations with 1 and 2 equal).  I hope this helps.

Craig

Wassell, James T., Ph.D. wrote:

> Craig,   I found the package asypow difficult to use and it did not 
> yield results in the ballpark of other approaches.   (could not figure

> out the "constraints" vector).  
>
> I wrote some simple functions, one asy.pwr uses the non-central 
> chi-square distn.
>
> asy.pwr<-function(counts=c(7,1),py=c(1081,3180))
> { # a two group Poisson regression power computation #  py is 
> person-years or person-time however measured
> group<-gl(2,1)
> ncp<-summary(glm(counts~group+offset(log(py)),family=poisson))$null.de
> viance
>
> q.tile<-qchisq(.95,1)  #  actually just the X2 critical value of 
> 3.841459 list(power=round(1-pchisq(q.tile,df=1,ncp),2))}
>
> The second function, sim.pwr,  estimates power using simulated Poisson

> random variates.  The most time consuming step is the call to rpois.

> (Maybe someone knows a more efficient way to accomplish this?).  The 
> "for" loop is rather quick in comparison.   I hope you may find this 
> helpful, or if you have solved your problem some other way, please 
> pass along your approach.Note, that for this problem,  very small 
> values of lambda, the two approaches give  much different power 
> estimates  (96% vs. 55% or so).   My problem may be better addressed 
> as binomial logistic regression, maybe then the simulation and the 
> asymptotic estimates my agree better.   
>
> sim.pwr<-function(means=c(0.0065,0.0003),ptime=c(1081,3180),nsim=1000)
> { # a two group poisson regression power computation # based 
> simulating lots of Poisson r.v.'s # input rates followed by a vector 
> of the corresponding person times # the most time consuming part is 
> the r.v. generation.
> # power is determined by counting the how often p-values are <= 0.05
> group<-as.factor(rep(c(1,2),ptime))
> rej<-vector(length=nsim)
> y<-rpois(ptime[1]*nsim,means[1])
> y<-c(y,rpois(ptime[2]*nsim,means[2]))
> y<-matrix(y,nrow=nsim)
> cat(sum(ptime)*nsim,"Poisson random variates simulated","\n") for(i in

> 1:nsim){res<-glm(y[i,]~group,family=poisson())
> rej[i]<-summary(res)$coeff[2,4]<=0.05}
> list(power=sum(rej)/nsim) }
>
>
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] power and sample size for a GLM with Poisson response variable

2006-02-27 Thread Wassell, James T., Ph.D.
Craig,   I found the package asypow difficult to use and it did not
yield results in the ballpark of other approaches.   (could not figure
out the "constraints" vector).   

I wrote some simple functions, one asy.pwr uses the non-central
chi-square distn.

asy.pwr<-function(counts=c(7,1),py=c(1081,3180))
{ # a two group Poisson regression power computation 
#  py is person-years or person-time however measured 
group<-gl(2,1)
ncp<-summary(glm(counts~group+offset(log(py)),family=poisson))$null.devi
ance
q.tile<-qchisq(.95,1)  #  actually just the X2 critical value of
3.841459
list(power=round(1-pchisq(q.tile,df=1,ncp),2))}

The second function, sim.pwr,  estimates power using simulated Poisson
random variates.  The most time consuming step is the call to rpois.
(Maybe someone knows a more efficient way to accomplish this?).  The
"for" loop is rather quick in comparison.   I hope you may find this
helpful, or if you have solved your problem some other way, please pass
along your approach.Note, that for this problem,  very small values
of lambda, the two approaches give  much different power estimates  (96%
vs. 55% or so).   My problem may be better addressed as binomial
logistic regression, maybe then the simulation and the asymptotic
estimates my agree better.

sim.pwr<-function(means=c(0.0065,0.0003),ptime=c(1081,3180),nsim=1000)
{ # a two group poisson regression power computation
# based simulating lots of Poisson r.v.'s 
# input rates followed by a vector of the corresponding person times
# the most time consuming part is the r.v. generation. 
# power is determined by counting the how often p-values are <= 0.05
group<-as.factor(rep(c(1,2),ptime))
rej<-vector(length=nsim)
y<-rpois(ptime[1]*nsim,means[1])
y<-c(y,rpois(ptime[2]*nsim,means[2]))
y<-matrix(y,nrow=nsim)
cat(sum(ptime)*nsim,"Poisson random variates simulated","\n")
for(i in 1:nsim){res<-glm(y[i,]~group,family=poisson())
rej[i]<-summary(res)$coeff[2,4]<=0.05}
list(power=sum(rej)/nsim) }





[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nlme question

2005-11-21 Thread Wassell, James T., Ph.D.
Deepayan, 

Yes, thanks for confirming my suspicions.  I know mixed models are
"different" but, I did not think they were so different as to preclude
estimating the var-cov matrix (via the Hessian in Maximum likelihood, as
you point out).  

Thanks for prompting me to think about MCMC.  Your suggestion to
consider MCMC makes me realize that using BUGS, I could directly sample
from the posterior of the linear combination of parameters - to get its
variance and eliminate the extra step using the var-cov matrix.   As you
say, with results better than the asymptotic approximation. (Maybe I can
do the same thing with mcmcsamp?, but I'm not familiar with this and
will have to take a look at it.)
 
-Original Message-
From: Deepayan Sarkar [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 17, 2005 2:22 PM
To: Doran, Harold
Cc: Wassell, James T., Ph.D.; r-help@stat.math.ethz.ch
Subject: Re: nlme question

On 11/17/05, Doran, Harold <[EMAIL PROTECTED]> wrote:
> I think the authors are mistaken. Sigma is random error, and due to
its
> randomness it cannot be systematically related to anything. It is this
> ind. assumption that allows for the likelihood to be expressed as
> described in Pinhiero and Bates p.62.

I think not. The issue is dependence between the _estimates_ of sigma,
tao, etc, and that may well be present. Presumably, if one can compute
the likelihood surface as a function of the 3 parameters, the hessian
at the MLE's would give the estimated covariance. However, I don't
think nlme does this.

A different approach you might want to consider is using mcmcsamp in
the lme4 package (or more precisely, the Matrix package) to get
samples from the joint posterior distribution. This is likely to be
better than the asymptotic normal approximation in any case.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nlme question

2005-11-17 Thread Wassell, James T., Ph.D.
Thank you for taking the time to think about my problem. 

The reference states:  "The covariance structure must be considered,
because for unbalanced data the estimates" (i.e. mu, sigma and tau hats)
"are not typically independent." Page 105.  It would be nice to simply
assume zero covariance terms, but the authors reject this
simplification.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] nlme question

2005-11-17 Thread Wassell, James T., Ph.D.


-Original Message-
From: Wassell, James T., Ph.D. 
Sent: Thursday, November 17, 2005 9:40 AM
To: 'Deepayan Sarkar'
Subject: RE: nlme question

Deepayan, 

Thanks for your interest.   It's difficult in email but I need the
variance of Kappa = mu + 1.645*tau + 1.645* sigma

Just using the standard variance calculation Var(K) = Var(mu) +
(1.645)^2 * Var(tau) + 1.645^2 * var(sigma) + [three covariance terms
with constant multipliers]. 

So, I can get var(mu) [or actually the standard error] from the summary
function -- and var(tau) and var(sigma) using the VarCorr function, but
I still need the covariance terms.

I am trying to duplicate methods in a paper by Nicas & Neuhaus,
"Variability in Respiratory Protection and the Assigned Protection
Factor"  J. Occ & Environ Health, Feb. 2004.   p 99-109.  (see eqn. 12).


The authors used Proc Mixed, but I can't figure out how to get
covariance terms with SAS either.  

Thanks again. 

Terry Wassell

-Original Message-
From: Deepayan Sarkar [mailto:[EMAIL PROTECTED] 
Sent: Thursday, November 17, 2005 2:52 AM
To: Wassell, James T., Ph.D.
Cc: r-help@stat.math.ethz.ch
Subject: Re: nlme question

On 11/16/05, Wassell, James T., Ph.D. <[EMAIL PROTECTED]> wrote:
> I am using the package nlme to fit a simple random effects (variance
> components model)
>
> with 3 parameters:  overall mean (fixed effect), between subject
> variance (random) and  within subject variance (random).

So to paraphrase, your model can be written as (with the index i
representing subject)

y_ij = \mu + b_i + e_ij

where

b_i ~ N(0, \tao^2)
e_ij ~ N(0, \sigma_2)
and all b_i's and e_ij's are mutually independent. The model has, as
you say, 3 parameters, \mu, \tao and \sigma.

> I have 16 subjects with 1-4 obs per subject.
>
> I need a 3x3 variance-covariance matrix that includes all 3 parameters
> in order to compute the variance of a specific linear combination.

Can you specify the 'linear combination' that you want to estimate in
terms of the model above?

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] nmle question

2005-11-16 Thread Wassell, James T., Ph.D.
Hello.

I have 16 subjects with 1-4 obs per subject. 

I am using the package "nlme" to fit a simple random effects (variance
components model) with 3 parameters:  overall mean (fixed effect),
between subject variance (random) and within subject variance (random).
 
I need a 3x3 variance-covariance matrix that includes all 3 parameters
in order to compute the variance of a specific linear combination.

But I can't get the 3x3 matrix.  Should I specify the formulae in lme
differently or is there some other suggestion that I might try?

Thank you very much for any advice.  my data and code follows.  

"mydata" <-
structure(list(subject = c(17, 17, 17, 17, 5, 16, 16, 8, 8, 8, 
8, 7, 7, 7, 7, 9, 9, 9, 10, 10, 11, 11, 11, 12, 12, 12, 12, 14, 
14, 14, 14, 15, 15, 15, 15, 13, 13, 13, 1, 1, 1, 2, 2, 2, 2, 
3, 3, 3, 4, 4, 4), y = c(-2.944, -5.521, -4.644, -4.736, -5.799, 
-4.635, -5.986, -5.011, -3.989, -4.682, -6.975, -6.064, -5.991, 
-8.068, -5.075, -5.298, -6.446, -5.037, -6.534, -4.828, -5.886, 
-4.025, -6.607, -5.914, -4.159, -6.757, -4.564, -5.011, -5.416, 
-5.371, -5.768, -7.962, -5.635, -4.575, -5.268, -6.975, -5.598, 
-7.669, -8.292, -7.265, -5.858, -7.003, -3.807, -5.829, -5.613, 
-3.135, -5.136, -5.394, -5.011, -5.598, -4.174)), .Names = c("subject", 
"y"), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 
"16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", 
"27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", 
"38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", 
"49", "50", "51"))

lme.res<-lme(fixed=y~1,data=mydata,random=~1|subject,method="ML")
VarCorr(lme.res)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] nlme question

2005-11-16 Thread Wassell, James T., Ph.D.
I am using the package nlme to fit a simple random effects (variance
components model)

with 3 parameters:  overall mean (fixed effect), between subject
variance (random) and 

within subject variance (random).

 

I have 16 subjects with 1-4 obs per subject. 

 

I need a 3x3 variance-covariance matrix that includes all 3 parameters
in order to 

compute the variance of a specific linear combination.

 

But I can't get the 3x3 matrix.  Should I specify the formulae in lme

differently or is there some other suggestion that I might try?

 

Thank you very much for any advice.  my data and code follows.  

 

"mydata" <-

structure(list(subject = c(17, 17, 17, 17, 5, 16, 16, 8, 8, 8, 

8, 7, 7, 7, 7, 9, 9, 9, 10, 10, 11, 11, 11, 12, 12, 12, 12, 14, 

14, 14, 14, 15, 15, 15, 15, 13, 13, 13, 1, 1, 1, 2, 2, 2, 2, 

3, 3, 3, 4, 4, 4), y = c(-2.944, -5.521, -4.644, -4.736, -5.799, 

-4.635, -5.986, -5.011, -3.989, -4.682, -6.975, -6.064, -5.991, 

-8.068, -5.075, -5.298, -6.446, -5.037, -6.534, -4.828, -5.886, 

-4.025, -6.607, -5.914, -4.159, -6.757, -4.564, -5.011, -5.416, 

-5.371, -5.768, -7.962, -5.635, -4.575, -5.268, -6.975, -5.598, 

-7.669, -8.292, -7.265, -5.858, -7.003, -3.807, -5.829, -5.613, 

-3.135, -5.136, -5.394, -5.011, -5.598, -4.174)), .Names = c("subject", 

"y"), class = "data.frame", row.names = c("1", "2", "3", "4", 

"5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 

"16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", 

"27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", 

"38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", 

"49", "50", "51"))

 

lme.res<-lme(fixed=y~1,data=mydata,random=~1|subject,method="ML")

VarCorr(lme.res)

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html