Hi all, I am trying to run a power analysis using simulated data to compare the power of a glm versus a binomial proportion test to detect differences in proportions. For example, suppose you have some proportion that decreases by some amount over X number of time steps. .4,.39,.38,.37 . . . . .01 and simulate data over those time steps based on the decrease. Does a glm approach (differences in slope) "outperform" an approach whereby you simply look at proportional differences. I ran simulations and basically summed up the number of times p<.05 divided by the number of trials. I was expecting that a glm approach would have more power because it would be utilizing all the data across multiple time steps, whereas a binomial proportion test is only comparing two populations ( the beginning and the end points).
However the results indicated that a binomial proportion test had more power relative to a glm at fewer time steps and that, depending on the simulated decrease in proportions, the relationship between glm power and binomial proportion test power changed. Interestingly, a greater decreases, a binomial proportion test seems have greater power compared to a glm, which to me is counterintuitive since the slope should be greater. I am attaching the code below my questions in case anyone is interested. My questions are: (1) Am I interpreting the results and p-values correctly? (2) If I am interested in trends, does glm results really have lower power and, if so, is there a way to combine the two tests? I know that I am ignoring a lot of issues such as autocorrelation, but I am really just trying to understand the output. Any suggestions or insights would be appreciated. ##### Attempt at using glm model ###### ltProb <- 0.4 ## longterm average of nest survival probability change <- c(.01,.03,.05) yrs <- seq(1,20, by=1) ## Years of inquiry, probability of detecting a yearly change nest survival samplesize <- 50 ## Reasonable sample size range reps <- 1000 ## of simulations per SurvProb <- ltProb ## initiating survival probablity for later use, nuisance power <- matrix(nrow=length(change)*length(yrs)*length(samplesize), ncol=5) ## creating a matrix to hold data scenario <- 0 ## initializing scenarios which will be a placeholder later on for (a in 1:length(change)){ ## loop through yearly pop change scenarios for (c in 1:length(samplesize)){ ## loop through sample size scenarios for (b in 1:length(yrs)){ ## loop through years sceanarios scenario=scenario+1 power[scenario,1] <- change[a] ## filling matrix with pop change used power[scenario,2]<-ltProb*((1-change[a])**yrs[b]) power[scenario,3] <- yrs[b] ## filling matrix with number of years used power[scenario,4] <- samplesize[c] ## filling matrix with sample size used } } } colnames(power) <- c("PopChange", "Proportion2","yrs", "Sample_Size", "Power") power=as.data.frame(power) power$Sample_Size=as.numeric(power$Sample_Size) scenario=levels(as.factor(power$PopChange)) ## To subset power ps = levels(as.factor(power$Sample_Size)) results <- matrix(nrow=0, ncol=5) ## final matrix results=as.data.frame(results) reps=1000 ## set number of reps for (k in 1:length(scenario)){ for (m in 1:length(ps)){ sub=power[(power$PopChange==scenario[k]) & power$Sample_Size==ps[m],] ## Subset by scenario and sample size probs = matrix(nrow=1000,ncol=16) ## this is matrix for probabilities. dat=matrix(nrow=0,ncol=3,NA) dat=as.data.frame(dat) for (l in 1:reps){ ##This should be for the replicates dat=matrix(nrow=0,ncol=3,NA) dat=as.data.frame(dat) print(l) for (j in 1:nrow(sub)){ tmp=rbinom(n=sub$Sample_Size[j],size=1,prob=sub[j,2]) tmp.dat=cbind(tmp,rep(sub$yrs[j],sub$Sample_Size[j]),rep(sub$Proportion2[j],sub$Sample_Size[j])) dat=rbind(dat,tmp.dat) if (sub$yrs[j]>4){ x=(glm(dat[,1]~dat[,2],family=binomial)) probs[l,j-4]=ifelse(summary(x)$coefficients[2,4]<.05,1,0) } } } sub$Power[5:20]=apply(probs,2,sum)/nrow(probs) results=rbind(results,sub) } } tmp=results[!is.na(results$Power),] ### remove NAs tmp$SampleSize_Decline=paste(tmp$Sample_Size,tmp$PopChange,sep=":") tmp$SampleSize_Decline=as.factor(tmp$SampleSize_Decline) ### Now do the same using prop.test ##### tmp$PropPower = NA for (i in 1:nrow(tmp)){ print(i) x=0 for (j in 1:1000){ a = rbinom(tmp$Sample_Size[i],1,prob = (tmp$Proportion2[i])) b=binom.test(sum(a),length(a),p=.4,conf.level=.95,alternative="less") if (b$p.value < .05) {x=x+1} } tmp$PropPower[i] = x/1000 } ##### graph results d=unique(tmp$PopChange) for (i in 1:length(d)){ par(mfcol = c(1,2)) sub = tmp[tmp$PopChange == d[i],] plot(sub$yrs,sub$Power,col="red", main = d[i], xlab = "Number of years",ylab="Power") points(sub$yrs,sub$PropPower,col="black") plot(sub$PropPower,sub$Power,xlab="Binomial Proportion Power", ylab="GLM Power",main=d[i]) abline(0,1) readline("Press <return to continue") } ##### End Code ##### Wade A. Wall US Army ERDC-CERL P.O. Box 9005 Champaign, IL 61826-9005 1-217-373-4420 wade.a.w...@usace.army.mil<mailto:wade.a.w...@usace.army.mil> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.