Re: [R] Lambert (1992) simulation
On Sat, 5 May 2012, Christopher Desjardins wrote: Hi, I am a little confused at the output from predict() for a zeroinfl object. Here's my confusion: ## From zeroinfl package fm_zinb2 - zeroinfl(art ~ . | ., data = bioChemists, dist = negbin) ## The raw zero-inflated overdispersed data table(bioChemists$art) 0 1 2 3 4 5 6 7 8 9 10 11 12 16 19 275 246 178 84 67 27 17 12 1 2 1 1 2 1 1 ## The default output from predict. It looks like it is doing a horrible job. Does it really predict 7 zeros? No, see also this R-help post on Zero-inflated regression models: predicting no 0s: https://stat.ethz.ch/pipermail/r-help/2011-June/279765.html The predicted _mean_ of a negative binomial distribution is not the most likely outcome (i.e., the _mode_) of the distribution. The post above presents some hands on examples. table(round(predict(fm_zinb2)) ) 0 1 2 3 4 5 6 10 7 354 487 45 12 6 3 1 ## The output from predict using count table(round(predict(fm_zinb2,type=count))) 1 2 3 4 5 6 10 312 536 45 12 6 3 1 ## The output from predict using zero, but here it predicts 24 structural zeros? table(round(predict(fm_zinb2,type=zero))) 0 1 891 24 So my question is how do I interpret these different outputs from the zeroinf object? What are the differences? The help page just left me confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y) and would most accurately track table(bioChemists$art) but I am wrong. How can I find the E(Y) that would most closely track the raw data? Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lambert (1992) simulation
On Thu, 26 Apr 2012, Christopher Desjardins wrote: Hi, I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Your implementation of the inverse logit link is wrong, it should be exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by hand but either use qlogis()/plogis() or make.link(logit). Your setting resulting in an almost constant probability of zero inflation and hence gamma was completely off. Below is my cleaned up code which nicely replicates the results for n = 100. The other two would require additional work because one would need to catch non-convergence etc. hth, Z ## data-generating process dgp - function(nobs = 100) { gamma - c(-1.5, 2) beta - c(1.5, -2) x - seq(0, 1, length.out = nobs) lambda - exp(beta[1] + beta[2] * x) p - plogis(gamma[1] + gamma[2] * x) y - ifelse(runif(nobs) = p, 0, rpois(nobs, lambda = lambda)) data.frame(y = y, x = x) } ## simulation of coefficients and standard errors sim - function(nrep = 2000, ...) { onesim - function(i) { d - dgp(...) m - zeroinfl(y ~ x, data = d) c(coef(m), sqrt(diag(vcov(m } t(sapply(1:nrep, onesim)) } ## run simulation #3 library(pscl) set.seed(1) cfse - sim(nobs = 100) ## mean coefficient estimates apply(cfse[, 1:4], 2, mean) ## median coefficient estimates apply(cfse[, 1:4], 2, median) ## sd of coefficient estimates apply(cfse[, 1:4], 2, sd) ## mean of the standard deviation estimates from observed information apply(cfse[, 5:8], 2, mean) Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=N) Model.Matrix=cbind(rep(1,length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%Beta) CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U=P,0,Lambda.Draw) CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lambert (1992) simulation
On Fri, Apr 27, 2012 at 1:53 AM, Achim Zeileis achim.zeil...@uibk.ac.atwrote: On Thu, 26 Apr 2012, Christopher Desjardins wrote: Hi, I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Your implementation of the inverse logit link is wrong, it should be exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by hand but either use qlogis()/plogis() or make.link(logit). Doh! So obvious. I really should have noticed that. Thanks! I'll have a look at the rest of your code too. Cheers, Chris Your setting resulting in an almost constant probability of zero inflation and hence gamma was completely off. Below is my cleaned up code which nicely replicates the results for n = 100. The other two would require additional work because one would need to catch non-convergence etc. hth, Z ## data-generating process dgp - function(nobs = 100) { gamma - c(-1.5, 2) beta - c(1.5, -2) x - seq(0, 1, length.out = nobs) lambda - exp(beta[1] + beta[2] * x) p - plogis(gamma[1] + gamma[2] * x) y - ifelse(runif(nobs) = p, 0, rpois(nobs, lambda = lambda)) data.frame(y = y, x = x) } ## simulation of coefficients and standard errors sim - function(nrep = 2000, ...) { onesim - function(i) { d - dgp(...) m - zeroinfl(y ~ x, data = d) c(coef(m), sqrt(diag(vcov(m } t(sapply(1:nrep, onesim)) } ## run simulation #3 library(pscl) set.seed(1) cfse - sim(nobs = 100) ## mean coefficient estimates apply(cfse[, 1:4], 2, mean) ## median coefficient estimates apply(cfse[, 1:4], 2, median) ## sd of coefficient estimates apply(cfse[, 1:4], 2, sd) ## mean of the standard deviation estimates from observed information apply(cfse[, 5:8], 2, mean) Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=**N) Model.Matrix=cbind(rep(1,**length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/**exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%**Beta) CoefSimulations=matrix(nrow=**NumSimulations,ncol=2*dim(** Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U=P,0,Lambda.Draw) CoefSimulations[i,]=coef(**zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lambert (1992) simulation
Hi, I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=N) Model.Matrix=cbind(rep(1,length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%Beta) CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U=P,0,Lambda.Draw) CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.