Re: [R] opposite estimates from zeroinfl() and hurdle()
At 12:36 23/10/2009, Tord Snäll wrote: Dear all, A question related to the following has been asked on R-help before, but I could not find any answer to it. Input will be much appreciated. The vignette explains this, and much more. I found it extremely instructive both for its intended purpose, count data, and for general tips about manipulating several models and comparing them. I got an unexpected sign of the "slope" parameter associated with a covariate (diam) using zeroinfl(). It led me to compare the estimates given by zeroinfl() and hurdle(): The (significant) negative estimate here is surprising, given the biology of the species: > summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, EM = TRUE)) Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.74604 0.02635 142.2 <2e-16 *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) 21.7510 7.6525 2.842 0.00448 ** diam -1.1437 0.3941 -2.902 0.00371 ** Number of iterations in BFGS optimization: 1 Log-likelihood: -582.8 on 3 Df The hurdle model gives the same estimates, but with opposite (and expected) signs of the parameters: summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar)) Count model coefficients (truncated poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.74604 0.02635 142.2 <2e-16 *** Zero hurdle model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) -21.7510 7.6525 -2.842 0.00448 ** diam 1.1437 0.3941 2.902 0.00371 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Number of iterations in BFGS optimization: 8 Log-likelihood: -582.8 on 3 Df Why is this so? thanks, Tord Windows NT, R 2.8.1, pcsl 1.03 -- Tord Snäll Department of Ecology / Swedish Species Information Centre Swedish University of Agricultural Sciences (SLU) P.O. 7044, SE-750 07 Uppsala, Sweden Office/Mobile/Fax +46-18-672612/+46-76-7662612/+46-18-673537 www.ekol.slu.se/staff_tordsnall www.artdata.slu.se/personal/fototsn.asp Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] opposite estimates from zeroinfl() and hurdle()
Tord: The logistic zero-inflation portion of the zeroinfl() implementation of ZIP or ZINB predict the probability of 0 rather than the probability of 1 (>0 counts) so the signs of the coefficients are often reversed from how you would expect them to be if you had just performed a logistic regression. I'm guessing that the hurdle model as a two-stage model is using a logistic regression predicting the probability of 1, hence the reversed signs of the estimates in the logistic regression portion of the model. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: Tord Snäll To: r-help@r-project.org Date: 10/23/2009 07:40 AM Subject: [R] opposite estimates from zeroinfl() and hurdle() Sent by: r-help-boun...@r-project.org Dear all, A question related to the following has been asked on R-help before, but I could not find any answer to it. Input will be much appreciated. I got an unexpected sign of the "slope" parameter associated with a covariate (diam) using zeroinfl(). It led me to compare the estimates given by zeroinfl() and hurdle(): The (significant) negative estimate here is surprising, given the biology of the species: > summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, EM = TRUE)) Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.746040.02635 142.2 <2e-16 *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) 21.7510 7.6525 2.842 0.00448 ** diam -1.1437 0.3941 -2.902 0.00371 ** Number of iterations in BFGS optimization: 1 Log-likelihood: -582.8 on 3 Df The hurdle model gives the same estimates, but with opposite (and expected) signs of the parameters: summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar)) Count model coefficients (truncated poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.746040.02635 142.2 <2e-16 *** Zero hurdle model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) -21.7510 7.6525 -2.842 0.00448 ** diam 1.1437 0.3941 2.902 0.00371 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Number of iterations in BFGS optimization: 8 Log-likelihood: -582.8 on 3 Df Why is this so? thanks, Tord Windows NT, R 2.8.1, pcsl 1.03 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] opposite estimates from zeroinfl() and hurdle()
Tord Snäll-4 wrote: > > Dear all, > A question related to the following has been asked on R-help before, but > I could not find any answer to it. Input will be much appreciated. > > I got an unexpected sign of the "slope" parameter associated with a > covariate (diam) using zeroinfl(). It led me to compare the estimates > given by zeroinfl() and hurdle(): > [snip] > The right thing to do in this case is to poke through the code of hurdle() and zeroinfl(), but a simple (?) demonstration shows that hurdle() and zeroinfl() are indeed reporting opposite values : hurdle reports -log(p/(1-p)) = -qlogis(p), where p is the probability of a zero count: z = rpois(500,lambda=3) z = (z[z>0])[1:90] z = c(z,rep(0,10)) hurdle(z~1) ## -qlogis(0.1) ## zero coefficient always == -qlogis(0.1) zeroinfl reports log(p/(1-p)), where p is the zero-inflation: z = rpois(90,lambda=3) z = c(z,rep(0,10)) zeroinfl(z~1) ## qlogis(0.1) tmpf = function() { z = rpois(90,lambda=3) z = c(z,rep(0,10)) coef(zeroinfl(z~1))[2] } rr = replicate(1000,tmpf()) hist(rr,breaks=1000) summary(rr) qlogis(0.1) Perhaps it would be worth sending an e-mail to the package maintainers to request a note to this effect in the documentation, particularly if this a FAQ ... -- View this message in context: http://www.nabble.com/opposite-estimates-from-zeroinfl%28%29-and-hurdle%28%29-tp26024735p26029131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] opposite estimates from zeroinfl() and hurdle()
Dear all, A question related to the following has been asked on R-help before, but I could not find any answer to it. Input will be much appreciated. I got an unexpected sign of the "slope" parameter associated with a covariate (diam) using zeroinfl(). It led me to compare the estimates given by zeroinfl() and hurdle(): The (significant) negative estimate here is surprising, given the biology of the species: > summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, EM = TRUE)) Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.746040.02635 142.2 <2e-16 *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) 21.7510 7.6525 2.842 0.00448 ** diam -1.1437 0.3941 -2.902 0.00371 ** Number of iterations in BFGS optimization: 1 Log-likelihood: -582.8 on 3 Df The hurdle model gives the same estimates, but with opposite (and expected) signs of the parameters: summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar)) Count model coefficients (truncated poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.746040.02635 142.2 <2e-16 *** Zero hurdle model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) -21.7510 7.6525 -2.842 0.00448 ** diam 1.1437 0.3941 2.902 0.00371 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Number of iterations in BFGS optimization: 8 Log-likelihood: -582.8 on 3 Df Why is this so? thanks, Tord Windows NT, R 2.8.1, pcsl 1.03 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] opposite estimates from zeroinfl() and hurdle()
Dear all, A question related to the following has been asked on R-help before, but I could not find any answer to it. Input will be much appreciated. I got an unexpected sign of the "slope" parameter associated with a covariate (diam) using zeroinfl(). It led me to compare the estimates given by zeroinfl() and hurdle(): The (significant) negative estimate here is surprising, given the biology of the species: > summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, EM = TRUE)) Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.74604 0.02635 142.2 <2e-16 *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) 21.7510 7.6525 2.842 0.00448 ** diam -1.1437 0.3941 -2.902 0.00371 ** Number of iterations in BFGS optimization: 1 Log-likelihood: -582.8 on 3 Df The hurdle model gives the same estimates, but with opposite (and expected) signs of the parameters: summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar)) Count model coefficients (truncated poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 3.74604 0.02635 142.2 <2e-16 *** Zero hurdle model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) -21.7510 7.6525 -2.842 0.00448 ** diam 1.1437 0.3941 2.902 0.00371 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Number of iterations in BFGS optimization: 8 Log-likelihood: -582.8 on 3 Df Why is this so? thanks, Tord Windows NT, R 2.8.1, pcsl 1.03 -- Tord Snäll Department of Ecology / Swedish Species Information Centre Swedish University of Agricultural Sciences (SLU) P.O. 7044, SE-750 07 Uppsala, Sweden Office/Mobile/Fax +46-18-672612/+46-76-7662612/+46-18-673537 www.ekol.slu.se/staff_tordsnall www.artdata.slu.se/personal/fototsn.asp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.