Z is correct, of course.  I was just being a little too simplistic in my
explanation trying to emphasize the reversal of signs of the coefficients
in the logistic regression part of the zero-inflated model.


Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  ca...@usgs.gov <brian_c...@usgs.gov>
tel:  970 226-9326

On Wed, Aug 14, 2013 at 4:07 AM, Achim Zeileis <achim.zeil...@uibk.ac.at>wrote:

> On Tue, 13 Aug 2013, Cade, Brian wrote:
>  Lauria:  For historical reasons the logistic regression (binomial with
>> logit link) model portion of a zero-inflated count model is usually
>> structured to predict the probability of the 0 counts rather than the
>> nonzero (>=1) counts so the coefficients will be the negative of what you
>> expect based on the count model portion (as in your output).  It is simple
>> to interpret the probability of the logistic regression portion as the
>> probability of the nonzero counts by just taking the negative of the
>> coefficient estimates provided for the probability of the zero counts.
> This is a common misinterpretation but not quite correct.
> The zero-inflation model is a mixture model of two components: (1) a count
> component (Poisson, NB, ...), and (2) a zero mass component (i.e., zero
> with probability 1). Hence, the observed zeros in the data can come from
> both sources: either they are "random" zeros from component (1) or "excess"
> zeros from component (2).
> The binomial zero-inflation part of the model predicts the probability
> that a given observation belongs to component (1). Thus, the probability of
> an "excess zero". But this is _not_ the probability of observing a zero in
> the data (which is larger than the excess zero probability).
> If you want a model that first models zero vs. non-zero and second the
> non-zero counts, use the hurdle model. This has exactly the interpretation
> you describe above.
> Best,
> Z
>  Brian
Brian S. Cade, PhD
>> U. S. Geological Survey
>> Fort Collins Science Center
>> 2150 Centre Ave., Bldg. C
>> Fort Collins, CO  80526-8818
>> email:  ca...@usgs.gov <brian_c...@usgs.gov>
>> tel:  970 226-9326
>> On Tue, Aug 13, 2013 at 9:06 AM, Lauria, Valentina <
>> valentina.lau...@nuigalway.ie> wrote:
>>  Dear All,
>>> I am running a negative binomial model in R using the package pscl in
>>> oder
>>> to estimate bed sediment movements versus river discharge. Currently we
>>> have deployed 4 different plates to test if a combination of more than
>>> one
>>> plate would better describe the sediment movements when the river
>>> discharge
>>> changes over time.
>>> My data are positively skewed and zero-inflated. I did run both
>>> zero-inflated Poisson and zero-inflated negative binomial regression and
>>> compared them using the VUONG test which showed that the negative
>>> binomial
>>> works better than a simple zero-inflated Poisson.
>>> My models look like:
>>> 1) plate1 ~ river discharge
>>> 2) (plate 1 + plate 2) ~ river discharge
>>> 3) (plate 1 + plate 2 +plate 3) ~ river discharge
>>> 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge
>>> My main problem as I am new to these type of models is that I get a
>>> different sign for the coefficent of discharge in the output of the
>>> zero-inflated negative binomial model (please see below). What does this
>>> mean? Also how could I compare the different models (1-4) i.e. what tells
>>> me which is performing best? Thank you very much in advance for any
>>> comments and suggestions!!
>>> Kind Regards,
>>> Valentina
>>> Call:
>>> zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist =
>>> "negbin", EM = TRUE)
>>> Pearson residuals:
>>>     Min      1Q  Median      3Q     Max
>>> -0.6770 -0.3564 -0.2101 -0.0814 12.3421
>>> Count model coefficients (negbin with log link):
>>>                          Estimate    Std. Error z value Pr(>|z|)
>>> (Intercept)  2.557066     0.036593   69.88   <2e-16 ***
>>> discharge    0.064698    0.001983   32.63   <2e-16 ***
>>> Log(theta)  -0.775736   0.012451  -62.30   <2e-16 ***
>>> Zero-inflation model coefficients (binomial with logit link):
>>>                       Estimate    Std. Error     z value    Pr(>|z|)
>>> (Intercept)   13.01011    0.22602      57.56   <2e-16 ***
>>> discharge    -1.64293    0.03092       -53.14   <2e-16 ***
>>> Theta = 0.4604
>>> Number of iterations in BFGS optimization: 1
>>> Log-likelihood: -6.933e+04 on 5 Df
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
