Dear List,

I apologise in advance for all my questions. 

I am interested to predict the habitat selection of fish species using the 
hurdle model. I know that I can perform this in R with the function 
predict.hurdle() on newdata, however how this work  is not entirely clear.

Usually with a two-step approach a binary and a poisson models are created to 
deal with zero-inflated and over-dispersed data, then the binary model is 
multiplied by the poisson model in order weight the predictions.  Is this 
already included in the predict.hurdle function? 

Also I am using the function dredge (from the MuMin package) to select my best 
model based on AIC, still in this case the best model selected seems to be a 
combination between the truncated poisson and the binary model (hurdle model). 
Is there any way that I could dredge the two model components separately? I did 
some research and in the NEWS section I found that a package pscf was created 
for this but when I did more digging around I did not have much luck.

I would be grateful if someone could help me. 
Thank you very much once again,
Valentina




-----Original Message-----
From: Achim Zeileis [mailto:achim.zeil...@uibk.ac.at] 
Sent: 18 October 2013 18:57
To: Lauria, Valentina
Cc: r-help@r-project.org
Subject: Re: [R] hurdle model error why does need integer values for the 
dependent variable?

On Fri, 18 Oct 2013, Lauria, Valentina wrote:

> Dear list,
>
> I am using the hurdle model for modelling the habitat of rare fish 
> species. However I do get an error message when I try to model my data:
>
>> test_new1<-hurdle(GALUMEL~ depth + sal + slope + vrm + lat:long + 
>> offset(log(haul_numb)), dist = "negbin", data = datafit_elasmo)
>
> Error in hurdle(GALUMEL ~ depth + sal + slope + vrm + lat:long + 
> offset(log(haul_numb)),  :
>  invalid dependent variable, non-integer values
>
> When I do fit the same model with round(my dependent variable) the 
> model works. Sorry for the stupid question but could anyone explain me 
> why? My data are zero inflated (zeros occurring for 78%) and positively 
> skewed.

hurdle() fits a count data distribution (poisson, negbin, geometric) by maximum 
likelihood. Hence, its response needs to be a count variable (i.e., integer). 
See vignette("countreg", package = "pscl") for the underlying likelihoods 
employed.

> Thank you very much in advance.
> Kind Regards,
> Valentina
>
>
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to