Strubbe Diederik <diederik.strubbe <at> ua.ac.be> writes: > > Dear R community, > > I have some questions regarding the analysis of a zero-inflated count dataset and repeated measures design. > > The dataset is arranged as follows : > Unit of analysis: point - these are points were bird were counted during a certain amount of time. In total we have about 175 points. Each point is located within a certain habitat fragment (here: "site" = > A-B-C-D-..., in reality we have 25 sites,i.e. forest fragments). All points were counted five times > during three years ( thus in total, each point was counted 15 times). We want to relate the bird abundance to > a number of habitat variables (here: X1-X2-X3) collected at the site level. Abundance: this is the number > of birds counted at a point. In most cases ( > 90%), no birds were detected and the abundance dataset is thus zero-inflated. > > I have been looking for a code to analyze this zero-inflated poisson distributed dataset with a repeated > measures design, and I have arrived at the glmmADMB package. > > library(glmmADMB) > data <- read.table("D:/Boris/Borisdataset.csv",sep=",",header=TRUE) > count <- data$count > site <- data$site > abundance <- data$abundance > test<-glmm.admb(abundance~data$X1+data$X2+data$year,random=~count, group="site",data=data,family="poisson",zeroInflation=TRUE) > > [ for clarity: in the above syntax: count ranges from 1-5 as each site has been counted 5 times in a year, site > refers to one of the 25 forest fragments in which the point counts were conducted, Xi are the habitat variables]. > > My questions are: > - does it make sense to analyze these data at the point level, as all habitat variables are collected at the > site level, meaning that for all points belonging to a certain forest fragment, the habitat variables > have the same value. If it does make sense, is the proposed syntax ok? Is there any option to include year as a > random effect, as I am not especially interested in differences between years.
> -it looks appealing to average the point count values for each forest fragment, and to analyze the data with > "forest fragment" as unit of analysis. However, also when averaging across fragments, the dataset is > still zero-inflated. It is however impossible to a zero-inflated Poisson distribution for this > analysis, as the averaged forest fragment values are not always discrete values. Rather than averaging, one can side-step that problem by instead summing over points within sites and using a log(time) offset for any fixed differences in time of observation across sites. It sounds sensible to me to take the approach of a site-level analysis, but my credentials are not in statistics so it's possible that a more authoritative answer would be offered. -- David Winsemius ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.