Hi all, I am trying to do a generalized estimating equation (GEE) with the "geepack" package and I am not 100% sure what exactly the "id" argument means. It seems to be an important argument because results differ considerably defining different clusters.
I have a data set of counts (poisson distribution): numbers of butterfly species counted every month during a period of one year (12 repeated measures) at seven sites, three of those being "continuous forest sites" and four of those being "secondary forest sites". The aim is to compare continuous and secondary forests. Would you define the sites or the forest type as id argument: model1<-geeglm(formula = number ~ type + month, family = poisson, *id = site *, corstr = "ar1") model2<-geeglm(formula = number ~ type + month, family = poisson, *id = type *, corstr = "ar1") or should even almost every count have a special id (e.g. * id=interaction(month,site)* or *id=interaction(month,type*)) Thanks for your help... Anna [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.