> On Dec 29, 2015, at 5:33 AM, Frank van Berkum <frankieboy...@hotmail.com> > wrote: > > Hi all, > My problem is the following. > Suppose I have a dataset with observations Y and explanatory variables X1, > ..., Xn, and suppose one of these explanatory variables is geographical area > (of which there are ten, j=1,...,10). For some observations I know the area, > but for others it is unknown and therefore record as NA. > I want to estimate a model of the form Y[i] ~ Poisson( lambda[i] ) with > log(lambda[i]) = constant + \sum_j I[!is.na(area[i])] * I[area[i]==j] * > beta[j] > In words: we estimate a constant for all observations and a factor for each > area. If it is unknown what the area is, we only include the constant. > When estimating this model using glm(), the records with is.na(area[i]) are > 'deleted' from the dataset, and this I do not want. I had hoped that the > model as described above could be estimated using the function I() (interpret > as), but so far my attempts have not succeeded. > Any help on how to approach this is kindly appreciated. > Kind regards, > Frank van Berkum > [[alternative HTML version deleted]]
I don't understand why you don't just recode the NA's to "unknown" and redo the model. This code is untested, but I think demonstrates the two steps needed: add a level and then recode the NA values assuming this factor is named `X6`: dat$X6 <- factor(dat$X6, levels=c( levels(dat$X6), "unknown") ) dat$X6[ is.na(dat$X6) ] <- "unknown" (I think this might be is a bit quicker than Rolf's approach.) In R the default contrasts are "treatment" (which is different than the contrasts you describe) and so each area will be referenced to the first area (and the first level of all other factors) in the lexical ordering of area names. This ordering and the contrast type can be changed. The are many postings on rhelp over the years demonstrating how to do this. Study these and the basics of R before reposting and if you do so, then post in plain text and include a small example constructed in R: ?factor ?C ?constrasts ?contr.sum -- David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.