Joris, the point is that 'z' is NOT used as a predictor in the model. Therefore it should not affect predictions. Also, I find it suspicious that the error only occurs when the response variable conitains missings and 'z' is unique (I have tested several other cases to confirm this).
-Mark Op vr 16 mrt. 2018 om 13:03 schreef Joris Meys <jorism...@gmail.com>: > It's not a bug per se. It's the effect of removing all observations linked > to a certain level in your data frame. So the output of lm() doesn't > contain a coefficient for level a of z, but your new data contains that > level a. With a small addition, this works again: > > d <- data.frame(x=rnorm(12),y=rnorm(12),z=rep(letters[1:6],2)) > > d$x[1] <- NA > m <- lm(x ~ . -z, data=d) > p <- predict(m, newdata=d) > > This is linked to another discussion earlier on stackoverflow : > https://stackoverflow.com/questions/48461980/prediction-in-r-glmm > which lead to an update to lme4 : https://github.com/lme4/lme4/issues/452 > > The point being that factors in your newdata should have the same levels > as factors in the original data that was used to fit the model. If you add > levels to these factors, it's impossible to use that model to predict for > these new data. > > Cheers > Joris > > On Fri, Mar 16, 2018 at 10:21 AM, Mark van der Loo < > mark.vander...@gmail.com> wrote: > >> Dear R-developers, >> >> In the 'lm' documentation, the '-' operator is only specified to be used >> with -1 (to remove the intercept from the model). >> >> However, the documentation also refers to the 'formula' help file, which >> indicates that it is possible to subtract any term. Indeed, the following >> works with no problems (the period '.' stands for 'all terms except the >> lhs'): >> >> d <- data.frame(x=rnorm(6), y=rnorm(6), z=letters[1:2]) >> m <- lm(x ~ . -z, data=d) >> p <- predict(m,newdata=d) >> >> Now, if I change 'z' so that it has only unique values, and I introduce an >> NA in the predicted variable, the following happens: >> >> d <- data.frame(x=rnorm(6),y=rnorm(6),z=letters[1:6]) >> d$x[1] <- NA >> m <- lm(x ~ . -z, data=d) >> p <- predict(m, newdata=d) >> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = >> object$xlevels) : factor z has new levels a >> >> It seems a bug to me, although one could argue that 'lm's documentation >> does not allow one to expect that the '-' operator should work generally. >> >> If it is a bug I'm happy to report it to bugzilla. >> >> Thanks for all your efforts, >> Mark >> >> ps: I was not able to test this on R3.4.4 yet, but the NEWS does not >> mention fixes related to 'lm' or 'predict'. >> >> >> > sessionInfo() >> R version 3.4.3 (2017-11-30) >> Platform: x86_64-pc-linux-gnu (64-bit) >> Running under: Ubuntu 16.04.4 LTS >> >> Matrix products: default >> BLAS: /usr/lib/libblas/libblas.so.3.6.0 >> LAPACK: /usr/lib/lapack/liblapack.so.3.6.0 >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 >> LC_PAPER=nl_NL.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] compiler_3.4.3 tools_3.4.3 yaml_2.1.16 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > > > -- > Joris Meys > Statistical consultant > > Department of Data Analysis and Mathematical Modelling > Ghent University > Coupure Links 653, B-9000 Gent (Belgium) > > <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> > > ----------- > Biowiskundedagen 2017-2018 > http://www.biowiskundedagen.ugent.be/ > > ------------------------------- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel