Hi WizaRds!

I don't know if my understanding of models is screwed up or
if there's a fundamental limitation to the way they are 
used in R.

Here's what I'd like to do: create a model and then test data
against that model using generics like 'predict' and 'residuals'.
The problem is that generics don't work for new data sets, except
of course 'predict': 

        set.seed(1)
        df<- data.frame(matrix(rnorm(10*4), nc=4))
        cn<-  c("y", "x1", "x2", "x3")
        colnames(df)<- cn
        model<- lm(y ~ x1 + x2 + x3 + 1, data=df)
        u<- residuals(model, data=df)

        test<- data.frame(matrix(rnorm(10*4), nc=4))
        colnames(test)<- cn
        y.hat<- predict(model, newdata=test)
        u.new<- residuals(model, newdata=test)

# I would expect to get the residuals to the 'test' 
# data fit to 'model' with 'residuals(model, newdata=test)',
# alas no:
        cbind(u, u.new)
                      u        u.new
        1   0.032996295  0.032996295
        2   0.575261650  0.575261650
        3  -0.702501425 -0.702501425
        4   0.482075538  0.482075538
        5   0.300600839  0.300600839
        6  -0.980832577 -0.980832577
        7   0.265239680  0.265239680
        8  -0.341625071 -0.341625071
        9   0.363224527  0.363224527
        10  0.005560545  0.005560545

# I would expect to get the 'test' fit to 'model'
# with 'fitted(model, newdata=test)' but again no:
        cbind(fitted(model), fitted(model, newdata=test))
                  [,1]        [,2]
        1  -0.65945011 -0.65945011
        2  -0.39161833 -0.39161833
        3  -0.13312719 -0.13312719
        4   1.11320526  1.11320526
        5   0.02890693  0.02890693
        6   0.16036419  0.16036419
        7   0.22218937  0.22218937
        8   1.07994978  1.07994978
        9   0.21255683  0.21255683
        10 -0.31094893 -0.31094893

I have to do:

        y.hat<- predict(model, newdata=test)
        # get test residuals:
        u.new<- test$y - y.hat
        u.new
         1          2          3          4          5          6          7 
 1.3661751 -0.4077999  1.1740108  0.4492772 -1.5865168 -0.7633108 -0.8800892 
         8          9         10 
 1.7474658 -0.1016104  2.1081361 
         

to get the residuals. Maybe this is what one is supposed to do?

To be fair, the help pages for e.g. 'residuals.lm'
don't say that you can do things like 'newdata=test' in:

'residuals(model, newdata=test)' 

or 'data=test' in:

'residuals(model, data=test)' 

but that is exactly what I would like to do!

Am I missing something here, or is this just the way it is?

Thanks guys!

Jack.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to