Hello,

Why mail a question just to me? Post to the list and the odds of getting more answers (and better) are bigger. As for your question, the problem is in the call to glm, you don't need the prefix 'train$' in the formula, the argument 'data' solves that and when predicting R will look for the columns with names in the formula and is unable to find columns called train$Outcome and train$Weight in the new data.frame 'test'. Corrected:

mylogit <- glm(Outcome ~ Weight, data=train, family = binomial("logit"))
predictions <- predict(mylogit, newdata = test, type= "response")


Hope this helps,

Rui Barradas
Em 26-11-2012 01:42, somnath bandyopadhyay escreveu:

Hi,
I am trying some basic logistic regression analysis using glm. I just have one dependent 
variable (Outcome) which is binary in nature and one independent variable (Weight). I fit 
a model using a training data set (train) which has 85 observations and try to apply it 
on an independent dataset (test) which has 55 observations. When I apply the predict 
function on the fitted model for the new dataset, I get the following warning 
"Warning message: 'newdata' had 55 rows but variable(s) found have 85 rows" and 
the predict works on the training observations and not on the test observations.

Following is he session info, code and the training and test datasets I am 
using.

What am I doing wrong? Any help would be greatly appreciated.

Thanks,
S.

train <- read.table("train_data.txt", header=T, row.names=1, sep="\t")
test<- read.table("test_data.txt", header=T, row.names=1, sep="\t")
mylogit <- glm(train$Outcome ~ train$Weight, data=train, family = 
binomial("logit"))
predictions <- predict(mylogit, newdata = test, type= "response")
Warning message:
'newdata' had 55 rows but variable(s) found have 85 rows


sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 
LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base




train
Outcome Weight
AB256939_21 0 0.331
AB257076_21 0 0.308
AB257079_21 0 0.453
AB415508_21 0 0.303
AB700497_21 0 0.354
AB904508_21 0 0.336
AC048719_21 0 0.420
AC185939_21 0 0.249
AC185940_21 0 1.525
AC445840_21 0 0.261
E7490523_21 0 0.269
E7490524_21 0 0.213
E7659579_21 0 0.360
E7661528_21 0 0.271
E7781094_21 0 0.156
E7781095_21 0 0.221
E7781096_21 0 0.098
E7969081_21 0 0.430
E8117594_21 0 0.321
E8133295_21 0 0.166
E8161578_22 0 0.269
E8483037_21 0 0.162
E8559720_21 0 0.226
L1065550_18 0 0.396
L1065607_17 0 0.541
L1065944_24 0 0.131
L1066017_20 0 0.421
L1069261_12 0 0.357
L1069262_14 0 0.309
L1069263_27 0 0.283
L1069297_24 0 0.620
L1081528_21 0 0.561
L1084066_21 0 0.564
L1086090_21 0 0.649
L1104280_17 0 0.181
L1111362_22 0 0.199
L1118063_15 0 0.369
L1133550_21 0 0.302
L1144201_14 0 0.249
L1155023_7 0 0.257
L1158386_21 0 0.470
L1163051_4 0 0.446
...........................
...........................
...........................


test
Weight
AB256870_21 0.364
AB256873_21 0.329
AB415518_21 0.219
AB460669_21 0.481
AB609036_21 0.313
AB609038_21 0.196
AB700495_21 0.402
AB700498_21 0.343
AC112834_21 0.372
AC185937_21 0.270
AC269527_21 0.285
E7352023_21 0.358
E7661554_21 0.471
E7750502_21 0.437
E7845183_21 0.232
E7854155_21 0.474
E7854156_21 0.121
E7924877_21 0.312
E7969079_21 0.423
E8139256_21 0.329
E8161577_22 1.060
E8161580_21 0.157
E8364473_21 0.227
E8364474_21 0.069
L1065940_14 0.256
L1065946_10 0.184
L1066018_25 0.282
L1069260_15 1.094
................................
................................






______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to