That did it!
Thanks so much as always. I emailed the question to you because I think you are an R expert based on all the suggestions, feedback and codes I have received from you in the past. Yes, I do look for answers in the open forum but when it comes to a question for which the answers out there are not very clear I prefer to ask somebody whom I can trust. I hope I donât bother you with my questions. Thanks once again. Cheers! -Som. Date: Mon, 26 Nov 2012 03:33:43 -0800 From: ml-node+s789695n4650829...@n4.nabble.com To: genome1...@hotmail.com Subject: Re: Help with predict function in glm Hello, Why mail a question just to me? Post to the list and the odds of getting more answers (and better) are bigger. As for your question, the problem is in the call to glm, you don't need the prefix 'train$' in the formula, the argument 'data' solves that and when predicting R will look for the columns with names in the formula and is unable to find columns called train$Outcome and train$Weight in the new data.frame 'test'. Corrected: mylogit <- glm(Outcome ~ Weight, data=train, family = binomial("logit")) predictions <- predict(mylogit, newdata = test, type= "response") Hope this helps, Rui Barradas Em 26-11-2012 01:42, somnath bandyopadhyay escreveu: > > Hi, > I am trying some basic logistic regression analysis using glm. I just have > one dependent variable (Outcome) which is binary in nature and one > independent variable (Weight). I fit a model using a training data set > (train) which has 85 observations and try to apply it on an independent > dataset (test) which has 55 observations. When I apply the predict function > on the fitted model for the new dataset, I get the following warning "Warning > message: 'newdata' had 55 rows but variable(s) found have 85 rows" and the > predict works on the training observations and not on the test observations. > > Following is he session info, code and the training and test datasets I am > using. > > What am I doing wrong? Any help would be greatly appreciated. > > Thanks, > S. > >> train <- read.table("train_data.txt", header=T, row.names=1, sep="\t") >> test<- read.table("test_data.txt", header=T, row.names=1, sep="\t") >> mylogit <- glm(train$Outcome ~ train$Weight, data=train, family = >> binomial("logit")) >> predictions <- predict(mylogit, newdata = test, type= "response") > Warning message: > 'newdata' had 55 rows but variable(s) found have 85 rows > > >> sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 > LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > > > >> train > Outcome Weight > AB256939_21 0 0.331 > AB257076_21 0 0.308 > AB257079_21 0 0.453 > AB415508_21 0 0.303 > AB700497_21 0 0.354 > AB904508_21 0 0.336 > AC048719_21 0 0.420 > AC185939_21 0 0.249 > AC185940_21 0 1.525 > AC445840_21 0 0.261 > E7490523_21 0 0.269 > E7490524_21 0 0.213 > E7659579_21 0 0.360 > E7661528_21 0 0.271 > E7781094_21 0 0.156 > E7781095_21 0 0.221 > E7781096_21 0 0.098 > E7969081_21 0 0.430 > E8117594_21 0 0.321 > E8133295_21 0 0.166 > E8161578_22 0 0.269 > E8483037_21 0 0.162 > E8559720_21 0 0.226 > L1065550_18 0 0.396 > L1065607_17 0 0.541 > L1065944_24 0 0.131 > L1066017_20 0 0.421 > L1069261_12 0 0.357 > L1069262_14 0 0.309 > L1069263_27 0 0.283 > L1069297_24 0 0.620 > L1081528_21 0 0.561 > L1084066_21 0 0.564 > L1086090_21 0 0.649 > L1104280_17 0 0.181 > L1111362_22 0 0.199 > L1118063_15 0 0.369 > L1133550_21 0 0.302 > L1144201_14 0 0.249 > L1155023_7 0 0.257 > L1158386_21 0 0.470 > L1163051_4 0 0.446 > ........................... > ........................... > ........................... > > >> test > Weight > AB256870_21 0.364 > AB256873_21 0.329 > AB415518_21 0.219 > AB460669_21 0.481 > AB609036_21 0.313 > AB609038_21 0.196 > AB700495_21 0.402 > AB700498_21 0.343 > AC112834_21 0.372 > AC185937_21 0.270 > AC269527_21 0.285 > E7352023_21 0.358 > E7661554_21 0.471 > E7750502_21 0.437 > E7845183_21 0.232 > E7854155_21 0.474 > E7854156_21 0.121 > E7924877_21 0.312 > E7969079_21 0.423 > E8139256_21 0.329 > E8161577_22 1.060 > E8161580_21 0.157 > E8364473_21 0.227 > E8364474_21 0.069 > L1065940_14 0.256 > L1065946_10 0.184 > L1066018_25 0.282 > L1069260_15 1.094 > ................................ > ................................ > > > > > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Calculating-all-possible-ratios-tp4627405p4650829.html To unsubscribe from Calculating all possible ratios, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Calculating-all-possible-ratios-tp4627405p4650836.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.