Re: [R] Help with predict function in glm

genome1976 Mon, 26 Nov 2012 08:48:17 -0800

That did it!



Thanks so much as always.





I emailed the question to you because I think you are an R expert based 
on all the suggestions, feedback and codes I have received from you in 
the past. 
Yes, I do look for answers in the open 
forum but when it comes to a question for which the answers out there 
are not very clear I prefer to ask somebody whom I can trust. I hope I 
donât bother you with my questions.
Thanks once again.




Cheers!
-Som.

Date: Mon, 26 Nov 2012 03:33:43 -0800
From: ml-node+s789695n4650829...@n4.nabble.com
To: genome1...@hotmail.com
Subject: Re: Help with predict function in glm



        Hello,


Why mail a question just to me? Post to the list and the odds of getting 

more answers (and better) are bigger.

As for your question, the problem is in the call to glm, you don't need 

the prefix 'train$' in the formula, the argument 'data' solves that and 

when predicting R will look for the columns with names in the formula 

and is unable to find columns called train$Outcome and train$Weight in 

the new data.frame 'test'. Corrected:


mylogit <- glm(Outcome ~ Weight, data=train, family = binomial("logit"))

predictions <- predict(mylogit, newdata = test, type= "response")



Hope this helps,


Rui Barradas

Em 26-11-2012 01:42, somnath bandyopadhyay escreveu:

>

> Hi,

> I am trying some basic logistic regression analysis using glm. I just have 
> one dependent variable (Outcome) which is binary in nature and one 
> independent variable (Weight). I fit a model using a training data set 
> (train) which has 85 observations and try to apply it on an independent 
> dataset (test) which has 55 observations. When I apply the predict function 
> on the fitted model for the new dataset, I get the following warning "Warning 
> message: 'newdata' had 55 rows but variable(s) found have 85 rows" and the 
> predict works on the training observations and not on the test observations.

>

> Following is he session info, code and the training and test datasets I am 
> using.

>

> What am I doing wrong? Any help would be greatly appreciated.

>

> Thanks,

> S.

>

>> train <- read.table("train_data.txt", header=T, row.names=1, sep="\t")

>> test<- read.table("test_data.txt", header=T, row.names=1, sep="\t")

>> mylogit <- glm(train$Outcome ~ train$Weight, data=train, family = 
>> binomial("logit"))

>> predictions <- predict(mylogit, newdata = test, type= "response")

> Warning message:

> 'newdata' had 55 rows but variable(s) found have 85 rows

>

>

>> sessionInfo()

> R version 2.15.0 (2012-03-30)

> Platform: x86_64-pc-mingw32/x64 (64-bit)

>

> locale:

> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 
> LC_MONETARY=English_United States.1252

> [4] LC_NUMERIC=C LC_TIME=English_United States.1252

>

> attached base packages:

> [1] stats graphics grDevices utils datasets methods base

>

>

>

>

>> train

> Outcome Weight

> AB256939_21 0 0.331

> AB257076_21 0 0.308

> AB257079_21 0 0.453

> AB415508_21 0 0.303

> AB700497_21 0 0.354

> AB904508_21 0 0.336

> AC048719_21 0 0.420

> AC185939_21 0 0.249

> AC185940_21 0 1.525

> AC445840_21 0 0.261

> E7490523_21 0 0.269

> E7490524_21 0 0.213

> E7659579_21 0 0.360

> E7661528_21 0 0.271

> E7781094_21 0 0.156

> E7781095_21 0 0.221

> E7781096_21 0 0.098

> E7969081_21 0 0.430

> E8117594_21 0 0.321

> E8133295_21 0 0.166

> E8161578_22 0 0.269

> E8483037_21 0 0.162

> E8559720_21 0 0.226

> L1065550_18 0 0.396

> L1065607_17 0 0.541

> L1065944_24 0 0.131

> L1066017_20 0 0.421

> L1069261_12 0 0.357

> L1069262_14 0 0.309

> L1069263_27 0 0.283

> L1069297_24 0 0.620

> L1081528_21 0 0.561

> L1084066_21 0 0.564

> L1086090_21 0 0.649

> L1104280_17 0 0.181

> L1111362_22 0 0.199

> L1118063_15 0 0.369

> L1133550_21 0 0.302

> L1144201_14 0 0.249

> L1155023_7 0 0.257

> L1158386_21 0 0.470

> L1163051_4 0 0.446

> ...........................

> ...........................

> ...........................

>

>

>> test

> Weight

> AB256870_21 0.364

> AB256873_21 0.329

> AB415518_21 0.219

> AB460669_21 0.481

> AB609036_21 0.313

> AB609038_21 0.196

> AB700495_21 0.402

> AB700498_21 0.343

> AC112834_21 0.372

> AC185937_21 0.270

> AC269527_21 0.285

> E7352023_21 0.358

> E7661554_21 0.471

> E7750502_21 0.437

> E7845183_21 0.232

> E7854155_21 0.474

> E7854156_21 0.121

> E7924877_21 0.312

> E7969079_21 0.423

> E8139256_21 0.329

> E8161577_22 1.060

> E8161580_21 0.157

> E8364473_21 0.227

> E8364474_21 0.069

> L1065940_14 0.256

> L1065946_10 0.184

> L1066018_25 0.282

> L1069260_15 1.094

> ................................

> ................................

>

>

>

>

>

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



        
        
        
        

        

        
        
                If you reply to this email, your message will be added to the 
discussion below:
                
http://r.789695.n4.nabble.com/Calculating-all-possible-ratios-tp4627405p4650829.html
        
        
                
                To unsubscribe from Calculating all possible ratios, click here.

                NAML
                                                  



--
View this message in context: 
http://r.789695.n4.nabble.com/Calculating-all-possible-ratios-tp4627405p4650836.html
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with predict function in glm

Reply via email to