On 04-May-05 Maciej Bliziński wrote: > Hello, > > I'd like to create a model with a factor-type response variable. > This is an example: > >> mydata <- data.frame(factor_var = as.factor(c(rep('one', 100), >> rep('two', 100), rep('three', 100))), real_var = c(rnorm(150), >> rnorm(150) + 5)) >> summary(mydata) > factor_var real_var > one :100 Min. :-2.742877 > three:100 1st Qu.:-0.009493 > two :100 Median : 2.361669 > Mean : 2.490411 > 3rd Qu.: 4.822394 > Max. : 6.924588 >> mymodel = glm(factor_var ~ real_var, family = 'binomial', data = >> mydata) >> summary(mymodel) > > Call: > glm(formula = factor_var ~ real_var, family = "binomial", data = > mydata) > > Deviance Residuals: > Min 1Q Median 3Q Max > -1.7442 -0.6774 0.1849 0.3133 2.1187 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -0.6798 0.1882 -3.613 0.000303 *** > real_var 0.8971 0.1066 8.417 < 2e-16 *** > --- > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 381.91 on 299 degrees of freedom > Residual deviance: 213.31 on 298 degrees of freedom > AIC: 217.31 > > Number of Fisher Scoring iterations: 6
Have you noticed that you get identical results with set.seed(214354) mydata <- data.frame(factor.var = as.factor(c(rep('one', 100), rep('two',100), rep('three', 100))), real.var = c(rnorm(150), rnorm(150) + 5)) mymodel <- glm(factor.var ~ real.var, family='binomial', data=mydata) summary(mymodel) and set.seed(214354) mydata <- data.frame(factor.var = as.factor(c(rep('one', 100), rep('two',200))),real.var = c(rnorm(150),rnorm(150) + 5)) mymodel <- glm(factor.var ~ real.var, family='binomial', data=mydata) summary(mymodel) (I've left out the "summary(mydata)" since these do naturally differ, and I've replaced "factor_var" with "factor.var" and "real_var" with "real.var" because of potential complications with "_"; also "mymodel =" to "mymodel <-"). So I think the interpretation of the results from your first model is that, because of the "family='binomial'", glm is treating "factor.var='one'" as binomial response "0", say, and "factor.var='two'" or "factor.var='three'" as binomial response "1". You're trying to fit a multinomial response, but you've specified a binomial family to 'glm'. 'glm' does not have a multinomial response family. You could try 'multinom' from package 'nnet' which fits loglinear models to factor responses with more than 2 levels. E.g. library(nnet) mymodel <- multinom(factor.var ~ real.var,data=mydata) ### weights: 9 (4 variable) ## initial value 329.583687 ## iter 10 value 209.780666 ## final value 209.779951 ## converged summary(mymodel) ## Re-fitting to get Hessian ## Call: ## multinom(formula = factor.var ~ real.var, data = mydata) ## Coefficients: ## (Intercept) real.var ## three -3.4262565 1.3838231 ## two -0.6754253 0.7116955 ## ## Std. Errors: ## (Intercept) real.var ## three 0.5028541 0.1480138 ## two 0.1846827 0.1068821 ## ## Residual Deviance: 419.5599 ## AIC: 427.5599 ## ## Correlation of Coefficients: ## three:(Intercept) three:real.var two:(Intercept) ## three:real.var -0.7286258 ## two:(Intercept) 0.1986995 -0.1261034 ## two:real.var -0.1411377 0.7012481 -0.3285741 This output does suggest a fairly clear interpretation! Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 04-May-05 Time: 09:18:03 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html