Re: [R] Visualizing binary response data?
You could also try using interactive graphics in iplots. Linking from a barchart of your binary response variable to your eight continuous predictors in a parallel coordinate plot and to your four categorical predictors in some form of mosaicplot could be very informative. Graphics are not necessarily the method of choice to select your predictor variables, as Frank Harrell has pointed out. It is also sensible not to rely on modelling alone. Graphic displays can help you better understand your data and models. The two approaches are complementary. Antony Unwin University of Augsburg Germany On Tue, May 4, 2010 at 9:04 PM, Kim Jung Hwa wrote: > Hi All, > > I'm dealing with binary response data for the first time, and I'm confused > about what kind of graphics I could explore in order to pick relevant > predictors and their relation with response variable. > > I have 8-10 continuous predictors and 4-5 categorical predictors. Can > anyone > suggest what kind of graphics I can explore to see how predictors behave > w.r.t. response variable... > > Any help would be greatly appreciated, thanks, > Kim > __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Visualizing binary response data?
On 05/04/2010 09:12 PM, Thomas Stewart wrote: For binary w.r.t. continuous, how about a smoothing spline? As in, x<-rnorm(100) y<-rbinom(100,1,exp(.3*x-.07*x^2)/(1+exp(.3*x-.07*x^2))) plot(x,y) lines(smooth.spline(x,y)) OR how about a more parametric approach, logistic regression? As in, glm1<-glm(y~x+I(x^2),family=binomial) plot(x,y) lines(sort(x),predict(glm1,newdata=data.frame(x=sort(x)),type="response")) FOR binary w.r.t. categorical it depends. Are the categories ordinal (is there a natural ordering?) or are the categories nominal (no ordering)? For nominal categories, the data is essentially a contingency table, and "strength of the predictor" is a test of independence. You can still do a graphical exploration: maybe plotting the proportion of Y=1 for each category of X. As in, z<-cut(x,breaks=-3:3) plot(tapply(y,z,mean)) If your goal is to find strong predictors of Y, you may want to consider graphical measures that look at the predictors jointly. Maybe with a generalized additive model (gam)? There is probably a lot more you can do. Be creative. -tgs And you have to decide why you would look to a graph to select predictors. This can badly distort later inferences (confidence intervals, P-values, biased regression coefficients, biased R^2, etc.). Frank On Tue, May 4, 2010 at 9:04 PM, Kim Jung Hwawrote: Hi All, I'm dealing with binary response data for the first time, and I'm confused about what kind of graphics I could explore in order to pick relevant predictors and their relation with response variable. I have 8-10 continuous predictors and 4-5 categorical predictors. Can anyone suggest what kind of graphics I can explore to see how predictors behave w.r.t. response variable... Any help would be greatly appreciated, thanks, Kim -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Visualizing binary response data?
For binary w.r.t. continuous, how about a smoothing spline? As in, x<-rnorm(100) y<-rbinom(100,1,exp(.3*x-.07*x^2)/(1+exp(.3*x-.07*x^2))) plot(x,y) lines(smooth.spline(x,y)) OR how about a more parametric approach, logistic regression? As in, glm1<-glm(y~x+I(x^2),family=binomial) plot(x,y) lines(sort(x),predict(glm1,newdata=data.frame(x=sort(x)),type="response")) FOR binary w.r.t. categorical it depends. Are the categories ordinal (is there a natural ordering?) or are the categories nominal (no ordering)? For nominal categories, the data is essentially a contingency table, and "strength of the predictor" is a test of independence. You can still do a graphical exploration: maybe plotting the proportion of Y=1 for each category of X. As in, z<-cut(x,breaks=-3:3) plot(tapply(y,z,mean)) If your goal is to find strong predictors of Y, you may want to consider graphical measures that look at the predictors jointly. Maybe with a generalized additive model (gam)? There is probably a lot more you can do. Be creative. -tgs On Tue, May 4, 2010 at 9:04 PM, Kim Jung Hwa wrote: > Hi All, > > I'm dealing with binary response data for the first time, and I'm confused > about what kind of graphics I could explore in order to pick relevant > predictors and their relation with response variable. > > I have 8-10 continuous predictors and 4-5 categorical predictors. Can > anyone > suggest what kind of graphics I can explore to see how predictors behave > w.r.t. response variable... > > Any help would be greatly appreciated, thanks, > Kim > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Visualizing binary response data?
Hi All, I'm dealing with binary response data for the first time, and I'm confused about what kind of graphics I could explore in order to pick relevant predictors and their relation with response variable. I have 8-10 continuous predictors and 4-5 categorical predictors. Can anyone suggest what kind of graphics I can explore to see how predictors behave w.r.t. response variable... Any help would be greatly appreciated, thanks, Kim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.