Often, when exploring a dataset, I'd like to plot several very different Y
variables against the same X variable, in panels stacked one over the other. Is
there an easy way to do this?
I'd like to achieve an elegant look similar to the look achieved by lattice in
conditioned plots--for instance no space between panels. But unlike in
straightforward conditioned plot, each panel may be on a different scale.
Example.
• Plot Estrogen, Creatinine, and their ratio; all by the same
predictor variable (say, Day).
Or: In a longitudinal study of hormones in reproductive-age
women, plot progesterone, estradiol, testosterone, luteinizing hormone,
follicle-stimulating hormone, and thyroid-stimulating hormone all on one page,
parallel. Note that several of these variables are measured in different units.
• One panel for each outcome variable, arranged one above the
other.
• Minimal vertical space between the panels.
To make this concrete, let's generate toy data:
N<-40
set.seed(5234767)
JUNK<-data.frame(Day=1:N)
JUNK$Creatinine<-exp(2*rnorm(nrow(JUNK)))
JUNK$Estrogen<- (sin(JUNK$Day/pi) + 2.5) * ( exp(2*rnorm(nrow(JUNK))) *
JUNK$Creatinine )
JUNK$Creatinine[10]<-0.0001
JUNK$Ratio<- JUNK$Estrogen / JUNK$Creatinine
The following traditional graphics commands put an annoying wide space between the
"panels" by default. Also, the X ticks are repeated unnecessarily.
par(mfrow=c(3,1))
par(oma=c(0,0,1,0))
plot(JUNK$Day, JUNK$Estrogen, xlab="", ylab="Estrogen", type="o")
plot(JUNK$Day, JUNK$Creatinine, xlab="", ylab="Creatinine", type="o")
plot(JUNK$Day, JUNK$Ratio, xlab="Day", ylab="Ratio", type="o")
The following lattice approach gives a kinda nice-looking end product, but seems so
counterintuitive that I want to call it a workaround. For instance, it generates a
"time" variable that actually is a category. And the variable names are
converted into the levels of a factor by hand, so that the process is susceptible to
human error.
Also, the Y variables are not labeled on the axes, only in the strip. This is
not ideal.
JUNK2<-JUNK
names(JUNK2)[names(JUNK2)=="Creatinine"]<-"Y.1"
names(JUNK2)[names(JUNK2)=="Estrogen"]<-"Y.2"
names(JUNK2)[names(JUNK2)=="Ratio"]<-"Y.3"
JUNKlong<- reshape(JUNK2, dir="long", varying=2:4)
JUNKlong$outcome<-factor( JUNKlong$time, levels=1:3, labels=c("Creatinine", "Estrogen",
"Ratio") )
JUNKlong$time<-NULL
library(lattice)
xyplot( Y ~ Day | outcome, data=JUNKlong, layout=c(1,3), type="o",
scales=list(x=list(alternating=3), y=list(relation="free", alternating=3)), ylab="")
Am I making this harder than it needs to be?
Thanks
Jacob A. Wegelin
Assistant Professor
Department of Biostatistics
Virginia Commonwealth University
730 East Broad Street Room 3006
P. O. Box 980032
Richmond VA 23298-0032
U.S.A.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.