I have the task of producing some boxplot graphics with the requirement
that these have the same general appearance as a set of such graphics
as were produced last year. I do not have access to the code that was
used to produce the "last year" graphics.
There are multiple boxplots per figure and these boxplots appear in
groups (with two boxplots in each group in the simplest instance; there
are four or more per group in other instances, but I figure that if I
can work out how to handle two, then ....).
After a bit of Googling I found that ggplot() does basically what I
want. However my mindset seems to be substantially incompatible with
that of ggplot() and I cannot figure out how to make some adjustments
which are needed in order to make my plots look like last year's.
In last year's graphics the boxes were unfilled and were distinguished
(within groups) by their boundary colours, which were "red" and "black"
in the simple two-per-group instance. I achieved the "unfilled" effect
by setting alpha=0 inside the call to geom_boxplot(). (Is this the
Right Thing to Do?) However I cannot get the boundary colours of the
boxes to be "red" and "black".
I have attached a sourceable script ("demo.txt") showing what I have
tried so far. I don't really understand the code; I simply copied and
adjusted code that I saw on stackoverflow. (Fairly mindlessly I'm afraid.)
Problems:
(1) The borders of the boxes are distinct, but they are sort-of-pink and
sort-of-blue, and I cannot for the life of me figure out how to make
them red and black.
(2) Putting in "color=Type" seemed to have the effect of creating two
legends, one with the desired legend title but all in black, and one
with legend title equal to "Type" but using the colours that actually
appear. How can I get just one "appropriate" legend?
(3) Last year's graphics have the x-axis starting at 0 (rather than at
c. 3.5). I tried using + xlim(0,8.5) but got told "Error: Discrete
value supplied to continuous scale". How can I make the appropriate
adjustment?
(4) Last year's graphics have y-axis tick marks, labels and grid lines
at 700, 800, 900, ..., 2000, 2100. How can I reproduce this?
I actually had several additional questions, but thought I'd better
scrounge around a bit more before posting this, and thereby managed
(mirabile dictu!) to answer them myself.
Can anyone help me out with questions (1) --- (4)? Please keep it
simple and very explicit, for I am a bear of very little brain and long
words bother me!
Thanks.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
#
# Script to demonstrate what I am trying to do.
#
# Simulate some data:
Year <- factor(rep(4:8,each=50,times=2))
Type <- rep(c("National","Local"),each=250)
M0 <- 1300+50*(0:4)
set.seed(42)
M1 <- M0 + runif(5,-100,-50)
X0 <- rnorm(250,rep(M0,each=50),150)
X1 <- rnorm(250,rep(M1,each=50),100)
DemoDat <- data.frame(Year=Year,Score=c(X0,X1),Type=Type)
# Grouped boxplots:
library(ggplot2)
print(ggplot(data=DemoDat) +
geom_boxplot(aes(x=Year, y=Score, color=Type,fill=Type),
position=position_dodge(1),alpha=0) +
theme_minimal() +
scale_fill_discrete(name="National v. Local") +
ylim(700,2100))
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.