I have the task of producing some boxplot graphics with the requirement that these have the same general appearance as a set of such graphics
as were produced last year.  I do not have access to the code that was
used to produce the "last year" graphics.

There are multiple boxplots per figure and these boxplots appear in groups (with two boxplots in each group in the simplest instance; there are four or more per group in other instances, but I figure that if I can work out how to handle two, then ....).

After a bit of Googling I found that ggplot() does basically what I want. However my mindset seems to be substantially incompatible with that of ggplot() and I cannot figure out how to make some adjustments which are needed in order to make my plots look like last year's.

In last year's graphics the boxes were unfilled and were distinguished
(within groups) by their boundary colours, which were "red" and "black"
in the simple two-per-group instance. I achieved the "unfilled" effect by setting alpha=0 inside the call to geom_boxplot(). (Is this the Right Thing to Do?) However I cannot get the boundary colours of the
boxes to be "red" and "black".

I have attached a sourceable script ("demo.txt") showing what I have tried so far. I don't really understand the code; I simply copied and adjusted code that I saw on stackoverflow. (Fairly mindlessly I'm afraid.)

Problems:

(1) The borders of the boxes are distinct, but they are sort-of-pink and sort-of-blue, and I cannot for the life of me figure out how to make them red and black.

(2) Putting in "color=Type" seemed to have the effect of creating two legends, one with the desired legend title but all in black, and one with legend title equal to "Type" but using the colours that actually appear. How can I get just one "appropriate" legend?

(3) Last year's graphics have the x-axis starting at 0 (rather than at
c. 3.5). I tried using + xlim(0,8.5) but got told "Error: Discrete value supplied to continuous scale". How can I make the appropriate
adjustment?

(4) Last year's graphics have y-axis tick marks, labels and grid lines at 700, 800, 900, ..., 2000, 2100. How can I reproduce this?

I actually had several additional questions, but thought I'd better scrounge around a bit more before posting this, and thereby managed (mirabile dictu!) to answer them myself.

Can anyone help me out with questions (1) --- (4)? Please keep it simple and very explicit, for I am a bear of very little brain and long words bother me!

Thanks.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
#
# Script to demonstrate what I am trying to do.
#

# Simulate some data:
Year <- factor(rep(4:8,each=50,times=2))
Type <- rep(c("National","Local"),each=250)
M0   <- 1300+50*(0:4)
set.seed(42)
M1   <- M0 + runif(5,-100,-50)
X0   <- rnorm(250,rep(M0,each=50),150)
X1   <- rnorm(250,rep(M1,each=50),100)
DemoDat <- data.frame(Year=Year,Score=c(X0,X1),Type=Type)

# Grouped boxplots:
library(ggplot2)
print(ggplot(data=DemoDat) +
    geom_boxplot(aes(x=Year, y=Score, color=Type,fill=Type),
                 position=position_dodge(1),alpha=0) +
    theme_minimal() +
    scale_fill_discrete(name="National v. Local") +
    ylim(700,2100))
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to