I have a very large dataset with three variables that I need to graph using
a scatterplot. However I find that the first variable gets masked by the
other two, so the graph looks entirely different depending on the order of
variables. Does anyone have any suggestions how to manage this?

This code is an illustration of what I am dealing with:

x <- 10000
plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
points(rnorm(x,mean=21),rnorm(x),col=2)
points(rnorm(x,mean=19),rnorm(x),col=3)

gives an entirely different looking graph to:

x <- 10000
plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
points(rnorm(x,mean=20),rnorm(x),col=1)
points(rnorm(x,mean=21),rnorm(x),col=2)

despite being identical in all respects except for the order in which the
variables are plotted.

I have tried using pch=".", however the colours are very difficult to
discern. I have experimented with a number of other symbols with no real
solution.

The only way that appears to work is to iterate the plot with a for loop,
and progressively add a few numbers from each variable, as below. However
although I can do this simply with random numbers as I have done here, this
is an extremely cumbersome method to use with real datasets.

plot(1,1,xlim=c(16,24),ylim=c(-4,4),col="white")
x <- 100
for (i in 1:100) {
points(rnorm(x,mean=19),rnorm(x),col=3)
points(rnorm(x,mean=20),rnorm(x),col=1)
points(rnorm(x,mean=21),rnorm(x),col=2)
}

Is there some function in R that could solve this through automatically
iterating my data as above, using transparent symbols, or something else? Is
there some other way of solving this issue that I haven't thought of?

Thankyou,

Samuel Dennis

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to