Re: [R] Scatter plot - using colour to group points?
Hello all, Yesterday I wrote Michael Weylandt to ask for some help in understanding a line of code he used responding to SarahH's query about controlling colours in scatter plots. He wrote an excellent explanation that deserves to be shared here. Below I include the code I wrote while experimenting with the problem (indicating the specific line of code I asked him about) followed by Michael's thoughtful reply. Saludos - Ian -- Ian G. Robertson Department of Anthropology Building 50, 450 Serra Mall Stanford University, CA 94305-2034 e:i...@stanford.edu #the code: ## x1 - rnorm(13) y1 - rnorm(13) #these two lines from R. Michael Weylandt X = letters[c(1,2,3,3,1,2,1,3,3,1,2,2,1)] colX = c(red,green,blue)[as.factor(X)] #?? How does this work? Ask RMW table(colX) plot(x1, y1, col=colX, pch=20, cex=2) ## #Michael Weylandt's explanation: In short, there are two key bits to follow: 1) What happens when you factorize something -- R stores factors internally as integers with special labels and a few special behaviors for some calculations that won't come up here: the labels aren't so important for our purpose, but the key is that each unique value of X gets assigned to its own factor. The order that they appear in X corresponds to the integers they get, not their real values (if they were already integers or doubles). As a side point this means that floating point trouble can sometimes show up so if you want to bin real numbers, it's safer to use cut() for the factoring step. 2) What happens when you use a factor to subset -- R simply tosses out the factor-ness and only uses the internal integer representation. If we wanted to be more explicit, we'd write colVec[as.integer(as.factor(X))] but the as.integer happens automatically. So the whole path is: assign integers to each unique value of X and subset by those integers: if there are as many unique values as there are elements of the color vector, the end result is a direct matching: if there are too many, it throws and error: too few and some colors go unused: something like: col(red,green,blue)[as.factor(letters[1:4])] ## ERROR col(red,green,blue)[as.factor(letters[1:2])] ## blue not used. Hope this helps, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
Thanks all for suggestions. I now have a nice plot showing the temperature of 6 different sites, each site distinguished by different coloured points, using nested ifelse. My apologies I thought I could change the type to l and the same arguments would be applied to line graph, with 6 different lines for each site...? I wanted to try lines as I think they might show the trends more clearly. I have just found the plottrix package manual and will try that to achieve this, and look at ggplot too. -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4095079.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
There's also the lines() command which takes a col argument if you want to do multiple lines (I usually wind up wrapping it in a for loop though there might be something smarter) ggplot2 is great, though the learning curve is a little rough: you can get good help here but if you go down that path, there's also a dedicated ggplot2 list that's worth checking out. Glad to have you as a new useR! Michael On Nov 22, 2011, at 5:13 AM, SarahH sarah@hotmail.co.uk wrote: Thanks all for suggestions. I now have a nice plot showing the temperature of 6 different sites, each site distinguished by different coloured points, using nested ifelse. My apologies I thought I could change the type to l and the same arguments would be applied to line graph, with 6 different lines for each site...? I wanted to try lines as I think they might show the trends more clearly. I have just found the plottrix package manual and will try that to achieve this, and look at ggplot too. -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4095079.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
Success with the lines command and col argument! I have some nice point and line plots. Thanks so much for you help. Ongoing project - I will probably be back! Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4097625.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
On Nov 21, 2011, at 2:17 PM, SarahH wrote: Dear All, I am very new to R - trying to teach myself it for some MSc coursework. I am plotting temperature data for two different sites over the same time period which I have downloaded from a university weather station data archive. I am using the following code to create the plot plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = TEMP3[,SITE], pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) I managed to use col = TEMP3[SITE] to plot the two different sites( BG1 and EA7) in different colours, but I am struggling to change the colours. I wanted to up a colour scheme to match the site, so tried Instead try num.site - as.numeric(TEMP3[,SITE]) plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = num.site, pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) Would create a vector of integer values that are specific to the sites and then offere that as argument to col= BG1 - blue EA7 - green That would only have created two new objects by that name (unless of course you were following someone's misguided directions to use attach().) before the plot function, but the graphic just came out with red and black as before. There are other datasets in which there are more than two sites so I would really like to learn how to use colour to distinguish between them on a plot. Any direction would be very greatly received! Thank you very much Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4092794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
I think the easiest way to do this is to set up a color vector with ifelse and hand that off to the plot command: something like col = ifelse(TEMP3[,SITE] == BG1, blue, green) # Syntax is ifelse(TEST, OUT_IF_TRUE, OUT_IF_FALSE) For more complicated schemes, a set of nested ifelse()'s can get you what you need. There are some other tricks with factors as well, but they require a little more advanced use of R. Just for the record, they'd look something like this: X = letters[c(1,2,3,3,1,2,1,3,3,1,2,2,1)] colX = c(red,green,blue)[as.factor(X)] Hope this helps, Michael On Mon, Nov 21, 2011 at 2:17 PM, SarahH sarah@hotmail.co.uk wrote: Dear All, I am very new to R - trying to teach myself it for some MSc coursework. I am plotting temperature data for two different sites over the same time period which I have downloaded from a university weather station data archive. I am using the following code to create the plot plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = TEMP3[,SITE], pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) I managed to use col = TEMP3[SITE] to plot the two different sites( BG1 and EA7) in different colours, but I am struggling to change the colours. I wanted to up a colour scheme to match the site, so tried BG1 - blue EA7 - green before the plot function, but the graphic just came out with red and black as before. There are other datasets in which there are more than two sites so I would really like to learn how to use colour to distinguish between them on a plot. Any direction would be very greatly received! Thank you very much Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4092794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
I got the colour vector with ifelse to work, great! Thank you. Is it possible to use the ifelse colour vector with other plot types? For example with type=l ? I tried but the graphic came back with blue lines for both sites and also a straight line connecting the start and end point of the data? Thanks Sarah Michael Weylandt wrote I think the easiest way to do this is to set up a color vector with ifelse and hand that off to the plot command: something like col = ifelse(TEMP3[,SITE] == BG1, blue, green) # Syntax is ifelse(TEST, OUT_IF_TRUE, OUT_IF_FALSE) For more complicated schemes, a set of nested ifelse()'s can get you what you need. There are some other tricks with factors as well, but they require a little more advanced use of R. Just for the record, they'd look something like this: X = letters[c(1,2,3,3,1,2,1,3,3,1,2,2,1)] colX = c(red,green,blue)[as.factor(X)] Hope this helps, Michael On Mon, Nov 21, 2011 at 2:17 PM, SarahH lt;sarah.g10@.cogt; wrote: Dear All, I am very new to R - trying to teach myself it for some MSc coursework. I am plotting temperature data for two different sites over the same time period which I have downloaded from a university weather station data archive. I am using the following code to create the plot plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = TEMP3[,SITE], pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) I managed to use col = TEMP3[SITE] to plot the two different sites( BG1 and EA7) in different colours, but I am struggling to change the colours. I wanted to up a colour scheme to match the site, so tried BG1 - blue EA7 - green before the plot function, but the graphic just came out with red and black as before. There are other datasets in which there are more than two sites so I would really like to learn how to use colour to distinguish between them on a plot. Any direction would be very greatly received! Thank you very much Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4092794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4093337.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
Another approach would be to use ggplot2. Code can look a bit daunting to begin with but ggplot2 is a very versitile graphing package and well worth learning. Simple example = library(ggplot2) mydata - data.frame(site=c(A,A,A, B,B,B), time1 = 1:6, t1=c(23,24,13,7,19,12), t2=c(7, 4,6,8,5,9)) p - ggplot(mydata, aes(x=time1)) + geom_point(aes(y= t1, colour= site)) + geom_point(aes(y = t2, colour=site)) p - ggplot(mydata, aes(x=time1)) + geom_point(aes(y= t1, colour= site)) + geom_point(aes(y = t2, colour=site)) p - p + scale_x_continuous('Time')+ scale_y_continuous('Temperature') p = --- On Mon, 11/21/11, SarahH sarah@hotmail.co.uk wrote: From: SarahH sarah@hotmail.co.uk Subject: [R] Scatter plot - using colour to group points? To: r-help@r-project.org Received: Monday, November 21, 2011, 2:17 PM Dear All, I am very new to R - trying to teach myself it for some MSc coursework. I am plotting temperature data for two different sites over the same time period which I have downloaded from a university weather station data archive. I am using the following code to create the plot plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = TEMP3[,SITE], pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) I managed to use col = TEMP3[SITE] to plot the two different sites( BG1 and EA7) in different colours, but I am struggling to change the colours. I wanted to up a colour scheme to match the site, so tried BG1 - blue EA7 - green before the plot function, but the graphic just came out with red and black as before. There are other datasets in which there are more than two sites so I would really like to learn how to use colour to distinguish between them on a plot. Any direction would be very greatly received! Thank you very much Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4092794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
I don't think you can do different colors for a single line (not an ifelse thing, just a what would that mean sort of thing), but a plot type like b o or h will work the same way. Michael On Mon, Nov 21, 2011 at 4:23 PM, SarahH sarah@hotmail.co.uk wrote: I got the colour vector with ifelse to work, great! Thank you. Is it possible to use the ifelse colour vector with other plot types? For example with type=l ? I tried but the graphic came back with blue lines for both sites and also a straight line connecting the start and end point of the data? Thanks Sarah Michael Weylandt wrote I think the easiest way to do this is to set up a color vector with ifelse and hand that off to the plot command: something like col = ifelse(TEMP3[,SITE] == BG1, blue, green) # Syntax is ifelse(TEST, OUT_IF_TRUE, OUT_IF_FALSE) For more complicated schemes, a set of nested ifelse()'s can get you what you need. There are some other tricks with factors as well, but they require a little more advanced use of R. Just for the record, they'd look something like this: X = letters[c(1,2,3,3,1,2,1,3,3,1,2,2,1)] colX = c(red,green,blue)[as.factor(X)] Hope this helps, Michael On Mon, Nov 21, 2011 at 2:17 PM, SarahH sarah.g10@.co wrote: Dear All, I am very new to R - trying to teach myself it for some MSc coursework. I am plotting temperature data for two different sites over the same time period which I have downloaded from a university weather station data archive. I am using the following code to create the plot plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = TEMP3[,SITE], pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) I managed to use col = TEMP3[SITE] to plot the two different sites( BG1 and EA7) in different colours, but I am struggling to change the colours. I wanted to up a colour scheme to match the site, so tried BG1 - blue EA7 - green before the plot function, but the graphic just came out with red and black as before. There are other datasets in which there are more than two sites so I would really like to learn how to use colour to distinguish between them on a plot. Any direction would be very greatly received! Thank you very much Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4092794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4093337.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
On Nov 21, 2011, at 10:18 PM, R. Michael Weylandt wrote: I don't think you can do different colors for a single line (not an ifelse thing, just a what would that mean sort of thing), but a plot type like b o or h will work the same way. I think Jim Lemon has a multicolored line function in package:plotrix. -- David. Michael On Mon, Nov 21, 2011 at 4:23 PM, SarahH sarah@hotmail.co.uk wrote: I got the colour vector with ifelse to work, great! Thank you. Is it possible to use the ifelse colour vector with other plot types? For example with type=l ? I tried but the graphic came back with blue lines for both sites and also a straight line connecting the start and end point of the data? Thanks Sarah Michael Weylandt wrote I think the easiest way to do this is to set up a color vector with ifelse and hand that off to the plot command: something like col = ifelse(TEMP3[,SITE] == BG1, blue, green) # Syntax is ifelse(TEST, OUT_IF_TRUE, OUT_IF_FALSE) For more complicated schemes, a set of nested ifelse()'s can get you what you need. There are some other tricks with factors as well, but they require a little more advanced use of R. Just for the record, they'd look something like this: X = letters[c(1,2,3,3,1,2,1,3,3,1,2,2,1)] colX = c(red,green,blue)[as.factor(X)] Hope this helps, Michael On Mon, Nov 21, 2011 at 2:17 PM, SarahH sarah.g10@.co wrote: Dear All, I am very new to R - trying to teach myself it for some MSc coursework. I am plotting temperature data for two different sites over the same time period which I have downloaded from a university weather station data archive. I am using the following code to create the plot plot ( x = TEMP3[,TIME], y = TEMP3[,TEMP], type = p, col = TEMP3[,SITE], pch = 3, main = Temperature changes, xlab = Date, ylab = Temberature[C]) I managed to use col = TEMP3[SITE] to plot the two different sites( BG1 and EA7) in different colours, but I am struggling to change the colours. I wanted to up a colour scheme to match the site, so tried BG1 - blue EA7 - green before the plot function, but the graphic just came out with red and black as before. There are other datasets in which there are more than two sites so I would really like to learn how to use colour to distinguish between them on a plot. Any direction would be very greatly received! Thank you very much Sarah -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4092794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Scatter-plot-using-colour-to-group-points-tp4092794p4093337.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatter plot - using colour to group points?
On 11/22/2011 05:00 PM, David Winsemius wrote: On Nov 21, 2011, at 10:18 PM, R. Michael Weylandt wrote: I don't think you can do different colors for a single line (not an ifelse thing, just a what would that mean sort of thing), but a plot type like b o or h will work the same way. I think Jim Lemon has a multicolored line function in package:plotrix. Hi David (and everybody else), The color.scale.lines function will display multicolored lines, just force the colors to what you want using the col argument. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.