[R] averaging rows on a data.frame according to a factor
Dear all, I apologize for the newbie question, but I'm stuck. I have a data frame in the following form: dat-as.data.frame(cbind(c(a,a,a,b,b), c(1,2,3,3,2),c(4,3,5,4,4))) I need a way to generate a new dataframe with the average for each factor. The result should look like: res-as.data.frame(cbind(c(a,b), c(2,2.5),c(4,4))) Any help would be greatly appreciated. Jonathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] histogram with bars colored according to a vector of values
Dear all, Let's say I have the following data.frame: dat-data.frame(x=rnorm(100), y=rnorm(100,2)) and I plot a histogram of variable x, somethink like: hist(dat$x, breaks=-5:5) Now, I'd like to color each bar according to the mean of the cases according to y. For instance, the color of the bar between -2 and -1 should reflect the mean of variable y for the corresponding cases. Any suggestions? John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with PCA
Dear all, If I do a PCA like this: dat-matrix(rnorm(30),ncol=3) res-prcomp(dat) Now, imagine that I got new data that I want to project onto the original PC axes. How do I do that? Thanks! John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple values in one column
I have some data files in which some fields have multiple values. For example first last sex major John Smith M ANTH Jane DoeF HIST,BIOL What's the best R-like way to handle these data (Jane's major in my example), so that I can do things like summarize the other fields by them (e.g., sex by major)? Right now I'm processing the files (in excel since they're spreadsheets) by duplicating lines with two values in the major field, eliminating one value per row. I suspect there's a nifty R way to do this. Thanks in advance! John Muccigrosso __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple values in one column
On Apr 6, 2012, at 9:09 AM, John D. Muccigrosso wrote: I have some data files in which some fields have multiple values. For example first last sex major John Smith M ANTH Jane DoeF HIST,BIOL What's the best R-like way to handle these data (Jane's major in my example), so that I can do things like summarize the other fields by them (e.g., sex by major)? Right now I'm processing the files (in excel since they're spreadsheets) by duplicating lines with two values in the major field, eliminating one value per row. I suspect there's a nifty R way to do this. I've gotten a few responses, for which I'm grateful, but either I don't quite see how they answer my question, or I didn't phrase my question well, both of which are equally possible. :-) So, given the data as above, let's call it students, I have no problem turning it into: first last sex major John Smith M ANTH Jane DoeF HIST Jane DoeF BIOL What I then do with this is things like table(students$sex, students$major) So, three steps: 1. Get data with multiple values per field. 2. Turn it into a data frame with only one value per field (by duplicating lines). 3. Do things like table(). I'd like to be able to skip #2. Thanks. John Muccigrosso __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] circles()
I cannot for the life of me figure this out: What's the parameter to fill in with color circles made with circles()? col changes the line color, but all I see in the help is a reference to additional graphic parameters, and no examples via google. Thanks! John Muccigrosso __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot and NA
Am I wrong that barplot is supposed to just skip NAs, and continue with the rest of the data in a matrix column? That's how I read various posts on the subject. But that's not what happens for me with R64.app (on a Mac, obviously). For example: d0 - as.matrix(c(2,3,4)) d1 - as.matrix(c(2,3,NA)) d2 - as.matrix(c(2,NA,4)) d3 - as.matrix(c(NA,3,4)) barplot(d0) barplot(d1) barplot(d2) barplot(d3) generates four bar plots. The first has one bar with three visible bands, as expected. The second has two bands; still OK. But the third has only one band (at 2) and the fourth has none. So it appears that barplot is barfing on those NAs and stopping its plot at those points. Is that the expected behavior? Thanks. John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot and NA
On 12 Mar 2012, at 12:47 , S Ellison wrote: Yes, to the extent that the default barplot plots the height of the bar so far as the sum of teh values so far, starting at teh first. For your first vector, no problem; for your second, the highest value is undefiuned, for the third, the sum is undefined after the second value (an NA) and so on. Try adding 'beside=TRUE to the barplots, as in barplot(d3, beside=TRUE) and you will see all the known values plotted as you;d expect. That makes sense, but since I do want a stacked bar plot, I'll need to change the NAs to 0 (which of course I've already done). This should be made clear in the documentation, no? It's possible that barplot could do something like a na.rm=T internally and avoid this problem, but it doesn't, so NA is deadly in stacked plots. To be honest, if I hadn't scaled all my bars to 1 to show percentages, I wouldn't have noticed how some were leaving out a small category or two. Thanks for the help. John Muccigrosso __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.