Re: [R] Mutliple sets of data in one dataset....Need a loop?
Thank you, Mario. Biostudent asked how one could perform repetitive tasks, e.g., plotting, with subsets of data. I originally provided a flexible example based on lapply. Mario suggested a variation that permits flexible control of options. This reply shows how Mario's objective and naming of each set of results can be accomplished very simply, within the context of the approach I suggested. In practice, I often use a list of levels of my grouping variable, rather than my list of data, as an argument to lapply(). Then I use the levels as subscripts and labels. For example: #List of unique values for grouping variable #that is not necessarily a factor names <- as.list(unique(df$Experiment)) #List of colors, same length as 'names' #In actual application, color1 , color2, etc. #would be character strings, numbers, or #color codes. clr <- as.list(c(color1, color2, ...)) names(clr) <- names #List of dataframes; 1 for each unique value of grouping variable df.lst <- lapply(names,function(name)subset(df,Experiment==name)) #Name components of the list #Permits indexing by level of the grouping variable names(df.lst) <- names #Now--if I didn't mistype something--lapply() can be used #to perform repetitive tasks without sacrificing flexibility. #For example, to send plots to a pdf with 1 page for each #component, vary the color of points in each plot, and print #the value of the grouping variable at the top of each plot: pdf("plot.pdf") lapply(names,function(nms){ plot(df.lst[[nms]][,2], df.lst[[nms]][,3],col=clr[[nms]]) mtext(nms)}) dev.off() - Glen Sargeant Research Wildlife Biologist -- View this message in context: http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1100167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mutliple sets of data in one dataset....Need a loop?
Great example Glen! I want to add simply a small thing that could be useful to someone. Suppose in your last step you want to change the line color for each chart. Using a for loop it is simple to use the integer index to access the df.lst elements and set the color: for(i in 1:length(df.lst)) plot(df.lst[i]$x, df.lst[i]$y, color=colors[i]) To do it 'lapply-style' use mapply: mapply(function(d, i) plot(d$x, d$y, color=colors[i]), dl, 1:length(dl)) Ciao! mario Glen Sargeant wrote: > One way to plot subsets of data identified by a grouping variable is to use > lapply() on a list of subsets. The approach is worth mentioning because > similar tactics are useful for many problems. > > #List of unique values for grouping variable > #that is not necessarily a factor > names <- as.list(unique(df$Experiment)) > > #List of dataframes; 1 for each unique value of grouping variable > df.lst <- lapply(names,function(name)subset(df,Experiment==name)) > > #Name components of the list > #Not necessary in this case... but permits indexing by level > #of the grouping variable > names(df.lst) <- names > > #Now you can use lapply() to carry out the same operation on > #each component of your list. For example, to send plots to > #a pdf with 1 page for each component: > > pdf("plot.pdf") > lapply(df.lst,function(df)plot(df[,2],df[,3])) > dev.off() > > > > > - > Glen Sargeant > Research Wildlife Biologist -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mutliple sets of data in one dataset....Need a loop?
> but I have thousands of results so it would be really hand to find away of > doing this quickly > its a little difficult to follow those examples Given your data in data.frame DF, maybe add the following to your list to investigate : > dat = data.table(DF) > dat[, cor(Score1,Score2), by="Experiment"] Experiment V1 [1,] X 0.9889524 [2,] Y 0.3041195 [3,] Z -0.1346107 To do a plot instead just replace "cor" with "plot" or whatever else you want to do within each group. Since you said you have thousands of results, data.table is faster for that. In terms of ease of use, you could try plyr too, which you may well prefer. > those examples as all seem so different If you look and search crantastic, users are putting their comments there. That might help you make a decision more quickly and avoid you needing to post to r-help and wait for a reply, assuming there is a package that already does what you need. Searching the history of r-help would have found many solutions to your problem this time, but it seems you are looking for advice on the best way. This changes over time and depends on lots of factors, including what you really want to do. Once you have worked out which packages work best for you, put your votes/comments onto crantastic and it should help everyone who follows in your path. I guess you should then update your votes/comments as time progresses too. Btw, plyr is ranked #2 on crantastic and is designed specifically for your task !! Making yourself aware of the most popular packages would have helped you.If you need speed try data.table. When it comes to current, up to date advice on the most appropriate package, crantastic could be fantastic, assuming of course that you, the user, contributes to it. HTH "BioStudent" wrote in message news:1264072645590-1049653.p...@n4.nabble.com... > > Hi Thanks for all your help > > Its a little difficult to follow those examples as all seem so different > and > its hard to see how I do what I want to my data from the help files but > i'll > try... > -- > View this message in context: > http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1049653.html > Sent from the R help mailing list archive at Nabble.com. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mutliple sets of data in one dataset....Need a loop?
Hi Thanks for all your help Its a little difficult to follow those examples as all seem so different and its hard to see how I do what I want to my data from the help files but i'll try... -- View this message in context: http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1049653.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mutliple sets of data in one dataset....Need a loop?
One way to plot subsets of data identified by a grouping variable is to use lapply() on a list of subsets. The approach is worth mentioning because similar tactics are useful for many problems. #List of unique values for grouping variable #that is not necessarily a factor names <- as.list(unique(df$Experiment)) #List of dataframes; 1 for each unique value of grouping variable df.lst <- lapply(names,function(name)subset(df,Experiment==name)) #Name components of the list #Not necessary in this case... but permits indexing by level #of the grouping variable names(df.lst) <- names #Now you can use lapply() to carry out the same operation on #each component of your list. For example, to send plots to #a pdf with 1 page for each component: pdf("plot.pdf") lapply(df.lst,function(df)plot(df[,2],df[,3])) dev.off() - Glen Sargeant Research Wildlife Biologist -- View this message in context: http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1018714.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mutliple sets of data in one dataset....Need a loop?
You'll probably want to look at the 'by' function d=data.frame(sex=rep(1:2,50),x=rnorm(100)) d$y=d$x+rnorm(100) head(d) cor(d) by(d[,-1],d['sex'],function(df)cor(df)) You might also want to look at the doBy package -- View this message in context: http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1018616.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mutliple sets of data in one dataset....Need a loop?
On Jan 20, 2010, at 11:07 AM, BioStudent wrote: Hi I'm hoping someone can help me I am a relative newbie to R. I have data that is in a similar format to this... Experiment Score1 Score2 X -0.85 -0.02 X -1.21 -0.02 X 1.05 0.09 Y -1.12 -0.07 Y -0.27 -0.07 Y -0.93 -0.08 Z 1.1 -0.03 Z 2.4 0.09 Z -1.0 0.09 Now I can easily have a look at the overall correlation of score 1 and 2 by doing this plot(data[,2], data[,3]) or fit <- lm(data[,2] ~ data[,3] BUT! I really want to look at the correlations within each experiment type so ideally a multiple plot per page of each correlation within an experiment - and/or a way of looping through the data to get the simple linear regression for each experiment for scores 1 and 2. This looks like an easy example but I have thousands of results so it would be really hand to find away of doing this quickly!! library(lattice) ?xyplot # a group specification ?panel.lmline There are quite a few examples at the xyplot help page. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mutliple sets of data in one dataset....Need a loop?
Hi I'm hoping someone can help me I am a relative newbie to R. I have data that is in a similar format to this... Experiment Score1 Score2 X -0.85 -0.02 X -1.21 -0.02 X 1.05 0.09 Y -1.12 -0.07 Y -0.27 -0.07 Y -0.93 -0.08 Z 1.1 -0.03 Z 2.4 0.09 Z -1.0 0.09 Now I can easily have a look at the overall correlation of score 1 and 2 by doing this plot(data[,2], data[,3]) or fit <- lm(data[,2] ~ data[,3] BUT! I really want to look at the correlations within each experiment type so ideally a multiple plot per page of each correlation within an experiment - and/or a way of looping through the data to get the simple linear regression for each experiment for scores 1 and 2. This looks like an easy example but I have thousands of results so it would be really hand to find away of doing this quickly!! Let me know if you need more explaining... E -- View this message in context: http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1018503.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.