Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-22 Thread Glen Sargeant

Thank you, Mario.

Biostudent asked how one could perform repetitive tasks, e.g., plotting,
with subsets of data.  I originally provided a flexible example based on
lapply.  Mario suggested a variation that permits flexible control of
options.  This reply shows how Mario's objective and naming of each set of
results can be accomplished very simply, within the context of the approach
I suggested.

In practice, I often use a list of levels of my grouping variable, rather
than my list of data, as an argument to lapply().  Then I use the levels as
subscripts and labels.  For example:

#List of unique values for grouping variable 
#that is not necessarily a factor 
names <- as.list(unique(df$Experiment))

#List of colors, same length as 'names'
#In actual application, color1 , color2, etc.
#would be character strings, numbers, or
#color codes.
clr <- as.list(c(color1, color2, ...))
names(clr) <- names 

#List of dataframes; 1 for each unique value of grouping variable 
df.lst <- lapply(names,function(name)subset(df,Experiment==name)) 

#Name components of the list 
#Permits indexing by level of the grouping variable
names(df.lst) <- names 

#Now--if I didn't mistype something--lapply() can be used 
#to perform repetitive tasks without sacrificing flexibility.
#For example, to send plots to a pdf with 1 page for each 
#component, vary the color of points in each plot, and print 
#the value of the grouping variable at the top of each plot: 
pdf("plot.pdf") 
lapply(names,function(nms){
  plot(df.lst[[nms]][,2], df.lst[[nms]][,3],col=clr[[nms]])
  mtext(nms)}) 
dev.off() 


-
Glen Sargeant
Research Wildlife Biologist
-- 
View this message in context: 
http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1100167.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-22 Thread Mario Valle
Great example Glen!
I want to add simply a small thing that could be useful to someone.

Suppose in your last step you want to change the line color for each chart.
Using a for loop it is simple to use the integer index to access the df.lst 
elements and
set the color:

   for(i in 1:length(df.lst)) plot(df.lst[i]$x, df.lst[i]$y, color=colors[i])

To do it 'lapply-style' use mapply:

   mapply(function(d, i) plot(d$x, d$y, color=colors[i]), dl, 1:length(dl))

Ciao!
mario

Glen Sargeant wrote:
> One way to plot subsets of data identified by a grouping variable is to use
> lapply() on a list of subsets.  The approach is worth mentioning because
> similar tactics are useful for many problems. 
> 
> #List of unique values for grouping variable
> #that is not necessarily a factor
> names <- as.list(unique(df$Experiment))
> 
> #List of dataframes; 1 for each unique value of grouping variable
> df.lst <- lapply(names,function(name)subset(df,Experiment==name))
> 
> #Name components of the list
> #Not necessary in this case... but permits indexing by level
> #of the grouping variable
> names(df.lst) <- names
> 
> #Now you can use lapply() to carry out the same operation on
> #each component of your list.  For example, to send plots to
> #a pdf with 1 page for each component:
> 
> pdf("plot.pdf")
> lapply(df.lst,function(df)plot(df[,2],df[,3]))
> dev.off() 
> 
> 
> 
> 
> -
> Glen Sargeant
> Research Wildlife Biologist

-- 
Ing. Mario Valle
Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-21 Thread Matthew Dowle
> but I have thousands of results so it would be really hand to find away of 
> doing this quickly
> its a little difficult to follow those examples

Given your data in data.frame DF, maybe add the following to your list to 
investigate :

> dat = data.table(DF)
> dat[, cor(Score1,Score2), by="Experiment"]
 Experiment V1
[1,]  X  0.9889524
[2,]  Y  0.3041195
[3,]  Z -0.1346107

To do a plot instead just replace "cor" with "plot" or whatever else you 
want to do within each group.
Since you said you have thousands of results,  data.table is faster for 
that.

In terms of ease of use,  you could try plyr too,  which you may well 
prefer.

> those examples as all seem so different

If you look and search crantastic, users are putting their comments there. 
That might help you make a decision more quickly and avoid you needing to 
post to r-help and wait for a reply,  assuming there is a package that 
already does what you need. Searching the history of r-help would have found 
many solutions to your problem this time, but it seems you are looking for 
advice on the best way. This changes over time and depends on lots of 
factors, including what you really want to do. Once you have worked out 
which packages work best for you, put your votes/comments onto crantastic 
and it should help everyone who follows in your path.  I guess you should 
then update your votes/comments as time progresses too.

Btw, plyr is ranked #2 on crantastic and is designed specifically for your 
task !!  Making yourself aware of the most popular packages would have 
helped you.If you need speed try data.table.  When it comes to current, 
up to date advice on the most appropriate package, crantastic could be 
fantastic, assuming of course that you, the user, contributes to it.

HTH

"BioStudent"  wrote in message 
news:1264072645590-1049653.p...@n4.nabble.com...
>
> Hi Thanks for all your help
>
> Its a little difficult to follow those examples as all seem so different 
> and
> its hard to see how I do what I want to my data from the help files but 
> i'll
> try...
> -- 
> View this message in context: 
> http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1049653.html
> Sent from the R help mailing list archive at Nabble.com.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-21 Thread BioStudent

Hi Thanks for all your help

Its a little difficult to follow those examples as all seem so different and
its hard to see how I do what I want to my data from the help files but i'll
try...
-- 
View this message in context: 
http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1049653.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-20 Thread Glen Sargeant

One way to plot subsets of data identified by a grouping variable is to use
lapply() on a list of subsets.  The approach is worth mentioning because
similar tactics are useful for many problems. 

#List of unique values for grouping variable
#that is not necessarily a factor
names <- as.list(unique(df$Experiment))

#List of dataframes; 1 for each unique value of grouping variable
df.lst <- lapply(names,function(name)subset(df,Experiment==name))

#Name components of the list
#Not necessary in this case... but permits indexing by level
#of the grouping variable
names(df.lst) <- names

#Now you can use lapply() to carry out the same operation on
#each component of your list.  For example, to send plots to
#a pdf with 1 page for each component:

pdf("plot.pdf")
lapply(df.lst,function(df)plot(df[,2],df[,3]))
dev.off() 




-
Glen Sargeant
Research Wildlife Biologist
-- 
View this message in context: 
http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1018714.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-20 Thread David Freedman

You'll probably want to look at the 'by' function

d=data.frame(sex=rep(1:2,50),x=rnorm(100))
d$y=d$x+rnorm(100)
head(d)
cor(d)
by(d[,-1],d['sex'],function(df)cor(df))

You might also want to look at the doBy package
-- 
View this message in context: 
http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1018616.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-20 Thread David Winsemius


On Jan 20, 2010, at 11:07 AM, BioStudent wrote:



Hi

I'm hoping someone can help me I am a relative newbie to R.

I have data that is in a similar format to this...

Experiment Score1 Score2
X -0.85 -0.02
X -1.21 -0.02
X  1.05  0.09
Y -1.12 -0.07
Y -0.27 -0.07
Y -0.93 -0.08
Z 1.1 -0.03
Z 2.4 0.09
Z -1.0 0.09

Now I can easily have a look at the overall correlation of score 1  
and 2 by

doing this
plot(data[,2], data[,3])   or
fit <- lm(data[,2] ~ data[,3]

BUT! I really want to look at the correlations within each  
experiment type
so ideally a multiple plot per page of each correlation within an  
experiment

- and/or a way of looping through the data to get the simple linear
regression for each experiment for scores 1 and 2. This looks like  
an easy
example but I have thousands of results so it would be really hand  
to find

away of doing this quickly!!


library(lattice)
?xyplot  # a group specification
?panel.lmline

There are quite a few examples at the xyplot help page.




--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mutliple sets of data in one dataset....Need a loop?

2010-01-20 Thread BioStudent

Hi

I'm hoping someone can help me I am a relative newbie to R.

I have data that is in a similar format to this...

Experiment Score1 Score2
X -0.85 -0.02
X -1.21 -0.02
X  1.05  0.09
Y -1.12 -0.07
Y -0.27 -0.07
Y -0.93 -0.08
Z 1.1 -0.03
Z 2.4 0.09
Z -1.0 0.09

Now I can easily have a look at the overall correlation of score 1 and 2 by
doing this
plot(data[,2], data[,3])   or
fit <- lm(data[,2] ~ data[,3]

BUT! I really want to look at the correlations within each experiment type
so ideally a multiple plot per page of each correlation within an experiment
- and/or a way of looping through the data to get the simple linear
regression for each experiment for scores 1 and 2. This looks like an easy
example but I have thousands of results so it would be really hand to find
away of doing this quickly!!

Let me know if you need more explaining...

E
-- 
View this message in context: 
http://n4.nabble.com/Mutliple-sets-of-data-in-one-dataset-Need-a-loop-tp1018503p1018503.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.