Re: [R] Data.frame manipulation
Thanks again, Dennis and Petr! The solution using the plyr package was perfect: ddply(data, .(id, mod1), summarize, es = mean(es), mod2 = head(mod2, 1)) Take care, AC On Thu, Jan 28, 2010 at 11:26 PM, Petr PIKAL wrote: > Hi > > r-help-boun...@r-project.org napsal dne 28.01.2010 17:40:01: > > > Thank you, Dennis and Petr. > > > > One more question: when aggregating to one es per id, how would I go > about > > keeping the other variables in the data.frame (e.g., keeping the value > for > > the first row of the other variables, such as mod2) e.g.: > > > > # Dennis provided this example (notice how mod2 is removed from the > output): > > > > > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1), > mean)) > > id mod1 es > > 1 31 0.20 > > 2 12 0.30 > > 3 24 0.15 > > > > # How can I get this output (taking the first row of the other variable > in > > the data.frame): > > If I remember it correctly in my suggestion I used something like > > aggregate(x[,-columns.mod1 and mod2], by = x[, columns.mod1 and mod2, > mean) > > Which shall use mod2 as aggregating variable. > > Does it result in output you wanted? > > Regards > Petr > > > > > > id es mod1 mod2 > > > > 1 .30 2wai > > 2 .15 4other > > 3 .20 1 itas > > > > > > Thank you, > > > > AC > > > > > > On Thu, Jan 28, 2010 at 1:29 AM, Petr PIKAL > wrote: > > > > > HI > > > > > > r-help-boun...@r-project.org napsal dne 28.01.2010 04:35:29: > > > > > > > > Hi All, > > > > > > > > > > I'm conducting a meta-analysis and have taken a data.frame with > > > multiple > > > > > rows per > > > > > study (for each effect size) and performed a weighted average of > > > effect > > > > > size for > > > > > each study. This results in a reduced # of rows. I am particularly > > > > > interested in > > > > > simply reducing the additional variables in the data.frame to the > > > first row > > > > > of the > > > > > corresponding id variable. For example: > > > > > > > > > > id<-c(1,2,2,3,3,3) > > > > > es<-c(.3,.1,.3,.1,.2,.3) > > > > > mod1<-c(2,4,4,1,1,1) > > > > > mod2<-c("wai","other","calpas","wai","itas","other") > > > > > data<-as.data.frame(cbind(id,es,mod1,mod2)) > > > > > > Do not use cbind. Its output is a matrix and in this case character > > > matrix. Resulting data frame will consist from factors as you can > check by > > > > > > > > > str(data) > > > > > > data<-data.frame(id=id,es=es,mod1=mod1,mod2=mod2) > > > > > > > > > > > > > > > > data > > > > > > > > > >id esmod1 mod2 > > > > > 1 1 0.32 wai > > > > > 2 2 0.14 other > > > > > 3 2 0.24 calpas > > > > > 4 3 0.11 itas > > > > > 5 3 0.21 wai > > > > > 6 3 0.31 wai > > > > > > > > > > # I would like to reduce the entire data.frame like this: > > > > > > E.g. aggregate > > > > > > aggregate(data[, -(3:4)], data[,3:4], mean) > > > mod1 mod2 id es > > > 14 calpas 2 0.3 > > > 21 itas 3 0.2 > > > 31 other 3 0.3 > > > 44 other 2 0.1 > > > 51wai 3 0.1 > > > 62wai 1 0.3 > > > > > > doBy or tapply or ddply from plyr library or > > > > > > Regards > > > Petr > > > > > > > > > > > > > id es mod1 mod2 > > > > > > > > > > 1 .30 2wai > > > > > 2 .15 4other > > > > > 3 .20 1 itas > > > > > > > > > > # If possible, I would also like the option of this (collapsing on > id > > > and > > > > > mod2): > > > > > > > > > > id es mod1 mod2 > > > > > 1 .30 2wai > > > > > 2 0.1 4 other > > > > > 2 0.2 4calpas > > > > > 3 0.1 1 itas > > > > > 3 0.251 wai > > > > > > > > > > Any help is much appreciated! > > > > > > > > > > AC Del Re > > > > > > > > > > > > >[[alternative HTML version deleted]] > > > > > > > > __ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame manipulation
Hi r-help-boun...@r-project.org napsal dne 28.01.2010 17:40:01: > Thank you, Dennis and Petr. > > One more question: when aggregating to one es per id, how would I go about > keeping the other variables in the data.frame (e.g., keeping the value for > the first row of the other variables, such as mod2) e.g.: > > # Dennis provided this example (notice how mod2 is removed from the output): > > > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1), mean)) > id mod1 es > 1 31 0.20 > 2 12 0.30 > 3 24 0.15 > > # How can I get this output (taking the first row of the other variable in > the data.frame): If I remember it correctly in my suggestion I used something like aggregate(x[,-columns.mod1 and mod2], by = x[, columns.mod1 and mod2, mean) Which shall use mod2 as aggregating variable. Does it result in output you wanted? Regards Petr > > id es mod1 mod2 > > 1 .30 2wai > 2 .15 4other > 3 .20 1 itas > > > Thank you, > > AC > > > On Thu, Jan 28, 2010 at 1:29 AM, Petr PIKAL wrote: > > > HI > > > > r-help-boun...@r-project.org napsal dne 28.01.2010 04:35:29: > > > > > > Hi All, > > > > > > > > I'm conducting a meta-analysis and have taken a data.frame with > > multiple > > > > rows per > > > > study (for each effect size) and performed a weighted average of > > effect > > > > size for > > > > each study. This results in a reduced # of rows. I am particularly > > > > interested in > > > > simply reducing the additional variables in the data.frame to the > > first row > > > > of the > > > > corresponding id variable. For example: > > > > > > > > id<-c(1,2,2,3,3,3) > > > > es<-c(.3,.1,.3,.1,.2,.3) > > > > mod1<-c(2,4,4,1,1,1) > > > > mod2<-c("wai","other","calpas","wai","itas","other") > > > > data<-as.data.frame(cbind(id,es,mod1,mod2)) > > > > Do not use cbind. Its output is a matrix and in this case character > > matrix. Resulting data frame will consist from factors as you can check by > > > > > > str(data) > > > > data<-data.frame(id=id,es=es,mod1=mod1,mod2=mod2) > > > > > > > > > > > > data > > > > > > > >id esmod1 mod2 > > > > 1 1 0.32 wai > > > > 2 2 0.14 other > > > > 3 2 0.24 calpas > > > > 4 3 0.11 itas > > > > 5 3 0.21 wai > > > > 6 3 0.31 wai > > > > > > > > # I would like to reduce the entire data.frame like this: > > > > E.g. aggregate > > > > aggregate(data[, -(3:4)], data[,3:4], mean) > > mod1 mod2 id es > > 14 calpas 2 0.3 > > 21 itas 3 0.2 > > 31 other 3 0.3 > > 44 other 2 0.1 > > 51wai 3 0.1 > > 62wai 1 0.3 > > > > doBy or tapply or ddply from plyr library or > > > > Regards > > Petr > > > > > > > > > > id es mod1 mod2 > > > > > > > > 1 .30 2wai > > > > 2 .15 4other > > > > 3 .20 1 itas > > > > > > > > # If possible, I would also like the option of this (collapsing on id > > and > > > > mod2): > > > > > > > > id es mod1 mod2 > > > > 1 .30 2wai > > > > 2 0.1 4 other > > > > 2 0.2 4calpas > > > > 3 0.1 1 itas > > > > 3 0.251 wai > > > > > > > > Any help is much appreciated! > > > > > > > > AC Del Re > > > > > > > > > >[[alternative HTML version deleted]] > > > > > > __ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame manipulation
Hi: On Thu, Jan 28, 2010 at 8:40 AM, AC Del Re wrote: > Thank you, Dennis and Petr. > > One more question: when aggregating to one es per id, how would I go about > keeping the other variables in the data.frame (e.g., keeping the value for > the first row of the other variables, such as mod2) e.g.: > > # Dennis provided this example (notice how mod2 is removed from the > output): > > > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1), mean)) > id mod1 es > 1 31 0.20 > 2 12 0.30 > 3 24 0.15 > > # How can I get this output (taking the first row of the other variable in > the data.frame): > > id es mod1 mod2 > > 1 .30 2wai > 2 .15 4other > 3 .20 1 itas > Using ddply from the plyr package: > ddply(x, .(id, mod1), summarize, es = mean(es), mod2 = head(mod2, 1)) id mod1 es mod2 1 12 0.30 wai 2 24 0.15 other 3 31 0.20 itas mod2 = head(...) selects the first instance of mod2 in each id/mod1 combination. It appears from the help page that aggregate only allows one summary function per call; if so, it wouldn't be able to do this. You could, however, do this in the doBy package with a custom summary function. HTH, Dennis > > > Thank you, > > AC > > > On Thu, Jan 28, 2010 at 1:29 AM, Petr PIKAL wrote: > >> HI >> >> r-help-boun...@r-project.org napsal dne 28.01.2010 04:35:29: >> >> > > Hi All, >> > > >> > > I'm conducting a meta-analysis and have taken a data.frame with >> multiple >> > > rows per >> > > study (for each effect size) and performed a weighted average of >> effect >> > > size for >> > > each study. This results in a reduced # of rows. I am particularly >> > > interested in >> > > simply reducing the additional variables in the data.frame to the >> first row >> > > of the >> > > corresponding id variable. For example: >> > > >> > > id<-c(1,2,2,3,3,3) >> > > es<-c(.3,.1,.3,.1,.2,.3) >> > > mod1<-c(2,4,4,1,1,1) >> > > mod2<-c("wai","other","calpas","wai","itas","other") >> > > data<-as.data.frame(cbind(id,es,mod1,mod2)) >> >> Do not use cbind. Its output is a matrix and in this case character >> matrix. Resulting data frame will consist from factors as you can check by >> >> >> str(data) >> >> data<-data.frame(id=id,es=es,mod1=mod1,mod2=mod2) >> >> >> > > >> > > data >> > > >> > >id esmod1 mod2 >> > > 1 1 0.32 wai >> > > 2 2 0.14 other >> > > 3 2 0.24 calpas >> > > 4 3 0.11 itas >> > > 5 3 0.21 wai >> > > 6 3 0.31 wai >> > > >> > > # I would like to reduce the entire data.frame like this: >> >> E.g. aggregate >> >> aggregate(data[, -(3:4)], data[,3:4], mean) >> mod1 mod2 id es >> 14 calpas 2 0.3 >> 21 itas 3 0.2 >> 31 other 3 0.3 >> 44 other 2 0.1 >> 51wai 3 0.1 >> 62wai 1 0.3 >> >> doBy or tapply or ddply from plyr library or >> >> Regards >> Petr >> >> > > >> > > id es mod1 mod2 >> > > >> > > 1 .30 2wai >> > > 2 .15 4other >> > > 3 .20 1 itas >> > > >> > > # If possible, I would also like the option of this (collapsing on id >> and >> > > mod2): >> > > >> > > id es mod1 mod2 >> > > 1 .30 2wai >> > > 2 0.1 4 other >> > > 2 0.2 4calpas >> > > 3 0.1 1 itas >> > > 3 0.251 wai >> > > >> > > Any help is much appreciated! >> > > >> > > AC Del Re >> > > >> > >> >[[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame manipulation
Thank you, Dennis and Petr. One more question: when aggregating to one es per id, how would I go about keeping the other variables in the data.frame (e.g., keeping the value for the first row of the other variables, such as mod2) e.g.: # Dennis provided this example (notice how mod2 is removed from the output): > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1), mean)) id mod1 es 1 31 0.20 2 12 0.30 3 24 0.15 # How can I get this output (taking the first row of the other variable in the data.frame): id es mod1 mod2 1 .30 2wai 2 .15 4other 3 .20 1 itas Thank you, AC On Thu, Jan 28, 2010 at 1:29 AM, Petr PIKAL wrote: > HI > > r-help-boun...@r-project.org napsal dne 28.01.2010 04:35:29: > > > > Hi All, > > > > > > I'm conducting a meta-analysis and have taken a data.frame with > multiple > > > rows per > > > study (for each effect size) and performed a weighted average of > effect > > > size for > > > each study. This results in a reduced # of rows. I am particularly > > > interested in > > > simply reducing the additional variables in the data.frame to the > first row > > > of the > > > corresponding id variable. For example: > > > > > > id<-c(1,2,2,3,3,3) > > > es<-c(.3,.1,.3,.1,.2,.3) > > > mod1<-c(2,4,4,1,1,1) > > > mod2<-c("wai","other","calpas","wai","itas","other") > > > data<-as.data.frame(cbind(id,es,mod1,mod2)) > > Do not use cbind. Its output is a matrix and in this case character > matrix. Resulting data frame will consist from factors as you can check by > > > str(data) > > data<-data.frame(id=id,es=es,mod1=mod1,mod2=mod2) > > > > > > > > data > > > > > >id esmod1 mod2 > > > 1 1 0.32 wai > > > 2 2 0.14 other > > > 3 2 0.24 calpas > > > 4 3 0.11 itas > > > 5 3 0.21 wai > > > 6 3 0.31 wai > > > > > > # I would like to reduce the entire data.frame like this: > > E.g. aggregate > > aggregate(data[, -(3:4)], data[,3:4], mean) > mod1 mod2 id es > 14 calpas 2 0.3 > 21 itas 3 0.2 > 31 other 3 0.3 > 44 other 2 0.1 > 51wai 3 0.1 > 62wai 1 0.3 > > doBy or tapply or ddply from plyr library or > > Regards > Petr > > > > > > > id es mod1 mod2 > > > > > > 1 .30 2wai > > > 2 .15 4other > > > 3 .20 1 itas > > > > > > # If possible, I would also like the option of this (collapsing on id > and > > > mod2): > > > > > > id es mod1 mod2 > > > 1 .30 2wai > > > 2 0.1 4 other > > > 2 0.2 4calpas > > > 3 0.1 1 itas > > > 3 0.251 wai > > > > > > Any help is much appreciated! > > > > > > AC Del Re > > > > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame manipulation
Thank you Dennis--this is perfect!! AC On Thu, Jan 28, 2010 at 12:24 AM, Dennis Murphy wrote: > Hi: > There are several ways to do this, but these are the most commonly used: > aggregate() and the ddply() function in package plyr. > > (1) plyr solution (using x as the name of your input data frame): > > library(plyr) > > ddply(x, .(id, mod1), summarize, es = mean(es)) > id mod1 es > 1 12 0.30 > 2 24 0.15 > 3 31 0.20 > > ddply(x, .(id, mod1, mod2), summarize, es = mean(es)) > id mod1 mod2 es > 1 12wai 0.30 > 2 24 calpas 0.20 > 3 24 other 0.10 > 4 31 itas 0.10 > 5 31wai 0.25 > > (2) aggregate() function in base R: > > > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1), mean)) > id mod1 es > 1 31 0.20 > 2 12 0.30 > 3 24 0.15 > > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1, mod2 = > mod2), > + mean)) > id mod1 mod2 es > 1 24 calpas 0.20 > 2 31 itas 0.10 > 3 24 other 0.10 > 4 31wai 0.25 > 5 12wai 0.30 > > Note that enclosing the variable names in lists and 'equating' them > maintains > the variable name in the output. Here's what happens if you don't: > > > with(x, aggregate(es, list(id, mod1), mean)) > Group.1 Group.2x > 1 3 1 0.20 > 2 1 2 0.30 > 3 2 4 0.15 > > ddply() is a little less painless and sorts the output for you > automatically. > > HTH, > Dennis > > On Wed, Jan 27, 2010 at 7:34 PM, AC Del Re wrote: > >> Hi All, >> >> I'm conducting a meta-analysis and have taken a data.frame with multiple >> rows per >> study (for each effect size) and performed a weighted average of effect >> size >> for >> each study. This results in a reduced # of rows. I am particularly >> interested in >> simply reducing the additional variables in the data.frame to the first >> row >> of the >> corresponding id variable. For example: >> >> id<-c(1,2,2,3,3,3) >> es<-c(.3,.1,.3,.1,.2,.3) >> mod1<-c(2,4,4,1,1,1) >> mod2<-c("wai","other","calpas","wai","itas","other") >> data<-as.data.frame(cbind(id,es,mod1,mod2)) >> >> data >> >> id esmod1 mod2 >> 1 1 0.32 wai >> 2 2 0.14 other >> 3 2 0.24 calpas >> 4 3 0.11 itas >> 5 3 0.21 wai >> 6 3 0.31 wai >> >> # I would like to reduce the entire data.frame like this: >> >> id es mod1 mod2 >> >> 1 .30 2wai >> 2 .15 4other >> 3 .20 1 itas >> >> # If possible, I would also like the option of this (collapsing on id and >> mod2): >> >> id es mod1 mod2 >> 1 .30 2wai >> 2 0.1 4 other >> 2 0.2 4calpas >> 3 0.1 1 itas >> 3 0.251 wai >> >> Any help is much appreciated! >> >> AC Del Re >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame manipulation
HI r-help-boun...@r-project.org napsal dne 28.01.2010 04:35:29: > > Hi All, > > > > I'm conducting a meta-analysis and have taken a data.frame with multiple > > rows per > > study (for each effect size) and performed a weighted average of effect > > size for > > each study. This results in a reduced # of rows. I am particularly > > interested in > > simply reducing the additional variables in the data.frame to the first row > > of the > > corresponding id variable. For example: > > > > id<-c(1,2,2,3,3,3) > > es<-c(.3,.1,.3,.1,.2,.3) > > mod1<-c(2,4,4,1,1,1) > > mod2<-c("wai","other","calpas","wai","itas","other") > > data<-as.data.frame(cbind(id,es,mod1,mod2)) Do not use cbind. Its output is a matrix and in this case character matrix. Resulting data frame will consist from factors as you can check by str(data) data<-data.frame(id=id,es=es,mod1=mod1,mod2=mod2) > > > > data > > > >id esmod1 mod2 > > 1 1 0.32 wai > > 2 2 0.14 other > > 3 2 0.24 calpas > > 4 3 0.11 itas > > 5 3 0.21 wai > > 6 3 0.31 wai > > > > # I would like to reduce the entire data.frame like this: E.g. aggregate aggregate(data[, -(3:4)], data[,3:4], mean) mod1 mod2 id es 14 calpas 2 0.3 21 itas 3 0.2 31 other 3 0.3 44 other 2 0.1 51wai 3 0.1 62wai 1 0.3 doBy or tapply or ddply from plyr library or Regards Petr > > > > id es mod1 mod2 > > > > 1 .30 2wai > > 2 .15 4other > > 3 .20 1 itas > > > > # If possible, I would also like the option of this (collapsing on id and > > mod2): > > > > id es mod1 mod2 > > 1 .30 2wai > > 2 0.1 4 other > > 2 0.2 4calpas > > 3 0.1 1 itas > > 3 0.251 wai > > > > Any help is much appreciated! > > > > AC Del Re > > > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data.frame manipulation
Hi All, I'm conducting a meta-analysis and have taken a data.frame with multiple rows per study (for each effect size) and performed a weighted average of effect size for each study. This results in a reduced # of rows. I am particularly interested in simply reducing the additional variables in the data.frame to the first row of the corresponding id variable. For example: id<-c(1,2,2,3,3,3) es<-c(.3,.1,.3,.1,.2,.3) mod1<-c(2,4,4,1,1,1) mod2<-c("wai","other","calpas","wai","itas","other") data<-as.data.frame(cbind(id,es,mod1,mod2)) data id esmod1 mod2 1 1 0.32 wai 2 2 0.14 other 3 2 0.24 calpas 4 3 0.11 itas 5 3 0.21 wai 6 3 0.31 wai # I would like to reduce the entire data.frame like this: id es mod1 mod2 1 .30 2wai 2 .15 4other 3 .20 1 itas # If possible, I would also like the option of this (collapsing on id and mod2): id es mod1 mod2 1 .30 2wai 2 0.1 4 other 2 0.2 4calpas 3 0.1 1 itas 3 0.251 wai Any help is much appreciated! AC Del Re [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame manipulation
> Hi All, > > I'm conducting a meta-analysis and have taken a data.frame with multiple > rows per > study (for each effect size) and performed a weighted average of effect > size for > each study. This results in a reduced # of rows. I am particularly > interested in > simply reducing the additional variables in the data.frame to the first row > of the > corresponding id variable. For example: > > id<-c(1,2,2,3,3,3) > es<-c(.3,.1,.3,.1,.2,.3) > mod1<-c(2,4,4,1,1,1) > mod2<-c("wai","other","calpas","wai","itas","other") > data<-as.data.frame(cbind(id,es,mod1,mod2)) > > data > >id esmod1 mod2 > 1 1 0.32 wai > 2 2 0.14 other > 3 2 0.24 calpas > 4 3 0.11 itas > 5 3 0.21 wai > 6 3 0.31 wai > > # I would like to reduce the entire data.frame like this: > > id es mod1 mod2 > > 1 .30 2wai > 2 .15 4other > 3 .20 1 itas > > # If possible, I would also like the option of this (collapsing on id and > mod2): > > id es mod1 mod2 > 1 .30 2wai > 2 0.1 4 other > 2 0.2 4calpas > 3 0.1 1 itas > 3 0.251 wai > > Any help is much appreciated! > > AC Del Re > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data.frame manipulation: Unbinding strings in a row
here is a quick hack: > x <- read.table(textConnection("ID ShopItems + ID1 A1 item1,item2,item3 + ID2 A2 item4,item5 + ID3 A1 item1,item3,item4"), header=TRUE) > y <- lapply(1:nrow(x), function(.row){ + .items <- strsplit(as.character(x$Items[.row]), ',')[[1]] + data.frame(ID=rep(x$ID[.row], length(.items)), Shop=rep(x$Shop[.row], length(.items)), + Item=.items) + }) > do.call(rbind,y) ID Shop Item 1 ID1 A1 item1 2 ID1 A1 item2 3 ID1 A1 item3 4 ID2 A2 item4 5 ID2 A2 item5 6 ID3 A1 item1 7 ID3 A1 item3 8 ID3 A1 item4 On Jan 10, 2008 6:40 AM, francogrex <[EMAIL PROTECTED]> wrote: > > Hi all, > > I have a data.frame I received with data that look like this (comma > separated strings in last row): > > ID ShopItems > ID1 A1 item1, item2, item3 > ID2 A2 item4, item5 > ID3 A1 item1, item3, item4 > > > But I would like to unbind the strings in col(2) items so that it will look > like this: > > ID ShopItems > ID1 A1 item1 > ID1 A1 item2 > ID1 A1 item3 > ID2 A2 item4 > ID2 A2 item5 > ID3 A1 item1 > ID3 A1 item3 > ID3 A1 item4 > > Meaning each item is on a different row but still maintain the ties with the > IDs and the Shops. > The final purpose is to count how many times a particular item has been > bought in a particular shop, like: > > item1-A1= 2 > item2-A1=1 > item3-A1=2 > > > Any ideas? Thanks > > > -- > View this message in context: > http://www.nabble.com/data.frame-manipulation%3A-Unbinding-strings-in-a-row-tp14731173p14731173.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data.frame manipulation: Unbinding strings in a row
Hi all, I have a data.frame I received with data that look like this (comma separated strings in last row): ID ShopItems ID1 A1 item1, item2, item3 ID2 A2 item4, item5 ID3 A1 item1, item3, item4 But I would like to unbind the strings in col(2) items so that it will look like this: ID ShopItems ID1 A1 item1 ID1 A1 item2 ID1 A1 item3 ID2 A2 item4 ID2 A2 item5 ID3 A1 item1 ID3 A1 item3 ID3 A1 item4 Meaning each item is on a different row but still maintain the ties with the IDs and the Shops. The final purpose is to count how many times a particular item has been bought in a particular shop, like: item1-A1= 2 item2-A1=1 item3-A1=2 Any ideas? Thanks -- View this message in context: http://www.nabble.com/data.frame-manipulation%3A-Unbinding-strings-in-a-row-tp14731173p14731173.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.