[R] by function does not separate output from function with mulliple parts
Colleagues, I have written an R function (see fully annotated code below), with which I want to process a dataframe within levels of the variable StepType. My program works, it processes the data within levels of StepType, but the usual headers that separate the output by levels of StepType are at the end of the listing rather than being used as separators, i.e. I get Regression results StepType First Contrast results StepType First Regression results StepType Second Contrast results StepType Second and only after the results are displayed do I get the usual separators: mydata$StepType: First NULL -- mydata$StepType: Second NULL What I want to get is output that includes the separators i.e., mydata$StepType: First Regression results StepType First Contrast results StepType First -- mydata$StepType: Second Regression results StepType Second Contrast results StepType Second Can you help me get the separators included in the printed otput? Thank you, John # Create Dataframe # mydata <- structure(list(HipFlex = c(19.44, 4.44, 3.71, 1.95, 2.07, 1.55, 0.44, 0.23, 2.15, 0.41, 2.3, 0.22, 2.08, 4.61, 4.19, 5.65, 2.73, 1.46, 10.02, 7.41, 6.91, 5.28, 9.56, 2.46, 6, 3.85, 6.43, 3.73, 1.08, 1.43, 1.82, 2.22, 0.34, 5.11, 0.94, 0.98, 2.04, 1.73, 0.94, 18.41, 0.77, 2.31, 0.22, 1.06, 0.13, 0.36, 2.84, 5.2, 2.39, 2.99), jSex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), levels = c("Male", "Female"), class = "factor")), row.names = c(NA, 50L), class = "data.frame") mydata[,"StepType"] <- rep(c("First","Second"),25) mydata # END Create Dataframe # # Define function to be run# DoReg <- function(x){ fit0<-lm(as.numeric(HipFlex) ~ jSex,data=x) print(summary(fit0)) cat("\nMale\n") print(contrast(fit0, list(jSex="Male"))) cat("\nFemale\n") print(contrast(fit0, list(jSex="Female"))) cat("\nDifference\n") print(contrast(fit0, a=list(jSex="Male"), b=list(jSex="Female"))) } # END Define function to be run# # # Run function within levels of Steptype# # by(mydata,mydata$StepType,DoReg) # # END Run function within levels of Steptype# # John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running crossvalidation many times MSE for Lasso regression
Dear Rui, I really thank you a lot for your response and your R code. Best, Sacha Le mardi 24 octobre 2023 à 16:37:56 UTC+2, Rui Barradas a écrit : Às 20:12 de 23/10/2023, varin sacha via R-help escreveu: > Dear R-experts, > > I really thank you all a lot for your responses. So, here is the error (and > warning) messages at the end of my R code. > > Many thanks for your help. > > > Error in UseMethod("predict") : > no applicable method for 'predict' applied to an object of class >"c('matrix', 'array', 'double', 'numeric')" >> mean(unlist(lst)) > [1] NA > Warning message: > In mean.default(unlist(lst)) : > argument is not numeric or logical: returning NA > > > > > > > > > Le lundi 23 octobre 2023 à 19:59:15 UTC+2, Ben Bolker a > écrit : > > > > > > For what it's worth it looks like spm2 is specifically for *spatial* > predictive modeling; presumably its version of CV is doing something > spatially aware. > > I agree that glmnet is old and reliable. One might want to use a > tidymodels wrapper to create pipelines where you can more easily switch > among predictive algorithms (see the `parsnip` package), but otherwise > sticking to glmnet seems wise. > > On 2023-10-23 4:38 a.m., Martin Maechler wrote: >>> Jin Li >>> on Mon, 23 Oct 2023 15:42:14 +1100 writes: >> >> > If you are interested in other validation methods (e.g., LOO or >>n-fold) >> > with more predictive accuracy measures, the function, glmnetcv, in >>the spm2 >> > package can be directly used, and some reproducible examples are >> > also available in ?glmnetcv. >> >> ... and once you open that can of w..: the glmnet package itself >> contains a function cv.glmnet() which we (our students) use when teaching. >> >> What's the advantage of the spm2 package ? >> At least, the glmnet package is authored by the same who originated and >> first published (as in "peer reviewed" ..) these algorithms. >> >> >> >> > On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch >> >> > wrote: >> >> >> On 22/10/2023 7:01 p.m., Bert Gunter wrote: >> >> > No error message shown Please include the error message so that >>it is >> >> > not necessary to rerun your code. This might enable someone to >>see the >> >> > problem without running the code (e.g. downloading packages, >>etc.) >> >> >> >> And it's not necessarily true that someone else would see the same >>error >> >> message. >> >> >> >> Duncan Murdoch >> >> >> >> > >> >> > -- Bert >> >> > >> >> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help >> >> > wrote: >> >> >> >> >> >> Dear R-experts, >> >> >> >> >> >> Here below my R code with an error message. Can somebody help >>me to fix >> >> this error? >> >> >> Really appreciate your help. >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> >> # MSE CROSSVALIDATION Lasso regression >> >> >> >> >> >> library(glmnet) >> >> >> >> >> >> >> >> >> >> >> >>x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91) >> >> >> >> >> >>x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9) >> >> >> >> >> >>y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2) >> >> >> T=data.frame(y,x1,x2) >> >> >> >> >> >> z=matrix(c(x1,x2), ncol=2) >> >> >> cv_model=glmnet(z,y,alpha=1) >> >> >> best_lambda=cv_model$lambda.min >> >> >> best_lambda >> >> >> >> >> >> >> >> >> # Create a list to store the results >> >> >> lst<-list() >> >> >> >> >> >> # This statement does the repetitions (looping) >> >> >> for(i in 1 :1000) { >> >> >> >> >> >> n=45 >> >> >> >> >> >> p=0.667 >> >> >> >> >> >> sam=sample(1 :n,floor(p*n),replace=FALSE) >> >> >> >> >> >> Training =T [sam,] >> >> >> Testing = T [-sam,] >> >> >> >> >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2) >> >> >> >> >> >> predictLasso=predict(cv_model, newx=test1) >> >> >> >> >> >> >> >> >> ypred=predict(predictLasso,newdata=test1) >> >> >> y=T[-sam,]$y >> >> >> >> >> >> MSE = mean((y-ypred)^2) >> >> >> MSE >> >> >> lst[i]<-MSE >> >> >> } >> >> >> mean(unlist(lst)) >> >> >> >>## >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> __ >>
Re: [R] How to Calculate the Mean by Multiple Groups in R
A variation is to remove Well and then we can use dot to refer to the remaining columns. aggregate(cbind(OD, ODnorm) ~ . , subset(df, select = - Well), mean) On Tue, Oct 24, 2023 at 8:32 AM Luigi Marongiu wrote: > > Hello, > I have a data frame with different groups (Time, Target, Conc) and > each entry has a triplicate value of the measurements OD and ODnorm. > How can I merge the triplicates into a single mean value? > I tried the following: > ``` > df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""), > OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663), > Target=rep("BACT", 9), > Conc=c(1,1,1,2,2,2,3,3,3), > ODnorm=c(9, 158, 158, 45, 82, 138, 0, 48, 6), > stringsAsFactors = FALSE) > aggregate(.~ODnorm, df, mean) > > > aggregate(.~ODnorm, df, mean) > ODnorm Time Well OD Target Conc > 1 0 NA NA NA NA NA > 2 6 NA NA NA NA NA > 3 9 NA NA NA NA NA > 4 45 NA NA NA NA NA > 5 48 NA NA NA NA NA > 6 82 NA NA NA NA NA > 7138 NA NA NA NA NA > 8158 NA NA NA NA NA > > aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean) > ODnorm Time Target Conc > 1 0 NA NA NA > 2 6 NA NA NA > 3 9 NA NA NA > 4 45 NA NA NA > 5 48 NA NA NA > 6 82 NA NA NA > 7138 NA NA NA > 8158 NA NA NA > ``` > > Thank you. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running crossvalidation many times MSE for Lasso regression
Às 20:12 de 23/10/2023, varin sacha via R-help escreveu: Dear R-experts, I really thank you all a lot for your responses. So, here is the error (and warning) messages at the end of my R code. Many thanks for your help. Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "c('matrix', 'array', 'double', 'numeric')" mean(unlist(lst)) [1] NA Warning message: In mean.default(unlist(lst)) : argument is not numeric or logical: returning NA Le lundi 23 octobre 2023 à 19:59:15 UTC+2, Ben Bolker a écrit : For what it's worth it looks like spm2 is specifically for *spatial* predictive modeling; presumably its version of CV is doing something spatially aware. I agree that glmnet is old and reliable. One might want to use a tidymodels wrapper to create pipelines where you can more easily switch among predictive algorithms (see the `parsnip` package), but otherwise sticking to glmnet seems wise. On 2023-10-23 4:38 a.m., Martin Maechler wrote: Jin Li on Mon, 23 Oct 2023 15:42:14 +1100 writes: > If you are interested in other validation methods (e.g., LOO or n-fold) > with more predictive accuracy measures, the function, glmnetcv, in the spm2 > package can be directly used, and some reproducible examples are > also available in ?glmnetcv. ... and once you open that can of w..: the glmnet package itself contains a function cv.glmnet() which we (our students) use when teaching. What's the advantage of the spm2 package ? At least, the glmnet package is authored by the same who originated and first published (as in "peer reviewed" ..) these algorithms. > On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch > wrote: >> On 22/10/2023 7:01 p.m., Bert Gunter wrote: >> > No error message shown Please include the error message so that it is >> > not necessary to rerun your code. This might enable someone to see the >> > problem without running the code (e.g. downloading packages, etc.) >> >> And it's not necessarily true that someone else would see the same error >> message. >> >> Duncan Murdoch >> >> > >> > -- Bert >> > >> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help >> > wrote: >> >> >> >> Dear R-experts, >> >> >> >> Here below my R code with an error message. Can somebody help me to fix >> this error? >> >> Really appreciate your help. >> >> >> >> Best, >> >> >> >> >> >> # MSE CROSSVALIDATION Lasso regression >> >> >> >> library(glmnet) >> >> >> >> >> >> >> x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91) >> >> >> x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9) >> >> >> y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2) >> >> T=data.frame(y,x1,x2) >> >> >> >> z=matrix(c(x1,x2), ncol=2) >> >> cv_model=glmnet(z,y,alpha=1) >> >> best_lambda=cv_model$lambda.min >> >> best_lambda >> >> >> >> >> >> # Create a list to store the results >> >> lst<-list() >> >> >> >> # This statement does the repetitions (looping) >> >> for(i in 1 :1000) { >> >> >> >> n=45 >> >> >> >> p=0.667 >> >> >> >> sam=sample(1 :n,floor(p*n),replace=FALSE) >> >> >> >> Training =T [sam,] >> >> Testing = T [-sam,] >> >> >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2) >> >> >> >> predictLasso=predict(cv_model, newx=test1) >> >> >> >> >> >> ypred=predict(predictLasso,newdata=test1) >> >> y=T[-sam,]$y >> >> >> >> MSE = mean((y-ypred)^2) >> >> MSE >> >> lst[i]<-MSE >> >> } >> >> mean(unlist(lst)) >> >> ## >> >> >> >> >> >> >> >> >> >> __ >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide
Re: [R] How to Calculate the Mean by Multiple Groups in R
Thank you On Tue, Oct 24, 2023 at 3:01 PM peter dalgaard wrote: > > Also, > > > aggregate(cbind(OD, ODnorm) ~ Time + Target + Conc, data = df, FUN = "mean") > Time Target Conc ODODnorm > 11 BACT1 765. 108.3 > 21 BACT2 745. 88.3 > 31 BACT3 675. 18.0 > > (You might wish for "cbind(OD,ODnorm) ~ . - Well", but aggregate.formula is > not smart enough for that.) > > -pd > > > On 24 Oct 2023, at 14:40 , Sarah Goslee wrote: > > > > Hi, > > > > I think you're misunderstanding which set of variables go on either > > side of the formula. > > > > Is this what you're looking for? > > > >> aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean") > > Time Target Conc OD > > 11 BACT1 765. > > 21 BACT2 745. > > 31 BACT3 675. > >> aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean") > > Time Target ConcODnorm > > 11 BACT1 108.3 > > 21 BACT2 88.3 > > 31 BACT3 18.0 > > > > Or using a different form, that might be more straightforward to you: > > > >> aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], > >> data = df, FUN = "mean") > > Time Target Conc ODODnorm > > 11 BACT1 765. 108.3 > > 21 BACT2 745. 88.3 > > 31 BACT3 675. 18.0 > > > > Sarah > > > > On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu > > wrote: > >> > >> Hello, > >> I have a data frame with different groups (Time, Target, Conc) and > >> each entry has a triplicate value of the measurements OD and ODnorm. > >> How can I merge the triplicates into a single mean value? > >> I tried the following: > >> ``` > >> df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""), > >>OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663), > >>Target=rep("BACT", 9), > >>Conc=c(1,1,1,2,2,2,3,3,3), > >>ODnorm=c(9, 158, 158, 45, 82, 138, 0, 48, 6), > >>stringsAsFactors = FALSE) > >> aggregate(.~ODnorm, df, mean) > >> > >>> aggregate(.~ODnorm, df, mean) > >> ODnorm Time Well OD Target Conc > >> 1 0 NA NA NA NA NA > >> 2 6 NA NA NA NA NA > >> 3 9 NA NA NA NA NA > >> 4 45 NA NA NA NA NA > >> 5 48 NA NA NA NA NA > >> 6 82 NA NA NA NA NA > >> 7138 NA NA NA NA NA > >> 8158 NA NA NA NA NA > >> > >> aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean) > >> ODnorm Time Target Conc > >> 1 0 NA NA NA > >> 2 6 NA NA NA > >> 3 9 NA NA NA > >> 4 45 NA NA NA > >> 5 48 NA NA NA > >> 6 82 NA NA NA > >> 7138 NA NA NA > >> 8158 NA NA NA > >> ``` > >> > >> Thank you. > >> > >> __ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Sarah Goslee (she/her) > > http://www.numberwright.com > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd@cbs.dk Priv: pda...@gmail.com > -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Calculate the Mean by Multiple Groups in R
Also, > aggregate(cbind(OD, ODnorm) ~ Time + Target + Conc, data = df, FUN = "mean") Time Target Conc ODODnorm 11 BACT1 765. 108.3 21 BACT2 745. 88.3 31 BACT3 675. 18.0 (You might wish for "cbind(OD,ODnorm) ~ . - Well", but aggregate.formula is not smart enough for that.) -pd > On 24 Oct 2023, at 14:40 , Sarah Goslee wrote: > > Hi, > > I think you're misunderstanding which set of variables go on either > side of the formula. > > Is this what you're looking for? > >> aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean") > Time Target Conc OD > 11 BACT1 765. > 21 BACT2 745. > 31 BACT3 675. >> aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean") > Time Target ConcODnorm > 11 BACT1 108.3 > 21 BACT2 88.3 > 31 BACT3 18.0 > > Or using a different form, that might be more straightforward to you: > >> aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], >> data = df, FUN = "mean") > Time Target Conc ODODnorm > 11 BACT1 765. 108.3 > 21 BACT2 745. 88.3 > 31 BACT3 675. 18.0 > > Sarah > > On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu > wrote: >> >> Hello, >> I have a data frame with different groups (Time, Target, Conc) and >> each entry has a triplicate value of the measurements OD and ODnorm. >> How can I merge the triplicates into a single mean value? >> I tried the following: >> ``` >> df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""), >>OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663), >>Target=rep("BACT", 9), >>Conc=c(1,1,1,2,2,2,3,3,3), >>ODnorm=c(9, 158, 158, 45, 82, 138, 0, 48, 6), >>stringsAsFactors = FALSE) >> aggregate(.~ODnorm, df, mean) >> >>> aggregate(.~ODnorm, df, mean) >> ODnorm Time Well OD Target Conc >> 1 0 NA NA NA NA NA >> 2 6 NA NA NA NA NA >> 3 9 NA NA NA NA NA >> 4 45 NA NA NA NA NA >> 5 48 NA NA NA NA NA >> 6 82 NA NA NA NA NA >> 7138 NA NA NA NA NA >> 8158 NA NA NA NA NA >> >> aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean) >> ODnorm Time Target Conc >> 1 0 NA NA NA >> 2 6 NA NA NA >> 3 9 NA NA NA >> 4 45 NA NA NA >> 5 48 NA NA NA >> 6 82 NA NA NA >> 7138 NA NA NA >> 8158 NA NA NA >> ``` >> >> Thank you. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Sarah Goslee (she/her) > http://www.numberwright.com > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Calculate the Mean by Multiple Groups in R
Thank you, the last is exactly what I was looking for. On Tue, Oct 24, 2023 at 2:41 PM Sarah Goslee wrote: > > Hi, > > I think you're misunderstanding which set of variables go on either > side of the formula. > > Is this what you're looking for? > > > aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean") > Time Target Conc OD > 11 BACT1 765. > 21 BACT2 745. > 31 BACT3 675. > > aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean") > Time Target ConcODnorm > 11 BACT1 108.3 > 21 BACT2 88.3 > 31 BACT3 18.0 > > Or using a different form, that might be more straightforward to you: > > > aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], > > data = df, FUN = "mean") > Time Target Conc ODODnorm > 11 BACT1 765. 108.3 > 21 BACT2 745. 88.3 > 31 BACT3 675. 18.0 > > Sarah > > On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu > wrote: > > > > Hello, > > I have a data frame with different groups (Time, Target, Conc) and > > each entry has a triplicate value of the measurements OD and ODnorm. > > How can I merge the triplicates into a single mean value? > > I tried the following: > > ``` > > df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""), > > OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663), > > Target=rep("BACT", 9), > > Conc=c(1,1,1,2,2,2,3,3,3), > > ODnorm=c(9, 158, 158, 45, 82, 138, 0, 48, 6), > > stringsAsFactors = FALSE) > > aggregate(.~ODnorm, df, mean) > > > > > aggregate(.~ODnorm, df, mean) > > ODnorm Time Well OD Target Conc > > 1 0 NA NA NA NA NA > > 2 6 NA NA NA NA NA > > 3 9 NA NA NA NA NA > > 4 45 NA NA NA NA NA > > 5 48 NA NA NA NA NA > > 6 82 NA NA NA NA NA > > 7138 NA NA NA NA NA > > 8158 NA NA NA NA NA > > > > aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean) > > ODnorm Time Target Conc > > 1 0 NA NA NA > > 2 6 NA NA NA > > 3 9 NA NA NA > > 4 45 NA NA NA > > 5 48 NA NA NA > > 6 82 NA NA NA > > 7138 NA NA NA > > 8158 NA NA NA > > ``` > > > > Thank you. > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Sarah Goslee (she/her) > http://www.numberwright.com -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Calculate the Mean by Multiple Groups in R
Hi, I think you're misunderstanding which set of variables go on either side of the formula. Is this what you're looking for? > aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean") Time Target Conc OD 11 BACT1 765. 21 BACT2 745. 31 BACT3 675. > aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean") Time Target ConcODnorm 11 BACT1 108.3 21 BACT2 88.3 31 BACT3 18.0 Or using a different form, that might be more straightforward to you: > aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], > data = df, FUN = "mean") Time Target Conc ODODnorm 11 BACT1 765. 108.3 21 BACT2 745. 88.3 31 BACT3 675. 18.0 Sarah On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu wrote: > > Hello, > I have a data frame with different groups (Time, Target, Conc) and > each entry has a triplicate value of the measurements OD and ODnorm. > How can I merge the triplicates into a single mean value? > I tried the following: > ``` > df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""), > OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663), > Target=rep("BACT", 9), > Conc=c(1,1,1,2,2,2,3,3,3), > ODnorm=c(9, 158, 158, 45, 82, 138, 0, 48, 6), > stringsAsFactors = FALSE) > aggregate(.~ODnorm, df, mean) > > > aggregate(.~ODnorm, df, mean) > ODnorm Time Well OD Target Conc > 1 0 NA NA NA NA NA > 2 6 NA NA NA NA NA > 3 9 NA NA NA NA NA > 4 45 NA NA NA NA NA > 5 48 NA NA NA NA NA > 6 82 NA NA NA NA NA > 7138 NA NA NA NA NA > 8158 NA NA NA NA NA > > aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean) > ODnorm Time Target Conc > 1 0 NA NA NA > 2 6 NA NA NA > 3 9 NA NA NA > 4 45 NA NA NA > 5 48 NA NA NA > 6 82 NA NA NA > 7138 NA NA NA > 8158 NA NA NA > ``` > > Thank you. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee (she/her) http://www.numberwright.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to Calculate the Mean by Multiple Groups in R
Hello, I have a data frame with different groups (Time, Target, Conc) and each entry has a triplicate value of the measurements OD and ODnorm. How can I merge the triplicates into a single mean value? I tried the following: ``` df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""), OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663), Target=rep("BACT", 9), Conc=c(1,1,1,2,2,2,3,3,3), ODnorm=c(9, 158, 158, 45, 82, 138, 0, 48, 6), stringsAsFactors = FALSE) aggregate(.~ODnorm, df, mean) > aggregate(.~ODnorm, df, mean) ODnorm Time Well OD Target Conc 1 0 NA NA NA NA NA 2 6 NA NA NA NA NA 3 9 NA NA NA NA NA 4 45 NA NA NA NA NA 5 48 NA NA NA NA NA 6 82 NA NA NA NA NA 7138 NA NA NA NA NA 8158 NA NA NA NA NA aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean) ODnorm Time Target Conc 1 0 NA NA NA 2 6 NA NA NA 3 9 NA NA NA 4 45 NA NA NA 5 48 NA NA NA 6 82 NA NA NA 7138 NA NA NA 8158 NA NA NA ``` Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create new data frame with conditional sums
This seems to work. A couple of fine points, including handling duplicated Pct values right, which is easier if you do the reversed cumsum. > dd2 <- dummydata[order(dummydata$Pct),] > dd2$Cum <- rev(cumsum(rev(dd2$Totpop))) > use <- !duplicated(dd2$Pct) > approx(dd2$Pct[use], dd2$Cum[use], ctof, method="constant", f=1, rule=2) $x [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 [16] 0.15 $y [1] 43800 43800 39300 39300 31000 26750 22750 17800 12700 12700 8000 8000 [13] 8000 3900 3900 3900 > On 14 Oct 2023, at 17:10 , Bert Gunter wrote: > > Well, here's one way to do it: > (dat is your example data frame) > > Cutoff <- seq(0, .15, .01) > Pop <- with(dat, sapply(Cutoff, \(p)sum(Totpop[Pct >= p]))) > > I think there must be a more efficient way to do it with cumsum(), though. > > Cheers, > Bert > > On Sat, Oct 14, 2023 at 12:53 AM Jason Stout, M.D. > wrote: >> >> This seems like it should be simple but I can't get it to work properly. >> I'm starting with a data frame like this: >> >> Tract Pct Totpop >> 1 0.054000 >> 2 0.033500 >> 3 0.014500 >> 4 0.124100 >> 5 0.213900 >> 6 0.044250 >> 7 0.075100 >> 8 0.094700 >> 9 0.064950 >> 10 0.034800 >> >> And I want to end up with a data frame with two columns, a "Cutoff" column >> that is a simple sequence of equally spaced cutoffs (let's say in this case >> from 0-0.15 by 0.01) and a "Pop" column which equals the sum of "Totpop" in >> the prior data frame in which "Pct" is greater than or equal to "cutoff." >> So in this toy example, this is what I want for a result: >> >> Cutoff Pop >> 10.00 43800 >> 20.01 43800 >> 30.02 39300 >> 40.03 39300 >> 50.04 31000 >> 60.05 26750 >> 70.06 22750 >> 80.07 17800 >> 90.08 12700 >> 10 0.09 12700 >> 11 0.10 8000 >> 12 0.11 8000 >> 13 0.12 8000 >> 14 0.13 3900 >> 15 0.14 3900 >> 16 0.15 3900 >> >> I can do this with a for loop but it seems there should be an easier, >> vectorized way that would be more efficient. Here is a reproducible example: >> >> dummydata<-data.frame(Tract=seq(1,10,by=1),Pct=c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),Totpop=c(4000,3500,4500,4100, >> >>3900,4250,5100,4700, >> >>4950,4800)) >> dfrm<-data.frame(matrix(ncol=2,nrow=0,dimnames=list(NULL,c("Cutoff","Pop" >> for (i in seq(0,0.15,by=0.01)) { >> temp<-sum(dummydata[dummydata$Pct>=i,"Totpop"]) >> dfrm[nrow(dfrm)+1,]<-c(i,temp) >> } >> >> Jason Stout, MD, MHS >> Division of Infectious Diseases >> Dept of Medicine >> Duke University >> Box 102359-DUMC >> Durham, NC 27710 >> FAX 919-681-7494 >> >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.