[R] by function does not separate output from function with mulliple parts

2023-10-24 Thread Sorkin, John
Colleagues,

I have written an R function (see fully annotated code below), with which I 
want to process a dataframe within levels of the variable StepType. My program 
works, it processes the data within levels of StepType, but the usual headers 
that separate the output by levels of StepType are at the end of the listing 
rather than being used as separators, i.e. I get

Regression results StepType First
Contrast results StepType First
Regression results StepType Second
Contrast results StepType Second

and only after the results are displayed do I get the usual separators:
mydata$StepType: First
NULL
-- 
mydata$StepType: Second
NULL


What I want to get is output that includes the separators i.e., 

mydata$StepType: First
Regression results StepType First
Contrast results StepType First
-- 
mydata$StepType: Second
Regression results StepType Second
Contrast results StepType Second

Can you help me get the separators included in the printed otput?
Thank you, 
John


# Create Dataframe #

mydata <- structure(list(HipFlex = c(19.44, 4.44, 3.71, 1.95, 2.07, 1.55, 
  0.44, 0.23, 2.15, 0.41, 2.3, 0.22, 2.08, 4.61, 4.19, 5.65, 2.73, 
  1.46, 10.02, 7.41, 6.91, 5.28, 9.56, 2.46, 6, 3.85, 6.43, 3.73, 
  1.08, 1.43, 1.82, 2.22, 0.34, 5.11, 0.94, 0.98, 2.04, 1.73, 0.94, 
  18.41, 0.77, 2.31, 0.22, 1.06, 0.13, 0.36, 2.84, 5.2, 2.39, 2.99),
   jSex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
  2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), levels = c("Male", "Female"), class = 
"factor")), 
  row.names = c(NA, 50L), class = "data.frame")

mydata[,"StepType"] <- rep(c("First","Second"),25)
mydata

# END Create Dataframe #



# Define function to be run#

DoReg <- function(x){
fit0<-lm(as.numeric(HipFlex) ~ jSex,data=x)
  print(summary(fit0))
  
  cat("\nMale\n")
  print(contrast(fit0,
 list(jSex="Male")))
  
  cat("\nFemale\n")  
  print(contrast(fit0,
 list(jSex="Female")))
  
  cat("\nDifference\n")
  print(contrast(fit0,
 a=list(jSex="Male"),
 b=list(jSex="Female")))
}

# END Define function to be run#


#
# Run function within levels of Steptype#
#
by(mydata,mydata$StepType,DoReg)
#
# END Run function within levels of Steptype#
#




John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;
Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running crossvalidation many times MSE for Lasso regression

2023-10-24 Thread varin sacha via R-help
Dear Rui,

I really thank you a lot for your response and your R code.

Best,

Sacha


Le mardi 24 octobre 2023 à 16:37:56 UTC+2, Rui Barradas  
a écrit : 





Às 20:12 de 23/10/2023, varin sacha via R-help escreveu:
> Dear R-experts,
> 
> I really thank you all a lot for your responses. So, here is the error (and 
> warning) messages at the end of my R code.
> 
> Many thanks for your help.
> 
> 
> Error in UseMethod("predict") :
>    no applicable method for 'predict' applied to an object of class 
>"c('matrix', 'array', 'double', 'numeric')"
>> mean(unlist(lst))
> [1] NA
> Warning message:
> In mean.default(unlist(lst)) :
>    argument is not numeric or logical: returning NA
> 
> 
> 
> 
> 
> 
> 
> 
> Le lundi 23 octobre 2023 à 19:59:15 UTC+2, Ben Bolker  a 
> écrit :
> 
> 
> 
> 
> 
>    For what it's worth it looks like spm2 is specifically for *spatial*
> predictive modeling; presumably its version of CV is doing something
> spatially aware.
> 
>    I agree that glmnet is old and reliable.  One might want to use a
> tidymodels wrapper to create pipelines where you can more easily switch
> among predictive algorithms (see the `parsnip` package), but otherwise
> sticking to glmnet seems wise.
> 
> On 2023-10-23 4:38 a.m., Martin Maechler wrote:
>>> Jin Li
>>>        on Mon, 23 Oct 2023 15:42:14 +1100 writes:
>>
>>        > If you are interested in other validation methods (e.g., LOO or 
>>n-fold)
>>        > with more predictive accuracy measures, the function, glmnetcv, in 
>>the spm2
>>        > package can be directly used, and some reproducible examples are
>>        > also available in ?glmnetcv.
>>
>> ... and once you open that can of w..:  the  glmnet package itself
>> contains a function  cv.glmnet()  which we (our students) use when teaching.
>>
>> What's the advantage of the spm2 package ?
>> At least, the glmnet package is authored by the same who originated and
>> first published (as in "peer reviewed" ..) these algorithms.
>>
>>
>>
>>        > On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch 
>>
>>        > wrote:
>>
>>        >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
>>        >> > No error message shown Please include the error message so that 
>>it is
>>        >> > not necessary to rerun your code. This might enable someone to 
>>see the
>>        >> > problem without running the code (e.g. downloading packages, 
>>etc.)
>>        >>
>>        >> And it's not necessarily true that someone else would see the same 
>>error
>>        >> message.
>>        >>
>>        >> Duncan Murdoch
>>        >>
>>        >> >
>>        >> > -- Bert
>>        >> >
>>        >> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help
>>        >> >  wrote:
>>        >> >>
>>        >> >> Dear R-experts,
>>        >> >>
>>        >> >> Here below my R code with an error message. Can somebody help 
>>me to fix
>>        >> this error?
>>        >> >> Really appreciate your help.
>>        >> >>
>>        >> >> Best,
>>        >> >>
>>        >> >> 
>>        >> >> # MSE CROSSVALIDATION Lasso regression
>>        >> >>
>>        >> >> library(glmnet)
>>        >> >>
>>        >> >>
>>        >> >>
>>        >> 
>>x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
>>        >> >>
>>        >> 
>>x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
>>        >> >>
>>        >> 
>>y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
>>        >> >> T=data.frame(y,x1,x2)
>>        >> >>
>>        >> >> z=matrix(c(x1,x2), ncol=2)
>>        >> >> cv_model=glmnet(z,y,alpha=1)
>>        >> >> best_lambda=cv_model$lambda.min
>>        >> >> best_lambda
>>        >> >>
>>        >> >>
>>        >> >> # Create a list to store the results
>>        >> >> lst<-list()
>>        >> >>
>>        >> >> # This statement does the repetitions (looping)
>>        >> >> for(i in 1 :1000) {
>>        >> >>
>>        >> >> n=45
>>        >> >>
>>        >> >> p=0.667
>>        >> >>
>>        >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
>>        >> >>
>>        >> >> Training =T [sam,]
>>        >> >> Testing = T [-sam,]
>>        >> >>
>>        >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
>>        >> >>
>>        >> >> predictLasso=predict(cv_model, newx=test1)
>>        >> >>
>>        >> >>
>>        >> >> ypred=predict(predictLasso,newdata=test1)
>>        >> >> y=T[-sam,]$y
>>        >> >>
>>        >> >> MSE = mean((y-ypred)^2)
>>        >> >> MSE
>>        >> >> lst[i]<-MSE
>>        >> >> }
>>        >> >> mean(unlist(lst))
>>        >> >> 
>>##
>>        >> >>
>>        >> >>
>>        >> >>
>>        >> >>
>>        >> >> __
>>       

Re: [R] How to Calculate the Mean by Multiple Groups in R

2023-10-24 Thread Gabor Grothendieck
A variation is to remove Well and then we can use dot to refer to the
remaining columns.

  aggregate(cbind(OD, ODnorm)  ~ . , subset(df, select = - Well), mean)


On Tue, Oct 24, 2023 at 8:32 AM Luigi Marongiu  wrote:
>
> Hello,
> I have a data frame with different groups (Time, Target, Conc) and
> each entry has a triplicate value of the measurements OD and ODnorm.
> How can I merge the triplicates into a single mean value?
> I tried the following:
> ```
> df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""),
> OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663),
> Target=rep("BACT", 9),
> Conc=c(1,1,1,2,2,2,3,3,3),
> ODnorm=c(9, 158, 158,  45,  82, 138,   0,  48,   6),
> stringsAsFactors = FALSE)
> aggregate(.~ODnorm, df, mean)
>
> > aggregate(.~ODnorm, df, mean)
>   ODnorm Time Well OD Target Conc
> 1  0   NA   NA NA NA   NA
> 2  6   NA   NA NA NA   NA
> 3  9   NA   NA NA NA   NA
> 4 45   NA   NA NA NA   NA
> 5 48   NA   NA NA NA   NA
> 6 82   NA   NA NA NA   NA
> 7138   NA   NA NA NA   NA
> 8158   NA   NA NA NA   NA
>
>  aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean)
>   ODnorm Time Target Conc
> 1  0   NA NA   NA
> 2  6   NA NA   NA
> 3  9   NA NA   NA
> 4 45   NA NA   NA
> 5 48   NA NA   NA
> 6 82   NA NA   NA
> 7138   NA NA   NA
> 8158   NA NA   NA
> ```
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running crossvalidation many times MSE for Lasso regression

2023-10-24 Thread Rui Barradas

Às 20:12 de 23/10/2023, varin sacha via R-help escreveu:

Dear R-experts,

I really thank you all a lot for your responses. So, here is the error (and 
warning) messages at the end of my R code.

Many thanks for your help.


Error in UseMethod("predict") :
   no applicable method for 'predict' applied to an object of class "c('matrix', 
'array', 'double', 'numeric')"

mean(unlist(lst))

[1] NA
Warning message:
In mean.default(unlist(lst)) :
   argument is not numeric or logical: returning NA








Le lundi 23 octobre 2023 à 19:59:15 UTC+2, Ben Bolker  a 
écrit :





   For what it's worth it looks like spm2 is specifically for *spatial*
predictive modeling; presumably its version of CV is doing something
spatially aware.

   I agree that glmnet is old and reliable.  One might want to use a
tidymodels wrapper to create pipelines where you can more easily switch
among predictive algorithms (see the `parsnip` package), but otherwise
sticking to glmnet seems wise.

On 2023-10-23 4:38 a.m., Martin Maechler wrote:

Jin Li
       on Mon, 23 Oct 2023 15:42:14 +1100 writes:


       > If you are interested in other validation methods (e.g., LOO or n-fold)
       > with more predictive accuracy measures, the function, glmnetcv, in the 
spm2
       > package can be directly used, and some reproducible examples are
       > also available in ?glmnetcv.

... and once you open that can of w..:  the  glmnet package itself
contains a function  cv.glmnet()  which we (our students) use when teaching.

What's the advantage of the spm2 package ?
At least, the glmnet package is authored by the same who originated and
first published (as in "peer reviewed" ..) these algorithms.



       > On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch 

       > wrote:

       >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
       >> > No error message shown Please include the error message so that it 
is
       >> > not necessary to rerun your code. This might enable someone to see 
the
       >> > problem without running the code (e.g. downloading packages, etc.)
       >>
       >> And it's not necessarily true that someone else would see the same 
error
       >> message.
       >>
       >> Duncan Murdoch
       >>
       >> >
       >> > -- Bert
       >> >
       >> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help
       >> >  wrote:
       >> >>
       >> >> Dear R-experts,
       >> >>
       >> >> Here below my R code with an error message. Can somebody help me 
to fix
       >> this error?
       >> >> Really appreciate your help.
       >> >>
       >> >> Best,
       >> >>
       >> >> 
       >> >> # MSE CROSSVALIDATION Lasso regression
       >> >>
       >> >> library(glmnet)
       >> >>
       >> >>
       >> >>
       >> 
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
       >> >>
       >> 
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
       >> >>
       >> 
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
       >> >> T=data.frame(y,x1,x2)
       >> >>
       >> >> z=matrix(c(x1,x2), ncol=2)
       >> >> cv_model=glmnet(z,y,alpha=1)
       >> >> best_lambda=cv_model$lambda.min
       >> >> best_lambda
       >> >>
       >> >>
       >> >> # Create a list to store the results
       >> >> lst<-list()
       >> >>
       >> >> # This statement does the repetitions (looping)
       >> >> for(i in 1 :1000) {
       >> >>
       >> >> n=45
       >> >>
       >> >> p=0.667
       >> >>
       >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
       >> >>
       >> >> Training =T [sam,]
       >> >> Testing = T [-sam,]
       >> >>
       >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
       >> >>
       >> >> predictLasso=predict(cv_model, newx=test1)
       >> >>
       >> >>
       >> >> ypred=predict(predictLasso,newdata=test1)
       >> >> y=T[-sam,]$y
       >> >>
       >> >> MSE = mean((y-ypred)^2)
       >> >> MSE
       >> >> lst[i]<-MSE
       >> >> }
       >> >> mean(unlist(lst))
       >> >> ##
       >> >>
       >> >>
       >> >>
       >> >>
       >> >> __
       >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
       >> >> https://stat.ethz.ch/mailman/listinfo/r-help
       >> >> PLEASE do read the posting guide
       >> http://www.R-project.org/posting-guide.html
       >> >> and provide commented, minimal, self-contained, reproducible code.
       >> >
       >> > __
       >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
       >> > https://stat.ethz.ch/mailman/listinfo/r-help
       >> > PLEASE do read the posting guide
  

Re: [R] How to Calculate the Mean by Multiple Groups in R

2023-10-24 Thread Luigi Marongiu
Thank you

On Tue, Oct 24, 2023 at 3:01 PM peter dalgaard  wrote:
>
> Also,
>
> > aggregate(cbind(OD, ODnorm) ~ Time + Target + Conc, data = df, FUN = "mean")
>   Time Target Conc   ODODnorm
> 11   BACT1 765. 108.3
> 21   BACT2 745.  88.3
> 31   BACT3 675.  18.0
>
> (You might wish for "cbind(OD,ODnorm) ~ . - Well", but aggregate.formula is 
> not smart enough for that.)
>
> -pd
>
> > On 24 Oct 2023, at 14:40 , Sarah Goslee  wrote:
> >
> > Hi,
> >
> > I think you're misunderstanding which set of variables go on either
> > side of the formula.
> >
> > Is this what you're looking for?
> >
> >> aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean")
> >  Time Target Conc   OD
> > 11   BACT1 765.
> > 21   BACT2 745.
> > 31   BACT3 675.
> >> aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean")
> >  Time Target ConcODnorm
> > 11   BACT1 108.3
> > 21   BACT2  88.3
> > 31   BACT3  18.0
> >
> > Or using a different form, that might be more straightforward to you:
> >
> >> aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], 
> >> data = df, FUN = "mean")
> >  Time Target Conc   ODODnorm
> > 11   BACT1 765. 108.3
> > 21   BACT2 745.  88.3
> > 31   BACT3 675.  18.0
> >
> > Sarah
> >
> > On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu  
> > wrote:
> >>
> >> Hello,
> >> I have a data frame with different groups (Time, Target, Conc) and
> >> each entry has a triplicate value of the measurements OD and ODnorm.
> >> How can I merge the triplicates into a single mean value?
> >> I tried the following:
> >> ```
> >> df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""),
> >>OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663),
> >>Target=rep("BACT", 9),
> >>Conc=c(1,1,1,2,2,2,3,3,3),
> >>ODnorm=c(9, 158, 158,  45,  82, 138,   0,  48,   6),
> >>stringsAsFactors = FALSE)
> >> aggregate(.~ODnorm, df, mean)
> >>
> >>> aggregate(.~ODnorm, df, mean)
> >>  ODnorm Time Well OD Target Conc
> >> 1  0   NA   NA NA NA   NA
> >> 2  6   NA   NA NA NA   NA
> >> 3  9   NA   NA NA NA   NA
> >> 4 45   NA   NA NA NA   NA
> >> 5 48   NA   NA NA NA   NA
> >> 6 82   NA   NA NA NA   NA
> >> 7138   NA   NA NA NA   NA
> >> 8158   NA   NA NA NA   NA
> >>
> >> aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean)
> >>  ODnorm Time Target Conc
> >> 1  0   NA NA   NA
> >> 2  6   NA NA   NA
> >> 3  9   NA NA   NA
> >> 4 45   NA NA   NA
> >> 5 48   NA NA   NA
> >> 6 82   NA NA   NA
> >> 7138   NA NA   NA
> >> 8158   NA NA   NA
> >> ```
> >>
> >> Thank you.
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Sarah Goslee (she/her)
> > http://www.numberwright.com
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>


-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Calculate the Mean by Multiple Groups in R

2023-10-24 Thread peter dalgaard
Also,

> aggregate(cbind(OD, ODnorm) ~ Time + Target + Conc, data = df, FUN = "mean")
  Time Target Conc   ODODnorm
11   BACT1 765. 108.3
21   BACT2 745.  88.3
31   BACT3 675.  18.0

(You might wish for "cbind(OD,ODnorm) ~ . - Well", but aggregate.formula is not 
smart enough for that.)

-pd

> On 24 Oct 2023, at 14:40 , Sarah Goslee  wrote:
> 
> Hi,
> 
> I think you're misunderstanding which set of variables go on either
> side of the formula.
> 
> Is this what you're looking for?
> 
>> aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean")
>  Time Target Conc   OD
> 11   BACT1 765.
> 21   BACT2 745.
> 31   BACT3 675.
>> aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean")
>  Time Target ConcODnorm
> 11   BACT1 108.3
> 21   BACT2  88.3
> 31   BACT3  18.0
> 
> Or using a different form, that might be more straightforward to you:
> 
>> aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], 
>> data = df, FUN = "mean")
>  Time Target Conc   ODODnorm
> 11   BACT1 765. 108.3
> 21   BACT2 745.  88.3
> 31   BACT3 675.  18.0
> 
> Sarah
> 
> On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu  
> wrote:
>> 
>> Hello,
>> I have a data frame with different groups (Time, Target, Conc) and
>> each entry has a triplicate value of the measurements OD and ODnorm.
>> How can I merge the triplicates into a single mean value?
>> I tried the following:
>> ```
>> df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""),
>>OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663),
>>Target=rep("BACT", 9),
>>Conc=c(1,1,1,2,2,2,3,3,3),
>>ODnorm=c(9, 158, 158,  45,  82, 138,   0,  48,   6),
>>stringsAsFactors = FALSE)
>> aggregate(.~ODnorm, df, mean)
>> 
>>> aggregate(.~ODnorm, df, mean)
>>  ODnorm Time Well OD Target Conc
>> 1  0   NA   NA NA NA   NA
>> 2  6   NA   NA NA NA   NA
>> 3  9   NA   NA NA NA   NA
>> 4 45   NA   NA NA NA   NA
>> 5 48   NA   NA NA NA   NA
>> 6 82   NA   NA NA NA   NA
>> 7138   NA   NA NA NA   NA
>> 8158   NA   NA NA NA   NA
>> 
>> aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean)
>>  ODnorm Time Target Conc
>> 1  0   NA NA   NA
>> 2  6   NA NA   NA
>> 3  9   NA NA   NA
>> 4 45   NA NA   NA
>> 5 48   NA NA   NA
>> 6 82   NA NA   NA
>> 7138   NA NA   NA
>> 8158   NA NA   NA
>> ```
>> 
>> Thank you.
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Sarah Goslee (she/her)
> http://www.numberwright.com
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Calculate the Mean by Multiple Groups in R

2023-10-24 Thread Luigi Marongiu
Thank you, the last is exactly what I was looking for.

On Tue, Oct 24, 2023 at 2:41 PM Sarah Goslee  wrote:
>
> Hi,
>
> I think you're misunderstanding which set of variables go on either
> side of the formula.
>
> Is this what you're looking for?
>
> > aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean")
>   Time Target Conc   OD
> 11   BACT1 765.
> 21   BACT2 745.
> 31   BACT3 675.
> > aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean")
>   Time Target ConcODnorm
> 11   BACT1 108.3
> 21   BACT2  88.3
> 31   BACT3  18.0
>
> Or using a different form, that might be more straightforward to you:
>
> > aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], 
> > data = df, FUN = "mean")
>   Time Target Conc   ODODnorm
> 11   BACT1 765. 108.3
> 21   BACT2 745.  88.3
> 31   BACT3 675.  18.0
>
> Sarah
>
> On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu  
> wrote:
> >
> > Hello,
> > I have a data frame with different groups (Time, Target, Conc) and
> > each entry has a triplicate value of the measurements OD and ODnorm.
> > How can I merge the triplicates into a single mean value?
> > I tried the following:
> > ```
> > df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""),
> > OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663),
> > Target=rep("BACT", 9),
> > Conc=c(1,1,1,2,2,2,3,3,3),
> > ODnorm=c(9, 158, 158,  45,  82, 138,   0,  48,   6),
> > stringsAsFactors = FALSE)
> > aggregate(.~ODnorm, df, mean)
> >
> > > aggregate(.~ODnorm, df, mean)
> >   ODnorm Time Well OD Target Conc
> > 1  0   NA   NA NA NA   NA
> > 2  6   NA   NA NA NA   NA
> > 3  9   NA   NA NA NA   NA
> > 4 45   NA   NA NA NA   NA
> > 5 48   NA   NA NA NA   NA
> > 6 82   NA   NA NA NA   NA
> > 7138   NA   NA NA NA   NA
> > 8158   NA   NA NA NA   NA
> >
> >  aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean)
> >   ODnorm Time Target Conc
> > 1  0   NA NA   NA
> > 2  6   NA NA   NA
> > 3  9   NA NA   NA
> > 4 45   NA NA   NA
> > 5 48   NA NA   NA
> > 6 82   NA NA   NA
> > 7138   NA NA   NA
> > 8158   NA NA   NA
> > ```
> >
> > Thank you.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Sarah Goslee (she/her)
> http://www.numberwright.com



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Calculate the Mean by Multiple Groups in R

2023-10-24 Thread Sarah Goslee
Hi,

I think you're misunderstanding which set of variables go on either
side of the formula.

Is this what you're looking for?

> aggregate(OD ~ Time + Target + Conc, data = df, FUN = "mean")
  Time Target Conc   OD
11   BACT1 765.
21   BACT2 745.
31   BACT3 675.
> aggregate(ODnorm ~ Time + Target + Conc, data = df, FUN = "mean")
  Time Target ConcODnorm
11   BACT1 108.3
21   BACT2  88.3
31   BACT3  18.0

Or using a different form, that might be more straightforward to you:

> aggregate(df[, c("OD", "ODnorm")], by = df[, c("Time", "Target", "Conc")], 
> data = df, FUN = "mean")
  Time Target Conc   ODODnorm
11   BACT1 765. 108.3
21   BACT2 745.  88.3
31   BACT3 675.  18.0

Sarah

On Tue, Oct 24, 2023 at 8:31 AM Luigi Marongiu  wrote:
>
> Hello,
> I have a data frame with different groups (Time, Target, Conc) and
> each entry has a triplicate value of the measurements OD and ODnorm.
> How can I merge the triplicates into a single mean value?
> I tried the following:
> ```
> df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""),
> OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663),
> Target=rep("BACT", 9),
> Conc=c(1,1,1,2,2,2,3,3,3),
> ODnorm=c(9, 158, 158,  45,  82, 138,   0,  48,   6),
> stringsAsFactors = FALSE)
> aggregate(.~ODnorm, df, mean)
>
> > aggregate(.~ODnorm, df, mean)
>   ODnorm Time Well OD Target Conc
> 1  0   NA   NA NA NA   NA
> 2  6   NA   NA NA NA   NA
> 3  9   NA   NA NA NA   NA
> 4 45   NA   NA NA NA   NA
> 5 48   NA   NA NA NA   NA
> 6 82   NA   NA NA NA   NA
> 7138   NA   NA NA NA   NA
> 8158   NA   NA NA NA   NA
>
>  aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean)
>   ODnorm Time Target Conc
> 1  0   NA NA   NA
> 2  6   NA NA   NA
> 3  9   NA NA   NA
> 4 45   NA NA   NA
> 5 48   NA NA   NA
> 6 82   NA NA   NA
> 7138   NA NA   NA
> 8158   NA NA   NA
> ```
>
> Thank you.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee (she/her)
http://www.numberwright.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to Calculate the Mean by Multiple Groups in R

2023-10-24 Thread Luigi Marongiu
Hello,
I have a data frame with different groups (Time, Target, Conc) and
each entry has a triplicate value of the measurements OD and ODnorm.
How can I merge the triplicates into a single mean value?
I tried the following:
```
df = data.frame(Time=rep(1, 9), Well=paste("A", 1:9, sep=""),
OD=c(666, 815, 815, 702, 739, 795, 657, 705, 663),
Target=rep("BACT", 9),
Conc=c(1,1,1,2,2,2,3,3,3),
ODnorm=c(9, 158, 158,  45,  82, 138,   0,  48,   6),
stringsAsFactors = FALSE)
aggregate(.~ODnorm, df, mean)

> aggregate(.~ODnorm, df, mean)
  ODnorm Time Well OD Target Conc
1  0   NA   NA NA NA   NA
2  6   NA   NA NA NA   NA
3  9   NA   NA NA NA   NA
4 45   NA   NA NA NA   NA
5 48   NA   NA NA NA   NA
6 82   NA   NA NA NA   NA
7138   NA   NA NA NA   NA
8158   NA   NA NA NA   NA

 aggregate(cbind(Time, Target, Conc) ~ ODnorm, df, mean)
  ODnorm Time Target Conc
1  0   NA NA   NA
2  6   NA NA   NA
3  9   NA NA   NA
4 45   NA NA   NA
5 48   NA NA   NA
6 82   NA NA   NA
7138   NA NA   NA
8158   NA NA   NA
```

Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create new data frame with conditional sums

2023-10-24 Thread peter dalgaard
This seems to work. A couple of fine points, including handling duplicated Pct 
values right, which is easier if you do the reversed cumsum.

> dd2 <- dummydata[order(dummydata$Pct),]
> dd2$Cum <- rev(cumsum(rev(dd2$Totpop)))
> use <- !duplicated(dd2$Pct)
> approx(dd2$Pct[use], dd2$Cum[use], ctof, method="constant", f=1, rule=2)
$x
 [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14
[16] 0.15

$y
 [1] 43800 43800 39300 39300 31000 26750 22750 17800 12700 12700  8000  8000
[13]  8000  3900  3900  3900


> On 14 Oct 2023, at 17:10 , Bert Gunter  wrote:
> 
> Well, here's one way to do it:
> (dat is your example data frame)
> 
> Cutoff <- seq(0, .15, .01)
> Pop <- with(dat, sapply(Cutoff, \(p)sum(Totpop[Pct >= p])))
> 
> I think there must be a more efficient way to do it with cumsum(), though.
> 
> Cheers,
> Bert
> 
> On Sat, Oct 14, 2023 at 12:53 AM Jason Stout, M.D.  
> wrote:
>> 
>> This seems like it should be simple but I can't get it to work properly.  
>> I'm starting with a data frame like this:
>> 
>> Tract  Pct  Totpop
>> 1  0.054000
>> 2  0.033500
>> 3  0.014500
>> 4  0.124100
>> 5  0.213900
>> 6  0.044250
>> 7  0.075100
>> 8  0.094700
>> 9  0.064950
>> 10   0.034800
>> 
>> And I want to end up with a data frame with two columns, a "Cutoff" column 
>> that is a simple sequence of equally spaced cutoffs (let's say in this case 
>> from 0-0.15 by 0.01) and a "Pop" column which equals the sum of "Totpop" in 
>> the prior data frame in which "Pct" is greater than or equal to "cutoff."  
>> So in this toy example, this is what I want for a result:
>> 
>>   Cutoff   Pop
>> 10.00 43800
>> 20.01 43800
>> 30.02 39300
>> 40.03 39300
>> 50.04 31000
>> 60.05 26750
>> 70.06 22750
>> 80.07 17800
>> 90.08 12700
>> 10   0.09 12700
>> 11   0.10  8000
>> 12   0.11  8000
>> 13   0.12  8000
>> 14   0.13  3900
>> 15   0.14  3900
>> 16   0.15  3900
>> 
>> I can do this with a for loop but it seems there should be an easier, 
>> vectorized way that would be more efficient.  Here is a reproducible example:
>> 
>> dummydata<-data.frame(Tract=seq(1,10,by=1),Pct=c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),Totpop=c(4000,3500,4500,4100,
>>  
>>3900,4250,5100,4700,
>>  
>>4950,4800))
>> dfrm<-data.frame(matrix(ncol=2,nrow=0,dimnames=list(NULL,c("Cutoff","Pop"
>> for (i in seq(0,0.15,by=0.01)) {
>> temp<-sum(dummydata[dummydata$Pct>=i,"Totpop"])
>> dfrm[nrow(dfrm)+1,]<-c(i,temp)
>> }
>> 
>> Jason Stout, MD, MHS
>> Division of Infectious Diseases
>> Dept of Medicine
>> Duke University
>> Box 102359-DUMC
>> Durham, NC 27710
>> FAX 919-681-7494
>> 
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.