Thanks David.This is working perfectly! On Fri, Jul 29, 2016 at 9:00 PM, David Winsemius <dwinsem...@comcast.net> wrote:
> > > On Jul 29, 2016, at 5:52 PM, David Winsemius <dwinsem...@comcast.net> > wrote: > > > > > >> On Jul 29, 2016, at 5:08 PM, Jun Shen <jun.shen...@gmail.com> wrote: > >> > >> Thanks Jeff/David for the reply. I wasn't clear in the previous > message. the problem of using na.omit is it will omit the whole row where > there is at least one NA, even when some variables do have non-NA values. > > > > Did you actually run the example I offered, or did you just guess at > what would happen and complained? When applied only to a vector there is no > such thing as a "column". > > > > What you are describing would only have happened if `na.omit` were > applied to an object that was a dataframe. That was not what was offered in > the example. > > And then I looked at the code again and realized you were not looping over > the columns as I thought was happening. So what you wnat is: > > do.stats <- function(data, stats.func, summary.var) > as.data.frame(signif(sapply(stats.func,function(func) > mapply( func, lapply( data[summary.var], na.omit) )), 3)) > > -- > David > > > > > > -- > > David. > >> > >> For example: let's define a new function > >> N <- function(x) length(x[!is.na(x)]) > >> > >> test <- > data.frame(ID=1:100,CL=rnorm(100),V1=rnorm(100),V2=rnorm(100),ALPHA=rnorm(100)) > >> test$CL[1] <- NA > >> > >> do.stats(test, stats.func=c('mean','sd','median','min','max','N'), > summary.var=c('CL','V1', 'V2','ALPHA')) > >> > >> gives > >> > >> mean sd median min max N > >> CL -0.0232 0.918 -0.0786 -2.14 3.14 99 > >> V1 -0.0410 0.936 -0.1160 -2.86 2.67 99 > >> V2 -0.1760 0.978 -0.1490 -2.31 2.15 99 > >> ALPHA -0.1380 0.960 -0.2160 -2.41 2.20 99 > >> > >> > >> there is one non-missing value in V1,V2 and ALPHA is omitted. > >> > >> > >> On Fri, Jul 29, 2016 at 2:29 AM, David Winsemius < > dwinsem...@comcast.net> wrote: > >> > >>> On Jul 28, 2016, at 7:37 PM, Jun Shen <jun.shen...@gmail.com> wrote: > >>> > >>> Because in reality the NA may appear in one variable but not others. > For > >>> example for ID=1, CL may be NA but not for others, For ID=2, V1 may be > NA > >>> etc. To keep all the IDs and all the variables in one data frame, it's > >>> inevitable to see some NA > >> > >> That doesn't seem to acknowledge Newmiller's advice. In particular this > would have seemed to an obvious response to that suggestion: > >> > >> do.stats <- function(data, stats.func, summary.var) > >> as.data.frame(signif(sapply(stats.func,function(func) > >> mapply( func, na.omit( data[summary.var]) )), 3)) > >> > >> > >> And please also heed the advice in the Posting Guide to use plain text. > >> > >> -- > >> David. > >> > >> > >> > >>> > >>> On Thu, Jul 28, 2016 at 10:22 PM, Jeff Newmiller < > jdnew...@dcn.davis.ca.us> > >>> wrote: > >>> > >>>> Why not remove it yourself before passing it to those functions? > >>>> -- > >>>> Sent from my phone. Please excuse my brevity. > >>>> > >>>> On July 28, 2016 5:51:47 PM PDT, Jun Shen <jun.shen...@gmail.com> > wrote: > >>>>> Dear list, > >>>>> > >>>>> I write a small function to calculate multiple stats on multiple > >>>>> variables > >>>>> and export in a format exactly the way I want. Everything seems fine > >>>>> until > >>>>> NA appears in the data. > >>>>> > >>>>> Here is my function: > >>>>> > >>>>> do.stats <- function(data, stats.func, summary.var) > >>>>> as.data.frame(signif(sapply(stats.func,function(func) > >>>>> mapply(func,data[summary.var])),3)) > >>>>> > >>>>> A test dataset: > >>>>> test <- > >>>> > >>>>> > data.frame(ID=1:100,CL=rnorm(100),V1=rnorm(100),V2=rnorm(100),ALPHA=rnorm(100)) > >>>>> > >>>>> a command like the following > >>>>> do.stats(test, stats.func=c('mean','sd','median','min','max'), > >>>>> summary.var=c('CL','V1', 'V2','ALPHA')) > >>>>> > >>>>> gives me > >>>>> > >>>>> mean sd median min max > >>>>> CL 0.1030 0.917 0.0363 -2.32 2.47 > >>>>> V1 -0.0545 1.070 -0.2120 -2.21 2.70 > >>>>> V2 0.0600 1.000 0.0621 -2.80 2.62 > >>>>> ALPHA -0.0113 0.919 0.0284 -2.35 2.31 > >>>>> > >>>>> > >>>>> However if I have a NA in the data > >>>>> test$CL[1] <- NA > >>>>> > >>>>> The same command run gives me > >>>>> mean sd median min max > >>>>> CL * NA NA NA NA NA* > >>>>> V1 -0.0545 1.070 -0.2120 -2.21 2.70 > >>>>> V2 0.0600 1.000 0.0621 -2.80 2.62 > >>>>> ALPHA -0.0113 0.919 0.0284 -2.35 2.31 > >>>>> > >>>>> I know this is because those functions (mean, sd etc.) all have > >>>>> na.rm=F by default. How can I > >>>>> > >>>>> pass na.rm=T to all these functions without manually redefining those > >>>>> stats functions > >>>>> > >>>>> Appreciate any comment. > >>>>> > >>>>> Thanks for your help. > >>>>> > >>>>> > >>>>> Jun > >>>>> > >>>>> [[alternative HTML version deleted]] > >>>>> > >>>>> ______________________________________________ > >>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>> PLEASE do read the posting guide > >>>>> http://www.R-project.org/posting-guide.html > >>>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>>> > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> David Winsemius > >> Alameda, CA, USA > >> > >> > > > > David Winsemius > > Alameda, CA, USA > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.