On Apr 23, 2010, at 3:48 PM, Maxim wrote:

I have a very simple question, but I'm obviously not able to solve the
problem on my own.

I have a data.frame like

sample(c("A","B","C"),size=20,replace = T)->type

rnorm(20)->value

data.frame(ty=type,val=value)->test

There must be some built in functions, that will do some descriptive
statistics with tabular output, in the end I like to have something like

number of samples mean sd .............

A 5
B 9
C 6

So I need a function that counts the number of occurrences of factors in
type and then does something like the *summary* function, but factor
specific.

I tried:
vector()->Median
vector()->SD
vector()->Mean

as.data.frame(table(type))->int
for (count in c(1:(nrow(int))))
    {
subset(test, ty==as.character(int$type[count])) -> subtest
median(subtest$val)->Median[count]
sd(subtest$val)->SD[count]
mean(subtest$val)->Mean[count]
}


cbind(int,Median,SD,Mean)

> require(Design) # loads Hmisc which has ne of many version of describe()
> describe(test)
test

 2  Variables      20  Observations
-------------------------------------------------------------------------
ty
      n missing  unique
     20       0       3

A (4, 20%), B (5, 25%), C (11, 55%)
-------------------------------------------------------------------------
val
        n   missing    unique      Mean       .05       .10       .25
       20         0        20   0.07383 -0.865776 -0.815317 -0.707465
      .50       .75       .90       .95
 0.005735  0.634226  1.270066  1.771820

lowest : -1.7965 -0.8168 -0.8152 -0.8040 -0.7170
highest:  0.6790  1.0680  1.2149  1.7665  1.8729
-------------------------------------------------------------------------


> require(doBy)
> summaryBy(value~ty, test, FUN=list(length, mean, min, max, sd, median))
  ty value.length  value.mean  value.min value.max  value.sd
1  A            4 -0.03442822 -0.8151531  1.766502 1.2258221
2  B            5  0.34541927 -0.8167919  1.214906 0.7647165
3  C           11 -0.01025352 -1.7964684  1.872865 1.0109676
  value.median
1  -0.54453098
2   0.57020532
3  -0.06826249


The by() function which is an application of tapply can also be used.

>


This works, but: isn't this much too complicated, I bet there is such
functionality embedded in the base packages, but I cannot find it.


Maxim


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to