Dear readers,

Lm returns all information necessary to reconstruct summary statistics by 
group. However, t.test only returns the group means, and not the group SDs, or 
even the group Ns. These cannot be reconstructed from the test statistic and 
df, because the df are already pooled, except under a very strict assumption of 
equality of groups and variances.

I need these summary statistics for a package that performs Bayesian inference 
for frequentist analyses, through normal approximation of the posterior. To 
make the package as user friendly as possible, I would like it to have S3 
methods for commonly used frequentist analyses in R, such as: lm and t.test.

As per the R-project feature request guidelines, I would like to gauge how 
people feel about adding functionality to t.test, so that it will return either 
the model data, as lm() does, or full summary statistics (per group)? I can 
mask the t.test function with an enhanced version, but I feel like there is 
value in making this functionality available to all.

Thank you sincerely for your input. Below is a reproducible example, 
illustrating the problem.

Best,
Caspar

d <- structure(list(Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7,
                                     1.4, 1.5, 1.4, 1.5, 1.5, 1.6, 1.4, 1.1, 
1.2, 1.5, 1.3, 1.4, 1.7,
                                     1.5, 1.7, 1.5, 1, 1.7, 1.9, 1.6, 1.6, 1.5, 
1.4, 1.6, 1.6, 1.5,
                                     1.5, 1.4, 1.5, 1.2, 1.3, 1.4, 1.3, 1.5, 
1.3, 1.3, 1.3, 1.6, 1.9,
                                     1.4, 1.6, 1.4, 1.5, 1.4, 4.7, 4.5, 4.9, 4, 
4.6, 4.5, 4.7, 3.3,
                                     4.6, 3.9, 3.5, 4.2, 4, 4.7, 3.6, 4.4, 4.5, 
4.1, 4.5, 3.9, 4.8,
                                     4, 4.9, 4.7, 4.3, 4.4, 4.8, 5, 4.5, 3.5, 
3.8, 3.7, 3.9, 5.1,
                                     4.5, 4.5, 4.7, 4.4, 4.1, 4, 4.4, 4.6, 4, 
3.3, 4.2, 4.2, 4.2,
                                     4.3, 3, 4.1),
                    Species = structure(rep(c(1L, 2L), each = 50), .Label = 
c("setosa", "versicolor"), class = "factor")),
                    row.names = c(NA, 100L), class = "data.frame")
# lm model
m_lm <- lm(Petal.Length ~ Species, d)
# Extract group means:
aggregate(m_lm$model$Petal.Length, list(m_lm$model$Species), mean)
# Extract group SDs:
aggregate(m_lm$model$Petal.Length, list(m_lm$model$Species), sd)
# Extract group Ns:
table(m_lm$model$Species)

# t.test model
m_t <- t.test(d$Petal.Length[1:50], d$Petal.Length[51:100], var.equal = TRUE)
# Extract group means:
m_t$estimate
# Extract group SDs:
# Not available
# Extract group Ns:
# Not available

Dr. Caspar J. van Lissa
Assistant professor of developmental data science
Utrecht University, dept. Methodology & Statistics
Sjoerd Groenmangebouw C1.01, 3584CH Utrecht, the Netherlands. Secretariat: +31 
30 253 4438


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to