One of the products of Project MOSAIC (funded by an NSF CCLI grant) has been the development of an R package with the goal of making it easier to use R, especially in teaching situations. We're not quite ready to declare that we've reached version 1.0, but version 0.4 does represent a fairly large step in that direction. You can find out more about the package on CRAN or by installing it, but here are some of the highlights (some example code appears at the end of this message):
* extensions of syntax to promote consistency across functions and make wider use of the formula interface * simplified ways of creating and plotting functions, including extracting model fits as functions * a tally() function that combines features of table() and xtabs() and more in a common syntax * expanded syntax for summary functions like mean(), median(), max(), sd(), var(), etc. that accepts formulas and data frames * a do() function that simplifies resampling-based statistical analysis * numerical integration and differentiation to support using calculus techniques in R * first drafts of vignettes on teaching resampling and calculus in R * some functions that add extra features to familiar functions (e.g., xchisq.test(), xhistogram(), xpnorm(), ...) * some data sets If you are using mosaic and discover bugs, or have suggestions for future development, consider submitting an issue on our github development site: http://github.com/rpruim/mosaic/issues/ You can also look there to see what's already on our to-do list. ---rjp (on behalf of the development team that includes Danny Kaplan and Nick Horton) ======================================================================== Randall Pruim phone: 616.526.7113 Dept. of Mathematics and Statistics email: rpr...@calvin.edu Calvin College office: NH 284 1740 Knollcrest Circle SE URL: http://www.calvin.edu/~rpruim/ Grand Rapids, MI 49546-4403 FAX: 616.526.6501 --------------------------------------------- Here are the promised code examples to give you a feel for what mosaic makes possible: > mean(age, data=HELPrct) [1] 35.65342 > mean(~age, data=HELPrct) [1] 35.65342 > mean(age ~ sex, data=HELPrct) female male 36.25234 35.46821 > mean(age ~ sex & treat, data=HELPrct) female.no male.no female.yes male.yes 37.56364 35.90173 34.86538 35.03468 > interval(binom.test( ~ eruptions > 3, faithful)) probability of success lower upper 0.6433824 0.5832982 0.7003038 > pval(binom.test( ~ eruptions > 3, faithful)) p.value 2.608528e-06 > xchisq.test(phs) # physicians health study example (data entry > omitted) Pearson's Chi-squared test with Yates' continuity correction data: phs X-squared = 24.4291, df = 1, p-value = 7.71e-07 104.00 10933.00 ( 146.52) (10890.48) [12.34] [ 0.17] <-3.51> < 0.41> 189.00 10845.00 ( 146.48) (10887.52) [12.34] [ 0.17] < 3.51> <-0.41> key: observed (expected) [contribution to X-squared] <residual> > model <- lm(length ~ width + sex, KidsFeet) > L <- makeFun(model) > L( 9.0, 'B') 1 24.80017 > L( 9.0, 'B', interval='confidence') fit lwr upr 1 24.80017 24.30979 25.29055 > xyplot( length ~ width, groups= sex, KidsFeet ) # scatter plot > with different symbols for boys and girls > plotFun(L(x,'B') ~ x, add=TRUE) # add model fit (for boys) to plot > plotFun(L(x,'G') ~ x, add=TRUE, lty=2) # add model fit (for girls) > to plot > rflip(10) # flip a coin 10 times Flipping 10 coins [ Prob(Heads) = 0.5 ] ... T H T H H H T H H T Result: 6 heads. > do(2) * rflip(10) # do that twice; notice that do() extracts > interesting info n heads tails 1 10 4 6 2 10 6 4 > ladyTastingTea <- do(5000) * rflip(10) # simulate 5000 ladies tasting tea > tally(~heads, ladyTastingTea) 0 1 2 3 4 5 6 7 8 9 10 Total 5 52 221 573 1032 1227 1027 606 198 52 7 5000 > tally(~heads, ladyTastingTea, format='proportion') 0 1 2 3 4 5 6 7 8 9 10 Total 0.0010 0.0104 0.0442 0.1146 0.2064 0.2454 0.2054 0.1212 0.0396 0.0104 0.0014 1.0000 # do() extracts useful information from lm objects so that randomization tests are easy. > do(2) * lm( length ~ width + shuffle(sex), data=KidsFeet ) Intercept width sexG sigma r-squared 1 9.646822 1.693137 -0.3057453 1.026824 0.4246224 2 11.416739 1.453416 0.4860068 1.013323 0.4396534 > tally( ~ sex & substance, HELPrct ) substance sex alcohol cocaine heroin Total female 36 41 30 107 male 141 111 94 346 Total 177 152 124 453 > tally( ~ sex | substance, HELPrct ) # auto switch to proportions > for conditional distributions substance sex alcohol cocaine heroin female 0.2033898 0.2697368 0.2419355 male 0.7966102 0.7302632 0.7580645 Total 1.0000000 1.0000000 1.0000000 > favstats(age ~ sex & substance, data=HELPrct) min Q1 median Q3 max mean sd n missing female.alcohol 23 33 37.0 45 58 39.16667 7.980333 36 0 male.alcohol 20 32 38.0 42 58 37.95035 7.575644 141 0 female.cocaine 24 31 34.0 38 49 34.85366 6.195002 41 0 male.cocaine 23 30 33.0 37 60 34.36036 6.889772 111 0 female.heroin 21 29 34.0 39 55 34.66667 8.035839 30 0 male.heroin 19 27 32.5 39 53 33.05319 7.973568 94 0 > D(sin(a*x) ~ x) # return derivative as a function with parameter a function (x, a) cos(a * x) * a [[alternative HTML version deleted]] _______________________________________________ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.