# plyr plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to:
* fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations like scaling or standardising It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with: * totally consistent names, arguments and outputs * convenient parallelisation through the foreach package * input from and output to data.frames, matrices and lists * progress bars to keep track of long running operations * built-in error recovery, and informative error messages * labels that are maintained across all transformations Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in functions. You can find out more at http://had.co.nz/plyr/, including a 20 page introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. You can ask questions about plyr (and data-manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr Version 1.4 (2011-01-03) ------------------------------------------------------------------------------ * `count` now takes an additional parameter `wt_var` which allows you to compute weighted sums. This is as fast, or faster than, `tapply` or `xtabs`. * Really fix bug in `names.quoted` * `.` now captures the environment in which it was evaluated. This should fix an esoteric class of bugs which no-one probably ever encountered, but will form the basis for an improved version of `ggplot2::aes`. Version 1.3.1 (2010-12-30) ------------------------------------------------------------------------------ * Fix bug in `names.quoted` that interfered with ggplot2 Version 1.3 (2010-12-28) ------------------------------------------------------------------------------ NEW FEATURES * new function `mutate` that works like transform to add new columns or overwrite existing columns, but computes new columns iteratively so later transformations can use columns created by earlier transformations. (It's also about 10x faster) (Fixes #21) BUG FIXES * split column names are no longer coerced to valid R names. * `quickdf` now adds names if missing * `summarise` preserves variable names if explicit names not provided (Fixes #17) * `arrays` with names should be sorted correctly once again (also fixed a bug in the test case that prevented me from catching this automatically) * `m_ply` no longer possesses .parallel argument (mistakenly added) * `ldply` (and hence `adply` and `ddply`) now correctly passes on .parallel argument (Fixes #16) * `id` uses a better strategy for converting to integers, making it possible to use for cases with larger potential numbers of combinations -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.