On 11/08/13 15:02, Roy Smith wrote:
In article <mailman.479.1376221844.1251.python-l...@python.org>,
  Skip Montanaro <s...@pobox.com> wrote:

See the Rationale of PEP 450 for more reasons why “install NumPy� is not
a feasible solution for many use cases, and why having ‘statistics’ as a
pure-Python, standard-library package is desirable.

I read that before posting but am not sure I agree. I don't see the
screaming need for this package.  Why can't it continue to live on
PyPI, where, once again, it is available as "pip install ..."?

My previous comments on this topic were along the lines of "installing
numpy is a non-starter if all you need are simple mean/std-dev".  You
do, however, make a good point here.  Running "pip install statistics"
is a much lower barrier to entry than getting numpy going, especially if
statistics is pure python and thus has no dependencies on compiler tool
chains which may be missing.

Still, I see two classes of function in PEP-450.  Class 1 is the really
basic stuff:

* mean
* std-dev

Class 2 are the more complicated things like:

* linear regression
* median
* mode
* functions for calculating the probability of random variables
   from the normal, t, chi-squared, and F distributions
* inference on the mean
* anything that differentiates between population and sample

I could see leaving class 2 stuff in an optional pure-python module to
be installed by pip, but for (as the PEP phrases it), the simplest and
most obvious statistical functions (into which I lump mean and std-dev),
having them in the standard library would be a big win.

I would probably move other descriptive statistics (median, mode, correlation, ...) into Class 1.

I roll my own statistical tests as I need them - simply to avoid having a dependency on R. But I generally do end up with a dependency on scipy because I need scipy.stats.distributions. So I guess a distinct library for probability distributions would be handy - but maybe it should not be in the standard library.

Once we move on to statistical modelling (e.g. linear regression) I think the case for inclusion in the standard library becomes weaker still. Cheers.


Reply via email to