[issue36018] Add a Normal Distribution class to the statistics module

2020-01-27 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 41f4dc3bcf30cb8362a062a26818311c704ea89f by Raymond Hettinger 
(Miss Islington (bot)) in branch '3.8':
bpo-36018: Minor fixes to the NormalDist() examples and recipes. (GH-18226) 
(GH-18227)
https://github.com/python/cpython/commit/41f4dc3bcf30cb8362a062a26818311c704ea89f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-27 Thread miss-islington


Change by miss-islington :


--
pull_requests: +17607
pull_request: https://github.com/python/cpython/pull/18227

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-27 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 01bf2196d842fc20667c5336e0a7a77eb4fdc25c by Raymond Hettinger in 
branch 'master':
bpo-36018: Minor fixes to the NormalDist() examples and recipes. (GH-18226)
https://github.com/python/cpython/commit/01bf2196d842fc20667c5336e0a7a77eb4fdc25c


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-27 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +17606
pull_request: https://github.com/python/cpython/pull/18226

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-25 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset eebcff8c071b38b53bd429892524ba8518cbeb98 by Raymond Hettinger 
(Miss Islington (bot)) in branch '3.8':
bpo-36018: Add another example for NormalDist() (GH-18191) (GH-18192)
https://github.com/python/cpython/commit/eebcff8c071b38b53bd429892524ba8518cbeb98


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-25 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 10355ed7f132ed10f1e0d8bd64ccb744b86b1cce by Raymond Hettinger in 
branch 'master':
bpo-36018: Add another example for NormalDist() (#18191)
https://github.com/python/cpython/commit/10355ed7f132ed10f1e0d8bd64ccb744b86b1cce


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-25 Thread miss-islington


Change by miss-islington :


--
pull_requests: +17575
pull_request: https://github.com/python/cpython/pull/18192

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2020-01-25 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +17574
pull_request: https://github.com/python/cpython/pull/18191

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-09-08 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset cc1bdf91d53b1a4751be84ef607e24e69a327a9b by Raymond Hettinger in 
branch '3.8':
[3.8] bpo-36018: Address more reviewer feedback (GH-15733) (GH-15734)
https://github.com/python/cpython/commit/cc1bdf91d53b1a4751be84ef607e24e69a327a9b


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-09-08 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +15389
pull_request: https://github.com/python/cpython/pull/15734

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-09-08 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 4db25d5c39e369f4b55eab52dc8f87f390233892 by Raymond Hettinger in 
branch 'master':
bpo-36018: Address more reviewer feedback (GH-15733)
https://github.com/python/cpython/commit/4db25d5c39e369f4b55eab52dc8f87f390233892


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-09-08 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +15388
pull_request: https://github.com/python/cpython/pull/15733

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-07-21 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

I have a query about the documentation:

   The default *method* is "exclusive" and is used for data sampled
   from a population that can have more extreme values than found 
   in the samples. ...
   Setting the *method* to "inclusive" is used for describing 
   population data or for samples that include the extreme points.

In all my reading about quantile calculation methods, this is the first time 
I've come across this recommendation. Do you have a source for it or a 
justification?

Thanks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-07-21 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +14678
pull_request: https://github.com/python/cpython/pull/14898

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-05-01 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 671d782f8dc52942dc8c48a513bf24ff8465b112 by Raymond Hettinger in 
branch 'master':
bpo-36018: Update example to show mean and stdev (GH-13047)
https://github.com/python/cpython/commit/671d782f8dc52942dc8c48a513bf24ff8465b112


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-05-01 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12966

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-04-30 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset b0a2c0fa83f9b79616ccf451687096542de1e6f8 by Raymond Hettinger in 
branch 'master':
bpo-36018: Test idempotence. Test two methods against one-another. (GH-13021)
https://github.com/python/cpython/commit/b0a2c0fa83f9b79616ccf451687096542de1e6f8


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-04-30 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12943

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-04-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset fb8c7d53326d137785ca311bfc48c8284da46770 by Raymond Hettinger in 
branch 'master':
bpo-36018: Make "seed" into a keyword only argument (GH-12921)
https://github.com/python/cpython/commit/fb8c7d53326d137785ca311bfc48c8284da46770


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-04-23 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12849

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-03-11 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

> I'm assuming you meant 5.374 rather than 5.372 in the first Nspire result.

Yes, that was a typo, sorry.

Thanks for checking into the results.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-03-11 Thread Mark Dickinson


Mark Dickinson  added the comment:

Below is the full transcript from Pari/GP: note that I converted the float 
inputs to exact Decimal equivalents, assuming IEEE 754 binary64. Summary: both 
Python results look fine; it's Nspire that's inaccurate here.


mirzakhani:~ mdickinson$ /opt/local/bin/gp
GP/PARI CALCULATOR Version 2.11.1 
(released)
i386 running darwin (x86-64/GMP-6.1.2 
kernel) 64-bit version
   compiled: Jan 24 2019, Apple LLVM version 10.0.0 
(clang-1000.11.45.5)
  threading engine: single
   (readline v8.0 enabled, extended 
help enabled)

   Copyright (C) 2000-2018 The PARI 
Group

PARI/GP is free software, covered by the GNU General Public License, and comes 
WITHOUT ANY WARRANTY WHATSOEVER.

Type ? for help, \q to quit.
Type ?17 for how to get moral (and possibly technical) support.

parisize = 800, primelimit = 50
? \p 200
   realprecision = 211 significant digits (200 digits displayed)
? ncdf(x, mu, sig) = (2 - erfc((x - mu) / sig / sqrt(2))) / 2
%1 = (x,mu,sig)->(2-erfc((x-mu)/sig/sqrt(2)))/2
? ncdf(5.37366604491419275291264057159423828125, 2, 
1.3000444089209850062616169452667236328125)
%2 = 
0.99527574392076815760565921436860970675961162900034485433923192853608778325191325235412640687571628164064779657215907190523884572141701976336760387216713270956350229484865180142256611330976179584951493
? ncdf(-0.2300099920072216264088638126850128173828125, 2, 
1.3000444089209850062616169452667236328125)
%3 = 
0.043137367078910025352120502108682523151629166877357644882244088336773338416883044522024586619860574718679715351558322591944140762629090301623352497457372937783778706411712862062109829239761761597057063

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-03-11 Thread Mark Dickinson


Mark Dickinson  added the comment:

According to GP/Pari, the correctly value for the first result, to the first 
few dozen places, is:

0.995275743920768157605659214368609706759611629000344854339231928536087783251913252354...

I'm assuming you meant 5.374 rather than 5.372 in the first Nspire result.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-03-11 Thread Steven D'Aprano

Steven D'Aprano  added the comment:

I've done some spot checks of NormDist.pdf and .cdf and compared the results to 
those returned by my TI Nspire calculator.

So far, the PDF has matched that of the Nspire to 12 decimal places (the limit 
the calculator will show), but the CDF differs on or about the 8th decimal 
place:


py> x = statistics.NormalDist(2, 1.3)
py> x.cdf(5.374)
0.9952757439207682
# Nspire normCdf(-∞, 5.372, 2, 1.3) returns 0.995275710979
# difference of 3.294176820212158e-08


py> x.cdf(-0.23)
0.04313736707891003
# Nspire normCdf(-∞, -0.23, 2, 1.3) returns 0.043137332077
# difference of 3.500191003008579e-08

Wolfram Alpha doesn't help me decide which is correct, as it doesn't show 
enough decimal places.

https://www.wolframalpha.com/input/?i=CDF[+NormalDistribution[2,+1.3],+5.374+]

https://www.wolframalpha.com/input/?i=CDF[+NormalDistribution[2,+1.3],+-0.23+]


Do we care about this difference? Should I raise a new ticket for it?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-28 Thread miss-islington


miss-islington  added the comment:


New changeset 9add4b3317629933d88cf206a24b15e922fa8941 by Miss Islington (bot) 
(Raymond Hettinger) in branch 'master':
bpo-36018: Add documentation link to "random variable" (GH-12114)
https://github.com/python/cpython/commit/9add4b3317629933d88cf206a24b15e922fa8941


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-28 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12120

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-28 Thread miss-islington


miss-islington  added the comment:


New changeset ef17fdbc1c274dc84c2f611c40449ab84824607e by Miss Islington (bot) 
(Raymond Hettinger) in branch 'master':
bpo-36018: Add special value tests and make minor tweaks to the docs (GH-12096)
https://github.com/python/cpython/commit/ef17fdbc1c274dc84c2f611c40449ab84824607e


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-28 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12103

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-24 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-24 Thread miss-islington


miss-islington  added the comment:


New changeset 9e456bc70e7bc9ee9726d356d7167457e585fd4c by Miss Islington (bot) 
(Raymond Hettinger) in branch 'master':
bpo-36018: Add properties for mean and stdev (GH-12022)
https://github.com/python/cpython/commit/9e456bc70e7bc9ee9726d356d7167457e585fd4c


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-24 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12054

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread miss-islington


miss-islington  added the comment:


New changeset 79fbcc597dfd039d3261fffcb519b5ec5a18df9d by Miss Islington (bot) 
(Raymond Hettinger) in branch 'master':
bpo-36018: Make __pos__ return a distinct instance of NormDist (GH-12009)
https://github.com/python/cpython/commit/79fbcc597dfd039d3261fffcb519b5ec5a18df9d


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Steven, Davin, Michael:  Thanks for the encouragement and taking the time to 
review this code.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
pull_requests: +12040

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> Why use object.__setattr__(self, 'mu', mu) instead of 
> self.mu = mu in the __init__ method?

The idea was the instances should be immutable and hashable, but this added 
unnecessary complexity, so I took this out prior to the check in.

> Should __pos__ return a copy rather than the instance itself?

Yes.  I'll fix that straight-way.

^ The chice of using mu versus xbar was deliberate

I concur with that choice and also prefer to stick with mu and sigma:

1) It's too late to change it elsewhere in statistics and the random modules. 
2) Having attribute names the same as function names in the same module is 
confusing. 3) I had already user tested this API in some Python courses.  4) 
The variable names match the various external sources I've linked to in the 
docs.  5)  Python historically hasn't shied from greek letter names (math: pi 
tau gamma random: alpha, better, lambd, mu, sigma).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

Steven: Your point about population versus sample makes sense and your point 
that altering their names would be a breaking change is especially important.  
I think that pretty well puts an end to my suggestion of alternative names and 
says the current pattern should be kept with NormalDist.

I particularly like the idea of using the TI Nspire and Casio Classpad to guide 
or help confirm what symbols might be recognizable to secondary students or 1st 
year university students.


Raymond: As an idea for examples demonstrating the code, what about an example 
where a plot of pdf is created, possibly for comparison with cdf?  This would 
require something like matplotlib but would help to visually communicate the 
concepts of pdf, perhaps with different sigma values?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

Davin: the chice of using mu versus xbar was deliberate, as they represent 
different quantities: the population mean versus a sample mean. But reading 
over the docs with fresh eyes, I can now see that the distinction is not as 
clear as I intended.

I think that changing the names now would be a breaking change, but even if it 
wasn't, I don't want to change the names. The distinction between population 
parameters (mu) and sample statistics (xbar) is important and I think the 
function parameters should reflect that.

As for the new NormalDist class, we aren't limited by backwards compatibility, 
but I would still argue for the current names mu and sigma. As well as matching 
the population parameters of the distribution, they also matches the names used 
in calculators such as the TI Nspire and Casio Classpad (two very popular CAS 
calculators used by secondary school students).

See #36099. If you would like to suggest some doc changes, please feel free to 
do so.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

Karthikeyan: thanks for the hint about Github.

Raymond: thanks for the diff. Some comments:

Why use object.__setattr__(self, 'mu', mu) instead of self.mu = mu in the 
__init__ method?

Should __pos__ return a copy rather than the instance itself?

The rest looks good to me, and I look forward to using it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Davin Potts


Davin Potts  added the comment:

There is an inconsistency worth paying attention to in the choice of names of 
the input parameters.

Currently in the statistics module, pvariance() accepts a parameter named "mu" 
and pstdev() and variance() each accept a parameter named "xbar".  The docs 
describe both "mu" and "xbar" as "it should be the mean of data".  I suggest it 
is worth rationalizing the names used within the statistics module for 
consistency before reusing "mu" or "xbar" or anything else in NormalDist.

Using the names of mathematical symbols that are commonly used to represent a 
concept is potentially confusing because those symbols are not always 
*universally* used.  For example, students are often introduced to new concepts 
in introductory mathematics texts where concepts such as "mean" appear in 
formulas and equations not as "mu" but as "xbar" or simply "m" or other simple 
(and hopefully "friendly") names/symbols.  As a mathematician, if I am told a 
variable is named, "mu", I still feel the need to ask what it represents.  
Sure, I can try guessing based upon context but I will usually have more than 
one guess that I could make.

Rather than continue down a path of using various 
mathematical-symbols-written-out-in-English-spelling, one alternative would be 
to use less ambiguous, more informative variable names such as "mean".  It 
might be worth considering a change to the parameter names of "mu" and "sigma" 
in NormalDist to names like "mean" and "stddev", respectively.  Or perhaps 
"mean" and "standard_deviation".  Or perhaps "mean" and "variance" would be 
easier still (recognizing that variance can be readily computed from standard 
deviation in this particular context).  In terms of consistency with other 
packages that users are likely to also use, scipy.stats functions/objects 
commonly refer to these concepts as "mean" and "var".

I like the idea of making NormalDist readily approachable for students as well 
as those more familiar with these concepts.  The offerings in scipy.stats are 
excellent but they are not always the most approachable things for new students 
of statistics.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Okay, it's in for the second alpha.  Please continue to make API or 
implementation suggestions.  Nothing is set in stone.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 11c79531655a4aa3f82c20ff562ac571f40040cc by Raymond Hettinger in 
branch 'master':
bpo-36018: Add the NormalDist class to the statistics module (GH-11973)
https://github.com/python/cpython/commit/11c79531655a4aa3f82c20ff562ac571f40040cc


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Thanks for all the positive feedback.  If there are no objections, I would like 
to push this so it will be in the second alpha release so that it can get 
exercised.  We can still make adjustments afterwards.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Karthikeyan Singaravelan


Karthikeyan Singaravelan  added the comment:

@steven.daprano Bit off topic but you can also append .patch in the PR URL to 
generate patch file with all the commits made in the PR up to latest commit and 
.diff provides the current diff against master. They are plain text and can be 
downloaded through wget and viewed with an editor in case if it helps.

https://github.com/python/cpython/pull/11973.patch
https://github.com/python/cpython/pull/11973.diff

--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Raymond Hettinger


Change by Raymond Hettinger :


Removed file: https://bugs.python.org/file48162/normdist_22feb2019.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Raymond Hettinger


Change by Raymond Hettinger :


Added file: https://bugs.python.org/file48163/normdist_22feb2019.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Raymond Hettinger


Change by Raymond Hettinger :


Removed file: https://bugs.python.org/file48161/normdist_22feb2019.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Raymond Hettinger


Change by Raymond Hettinger :


Added file: https://bugs.python.org/file48162/normdist_22feb2019.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

I've made both suggested changes, "examples"->"samples" and set the defaults to 
the standard normal distribution.

To bypass Github, I've attached a diff to this tracker issue.  Let me know what 
you think :-)

--
Added file: https://bugs.python.org/file48161/normdist_22feb2019.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

Thanks Raymond.

Apologies for commenting here instead of at the PR.

While I've been fighting with more intermittedly broken than usual 
internet access, Github has stopped supporting my browser. I can't 
upgrade the browser without upgrading the OS, and I can't upgrade the OS 
without new hardware, and that will take money I don't have at the moment.

So the bottom line is that while I can read *part* of the diffs on 
Github, that's about all I can do. I can't comment there, I can't fork, 
I can't make push requests, half the pages don't load for me and the 
other half don't work properly when they do load. I can't even do a git 
clone.

So right now, the only thing I can do is comment on your extensive 
documentation in statistics.rst. That's very nicely done.

The only thing that strikes me as problematic is the default value for 
sigma, namely 0.0. The PDF for normal curve divides by sigma, so if 
that's zero, things are undefined. So I think that sigma ought to be 
strictly positive.

I also think it would be nice to default to the standard normal curve, 
with mu=0.0 and sigma=1.0. That will make it easy to work with Z scores.

Thanks again for this class, and my apologies for my inability to 
follow the preferred workflow.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-22 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Okay the PR is ready.

If you all are mostly comfortable with it, it would great to get this in for 
the second alpha so that people have a chance to work with it.

--
nosy: +davin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-21 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
keywords: +patch
pull_requests: +11999
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-19 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

I'll work up a PR for this.  

We can continue to tease-out the best method names. I've has success with 
"examples" and "from_samples" when developing this code in the classroom.  Both 
names had the virtue of being easily understood and never being misunderstood.

Intellectually, the name fit() makes sense because we are using data to create 
best fit model parameters.  So, technically this is probably the most accurate 
terminology.  However, it doesn't match how I think about the problem though -- 
that is more along the lines of "use sampling data to make a random variable 
with a normal distribution".  Another minor issue is that class methods are 
typically (but not always) recognizable by their from-prefix (e.g. 
dict.fromkeys, datetime.fromtimestamp, etc).

"NormalDist" seems more self explanatory to me that just "Normal".  Also, the 
noun form seems "more complete" than a dangling adjective (reading "normal" 
immediately raises the question "normal what?").  FWIW, MS Excel also calls 
their variant NORM.DIST (formerly spelled without the dot).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-19 Thread Michael Selik


Michael Selik  added the comment:

+1, This would be useful for quick analyses, avoiding the overhead of 
installing scipy and looking through its documentation.

Given that it's in the statistics namespace, I think the name can be simply 
``Normal`` rather than ``NormalDist``.  Also, instead of ``.from_examples`` 
consider naming the classmethod ``.fit``.

--
nosy: +selik

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-18 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

I like this idea!

Should the "examples" method be re-named "samples"? That's the word used in the 
docstring, and it matches the from_samples method.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-17 Thread Raymond Hettinger


Change by Raymond Hettinger :


Added file: https://bugs.python.org/file48148/gauss_demo.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-17 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
nosy: +mark.dickinson, tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36018] Add a Normal Distribution class to the statistics module

2019-02-17 Thread Raymond Hettinger


New submission from Raymond Hettinger :

Attached is a class that I've found useful for doing practical statistics work 
with normal distributions.  It provides a nice, high-level API that makes 
short-work of everyday statistical problems.

-- Examples 

# Simple scaling and translation
temperature_february = NormalDist(5, 2.5)# Celsius
print(temperature_february * (9/5) + 32) # Fahrenheit


# Classic probability problems
# https://blog.prepscholar.com/sat-standard-deviation
# The mean score on a SAT exam is 1060 with a standard deviation of 195
# What percentage of students score between 1100 and 1200?
sat = NormalDist(1060, 195)
fraction = sat.cdf(1200) - sat.cdf(1100)
print(f'{fraction * 100 :.1f}% score between 1100 and 1200')


# Combination of normal distributions by summing variances
birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5])
drug_effects = NormalDist(0.4, 0.15)
print(birth_weights + drug_effects)


# Statistical calculation estimates using simulations
# Estimate the distribution of X * Y / Z
n = 100_000
X = NormalDist(350, 15).examples(n)
Y = NormalDist(47, 17).examples(n)
Z = NormalDist(62, 6).examples(n)
print(NormalDist.from_samples(x * y / z for x, y, z in zip(X, Y, Z)))


# Naive Bayesian Classifier
# https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Sex_classification

height_male = NormalDist.from_samples([6, 5.92, 5.58, 5.92])
height_female = NormalDist.from_samples([5, 5.5, 5.42, 5.75])
weight_male = NormalDist.from_samples([180, 190, 170, 165])
weight_female = NormalDist.from_samples([100, 150, 130, 150])
foot_size_male = NormalDist.from_samples([12, 11, 12, 10])
foot_size_female = NormalDist.from_samples([6, 8, 7, 9])

prior_male = 0.5
prior_female = 0.5
posterior_male = prior_male * height_male.pdf(6) * weight_male.pdf(130) * 
foot_size_male.pdf(8)
posterior_female = prior_female * height_female.pdf(6) * weight_female.pdf(130) 
* foot_size_female.pdf(8)
print('Predict', 'male' if posterior_male > posterior_female else 'female')

--
assignee: steven.daprano
components: Library (Lib)
files: gauss.py
messages: 335792
nosy: rhettinger, steven.daprano
priority: normal
severity: normal
status: open
title: Add a Normal Distribution class to the statistics module
type: enhancement
versions: Python 3.8
Added file: https://bugs.python.org/file48147/gauss.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com