[issue39094] Add a default to statistics.mean and related functions

2019-12-20 Thread Yoni Lavi


Yoni Lavi  added the comment:

Thanks for the good feedback everyone and apologies for the unresponsiveness 
over the past day.

I understand that my use cases may not reflect wider usage patterns and am not 
looking to argue against the closing. But anyway, for future reference, I'll 
add two real-life usage examples, which I should have included originally 
(again, apologies for the delay, things have been hectic).

The context is that I'm involved in running a coding bootcamp, and these are 
two recent cases when I needed a default of zero recently:

1. (Separately of the final grade calculations) We are interested in students' 
average grades on their projects as an indicator of their skills gained and 
their striving for excellence. When calculating this indicator, we use an 
average of 0 for a student who haven't yet submitted anything.

2. When providing tutoring support, we classify the "complexity" of each 
student issue, and then one of our indicators involves the average complexity 
of questions in a particular slice of time and the programme (this is 
particularly interesting around changes to the content). For this as well, a 
slice of time/programme/tutor during which there were no issues would be 
considered as having a complexity of 0.

Again, not disputing the decision to close, just adding these examples for 
future reference.
Thanks

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-20 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

I agree with Raymond's comments, except that because I'm sometimes a bit
of a pedant, I have to make one minor correction: max and min can be
descriptive statistics.

The sample minimum is the 1st order statistic, and the sample maximum is
the N-th order statistic:

https://www2.stat.duke.edu/courses/Spring12/sta104.1/Lectures/Lec15.pdf

This doesn't invalidate the rest of what Raymond says.

Yoni Lavi, thank you for the suggestion, but I'm going to close this ticket. If 
you think you have a really strong argument for the feature, please feel free 
to make it here, and we will rethink the closure. But I don't want to give you 
false hope: it would have to be a very strong argument.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-19 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Thought experiment: Suppose someone proposed, "math.log(x) should take an 
optional default argument because it is inconvenient to a catch a ValueError if 
the input is non-positive".   Or more generally, what if someone proposed, 
"every function in Python that can raise a ValueError should offer a default 
argument."  One could imagine a use case for both of these proposals but that 
doesn't mean that the API extensions would be warranted.

Also, ISTM the analogy to min() and max() is imperfect.  Those aren't 
descriptive statistics.  For min() and max() we can know a priori that a 
probability is never lower than 0.0 or greater than 1.0 for example.

Lastly, in common cases where the input is a sequence (rather than just an 
iterator), we already have a ternary operator to does the job nicely:

   central_value = mean(data) if data else 'unknown'

For the less common case, a try/except is not an undue burden; after all, it is 
a basic core language feature.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-19 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

TL;DR: I'm not likely to accept this feature request without at least one of 
(1) a practical use-case, (2) prior art in other statistics software, or (3) a 
strong mathematical justification for why this is meaningful and useful.


I'm not categorically against this idea, but it seems a bit fishy to me. If you 
have no data, how do you know what default value to give that would be 
appropriate for your (non-existent) observations?

It might help if you could show a real-life example of how, and why, you would 
use this, and how you would choose the default?

Another possibility would be to find prior-art: another language, library or 
stats calculator which already offers this feature.

Alternatively, a mathematical/statistical justification for a default. For 
example, the empty sum is normally taken as 0 and the empty product as 1. R 
returns either a NAN or NA for the empty mean (depending on precisely how you 
calculate it).

While I'm personally sympathetic to the nuisance factor of having to wrap code 
in try...except blocks (my *personal* preference would have been for mean to 
return NAN on empty input) I think you will need to make a stronger case than 
just the analogy with min and max.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-19 Thread STINNER Victor


STINNER Victor  added the comment:

> I've tried think of other solutions, such as a generic wrapper for such 
> functions or a helper to check whether an iterable is empty, and they all 
> turn out to be very clunky to use and un-Pythonic.

So the main use case would be to detect an empty iterable in an efficient 
fashion? Something like the following code?

sentinel = objet()
avg = mean(data, default=sentinel)
if avg is sentinel:
   ... # special code path

Why not adding a statistics.StatisticsError subclass for empty set (ex: 
StatisticsEmptyError)? Something like:

try:
   avg = mean(data)
except statistics.StatisticsEmptyError:
   ... # special code path, ex: avg = default

Or is there another use case for the proposed default parameter?

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-19 Thread Mark Dickinson


Mark Dickinson  added the comment:

What would the proposal look like for `statistics.stdev`? There you need at 
least two data points to compute a result, and a user might want to do 
different things for an empty dataset versus a single data point.

--
nosy: +mark.dickinson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-18 Thread Tal Einat


Tal Einat  added the comment:

It seems to me that this would follow the same argument as in issue #18111: The 
real issue is that there's no good way to check if an arbitrary iterable is 
empty, unlike with sequences. Currently, callers need to wrap with try/except 
to handle empty iterators properly, or do non-trivial iterator "magic" to check 
whether the iterator is empty before passing it in.

I've tried think of other solutions, such as a generic wrapper for such 
functions or a helper to check whether an iterable is empty, and they all turn 
out to be very clunky to use and un-Pythonic.

Since we provide first-class support for iterators, and many builtins return 
iterators, giving the tools to handle the case where they are empty elegantly 
and simply seems prudent.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-18 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

I vote -1.  We don't have defaults for stdev() or median() or mode().  And it 
isn't clear what one would use for a meaningful default value in most cases.  
Also, I'm not seeing anything like this in Pandas, Excel, etc.  So, I recommend 
keeping the current simple and clean APIs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-18 Thread Yoni Lavi


Change by Yoni Lavi :


--
keywords: +patch
pull_requests: +17124
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/17657

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-18 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +rhettinger, steven.daprano, taleinat

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39094] Add a default to statistics.mean and related functions

2019-12-18 Thread Yoni Lavi


New submission from Yoni Lavi :

I would like to put forward an argument in favour of a `default` parameter in 
the statistics.mean function and the related function. 

What motivated me to open this is that my code would more often than not 
include a check (or try-except) whenever I calculate a mean and add a 
default/sentinel value, and I felt that there should be a better way.

Please also note that we have a precedent for this in a similar parameter added 
to min & max in 3.4 (https://bugs.python.org/issue18111)

--
components: Library (Lib)
messages: 358653
nosy: Yoni Lavi
priority: normal
severity: normal
status: open
title: Add a default to statistics.mean and related functions
type: enhancement
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com