Raymond Hettinger <[email protected]> added the comment:
The attraction to having a first_tie=False flag is that it is backwards
compatible with the current API.
However on further reflection, people would almost never want the existing
behavior of raising an exception rather than returning a useful result.
So, we could make first_tie the default behavior and not have a flag at all.
We would also add a separate multimode() function with a clean signature,
always returning a list. FWIW, MS Excel evolved to this solution as well (it
has MODE.SGNL and MODE.MULT).
This would give us a clean, fast API with no flags:
mode(Iterable) -> scalar
multimode(Iterable) -> list
ISTM this is an all round win:
* Fixes the severe usability problems with mode(). Currently, it can't be used
without a try/except. The except-clause can't distinguish between an empty data
condition and no unique mode condition, nor can the except clause access the
dataset counts to get to a useful answer.
* It makes mode() memory friendly and faster. Currently, mode() has to convert
the counter items into a list() and run a full sort. However, if we only need
the first mode, we can feed the items iterator to max() and make only a single
pass, never having to create a list.
However, there would be an incompatibility in the API change. It is possible
that a person really does want an exception instead of a value when there are
multiple modes. In that case, this would break their code so that they have to
switch to multimode() as a remediation. OTOH, it is also possible that people
have existing code that uses mode() but their code has a latent bug because
they weren't using a try/except and haven't yet encountered a multi-modal
dataset.
So here are some options:
* Change mode() to return the first tie instead of raising an exception. This
is a behavior change but leaves you with the cleanest API going forward.
* Add a Deprecation warning to the current behavior of mode() when it finds
multimodal data. Then change the behavior in Python 3.9. This is
uncomfortable for one release, but does get us to the cleanest API and best
performance.
* Change mode() to have a flag where first_tie defaults to False. This is
backwards compatible, but it doesn't fix latent bugs, it leaves us with
on-going API complexities, and the default case remains the least usable option.
* Leave mode() as-is and just document its limitations.
For any of those options, we should still add a separate multimode() function.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue35892>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com