[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-08 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

Re AlexWaygood:

If these PEP-484 related things were so obvious that they would admit a compact 
description of the problem in 2-3 lines, these issues would likely have been 
identified much earlier. We would not be seeing them now, given that Python by 
and large is a somewhat mature language.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-08 Thread Alex Waygood


Alex Waygood  added the comment:

Please try to make your messages more concise.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-08 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

This is not a partial duplicate of https://bugs.python.org/issue47121 about 
math.isfinite().
The problem there is about a specific function on which the documentation may 
be off -
I'll comment separately on that.


The problem here is: There is a semantic discrepancy between what the
term 'float' means "at run time", such as in a check like:

issubclass(type(x), float)

(I am deliberately writing it that way, given that isinstance() can, in general 
[but actually not for float], lie.)

and what the term 'float' means in a statically-checkable type annotation like:

def f(x: float) -> ... : ...

...and this causes headaches.


The specific example ('middle_mean') illustrates the sort of weird
situations that arise due to this. (I discovered this recently when
updating some of our company's Python onboarding material, where the
aspiration naturally is to be extremely accurate with all claims.)

So, basically, there is a choice to make between these options:

Option A: Give up on the idea that "we want to be able to reason with
stringency about the behavior of code" / "we accept that there will be
gaps between what code does and what we can reason about".  (Not
really an option, especially with an eye on "writing secure code
requires being able to reason out everything with stringency".)

Option B: Accept the discrepancy and tell people that they have to be
mindful about float-the-dynamic-type being a different concept from
float-the-static-type.

Option C: Realizing that having "float" mean different things for
dynamic and static typing was not a great idea to begin with, and get
everybody who wants to state things such as "this function parameter
can be any instance of a real number type" to use the type
`numbers.Real` instead (which may well need better support by
tooling), respectively express "can be int or float" as `Union[int,
float]`.

Also, there is Option D: PEP-484 has quite a lot of other problems
where the design does not meet rather natural requirements, such as:
"I cannot introduce a newtype for 'a mapping where I know the key to
be a particular enum-type, but the value is type-parametric'
(so the new type would also be 1-parameter type-parametric)", and
this float-mess is merely one symptom of "maybe PEP-484 was approved
too hastily and should have been also scrutinized by people
from a community with more static typing experience".


Basically, Option B would spell out as: 'We expect users who use
static type annotations to write code like this, and expect them to be
aware of the fact that the four places where the term "float" occurs
refer to two different concepts':

def foo(x: float) -> float:
  """Returns the foo of the number `x`.

  Args:
x: float, the number to foo.

  Returns:
float, the value of the foo-function at `x`.
  """
  ...

...which actually is shorthand for...:

def foo(x: float  # Note: means float-or-int
  ) -> float  # Note: means float-or-int
  :
  """Returns the foo of the number `x`.

  Args:
x: the number to foo, an instance of the `float` type.

  Returns:
The value of the foo-function at `x`,
as an instance of the `float` type.
  """
  ...

Option C (and perhaps D) appear - to me - to be the only viable
choices here. The pain with Option C is that it invalidates/changes
the meaning of already-written code that claims to follow PEP-484,
and the main point of Option D is all about: "If we have to cause
a new wound and open up the patient again, let's try to minimize
the number of times we have to do this."

Option C would amount to changing the meaning of...:

def foo(x: float) -> float:
  """Returns the foo of the number `x`.

  Args:
x: float, the number to foo.

  Returns:
float, the value of the foo-function at `x`.
  """
  ...

to "static type annotation float really means instance-of-float here"
(I do note that issubclass(numpy.float64, float), so passing a
numpy-float64 is expected to work here, which is good), and ask people
who would want to have functions that can process more generic real
numbers to announce this properly. So, we would end up with basically
a list of different things that a function-sketch like the one above
could turn into - depending on the author's intentions for
the function, some major cases being perhaps:

(a) ("this is supposed to strictly operate on float")
def foo(x: float) -> float:
  """Returns the foo of the number `x`.

  Args:
x: the number to foo.

  Returns:
the value of the foo-function at `x`.
  """

(b) ("this will eat any kind of real number")

def foo(x: numbers.Real) -> numbers.Real:
  """Returns the foo of the number `x`.

  Args:
x: the number to foo.

  Returns:
the value of the foo-function at `x`.
  """

(c) ("this will eat any kind of real number, but the result will always be 
float")

def foo(x: numbers.Real) -> float:
  """Returns the foo of the number `x`.

  Args:
x: the number to foo.

  Returns:
the value of 

[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-06 Thread Ken Jin


Change by Ken Jin :


--
nosy:  -kj

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-06 Thread Guido van Rossum


Change by Guido van Rossum :


--
nosy:  -gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-06 Thread Steven D'Aprano


Change by Steven D'Aprano :


--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-05 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

This is a partial duplicate of an issue you already filed: 
https://bugs.python.org/issue47121 where math.isfinite(10**1000) raises an 
OverflowError even though it type checks.

Here was one of the comments:
"""
Types relationships are useful for verifying which methods are available, but 
they don't make promises about the range of valid values.  For example 
math.sqrt(float) -> float promises which types are acceptable but doesn't 
promise that negative inputs won't raise an exception.  Likewise, "n: int=10; 
len(range(n))" is type correct but will raise an OverflowError for "n = 
10**100".
"""

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-05 Thread Jelle Zijlstra


Jelle Zijlstra  added the comment:

Thanks for your report, but I would appreciate a more concise explanation. Let 
me try to rephrase the problem.

Given this function:

def mean(x: list[float]) -> float:
return sum(x) / len(x)

We want to provide a guarantee that if x is a nonempty list containing only 
floats, the function returns successfully and returns a float.

But the type system currently doesn't give this guarantee, because PEP 484 
specifies that ints are compatible with floats, and `mean([0.0, 1.25, 10**1000, 
0.0])` will throw OverflowError.

---

We generally discuss issues with the general type system over at 
https://github.com/python/typing/issues, but here are a few thoughts:

- The type system doesn't generally try to cover exceptions. Your function 
could also raise MemoryError, or KeyboardInterrupt, and the type system can't 
tell you.
- The concrete proposal here would be to make int no longer implicitly 
compatible with float. That might be nice, but at this point it would be a big 
break in backwards compatibility, so I'm not sure we can do it.

--
nosy: +AlexWaygood, JelleZijlstra, gvanrossum, kj

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-05 Thread Thomas Fischbacher

New submission from Thomas Fischbacher :

Here is a major general problem with python-static-typing as it is
described by PEP-484: The approach described in
https://peps.python.org/pep-0484/#the-numeric-tower negatively impacts
our ability to reason about the behavior of code with stringency.

I would like to clarify one thing in advance: this is a real problem
if we subscribe to some of the important ideas that Dijkstra
articulated in his classic article "On the role of scientific thought"
(e.g.: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD447.html).
Specifically, this part:

"""
Let me try to explain to you, what to my taste is characteristic
for all intelligent thinking. It is, that one is willing to study in
depth an aspect of one's subject matter in isolation for the sake of
its own consistency, all the time knowing that one is occupying
oneself only with one of the aspects. We know that a program must be
correct and we can study it from that viewpoint only; we also know
that it should be efficient and we can study its efficiency on another
day, so to speak. In another mood we may ask ourselves whether, and if
so: why, the program is desirable. But nothing is gained —on the
contrary!— by tackling these various aspects simultaneously. It is
what I sometimes have called "the separation of concerns", which, even
if not perfectly possible, is yet the only available technique for
effective ordering of one's thoughts, that I know of. This is what I
mean by "focussing one's attention upon some aspect": it does not mean
ignoring the other aspects, it is just doing justice to the fact that
from this aspect's point of view, the other is irrelevant. It is being
one- and multiple-track minded simultaneously.
"""

So, "code should be easy to reason about".

Now, let us look at this function - I am here (mostly) following the
Google Python style guide (https://google.github.io/styleguide/pyguide.html) 
for now:

=== Example 1, original form ===

def middle_mean(xs):
  """Compute the average of the nonterminal elements of `xs`.

  Args:
`xs`: a list of floating point numbers.

  Returns:
A float, the mean of the elements in `xs[1:-1]`.

  Raises:
ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
==

Let's not discuss performance, or whether it makes sense to readily
generalize this to operate on other sequences than lists, but focus,
following Dijkstra, on one specific concern here: Guaranteed properties.

Given the function as it is above, I can make statements that are
found to be correct when reasoning with mathematical rigor, such as
this specific one that we will come back to:

=== Theorem 1 ===
  If we have an object X that satisfies these properties...:

  1. type(X) is list
  2. len(X) == 4
  3. all(type(x) is float for x in X)

  ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y
  which satisfies:

  - type(Y) is float
  - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y)
===

Now, following PEP-484, we would want to re-write our function, adding type 
annotations.
Doing this mechanically would give us:

=== Example 1, with mechanically added type information ===
def middle_mean(xs: List[float]) -> float:
  """Compute the average of the nonterminal elements of `xs`.

  Args:
`xs`: a list of floating point numbers.

  Returns:
A float, the mean of the elements in `xs[1:-1]`.

  Raises:
ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
==

(We are also deliberately not discussing another question here: given
this documentation and type annotation, should the callee be
considered to be permitted to mutate the input list?)


So, given the above form, we now find that there seems to be quite a
bit of redundancy here. After all, we have the type annotation but
also repeat some typing information in the docstring. Hence, the
obvious proposal here is to re-write the above definition again, obtaining:

=== Example 1, "cleaned up" ===
def middle_mean(xs: List[float]) -> float:
  """Compute the average of the nonterminal elements of `xs`.

  Args:
`xs`: numbers to average, with terminals ignored.

  Returns:
The mean of the elements in `xs[1:-1]`.

  Raises:
ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
==

But now, what does this change mean for the contract? Part of the "If
arguments have these properties, then these are the guarantees"
contract now is no longer spelled out by the docstring, but via type
annotations. We naturally would expect this to be straightforward,
so, we would like to have:

=== Theorem 1b ===
  If we have an object X that satisfies these properties...:

  1. Static