[Python-ideas] Re: Native support for units [was: custom literals]

Stephen J. Turnbull Thu, 07 Apr 2022 07:50:52 -0700

Brian McCall writes:

 > Stephen J Turnbull and Paul Moore have asked why the "need" for
 > something other than a library (or perhaps a better library). There
 > are a number of examples that show simple unit calculations and
 > it's easy to argue based on those that nothing further is needed.


Nobody is arguing that, though.  We ask *because* we believe you
perceive need (and we don't put it in quotes, for exactly the reason
you give -- we do not define "need" as "if I don't get it I will
literally die").  You don't need to prove your need to Python core.

You need to show two things.  (1) The particular solution proposed
solves problems that already available approaches don't (including
"too inconvenient to consistently use in practice" which might apply
to the library approach vs the user-defined literal approach).  (2)
There are enough people with this need to outweigh the slight costs to
all Python users of maintaining the facility, documenting it, and
learning it, and possibly preventing a more valuable use of the same
syntax.

 > In regulated environments, risk analysis and mitigation is very
 > much affected by whether a feature has native support

That you say this after citing Paul of all people gave me a chuckle,
as he's probably the most prominent advocate of a comprehensive
standard library on these lists.  If anybody will be sympathic to your
needs on these grounds, he will.  (I don't expect you to know that,
but it might be useful to know you have a potential ally.)

It's important to distinguish between several levels of "native
support" here.

0.  Adding units to all numeric objects.

    I don't think level 0 is a very good idea.  I think it was Mac
    Lane who pointed out that the solution to a problem expressed in
    mathematics starts by giving a formal statement of the problem,
    manipulates the form according to syntactic rules, and finally
    interprets the final form as a semantic solution to the problem.
    Similarly, as long as the expected units of the result, and the
    units output by a Python program are the same, the numbers don't
    need to have units.  I suspect that from the point of view of
    astropy, attaching units to ndarray rather than atomic numeric
    types is optimal for this reason, not a workaround.

1.  Syntax, aka language support (the original suggestion of
    user-defined literals fits here because either the way numbers are
    parsed or the way '_' is parsed would have to change).

    It's very difficult to get *any* syntax change in.  In particular,
    changing '_' from an identifier component to an operator for
    combining numeric literals would invalidate *tons* of code
    (including internationalization code that is the 0.454kg nearest
    my heart).  I can't imagine that being possible.  There was a fair
    amount of pushback to using it as the non-capturing marker in the
    match statement.  Changing its interpretation only in a numeric
    literal is probably possible, since as far as I know it's
    currently an error to follow a sequence of digits with _ without
    interposing an operator or whitespace.

    NOTE: The big problem I see with this is that I don't see any
    practical way to use syntax for non-numeric types like ndarray.
    The problem is that in, say, economics we use a lot of m x 2
    arrays where the two columns have different units (ie, a quantity
    unit and a $/quantity unit).  The sensible way to express this
    isn't some kind of appended syntax as with numbers, but rather a
    sequence of units corresponding to the sequence of columns.

    If I'm right about that, the only time the literal would be used
    syntax is when using Python interactively as a calculator.  Any
    time you read from a string or file, you have to interpose a
    parsing step anyway, so it makes sense to handle the construction
    of quantities there.

2.  Built-in, that is pre-loaded in the interpreter.  These are
    implemented as library functionality in the builtins module.

    I would definitely be against level 2.  It would pollute the
    builtin namespace to the great detriment of all non-numeric
    applications.

3.  Modules in the standard library.  Always available, but must be
    explicitly imported to be used.

    I think adding a module that provides units for the stdlib numeric
    types is a very interesting idea.  I doubt I'd ever use it, and
    I'm not sure that the requirements can be well-enough specified at
    the moment to go to +1 on it.  (As the Zen says, "Although never
    is often better than *right* now.)

For me, level 4 (available on PyPI) is good enough.  I can think of
cases where it would be useful for class demos to be able to
interrogate a computation for its use of units, but that's about it.
So I depend on descriptions of proponents' use cases to sort out the
question of level 1 vs. level 3.

 > > The Mars Climate Orbiter [...] navigation error arose because a
 > > NASA subcontractor (Lockheed Martin) used Imperial units
 > > (pound-seconds) instead of the metric system.

These examples don't need to be repeated over and over again, though.
What is needed is to show that a particular proposal is especially
effective in preventing them compared to others.  This is what I
haven't seen.

 > One last pontification before I get to my example relating to
 > units. We already have examples of features that have both a native
 > implementation and library extensions. int and float are primitives
 > in Python. They are more than enough for most users, but limiting
 > for quite a few other users. So modules like fractions and decimal
 > provide extended support, and libraries like numpy provide even
 > more data types for task-specific needs.

??  That analogy looks like an argument for users who need units to
get facilities suited to their needs in libraries like numpy and
units.

 > That one was on me, and since I am a capable programmer, I ought to
 > have been using a units package.

This is the big question, to me: will people be so much more likely to
use units with syntactic support than they are with something like the
units package plus a simple facility for getting the great majority of
units they use with one import or function call?

 > echo What literals would look like
 > python -c "
 > # h  = 6.62607015e-34m2kg / 1s

That's pretty unreadable IMO.  And how do you distinguish k(m2) from
(km)2?  Is latter always going to be the natural reading?

 > echo 'What bracket syntax might look like'
 > python -c "
 > # n  = (200[lx] * 0.25 * 0.63 * 500[nm] * 30[ms] * (0.00345[mm])**2) / (2 * 
 > h*c * 683[lm/W] * 2.4**2)
 > "

This is quite readable for me.  The unit "sticks to" the number
nicely.  I guess the parentheses in "(0.00345[mm])**2" are
unnecessary?

This also has the k(m2) vs (km)2 issue, if it's real.

 > echo units
 > python -c "
 > from units import unit
 > h  = unit('m**2*kg/s')    (6.62607015e-34)
 > [...]
 > n  = unit('lx')(200) * 0.25 * 0.63 * unit('nm')(550) * 
 > unit('mm')(0.00345)**2 * unit('ms')(30) / (2 * 2.4**2 * h*c) / 
 > unit('lm/W')(683)
 > print(n)
 > "
 > 
 > # Pros - Has units, and no syntax change required
 > # Cons - How do you get the final answer?

I don't understand the question:

>>> unit('m')(1) * unit('kg')(1)
Quantity(1, ComposedUnit([LeafUnit('m', False), LeafUnit('kg', False)], [], 1))
>>> str(unit('m')(1) * unit('kg')(1))
'1.00 m * kg'

 > You need to know that units became astropy.units and see below

This isn't a con, because you're proposing to change the Python
distribution.  Some form of some library would be added to the stdlib.
If you can't do that, there's no way you're going to get syntax to
support not doing it. :-)

 > # Cons - less compact
 > # Cons - Not dead simple. Multiple adjacent ()() is going to be
 > # unpopular with the crowd that uses Python as if it were
 > # Fortran/Matlab

Both of these can be addressed with a prologue

mps = unit('m/s')
lx = unit('lx')

and so on.  I would expect that various fields would develop their own
abbreviations that would be commonly used.  You missed one "con",
here.  This won't handle SI prefixes in a natural way.  I guess the
unit() constructor can do it, but it seems pretty nasty from the point
of view of a Pythonista to have to define all of m, km, mm, g, kg, and
mg when you could get away with m and g, and k and m, and obviously it
gets a lot worse as we go to the full suites of prefixes and units.

 > echo astropy.units again, but importing everything into globals
 > python -c "
 > from astropy.units import *
 > [...]
 > # Cons - Can't use "m" as a counter (and other namespace collisions)

You could limit the name collisions by having field-specific
submodules.  Losing 'm' would be painful, though.  I guess you could
use "meter" instead, and ask those who want the abbreviated unit to
assign m = meter.

But most of the comments above aren't particularly important.  The key
question remains:

How often would the syntax prevent an error that adding an appropriate
units library to the stdlib would not?

Among other things, in a later post you wrote

 > Oops, my examples have some other problems:
 >
 > > # λ  = 550e-9nm
 >
 > should be 
 >
 > > # λ  = 550nm

I can easily see myself making this mistake, hearing "lambda equals
550 nanometers", writing "λ = 550e-9", thinking "I really should add
the explicit units, and appending "nm" because that's the unit I
heard.  Explicit units won't fix that, nor will syntax for explicit
units.  So you need to make sure that when people say "I make units
mistakes a lot" they mean "I think 1000kg as I enter '1000' into a
text box that expects lb", not "my SI prefixes frequently duplicate my
engineering notation exponents".

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RCA4GKJMME2RXH2U4WYM7X6ZSKUIQLPL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Native support for units [was: custom literals]

Reply via email to