Brian McCall writes: > Stephen J Turnbull and Paul Moore have asked why the "need" for > something other than a library (or perhaps a better library). There > are a number of examples that show simple unit calculations and > it's easy to argue based on those that nothing further is needed.
Nobody is arguing that, though. We ask *because* we believe you perceive need (and we don't put it in quotes, for exactly the reason you give -- we do not define "need" as "if I don't get it I will literally die"). You don't need to prove your need to Python core. You need to show two things. (1) The particular solution proposed solves problems that already available approaches don't (including "too inconvenient to consistently use in practice" which might apply to the library approach vs the user-defined literal approach). (2) There are enough people with this need to outweigh the slight costs to all Python users of maintaining the facility, documenting it, and learning it, and possibly preventing a more valuable use of the same syntax. > In regulated environments, risk analysis and mitigation is very > much affected by whether a feature has native support That you say this after citing Paul of all people gave me a chuckle, as he's probably the most prominent advocate of a comprehensive standard library on these lists. If anybody will be sympathic to your needs on these grounds, he will. (I don't expect you to know that, but it might be useful to know you have a potential ally.) It's important to distinguish between several levels of "native support" here. 0. Adding units to all numeric objects. I don't think level 0 is a very good idea. I think it was Mac Lane who pointed out that the solution to a problem expressed in mathematics starts by giving a formal statement of the problem, manipulates the form according to syntactic rules, and finally interprets the final form as a semantic solution to the problem. Similarly, as long as the expected units of the result, and the units output by a Python program are the same, the numbers don't need to have units. I suspect that from the point of view of astropy, attaching units to ndarray rather than atomic numeric types is optimal for this reason, not a workaround. 1. Syntax, aka language support (the original suggestion of user-defined literals fits here because either the way numbers are parsed or the way '_' is parsed would have to change). It's very difficult to get *any* syntax change in. In particular, changing '_' from an identifier component to an operator for combining numeric literals would invalidate *tons* of code (including internationalization code that is the 0.454kg nearest my heart). I can't imagine that being possible. There was a fair amount of pushback to using it as the non-capturing marker in the match statement. Changing its interpretation only in a numeric literal is probably possible, since as far as I know it's currently an error to follow a sequence of digits with _ without interposing an operator or whitespace. NOTE: The big problem I see with this is that I don't see any practical way to use syntax for non-numeric types like ndarray. The problem is that in, say, economics we use a lot of m x 2 arrays where the two columns have different units (ie, a quantity unit and a $/quantity unit). The sensible way to express this isn't some kind of appended syntax as with numbers, but rather a sequence of units corresponding to the sequence of columns. If I'm right about that, the only time the literal would be used syntax is when using Python interactively as a calculator. Any time you read from a string or file, you have to interpose a parsing step anyway, so it makes sense to handle the construction of quantities there. 2. Built-in, that is pre-loaded in the interpreter. These are implemented as library functionality in the builtins module. I would definitely be against level 2. It would pollute the builtin namespace to the great detriment of all non-numeric applications. 3. Modules in the standard library. Always available, but must be explicitly imported to be used. I think adding a module that provides units for the stdlib numeric types is a very interesting idea. I doubt I'd ever use it, and I'm not sure that the requirements can be well-enough specified at the moment to go to +1 on it. (As the Zen says, "Although never is often better than *right* now.) For me, level 4 (available on PyPI) is good enough. I can think of cases where it would be useful for class demos to be able to interrogate a computation for its use of units, but that's about it. So I depend on descriptions of proponents' use cases to sort out the question of level 1 vs. level 3. > > The Mars Climate Orbiter [...] navigation error arose because a > > NASA subcontractor (Lockheed Martin) used Imperial units > > (pound-seconds) instead of the metric system. These examples don't need to be repeated over and over again, though. What is needed is to show that a particular proposal is especially effective in preventing them compared to others. This is what I haven't seen. > One last pontification before I get to my example relating to > units. We already have examples of features that have both a native > implementation and library extensions. int and float are primitives > in Python. They are more than enough for most users, but limiting > for quite a few other users. So modules like fractions and decimal > provide extended support, and libraries like numpy provide even > more data types for task-specific needs. ?? That analogy looks like an argument for users who need units to get facilities suited to their needs in libraries like numpy and units. > That one was on me, and since I am a capable programmer, I ought to > have been using a units package. This is the big question, to me: will people be so much more likely to use units with syntactic support than they are with something like the units package plus a simple facility for getting the great majority of units they use with one import or function call? > echo What literals would look like > python -c " > # h = 6.62607015e-34m2kg / 1s That's pretty unreadable IMO. And how do you distinguish k(m2) from (km)2? Is latter always going to be the natural reading? > echo 'What bracket syntax might look like' > python -c " > # n = (200[lx] * 0.25 * 0.63 * 500[nm] * 30[ms] * (0.00345[mm])**2) / (2 * > h*c * 683[lm/W] * 2.4**2) > " This is quite readable for me. The unit "sticks to" the number nicely. I guess the parentheses in "(0.00345[mm])**2" are unnecessary? This also has the k(m2) vs (km)2 issue, if it's real. > echo units > python -c " > from units import unit > h = unit('m**2*kg/s') (6.62607015e-34) > [...] > n = unit('lx')(200) * 0.25 * 0.63 * unit('nm')(550) * > unit('mm')(0.00345)**2 * unit('ms')(30) / (2 * 2.4**2 * h*c) / > unit('lm/W')(683) > print(n) > " > > # Pros - Has units, and no syntax change required > # Cons - How do you get the final answer? I don't understand the question: >>> unit('m')(1) * unit('kg')(1) Quantity(1, ComposedUnit([LeafUnit('m', False), LeafUnit('kg', False)], [], 1)) >>> str(unit('m')(1) * unit('kg')(1)) '1.00 m * kg' > You need to know that units became astropy.units and see below This isn't a con, because you're proposing to change the Python distribution. Some form of some library would be added to the stdlib. If you can't do that, there's no way you're going to get syntax to support not doing it. :-) > # Cons - less compact > # Cons - Not dead simple. Multiple adjacent ()() is going to be > # unpopular with the crowd that uses Python as if it were > # Fortran/Matlab Both of these can be addressed with a prologue mps = unit('m/s') lx = unit('lx') and so on. I would expect that various fields would develop their own abbreviations that would be commonly used. You missed one "con", here. This won't handle SI prefixes in a natural way. I guess the unit() constructor can do it, but it seems pretty nasty from the point of view of a Pythonista to have to define all of m, km, mm, g, kg, and mg when you could get away with m and g, and k and m, and obviously it gets a lot worse as we go to the full suites of prefixes and units. > echo astropy.units again, but importing everything into globals > python -c " > from astropy.units import * > [...] > # Cons - Can't use "m" as a counter (and other namespace collisions) You could limit the name collisions by having field-specific submodules. Losing 'm' would be painful, though. I guess you could use "meter" instead, and ask those who want the abbreviated unit to assign m = meter. But most of the comments above aren't particularly important. The key question remains: How often would the syntax prevent an error that adding an appropriate units library to the stdlib would not? Among other things, in a later post you wrote > Oops, my examples have some other problems: > > > # λ = 550e-9nm > > should be > > > # λ = 550nm I can easily see myself making this mistake, hearing "lambda equals 550 nanometers", writing "λ = 550e-9", thinking "I really should add the explicit units, and appending "nm" because that's the unit I heard. Explicit units won't fix that, nor will syntax for explicit units. So you need to make sure that when people say "I make units mistakes a lot" they mean "I think 1000kg as I enter '1000' into a text box that expects lb", not "my SI prefixes frequently duplicate my engineering notation exponents". _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RCA4GKJMME2RXH2U4WYM7X6ZSKUIQLPL/ Code of Conduct: http://python.org/psf/codeofconduct/