In the previous thread (Custom C++ literals), ChrisA raised some good questions, some of which I can actually answer :D
> Part of the problem here is that Python has to be many many things. > Which set of units is appropriate? For instance, in a lot of contexts, > it's fine to simply attach K to the end of something to mean "a > thousand", while still keeping it unitless; but in other contexts, > 273K clearly is a unit of temperature. (Although I think the solution > there is to hard-disallow prefixes without units, as otherwise there'd > be all manner of collisions.) Is it valid to refer to fifteen > Angstroms as 15A, or do you have to say 15Å, or 15e-10m and accept > that it's now a float not an int? Similarly, what if you want to write > a Python script that works in natural units - the Planck length, mass, > time, and temperature? I think if you look into CGPM standards (they're the grand pooh bahs who decide what SI units are) then you'd find that a lot of these potential collisions have already been encountered and resolved. Under SI, there is no ambiguity regarding K. K means Kelvin and only Kelvin, whereas k means 1000. Some units, like Å do pose challenges. We often substitute u instead of μ, which works fine since there don't seem to be any SI units that start with u. But we can't do likewise for Å, since A is already reserved for Amperes. The easy way out is to say Å is not SI, so it's out. But I would rather not see this feature limited to SI units only (although SI should be preferred). A somewhat gentler approach would be let Å be Å. Unicode letters are allowed in Python these days. I use theta, mu, lambda - the whole bunch of them, in my code all the time. If someone wants to use Å bad enough, let them use the unicode for it, otherwise use nm. Units like Planck's length are valid, and I don't see any reason to exclude them. The problem is that CGPM (nor anyone else, as far as I can tell) hasn't created an SI unit Planck length and other similar units that are lexicographically distinct from other units. And creating one would only be worth the trouble if all of the physicists who might use it could immediately recognize it. Not that I speak for them, but I'm guessing the folks who run SciPy or astropy could be of help in answering these sort of questions, rather than trying to get a Python steering committee to work with a possibly more bureaucratic organization like CGPM. Regarding precision, this is not something that so many scientists and engineers understand as well as computer scientists and engineers. I'd rather see units available for integers as well as floats. I think that as long as a unit is defined, it makes sense to allow integer quantities of them. If they are to built-in types, as I would prefer, then I suppose unfortunately one would not be able to define fractions of these units as new units. But again, most of this work is done with floats anyway, so if units were only available for floats, I would still see this as a big step forward. Related to these questions, there is the question of what to do about mixed systems? Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1? I'd much rather it evaluate to 1, but if anyone else has a stronger opinion, I would not let a dispute over such a thing stand in the way of getting units. Regarding 1m / 1mm, though, I have a much stronger opinion. It should be 1000, without any units. There is yet another question related to the interpretation of K as 1000 vs Kelvin. As I said, SI is clear that K means Kelvin, but what about Python users that are not familiar with SI? What about those in the financial industry? To them, K means 1000, and might not even know what Kelvin is. Now, unless adding a suffix K to a number is supported later on, a financial person would have to go pretty far out of their way, or be looking at the wrong code to be confused by something referring to Kelvin. But it would indeed be a mistake to assume that everyone who uses Python wants and can live with SI units, or even that they would be using the same set of units! Which brings me to the next part of ChrisA's reply... > > Purity and practicality are at odds here. Practicality says that you > should be able to have "miles" as a unit, purity says that the only > valid units are pure SI fundamentals and everything else is > transformed into those. Leaving it to libraries would allow different > Python programs to make different choices. > > But I would very much like to see a measure of language support for > "number with alphabetic tag", without giving it any semantic meaning > whatsoever. Python currently has precisely one such tag, and one > conflicting piece of syntax: "10j" means "complex(imag=10)", and > "10e1" means "100.0". (They can of course be combined, 10e1j does > indeed mean 100*sqrt(-1).) This is what could be expanded. > As I mentioned above, I am not a purist. I keep a set of Thorlabs thread adapters handy in my lab so that I can screw imperial cage plates onto metric posts. I think I diverge (or perhaps just don't understand) statement on "semantic meaning". To me, semantic meaning of the units seems pretty essential. Wherever possible, units should be simplified in a prescribed manner. 1W / 1s = 1J, 10km/1cm = 1000000. The meaning of these suffixes should be explicit, not implicit. Also, see above about precision of unit-aware data types. Floating point only would be fine, but I don't see why integers cannot be supported as well. > C++ does things differently, since it can actually compile things in, > and declarations earlier in the file can redefine how later parts of > the file get parsed. In Python, I think it'd make sense to > syntactically accept *any* suffix, and then have a run-time > translation table that can have anything registered; if you use a > suffix that isn't registered, it's a run-time error. Something like > this: > > import sys > # sys.register_numeric_suffix("j", lambda n: complex(imag=n)) > sys.register_numeric_suffix("m", lambda n: unit(n, "meter")) > sys.register_numeric_suffix("mol", lambda n: unit(n, "mole")) > > (For backward compatibility, the "j" suffix probably still has to be > handled at compilation time, which would mean you can't actually do > that first one.) > > Using it would look something like this: > > def spread(): > """Calculate the thickness of avocado when spread on > a single slice of bread""" > qty = 1.5mol > area = 200mm * 200mm > return qty / area > > Unfortunately, these would no longer be "literals" in the same way > that imaginary numbers are, but let's call them "unit displays". To > evaluate a unit display, you take the literal (1.5) and the unit > (stored as a string, "mol"), and do a lookup into the core table > (CPython would probably have an opcode for this, rather than doing it > with a method that could be overridden, but it would basically be > "sys.lookup_unit(1.5, 'mol')" or something). Whatever it gives back is > the object you use. > > Does this seem like a plausible way to go about it? As far as registering units, I think registering individual units is a bit much. Of course, several of these statements could be put inside a module or package to make things easier. But I also don't like that it means the syntax of the "literals" needs to be allowed during parsing, and left to the interpreter to figure out if the unit was registered. I do think it is reasonable to require programmers to "opt in" to using SI or other units, and possibly even specify which set or sets of units they intend to use. But if their constants are ill-formed, then that should still be caught during parsing and throw a SyntaxError. How that would be implemented behind the scenes, I don't know, but from a syntax point of view, I am envisioning something like a namespace statement with a new keyword (I propose `measure`). Here, I am referring to namespaces like `local` and `global`, not something like `argparse.Namespace`. Consider the following example as of today: ``` A = 1 global A A = 2 ``` This will generate a syntax error during parsing: SyntaxError: name 'A' is assigned to before global declaration Similarly, what I envision is something like this: ``` length = 12m ``` SyntaxError: invalid syntax ``` measure SI length = 12m width = 10mm area = length * width print(area) ``` ... with no SyntaxErrors and a result of "0.12 m2" After the "measure SI" statement, all literals that are formed with SI units are considered valid syntax and are evaluated accordingly. Prior to "measure SI", only the unitless primitives are allowed. Clearly this works differently than does the `global` or `local` statement, which are modifying a namespace. Also, the choice of keyword matters, because making "measure" a keyword would probably break a lot of existing code (3to4.py!!!). But it is dead simple, and it does behave in a way that is actually quite similar to modifying the existing namespace. This ended up being a much longer reply than I anticipated, but I hope it helps. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QUD5GG3CBORW5OJ45DVNSACFZQG6SOXN/ Code of Conduct: http://python.org/psf/codeofconduct/