Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-09-01 Thread Stephen J. Turnbull
Guido van Rossum writes:
 > On Wed, Aug 31, 2016 at 8:57 PM, Stephen J. Turnbull
 >  wrote:

 > > That seems to be right approach: in system administration, these
 > > numbers are used mostly to understand resource usage, and
 > > underestimates are almost never what you want,
 > 
 > That would seem to apply to "space used" but not to "space available".

True, but I don't think the implications are symmetric.  I buy storage
to handle space (expected to be) used, not space available.  But when
I find myself caring about the "slop" in space available, the fact
that I care about that is already very bad news.  Time to head for
Fry's Electronics!

As I wrote before, I don't think the same argument applies to
scientific computing.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-09-01 Thread Random832
On Thu, Sep 1, 2016, at 02:17, Greg Ewing wrote:
> I don't think a space should be automatic. The typographical
> recommendation is to put a thin non-breaking space between
> the value and the unit, but this is not possible with a
> monospaced font, so some people might decide that it's
> better without a space, or they might want to use a
> character other than 0x20. Better to let the user put the
> space in the format string if wanted.

If the space needs to be between the number and the unit there's no good
way to do this. I think this is an argument for a separate function that
returns a tuple of (formatted number, prefix).

Incidentally, do we have a good primitive to return (string of digits,
exponent or position of decimal point) a la C's ecvt/fcvt? This would be
something that might be useful in allowing users to build their own
formatting code. It can be worked around though ("guess" the exponent,
scale with multiplication or division, round to an integer to get the
string of digits) so I guess it's not that important.

> I'm inclined to think it should be the number of significant
> digits, not decimal places, to give a more consistent
> precision as the magnitude of the number changes.
> 
> For example, if you're displaying some resistor values that
> are accurate to 2 digits, you would want to see 2.7k,
> 27k, 270k, but not 27.0k or 270.0k as those would suggest
> spurious precision.

What I was getting at is that there are two different use cases possible
here.

> This would also help with fitting the value into a fixed
> width, since you would know that a precision of n would
> use at most n+1 characters for the numeric part.

Exactly n+1, surely? And on the other hand a fixed number of decimal
places allows easy alignment by right-justifying the text within a field
(and will use at most n+4 characters for the numeric part).
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-09-01 Thread Greg Ewing

On 2016-08-31 17:19, Guido van Rossum wrote:


I guess we need to debate what it should do if the value is
way out of range of the SI scale system -- what's it going to do when
I pass it 1e50? I propose that it should fall back to 'g' style then,
but use "engineering" style where exponents are always a multiple of
3.)


An alternative would be to use the largest or smallest scale
factor available, and use e format to make up the difference.
Not sure whether that would be better or worse.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Greg Ewing

Nikolaus Rath wrote:

There's also the important nitpick if 32e7 is best rendered as 320 M or
0.32 G. There's valid applications for both.


If you want 0.32 G it's probably because you're showing it
alongside other values >= 1 G, so you're really getting into
the business of letting the user choose the prefix.

The default should be 320 M, I think. (Unless it's a
capacitor value, where there's a long-standing convention
in some circles to use uF or pF, but never nF. :-)

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Greg Ewing

Random832 wrote:

One thing to consider is that this is very likely to be used with a unit
(e.g. "%hA" intending to display in amperes), so maybe it should put a
space after it? Though really people are probably going to want "1 A" vs
"1 kA" in that case, rather than "1 A" vs "1kA".


I don't think a space should be automatic. The typographical
recommendation is to put a thin non-breaking space between
the value and the unit, but this is not possible with a
monospaced font, so some people might decide that it's
better without a space, or they might want to use a
character other than 0x20. Better to let the user put the
space in the format string if wanted.


Engineering or SI-scale-factor format suggests a third
possibility: number of decimal places to be shown after the displayed
decimal point, e.g. "%.1h" % 1.2345 * 10 ** x for x in range(10): "1.2",
"12.3", "123.5", "1.2k", "12.3k", "123.5k", "1.2M", "12.3M", "123.5M".


I'm inclined to think it should be the number of significant
digits, not decimal places, to give a more consistent
precision as the magnitude of the number changes.

For example, if you're displaying some resistor values that
are accurate to 2 digits, you would want to see 2.7k,
27k, 270k, but not 27.0k or 270.0k as those would suggest
spurious precision.

This would also help with fitting the value into a fixed
width, since you would know that a precision of n would
use at most n+1 characters for the numeric part.

--
Greg

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Greg Ewing

Steven D'Aprano wrote:


On Tue, Aug 30, 2016 at 09:08:01PM -0700, Ken Kundert wrote:


My thinking was that r stands for real like f stands for float.


The next available letter in the e, f, g sequence would
be 'h'.

If you want it to stand for something, it could be
"human-readable" or "human-oriented". (There's a precedent
for this in the "df" unix utility which has a -H option
producing SI prefixes.)

I'm talking about chosing between "M" or "mega". The actual unit 
itself is up to the caller to supply.


Maybe 'h' for abbreviations and 'H' for full prefixes?

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Ken Kundert
All,
Armed with all of your requirements, suggestions and good ideas, I believe 
I am ready to try to put something together.

Thank you all, and once again let me apologize for 'all the drama'.
I'll let you know when I have something.

-Ken
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Guido van Rossum
On Wed, Aug 31, 2016 at 8:57 PM, Stephen J. Turnbull
 wrote:
> Random832 writes:
>
>  > Also, interesting quirk - it always rounds up. 1025 bytes is "1.1K", and
>  > in SI mode, 1001 bytes is "1.1k"
>
> That seems to be right approach: in system administration, these
> numbers are used mostly to understand resource usage, and
> underestimates are almost never what you want, while quite large
> overestimates are tolerable, and are typically limited because the
> actual precision of calculations is much higher than that of the
> "human-readable" output.

That would seem to apply to "space used" but not to "space available".

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Stephen J. Turnbull
Random832 writes:

 > Also, interesting quirk - it always rounds up. 1025 bytes is "1.1K", and
 > in SI mode, 1001 bytes is "1.1k"

That seems to be right approach: in system administration, these
numbers are used mostly to understand resource usage, and
underestimates are almost never what you want, while quite large
overestimates are tolerable, and are typically limited because the
actual precision of calculations is much higher than that of the
"human-readable" output.

I don't know if that would be true in general-purpose programming.  I
suspect not.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Nikolaus Rath
On Aug 31 2016, Guido van Rossum 
 wrote:
> On Wed, Aug 31, 2016 at 5:21 AM, Nick Coghlan 
>  wrote:
>> On 31 August 2016 at 17:07, Chris Angelico 
>>  wrote:
>>> On Wed, Aug 31, 2016 at 2:08 PM, Ken Kundert
>>>  wrote:
 > What's the mnemonic here? Why "r" for scale factor?

 My thinking was that r stands for real like f stands for float.
 With the base 2 scale factors, b stands for binary.
>>>
>>> "Real" has historically often been a synonym for "float", and it
>>> doesn't really say that it'll be shown in engineering notation. But
>>> then, we currently have format codes 'e', 'f', and 'g', and I don't
>>> think there's much logic there beyond "exponential", "floating-point",
>>> and... "general format"? I think that's a back-formation, frankly, and
>>> 'g' was used simply because it comes nicely after 'e' and 'f'. (C's
>>> decision, not Python's, fwiw.) I'll stick with 'r' for now, but it
>>> could just as easily become 'h' to avoid confusion with %r for repr.
>>
>> "h" would be a decent choice - it's not only a continuation of the
>> e/f/g pattern, it's also very commonly used as a command line flag for
>> "human-readable output" in system utilities that print numbers.
>
> I like it. So after all the drama we're just talking about adding an
> 'h' format code that's like 'g' but uses SI scale factors instead of
> exponents. I guess we need to debate what it should do if the value is
> way out of range of the SI scale system -- what's it going to do when
> I pass it 1e50? I propose that it should fall back to 'g' style then,
> but use "engineering" style where exponents are always a multiple of
> 3.)

There's also the important nitpick if 32e7 is best rendered as 320 M or
0.32 G. There's valid applications for both.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Eric V. Smith
On 08/31/2016 01:07 PM, MRAB wrote:
> On 2016-08-31 17:19, Guido van Rossum wrote:
>> On Wed, Aug 31, 2016 at 5:21 AM, Nick Coghlan  wrote:
>>> "h" would be a decent choice - it's not only a continuation of the
>>> e/f/g pattern, it's also very commonly used as a command line flag for
>>> "human-readable output" in system utilities that print numbers.
>>
>> I like it. So after all the drama we're just talking about adding an
>> 'h' format code that's like 'g' but uses SI scale factors instead of
>> exponents. I guess we need to debate what it should do if the value is
>> way out of range of the SI scale system -- what's it going to do when
>> I pass it 1e50? I propose that it should fall back to 'g' style then,
>> but use "engineering" style where exponents are always a multiple of
>> 3.)

Would you also want h to work with integers?

>>> The existing "alternate form" marker in string formatting could be
>>> used to request the use of the base 2 scaling prefixes rather than the
>>> base 10 ones: "#h".
>>
>> Not sure about this one.
>>

'#' already has a meaning for float's 'g' format:

>>> format(1.0, 'g')
'1'
>>> format(1.0, '#g')
'1.0'

So I think you'd want to pick another type character to mean base 2
scaling, or another character other than #. But it gets cryptic pretty
quickly.

You could indeed use type == 'b' for floats to mean base 2 scaling,
since it has no current meaning, but I'm not sure that's a great idea
because 'b' means binary for integers, and if you want to also be able
to scale ints (see above), then there's a conflict. Maybe type == 'z'?

Or, use something like '@' (or whatever) instead of '#' to mean "the
other alternate form", base 2 scaling.

> Does the 'type' have to be a single character?

As a practical matter, yes, it should just be a single character. You
could make a special case for 'h' and 'hb', but I would not recommend
that. Explaining it in the documentation would be confusing.

Eric.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Random832
On Wed, Aug 31, 2016, at 13:43, Random832 wrote:
> And the actual -h behavior of those system utilities you mentioned is
> "123k", "1.2M", "12M", with the effect being that the value always fits
> within a four-character field width, but this isn't a fixed number of
> decimal places *or* significant digits.

I just did some testing... it can go to five characters when binary
prefixes are used for e.g. "1023K".

Also, interesting quirk - it always rounds up. 1025 bytes is "1.1K", and
in SI mode, 1001 bytes is "1.1k"
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Random832
On Wed, Aug 31, 2016, at 12:19, Guido van Rossum wrote:
> On Wed, Aug 31, 2016 at 5:21 AM, Nick Coghlan  wrote:
> > "h" would be a decent choice - it's not only a continuation of the
> > e/f/g pattern, it's also very commonly used as a command line flag for
> > "human-readable output" in system utilities that print numbers.
> 
> I like it. So after all the drama we're just talking about adding an
> 'h' format code that's like 'g' but uses SI scale factors instead of
> exponents. I guess we need to debate what it should do if the value is
> way out of range of the SI scale system -- what's it going to do when
> I pass it 1e50? I propose that it should fall back to 'g' style then,
> but use "engineering" style where exponents are always a multiple of
> 3.)

One thing to consider is that this is very likely to be used with a unit
(e.g. "%hA" intending to display in amperes), so maybe it should put a
space after it? Though really people are probably going to want "1 A" vs
"1 kA" in that case, rather than "1 A" vs "1kA".

Also, maybe consider that "1*10^50" [or, slightly less so, 1.0*10**50]
is more human-readable than "1e+50". Er, with engineering style it'd be
100e+48 etc, but same basic issue.

Also, is it really necessary to use single-character codes not shared
with any other language? The only rationale here seems to be a desire to
support everything in % and its limited grammar rather than requiring
anyone to use format. If this feature is only supported in format a more
verbose description of the desired format could be used. What if, for
example, you want engineering style without SI scale factors?

What should the "precision" field mean? %f takes a number of places
after the decimal point whereas %e/%g takes a number of significant
digits. Engineering or SI-scale-factor format suggests a third
possibility: number of decimal places to be shown after the displayed
decimal point, e.g. "%.1h" % 1.2345 * 10 ** x for x in range(10): "1.2",
"12.3", "123.5", "1.2k", "12.3k", "123.5k", "1.2M", "12.3M", "123.5M".

And the actual -h behavior of those system utilities you mentioned is
"123k", "1.2M", "12M", with the effect being that the value always fits
within a four-character field width, but this isn't a fixed number of
decimal places *or* significant digits.

> > The existing "alternate form" marker in string formatting could be
> > used to request the use of the base 2 scaling prefixes rather than the
> > base 10 ones: "#h".

If base 2 scaling prefixes are used, should "engineering style" mean
2**[multiple of 10] instead of 10**[multiple of 3]?

> Not sure about this one.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread MRAB

On 2016-08-31 17:19, Guido van Rossum wrote:

On Wed, Aug 31, 2016 at 5:21 AM, Nick Coghlan  wrote:

On 31 August 2016 at 17:07, Chris Angelico  wrote:

On Wed, Aug 31, 2016 at 2:08 PM, Ken Kundert
 wrote:

> What's the mnemonic here? Why "r" for scale factor?

My thinking was that r stands for real like f stands for float.
With the base 2 scale factors, b stands for binary.


"Real" has historically often been a synonym for "float", and it
doesn't really say that it'll be shown in engineering notation. But
then, we currently have format codes 'e', 'f', and 'g', and I don't
think there's much logic there beyond "exponential", "floating-point",
and... "general format"? I think that's a back-formation, frankly, and
'g' was used simply because it comes nicely after 'e' and 'f'. (C's
decision, not Python's, fwiw.) I'll stick with 'r' for now, but it
could just as easily become 'h' to avoid confusion with %r for repr.


"h" would be a decent choice - it's not only a continuation of the
e/f/g pattern, it's also very commonly used as a command line flag for
"human-readable output" in system utilities that print numbers.


I like it. So after all the drama we're just talking about adding an
'h' format code that's like 'g' but uses SI scale factors instead of
exponents. I guess we need to debate what it should do if the value is
way out of range of the SI scale system -- what's it going to do when
I pass it 1e50? I propose that it should fall back to 'g' style then,
but use "engineering" style where exponents are always a multiple of
3.)


The existing "alternate form" marker in string formatting could be
used to request the use of the base 2 scaling prefixes rather than the
base 10 ones: "#h".


Not sure about this one.


Does the 'type' have to be a single character?

If not, how about 'hb' for binary scaling?

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread MRAB

On 2016-08-31 05:08, Ken Kundert wrote:

What's the mnemonic here? Why "r" for scale factor?


My thinking was that r stands for real like f stands for float.
With the base 2 scale factors, b stands for binary.


'b' already means binary:

>>> '{:b}'.format(100)
'1100100'

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Guido van Rossum
On Wed, Aug 31, 2016 at 5:21 AM, Nick Coghlan  wrote:
> On 31 August 2016 at 17:07, Chris Angelico  wrote:
>> On Wed, Aug 31, 2016 at 2:08 PM, Ken Kundert
>>  wrote:
>>> > What's the mnemonic here? Why "r" for scale factor?
>>>
>>> My thinking was that r stands for real like f stands for float.
>>> With the base 2 scale factors, b stands for binary.
>>
>> "Real" has historically often been a synonym for "float", and it
>> doesn't really say that it'll be shown in engineering notation. But
>> then, we currently have format codes 'e', 'f', and 'g', and I don't
>> think there's much logic there beyond "exponential", "floating-point",
>> and... "general format"? I think that's a back-formation, frankly, and
>> 'g' was used simply because it comes nicely after 'e' and 'f'. (C's
>> decision, not Python's, fwiw.) I'll stick with 'r' for now, but it
>> could just as easily become 'h' to avoid confusion with %r for repr.
>
> "h" would be a decent choice - it's not only a continuation of the
> e/f/g pattern, it's also very commonly used as a command line flag for
> "human-readable output" in system utilities that print numbers.

I like it. So after all the drama we're just talking about adding an
'h' format code that's like 'g' but uses SI scale factors instead of
exponents. I guess we need to debate what it should do if the value is
way out of range of the SI scale system -- what's it going to do when
I pass it 1e50? I propose that it should fall back to 'g' style then,
but use "engineering" style where exponents are always a multiple of
3.)

> The existing "alternate form" marker in string formatting could be
> used to request the use of the base 2 scaling prefixes rather than the
> base 10 ones: "#h".

Not sure about this one.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Nick Coghlan
On 31 August 2016 at 17:07, Chris Angelico  wrote:
> On Wed, Aug 31, 2016 at 2:08 PM, Ken Kundert
>  wrote:
>> > What's the mnemonic here? Why "r" for scale factor?
>>
>> My thinking was that r stands for real like f stands for float.
>> With the base 2 scale factors, b stands for binary.
>
> "Real" has historically often been a synonym for "float", and it
> doesn't really say that it'll be shown in engineering notation. But
> then, we currently have format codes 'e', 'f', and 'g', and I don't
> think there's much logic there beyond "exponential", "floating-point",
> and... "general format"? I think that's a back-formation, frankly, and
> 'g' was used simply because it comes nicely after 'e' and 'f'. (C's
> decision, not Python's, fwiw.) I'll stick with 'r' for now, but it
> could just as easily become 'h' to avoid confusion with %r for repr.

"h" would be a decent choice - it's not only a continuation of the
e/f/g pattern, it's also very commonly used as a command line flag for
"human-readable output" in system utilities that print numbers.

The existing "alternate form" marker in string formatting could be
used to request the use of the base 2 scaling prefixes rather than the
base 10 ones: "#h".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Steven D'Aprano
On Tue, Aug 30, 2016 at 09:08:01PM -0700, Ken Kundert wrote:
> > What's the mnemonic here? Why "r" for scale factor?
> 
> My thinking was that r stands for real like f stands for float.

Hmmm. Do you know many mathematicians who use SI prefixes when talking 
about real numbers? I don't think "real number" is relevant to SI 
prefixes.


> With the base 2 scale factors, b stands for binary.

Well, obviously :-)


> > (1) Why no support for choosing a particular scale? If this only 
> > auto-scales, 
> > I'm not interested.
> 
> Auto-scaling is kind of the point. There is really little need for a special 
> mechanism if your going to specify the scale factor yourself.

The point is not to have to repeat yourself. If I have to scale numbers 
in lots of places, I don't want to have to re-write the same code in 
each of them. I want to call a function.

Understand that I'm not against auto-scaling. I think it is a good idea. 
But I strongly disagree that it is the *only* way to do this. If there's 
code in the std lib to format numbers to some scale, I should be able to 
loop through a bunch of numbers and format them all in a consistent unit 
if I so choose, without having to do my own formatting.

Its not that I don't want you to be able to auto-scale. I just want the 
choice of being able to use a consistent scale or not.

[...]
> If you wanted to force the second number to be in km, you use a %f format and 
> scale the argument:
> 
> >>> print('Attenuation = {:.1f} dB at {:.1f} km.'.format(-13.7, 50e3/1e3))
> Attenuation = -13.7 dB at 50 km.

*shrug* Well, you could do exactly the same thing. You only need a short 
function that determines the scale you want, and then scale it yourself. 
The point of making this a standard function is so that we don't have to 
keep re-writing the same code.


> > (2) Support for full prefix names, so we can format (say) "kilograms" as 
> > well 
> > as "kg"?
> 
> This assumes that somehow this code can access the units so that it can 
> switch 
> between long form 'grams' and short form 'g'. That is a huge expansion in the 
> complexity for what seems like a small benefit.

No, I'm talking about chosing between "M" or "mega". The actual unit 
itself is up to the caller to supply.

You have definitely prodded my interest in the output side of this. I'm 
rather busy at the moment, but in the coming weeks I think I'll brush 
the cobwebs off byteformat and see what can be done.

https://pypi.python.org/pypi/byteformat

in case you want to have a play with it.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Ken Kundert
Thanks Chris.

I had misunderstood Steve's request, and I was thinking of something much more 
complicated.

Your code is very helpful.

-Ken


On Wed, Aug 31, 2016 at 05:07:11PM +1000, Chris Angelico wrote:
> On Wed, Aug 31, 2016 at 2:08 PM, Ken Kundert
>  wrote:
> > > What's the mnemonic here? Why "r" for scale factor?
> >
> > My thinking was that r stands for real like f stands for float.
> > With the base 2 scale factors, b stands for binary.
> 
> "Real" has historically often been a synonym for "float", and it
> doesn't really say that it'll be shown in engineering notation. But
> then, we currently have format codes 'e', 'f', and 'g', and I don't
> think there's much logic there beyond "exponential", "floating-point",
> and... "general format"? I think that's a back-formation, frankly, and
> 'g' was used simply because it comes nicely after 'e' and 'f'. (C's
> decision, not Python's, fwiw.) I'll stick with 'r' for now, but it
> could just as easily become 'h' to avoid confusion with %r for repr.
> 
> >> (2) Support for full prefix names, so we can format (say) "kilograms" as 
> >> well
> >> as "kg"?
> >
> > This assumes that somehow this code can access the units so that it can 
> > switch
> > between long form 'grams' and short form 'g'. That is a huge expansion in 
> > the
> > complexity for what seems like a small benefit.
> >
> 
> AIUI, it's just giving the full word.
> 
> class ScaledNumber(float):
> invert = {"μ": 1e6, "m": 1e3, "": 1, "k": 1e-3, "M": 1e-6}
> words = {"μ": "micro", "m": "milli", "": "", "k": "kilo", "M": "mega"}
> aliases = {"u": "μ"}
> def autoscale(self):
> if self < 1e-6: return None
> if self < 1e-3: return "μ"
> if self < 1: return "m"
> if self < 1e3: return ""
> if self < 1e6: return "k"
> if self < 1e9: return "M"
> return None
> def __format__(self, fmt):
> if fmt == "r" or fmt == "R":
> scale = self.autoscale()
> fmt = fmt + scale if scale else "f"
> if fmt.startswith("r"):
> scale = self.aliases.get(fmt[1], fmt[1])
> return "%g%s" % (self * self.invert[scale], scale)
> if fmt.startswith("R"):
> scale = self.aliases.get(fmt[1], fmt[1])
> return "%g %s" % (self * self.invert[scale], self.words[scale])
> return super().__format__(self, fmt)
> 
> >>> range = ScaledNumber(50e3)
> >>> print('Attenuation = {:.1f} dB at {:r}m.'.format(-13.7, range))
> Attenuation = -13.7 dB at 50km.
> >>> print('Attenuation = {:.1f} dB at {:R}meters.'.format(-13.7, range))
> Attenuation = -13.7 dB at 50 kilometers.
> >>> print('Attenuation = {:.1f} dB at {:rM}m.'.format(-13.7, range))
> Attenuation = -13.7 dB at 0.05Mm.
> >>> print('Attenuation = {:.1f} dB at {:RM}meters.'.format(-13.7, range))
> Attenuation = -13.7 dB at 0.05 megameters.
> 
> It's a minor flexibility, but could be very useful. As you see, it's
> still not at all unit-aware; but grammatically, these formats only
> make sense if followed by an actual unit name. (And not an SI base
> unit, necessarily - you have to use "gram", not "kilogram", lest you
> get silly constructs like "microkilogram" for milligram.)
> 
> Note that this *already works*. You do have to use an explicit class
> for your scaled numbers, since Python doesn't want you monkey-patching
> the built-in float type, but if you were to request that
> float.__format__ grow support for this, it'd be a relatively
> non-intrusive change. This class could live on PyPI until one day
> becoming subsumed into core, or just be a permanent third-party float
> formatting feature.
> 
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Chris Angelico
On Wed, Aug 31, 2016 at 2:08 PM, Ken Kundert
 wrote:
> > What's the mnemonic here? Why "r" for scale factor?
>
> My thinking was that r stands for real like f stands for float.
> With the base 2 scale factors, b stands for binary.

"Real" has historically often been a synonym for "float", and it
doesn't really say that it'll be shown in engineering notation. But
then, we currently have format codes 'e', 'f', and 'g', and I don't
think there's much logic there beyond "exponential", "floating-point",
and... "general format"? I think that's a back-formation, frankly, and
'g' was used simply because it comes nicely after 'e' and 'f'. (C's
decision, not Python's, fwiw.) I'll stick with 'r' for now, but it
could just as easily become 'h' to avoid confusion with %r for repr.

>> (2) Support for full prefix names, so we can format (say) "kilograms" as well
>> as "kg"?
>
> This assumes that somehow this code can access the units so that it can switch
> between long form 'grams' and short form 'g'. That is a huge expansion in the
> complexity for what seems like a small benefit.
>

AIUI, it's just giving the full word.

class ScaledNumber(float):
invert = {"μ": 1e6, "m": 1e3, "": 1, "k": 1e-3, "M": 1e-6}
words = {"μ": "micro", "m": "milli", "": "", "k": "kilo", "M": "mega"}
aliases = {"u": "μ"}
def autoscale(self):
if self < 1e-6: return None
if self < 1e-3: return "μ"
if self < 1: return "m"
if self < 1e3: return ""
if self < 1e6: return "k"
if self < 1e9: return "M"
return None
def __format__(self, fmt):
if fmt == "r" or fmt == "R":
scale = self.autoscale()
fmt = fmt + scale if scale else "f"
if fmt.startswith("r"):
scale = self.aliases.get(fmt[1], fmt[1])
return "%g%s" % (self * self.invert[scale], scale)
if fmt.startswith("R"):
scale = self.aliases.get(fmt[1], fmt[1])
return "%g %s" % (self * self.invert[scale], self.words[scale])
return super().__format__(self, fmt)

>>> range = ScaledNumber(50e3)
>>> print('Attenuation = {:.1f} dB at {:r}m.'.format(-13.7, range))
Attenuation = -13.7 dB at 50km.
>>> print('Attenuation = {:.1f} dB at {:R}meters.'.format(-13.7, range))
Attenuation = -13.7 dB at 50 kilometers.
>>> print('Attenuation = {:.1f} dB at {:rM}m.'.format(-13.7, range))
Attenuation = -13.7 dB at 0.05Mm.
>>> print('Attenuation = {:.1f} dB at {:RM}meters.'.format(-13.7, range))
Attenuation = -13.7 dB at 0.05 megameters.

It's a minor flexibility, but could be very useful. As you see, it's
still not at all unit-aware; but grammatically, these formats only
make sense if followed by an actual unit name. (And not an SI base
unit, necessarily - you have to use "gram", not "kilogram", lest you
get silly constructs like "microkilogram" for milligram.)

Note that this *already works*. You do have to use an explicit class
for your scaled numbers, since Python doesn't want you monkey-patching
the built-in float type, but if you were to request that
float.__format__ grow support for this, it'd be a relatively
non-intrusive change. This class could live on PyPI until one day
becoming subsumed into core, or just be a permanent third-party float
formatting feature.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-31 Thread Paul Moore
On 31 August 2016 at 05:08, Ken Kundert  wrote:
> Auto-scaling is kind of the point. There is really little need for a special
> mechanism if your going to specify the scale factor yourself.
>
> >>> print('Attenuation = {:.1f} dB at {:r}m.'.format(-13.7, 50e3))
> Attenuation = -13.7 dB at 50 km.
>
> If you wanted to force the second number to be in km, you use a %f format and
> scale the argument:
>
> >>> print('Attenuation = {:.1f} dB at {:.1f} km.'.format(-13.7, 50e3/1e3))
> Attenuation = -13.7 dB at 50 km.

This argument can just as easily be used against your proposal:

If you want auto-scaling you use a %s format and a suitable library function:

>>> print('Attenuation = {:.1f} dB at {}m.'.format(-13.7, scale(50e3)))
Attenuation = -13.7 dB at 50 km.

Anything that's going to be included in the language has to consider
other requirements than just your own.

> This is suddenly a much bigger project than what I was envisioning.

You're going to have to write the scaling code one way or the other.
Writing it in Python and publishing it as a library is *far* easier
than writing it in C and hooking it into the format mechanism. You can
leave others to offer pull requests to your library to add extra types
of formatting.

IMO, it's probably time to write some code. Publish a library on PyPI
(call it a "prototype" if you like) implementing the scale() function
above, publicise it here and elsewhere, and see what reception it
gets.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Ken Kundert
> What's the mnemonic here? Why "r" for scale factor?

My thinking was that r stands for real like f stands for float.
With the base 2 scale factors, b stands for binary.

> (1) Why no support for choosing a particular scale? If this only auto-scales, 
> I'm not interested.

Auto-scaling is kind of the point. There is really little need for a special 
mechanism if your going to specify the scale factor yourself.

>>> print('Attenuation = {:.1f} dB at {:r}m.'.format(-13.7, 50e3))
Attenuation = -13.7 dB at 50 km.

If you wanted to force the second number to be in km, you use a %f format and 
scale the argument:

>>> print('Attenuation = {:.1f} dB at {:.1f} km.'.format(-13.7, 50e3/1e3))
Attenuation = -13.7 dB at 50 km.

> (2) Support for full prefix names, so we can format (say) "kilograms" as well 
> as "kg"?

This assumes that somehow this code can access the units so that it can switch 
between long form 'grams' and short form 'g'. That is a huge expansion in the 
complexity for what seems like a small benefit.

> (3) Scientific notation and engineering notation?
> 
> (4) 1e5 versus 1×10^5 notation?

Ah, okay. But all of these require auto-scaling. And I was still thinking that 
we need to provide input and output capability (ie, we still need be able to 
convert whatever format we output back from strings into floats). Are you 
thinking that we should parse 1×10^5? And why 1×10^5 and not 1×10⁵?

> (5) Is this really something that format() needs to understand? We can get 
> a *much* richer and more powerful interface by turning it into a generalise 
> numeric pretty-printing library, at the cost of a little less convenience.

This is suddenly a much bigger project than what I was envisioning.

-Ken

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Chris Angelico
On Wed, Aug 31, 2016 at 12:05 PM, Steven D'Aprano  wrote:
> (5) Is this really something that format() needs to understand? We can
> get a *much* richer and more powerful interface by turning it into a
> generalise numeric pretty-printing library, at the cost of a little less
> convenience.

Or just have a subclass of int or float that defines __format__, and
can do whatever it likes - including specifying the scale, if you so
choose. Say, something like:

{:s} -- autoscale, prefix
{:S} -- autoscale, full word
{:sM} -- scale to mega, print "M"
{:SM} -- scale to mega, print "Mega"
etc

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Steven D'Aprano
On Tue, Aug 30, 2016 at 01:34:27PM -0700, Ken Kundert wrote:

> 3. A change to the various string formatting mechanisms to allow outputting 
> real 
>numbers with SI scale factors:

This is somewhat similar to a library I wrote for formatting bytes:

https://pypi.python.org/pypi/byteformat

Given that feature freeze for 3.6 is two weeks way, I don't think that 
this proposal will appear before 3.7. So I'm interested, but I'm less 
interested *right now*. So for now I'll limit myself to only a few 
observations.

 
>   >>> print('Speed of light in a vacuum: {:r}m/s.'.format(2.9979e+08))
>   Speed of light in a vacuum: 299.79 Mm/s.

Do you think that {:r} might be confused with {!r}?

What's the mnemonic here? Why "r" for scale factor?

 
>   >>> print('Speed of sound in water: %rm/s.' % 1481
>   Speed of sound in water: 1.481 km/s.

I doubt that you'll get any new % string formatting codes. That's a 
legacy interface, *not* deprecated but unlikely to get new features 
added, and it is intended to closely match the C printf codes.


A few more questions:

(1) Why no support for choosing a particular scale? If this only 
auto-scales, I'm not interested.

(2) Support for full prefix names, so we can format (say) "kilograms" as 
well as "kg"?

(3) Scientific notation and engineering notation?

(4) 1e5 versus 1×10^5 notation?

(5) Is this really something that format() needs to understand? We can 
get a *much* richer and more powerful interface by turning it into a 
generalise numeric pretty-printing library, at the cost of a little less 
convenience.


> 3. Allowing numbers to be formatted with SI prefixes is useful and not 
>controversial.

I wouldn't quite go that far. You made an extremely controversial 
request (new syntax for scaling prefixes + ignored units) and nearly all 
the attention was on that.

For what its worth, I have no need for a format code which *only* 
auto-selects the scaling factor. If I don't have at least the option to 
choose which scaling factor I get, and hence the prefix, this is of 
little or no use to me, I likely wouldn't use it, and as far as I am 
concerned the nuisance value of having yet another format string code to 
learn outweighs the benefit.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Barry Warsaw
On Aug 30, 2016, at 02:16 PM, Guido van Rossum wrote:

>Given that something like this gets proposed from time to time, I
>wonder if it would make sense to actually write up (1) and (2) as a
>PEP that is immediately marked rejected. The PEP should make it clear
>*why* it is rejected. This would be a handy reference doc to have
>around the next time the idea comes up.

There certainly is precedence: e.g. PEPs 404 and 666. :)

Cheers,
-Barry


pgpLLpPQDzqcA.pgp
Description: OpenPGP digital signature
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Sven R. Kunze

Thanks a lot for this comprehensive summary. :) Find my comments below.


On 30.08.2016 22:34, Ken Kundert wrote:

Okay, let's try to wrap this up. In summary I proposed three things:

1. A change to the Python lexer to accept SI literal as an alternative, but not
replacement to, E-notation. As an optional feature, simple units could be
added to the end but would be largely ignored. So the following would be
accepted:

   freq = 2.4GHz
   r = 1k
   l = 10nm

The idea in accepting units was to allow them to be specified when 
convenient
as additional documentation on the meaning of the number.

Objections:
a. Acceptance of the abbreviation for Exa (E) overlaps with E-notation (1E+1
   could represent 1e18 + 1 or 10). A suggestion to change the prefix from
   E to X conflicts with a proposal to use X, W, and V to represent 10^27,
   10^30, and 10^33 (en.wikipedia.org/wiki/Metric_prefix)


I think this results from the possibility of omitting the SI units.


b. Allowing the units to be specified will lead some users to assume
   a dimensional analysis is being performed when in fact the units are
   ignored. This false sense of security could lead to bugs.


Same can be said for variable annotations for which a PEP is in the works.


c. The proposal only supports simple units, not compound units such as m/s.
   So even if hooks were provided to allow access to the units to support an
   add-on dimensional analysis capability, an additional mechanism would 
have
   to be provided to support compound units.


I get the feeling that SI syntax should only work when the hook is provided.

So this could be the dealbreaker here: only enabling it when the hook is 
provided, changes the syntax/semantics of valid Python code depending on 
the presence of some hidden hooks. Enabling the syntax regardless of a 
working hook, have those sideeffects like described by you above.


So, no matter how done, it always has some negative connotation.


d. Many people objected to allowing the use of naked scale factors as
   a perversion of the standard.


Remove this and it also solves 1.a.



2. A change to the float() function so that it accepts SI scale factors and
units. This extension naturally follows from the first: the float function
should accept anything the Python parser accepts.  For example:

   freq = float('2.4GHz')
   r = float('1k')
   l = float('10nm')

Objections:
a. The Exa objection from the above proposal is problematic here as well.
b. Things that used to be errors are now no longer errors. This could cause
   problems if a program was counting on float('1k') to be an error.


3. A change to the various string formatting mechanisms to allow outputting real
numbers with SI scale factors:

   >>> print('Speed of light in a vacuum: {:r}m/s.'.format(2.9979e+08))
   Speed of light in a vacuum: 299.79 Mm/s.

   >>> print('Speed of sound in water: %rm/s.' % 1481
   Speed of sound in water: 1.481 km/s.

Objections:
No objections were raised that I recall, however here is something else to
consider:

a. Should we also provide mechanism for the binary scale factors (Ki, Mi,
   ..., Yi)? For example: '{:b}B'.format(2**30) --> 1 GiB.

On proposed extension 1 (native support for SI literals) my conclusion is that
we did not reach any sense of consensus and there was considerable opposition to
my proposal.  There was much less discussion on extensions 2 & 3, so it is hard
to say whether consensus was reached.

So, given all this, I would like to make the following recommendations:
1. No action should be taken.
2. The main justification to modifying float() was to make it consistent with
the extended Python language. Without extension 1, this justification goes
away. However the need to be able to easily convert strings of numbers with
SI scale factors into floats still exists. This should be handled by adding
a library or extending an existing library.
3. Allowing numbers to be formatted with SI prefixes is useful and not
controversial. The 'r' and 'b' format codes should be added to the various
string formatting mechanisms.

What do you think?


I like your conclusion. It seems there is missing some technical note of 
why this won't happen the way you proposed it (maybe the hook + missing 
stdlib package for SI units). :)


Aren't there some package already available for recommendation 3?


Sven

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Guido van Rossum
Given that something like this gets proposed from time to time, I
wonder if it would make sense to actually write up (1) and (2) as a
PEP that is immediately marked rejected. The PEP should make it clear
*why* it is rejected. This would be a handy reference doc to have
around the next time the idea comes up.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Paul Moore
On 30 August 2016 at 21:34, Ken Kundert  wrote:
> So, given all this, I would like to make the following recommendations:
> 1. No action should be taken.
> 2. The main justification to modifying float() was to make it consistent with
>the extended Python language. Without extension 1, this justification goes
>away. However the need to be able to easily convert strings of numbers with
>SI scale factors into floats still exists. This should be handled by adding
>a library or extending an existing library.
> 3. Allowing numbers to be formatted with SI prefixes is useful and not
>controversial. The 'r' and 'b' format codes should be added to the various
>string formatting mechanisms.
>
> What do you think?

Thanks for the summary (which I mostly elided) which I think was fair.

Regarding (3), the only one that remains proposed, I think it would be
useful to see a 3rd-party library implementation of the formatting
operation proposed. This would allow any corner cases or controversial
points to be ironed out before proposing it for direct incorporation
in the string formatting mini-language. Furthermore, in Python 2.6, it
will be possible to write

f"The value is {si_format(the_val)}"

directly, using PEP 498 f-strings. The combination of a 3rd party
function and f-strings may even make special formatting support
unnecessary - but that will be easier to establish with practical
experience. And there's little or no downside - the proposed feature
won't be possible before 3.7, so we may as well use lifetime of the
3.6 release to gain that experience.

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] real numbers with SI scale factors: next steps

2016-08-30 Thread Ken Kundert
Okay, let's try to wrap this up. In summary I proposed three things:

1. A change to the Python lexer to accept SI literal as an alternative, but not 
   replacement to, E-notation. As an optional feature, simple units could be 
   added to the end but would be largely ignored. So the following would be 
   accepted:

  freq = 2.4GHz
  r = 1k
  l = 10nm

   The idea in accepting units was to allow them to be specified when 
convenient 
   as additional documentation on the meaning of the number.

   Objections:
   a. Acceptance of the abbreviation for Exa (E) overlaps with E-notation (1E+1 
  could represent 1e18 + 1 or 10). A suggestion to change the prefix from 
  E to X conflicts with a proposal to use X, W, and V to represent 10^27, 
  10^30, and 10^33 (en.wikipedia.org/wiki/Metric_prefix)
   b. Allowing the units to be specified will lead some users to assume 
  a dimensional analysis is being performed when in fact the units are 
  ignored. This false sense of security could lead to bugs.
   c. The proposal only supports simple units, not compound units such as m/s.  
  So even if hooks were provided to allow access to the units to support an 
  add-on dimensional analysis capability, an additional mechanism would 
have 
  to be provided to support compound units.
   d. Many people objected to allowing the use of naked scale factors as 
  a perversion of the standard.

2. A change to the float() function so that it accepts SI scale factors and 
   units. This extension naturally follows from the first: the float function 
   should accept anything the Python parser accepts.  For example:

  freq = float('2.4GHz')
  r = float('1k')
  l = float('10nm')

   Objections:
   a. The Exa objection from the above proposal is problematic here as well.
   b. Things that used to be errors are now no longer errors. This could cause 
  problems if a program was counting on float('1k') to be an error.


3. A change to the various string formatting mechanisms to allow outputting 
real 
   numbers with SI scale factors:

  >>> print('Speed of light in a vacuum: {:r}m/s.'.format(2.9979e+08))
  Speed of light in a vacuum: 299.79 Mm/s.

  >>> print('Speed of sound in water: %rm/s.' % 1481
  Speed of sound in water: 1.481 km/s.

   Objections:
   No objections were raised that I recall, however here is something else to 
   consider:

   a. Should we also provide mechanism for the binary scale factors (Ki, Mi, 
  ..., Yi)? For example: '{:b}B'.format(2**30) --> 1 GiB.

On proposed extension 1 (native support for SI literals) my conclusion is that 
we did not reach any sense of consensus and there was considerable opposition 
to 
my proposal.  There was much less discussion on extensions 2 & 3, so it is hard 
to say whether consensus was reached.

So, given all this, I would like to make the following recommendations:
1. No action should be taken.
2. The main justification to modifying float() was to make it consistent with 
   the extended Python language. Without extension 1, this justification goes 
   away. However the need to be able to easily convert strings of numbers with 
   SI scale factors into floats still exists. This should be handled by adding 
   a library or extending an existing library.
3. Allowing numbers to be formatted with SI prefixes is useful and not 
   controversial. The 'r' and 'b' format codes should be added to the various 
   string formatting mechanisms.

What do you think?

-Ken
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/