[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-22 Thread Vedran Čačić
> ints print the same in just about every single programming language that
uses base ten Arabic-Hindu digits 0...9. It's kind of a universal.

Not actually true: C64's Basic V2 printed positive numbers with space in
front.
-- 
Vedran Čačić
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ALURGURCJTE5VPYXUTTXDWYHFB22BQBT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Karl O. Pinc
On Tue, 21 Jan 2020 09:01:29 -0600
"Karl O. Pinc"  wrote:

> I guess I will advocate for _some_ specification built into Python's
> definition.  Otherwise everybody should _always_ build their own
> formatter; lest they wake up one morning and find that int zero prints
> as "+0".

Having made a suggestion I've followed up with a pull request.
https://github.com/python/cpython/pull/18111

I think I have come up with a very minimal and sane
set of restrictions on the default Numeric string
representations.  Having done that, I'm less interested
in spending a lot more time on this.

I'd be happy to explain my wording choices, and equally
happy to have the pull request immediately rejected.

The pull request is presently failing the check for
news.  (I'm not entirely clear on how to
satisfy the requirement,
or whether I could come up with a good news entry.
I'll wait to resolve this if it looks like the patch
is going anywhere.)

There should probably also be unit tests.  But again,
I'll wait to see if this is going anywhere.

FYI, it was remarkably easy to build the docs.  But the
contribution process goes through an annoying number
of corporations (github, the contributor signature...)
and login steps.

(The contributor signature needs to clear at your end.)

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FDC772QSZB5IE7TY4DQILHWBZS2WYKKQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Tim Peters
[Serhiy Storchaka]
> This is not the only difference between '.17g' and repr().
>
> >>> '%.17g' % 1.23456789
> '1.23456788'
> >>> format(1.23456789, '.17g')
> '1.23456788'
> >>> repr(1.23456789)
> '1.23456789'

More amazingly ;-), repr() isn't even always the same as a %g format
specifying exactly the same number of digits as repr() produces.

That's because repr(x) currently returns the shortest string such that
eval(repr(x)) == x, but %g rounds correctly to the given number of
digits.  Not always the same thing!

>>> x = 2.0 ** 89
>>> print(repr(x))
6.189700196426902e+26
>>> print("%.16g" % x) # repr produced 16 digits
6.189700196426901e+26

The repr() output is NOT correctly rounded.  To see which one is
correctly rounded, here's an easy way:

>>> import decimal
>>> decimal.Decimal(x)
Decimal('618970019642690137449562112')

The "37449562112" is rounded off, and is less than half a unit in the
last place, so correct rounding truncates the last digit to 1.

But there is no string with 16 digits other than the incorrectly
rounded one repr() returns that gives x back.  In particular, the
correctly rounded 16 digit string does not:

>>> 6.189700196426901e+26 # 16-digit correctly rounded fails
6.189700196426901e+26
>>> x == _
False

To my mind it's idiotic(*) that "shortest string" requires incorrect
rounding in some cases.

In Python's history, eval(repr(x)) == x is something that was always
intended, so long as the writing and reading was done by the same
Python instance on the same machine.  Maybe it's time to document that
;-)

But CPython goes far beyond that now, also supplying correct rounding,
_except_ for repr's output, where - for reasons already illustrated -
"correct rounding" and "shortest" can't always both be satisfied.

(*) What wouldn't be idiotic?  For repr(x) to return the shortest
_correctly rounded_ string such that eval(repr(x)) == x.  In the
example, that would require repr(x) to produce a 17-digit output (and
17 is the most that's ever needed for a Python float on boxes with
IEEE doubles).  But "shortest string" was taken ultra literally by the
people who first worked out routines capable of doing that, so has
become a de facto standard now.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F3BOGGTGJPZS3RR7FKG7YE6GYADHYI76/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Eric V. Smith

On 1/21/2020 2:02 PM, Serhiy Storchaka wrote:

21.01.20 12:37, Eric V. Smith пише:
Yes (I wrote a lot of that), but '.17g' doesn't mean to always show 
17 digits. See 
https://github.com/python/cpython/blob/master/Python/pystrtod.c#L825 
where the repr (which is format_code =='r') is translated to 
format_code = 'g' and precision = 17.


But I was wrong about them being equivalent: 'g' will drop the 
trailing '.0' if it exists, and repr() will not (via flags = 
Py_DTSF_ADD_DOT_0).


This is not the only difference between '.17g' and repr().

>>> '%.17g' % 1.23456789
'1.23456788'
>>> format(1.23456789, '.17g')
'1.23456788'
>>> repr(1.23456789)
'1.23456789'


Huh. That's interesting. Thanks!

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BJKU6GO65UOGCHQN2BCDTFH3XWXIYRD7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Serhiy Storchaka

21.01.20 12:37, Eric V. Smith пише:
Yes (I wrote a lot of that), but '.17g' doesn't mean to always show 17 
digits. See 
https://github.com/python/cpython/blob/master/Python/pystrtod.c#L825 
where the repr (which is format_code =='r') is translated to format_code 
= 'g' and precision = 17.


But I was wrong about them being equivalent: 'g' will drop the trailing 
'.0' if it exists, and repr() will not (via flags = Py_DTSF_ADD_DOT_0).


This is not the only difference between '.17g' and repr().

>>> '%.17g' % 1.23456789
'1.23456788'
>>> format(1.23456789, '.17g')
'1.23456788'
>>> repr(1.23456789)
'1.23456789'
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/I6EXOVZPGZO2M7CBU7TWXJQ5FYGCFW6O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Eric V. Smith

On 1/21/2020 1:32 PM, Chris Angelico wrote:

On Wed, Jan 22, 2020 at 4:03 AM Eric V. Smith  wrote:

The reason repr adds the '.0' that 'g' does not is to avoid this problem:

  >>> type(eval(repr(17.0))) == type(17.0)
True
  >>> type(eval(format(17.0, '.17g'))) == type(17.0)
False


The OP wasn't asking about eval, though, but about float. If you're
depending on the ability to eval the repr of a float, you also have to
concern yourself with inf and nan, which are not builtin names. But I
believe float(repr(x)) == x for any float x.


None the less, it's why repr adds the '.0' that 'g' does not.

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7OMWVNHERBVTS56KD4ENQ4PHYFXXPHFV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Chris Angelico
On Wed, Jan 22, 2020 at 4:03 AM Eric V. Smith  wrote:
> The reason repr adds the '.0' that 'g' does not is to avoid this problem:
>
>  >>> type(eval(repr(17.0))) == type(17.0)
> True
>  >>> type(eval(format(17.0, '.17g'))) == type(17.0)
> False
>

The OP wasn't asking about eval, though, but about float. If you're
depending on the ability to eval the repr of a float, you also have to
concern yourself with inf and nan, which are not builtin names. But I
believe float(repr(x)) == x for any float x.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LA3UGPBZYV3ODLEIUDUP26J43IHS56LI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Eric V. Smith

On 1/21/2020 11:52 AM, Steven D'Aprano wrote:

I don't really care whether there's documentation for __str__() or
__repr__() or something else.  I'm just thinking that there should
be some way to guarantee a well defined "useful" float output
formatting.

https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting

https://docs.python.org/3/library/string.html#format-string-syntax

Thanks.  For some reason nobody in #python pointed me to the 'g' format
type.  That resolves my issue.

Unfortunately, because 'g' can strip the trailing ".0" floats
formatted with it no longer satisfy the float->str->float
immutability property.

I don't see why. Any string you get back from %g ought to convert back
to a float without loss of precision, the trailing '.0' should not
affect it. Can you give an example where it does?

It seems to work for me.

 py> x = 94.0
 py> float('%g' % x) == x
 True


The reason repr adds the '.0' that 'g' does not is to avoid this problem:

>>> type(eval(repr(17.0))) == type(17.0)
True
>>> type(eval(format(17.0, '.17g'))) == type(17.0)
False

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HMOBPGLJBPVTHSTDCXVUHKSJHJKC2Z6D/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Steven D'Aprano
On Tue, Jan 21, 2020 at 09:01:29AM -0600, Karl O. Pinc wrote:

> Understood.  But you still might want to document, or even define in the
> language, that you're outputting the shortest unambiguous
> representation.

I'm not even sure I would want to do that. That would make it a language 
guarantee and force all implementations to follow. Jython and IronPython 
may prefer to follow the repr used by Java and .Net; if not those two 
implementations, other implementations might want to do something 
similar.


> I guess I will advocate for _some_ specification built into Python's
> definition.  Otherwise everybody should _always_ build their own
> formatter; lest they wake up one morning and find that int zero prints
> as "+0".

We're not talking about ints, we're talking about floats. There's only 
one reasonable way to print ints that everyone expects, and that doesn't 
including putting a spurious sign on zero. As far as I know, ints print 
the same in just about every single programming language that uses base 
ten Arabic-Hindu digits 0...9. It's kind of a universal.


> > > I don't really care whether there's documentation for __str__() or
> > > __repr__() or something else.  I'm just thinking that there should
> > > be some way to guarantee a well defined "useful" float output
> > > formatting.  
> > 
> > https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
> > 
> > https://docs.python.org/3/library/string.html#format-string-syntax
> 
> Thanks.  For some reason nobody in #python pointed me to the 'g' format
> type.  That resolves my issue.
> 
> Unfortunately, because 'g' can strip the trailing ".0" floats
> formatted with it no longer satisfy the float->str->float
> immutability property.

I don't see why. Any string you get back from %g ought to convert back 
to a float without loss of precision, the trailing '.0' should not 
affect it. Can you give an example where it does?

It seems to work for me.

py> x = 94.0
py> float('%g' % x) == x
True


Do you care about having the shortest representation, or consistent 
representation?

If you want a consistent representation, then I understand that %.17e 
is guaranteed to round-trip exactly for all floats (C doubles):

py> '%.17e' % 94.0
'9.4e+01'

If you care about length, "94" is shorter than "94.0" and it still 
losslessly converts back to the float 94.0:

py> '%.17g' % 94.0
'94'

repr() will (I think) round trip, but it won't necessarily be the 
shortest, and it won't be consistent.


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WEHXLXEZ7PI5DOUEVCHQJ63NRBVWEMLC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Karl O. Pinc
On Tue, 21 Jan 2020 21:09:57 +1100
Steven D'Aprano  wrote:

> On Mon, Jan 20, 2020 at 09:59:07PM -0600, Karl O. Pinc wrote:
> 
> > It would be nice if the output format for float was documented, to
> > the extent this is possible.  
> 
> I don't think we should make any promises about the repr() of floats. 
> We've already changed the format at least twice:
> 
> - once to switch to the shortest unambiguous representation;
> - and once to shift to a more consistent output for NANs.
> 
> (NANs on Windows prior to 2.6 used to be displayed as '1.#IND', if I 
> recall correctly.)
> 
> We may never want to change output format again, but if we document a 
> certain format that will be read by people as a guarantee, and that 
> closes the door to any change without a long and tedious deprecation 
> period.

Understood.  But you still might want to document, or even define in the
language, that you're outputting the shortest unambiguous
representation.  Or other such broad principals like IEEE 754
representation compatibility.  This is a suggestion, I don't want to
advocate.

> If anyone wants a guaranteed output format for floats, they ought to
> use the various string formatting operations, which offer guaranteed 
> formatting outputs. Or build your own formatter.
> 
> I think that the most we should promise is that (with the exception
> of NANs) float -> repr -> float should round-trip with no change in
> value.

That would be nice, and is the sort of general principal I'm thinking
of.

Another one might be "a sign is only printed for negative numbers".

I guess I will advocate for _some_ specification built into Python's
definition.  Otherwise everybody should _always_ build their own
formatter; lest they wake up one morning and find that int zero prints
as "+0".

As mentioned, parts of this discussion could also apply to other
numeric types.

> > I don't really care whether there's documentation for __str__() or
> > __repr__() or something else.  I'm just thinking that there should
> > be some way to guarantee a well defined "useful" float output
> > formatting.  
> 
> https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
> 
> https://docs.python.org/3/library/string.html#format-string-syntax

Thanks.  For some reason nobody in #python pointed me to the 'g' format
type.  That resolves my issue.

Unfortunately, because 'g' can strip the trailing ".0" floats
formatted with it no longer satisfy the float->str->float
immutability property.  I can always:

  out = f'{num:g}'
  print(out if 'e' in out or '.' in out else f'{out}.0')

sort of logic.  (With handling for INF and NAN.)
A cleaner format would be nice but this works.
(The #g format leaves multiple trailing zeros, which is
too different from the "minimal" form __repr__() produces.)

FYI.  It wouldn't hurt to have the PyOS_double_to_string() docs
https://docs.python.org/3/c-api/conversion.html point out that "format"
uses the codes as defined in your formatting links above.  Digging
around got me to PyOS_double_to_string() whereupon I was left in
the dark about the meaning of the "format" codes.

Thanks you all for the help.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MP5OKKVGWLCCYJE7EQ2DOPXFHACGTRN4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Eric V. Smith

On 1/21/2020 4:32 AM, Serhiy Storchaka wrote:

21.01.20 10:37, Eric V. Smith пише:
For what it's worth, float's repr internally uses a format of '.17g'. 
So, format(value, '.17g') will be equal to repr(f), where f is any 
float.


It was in Python 2, but since Python 3.1 it returns the shortest 
unambiguous representation, which may be shorter than 17 digits.


https://docs.python.org/3/whatsnew/3.1.html#other-language-changes
https://bugs.python.org/issue1580

Yes (I wrote a lot of that), but '.17g' doesn't mean to always show 17 
digits. See 
https://github.com/python/cpython/blob/master/Python/pystrtod.c#L825 
where the repr (which is format_code =='r') is translated to format_code 
= 'g' and precision = 17.


But I was wrong about them being equivalent: 'g' will drop the trailing 
'.0' if it exists, and repr() will not (via flags = Py_DTSF_ADD_DOT_0).


And I'm almost positive that 2.7 also uses short float repr, but doesn't 
have str == repr for floats. But 2.7 is dead to me (despite that fact 
that I use it extensively for one client!), so I'm not going to bother 
double checking.


Eric

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FILPSQ3FOJF5IZNL4BEZSKJJS6RNKT4A/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Steven D'Aprano
On Mon, Jan 20, 2020 at 09:59:07PM -0600, Karl O. Pinc wrote:

> It would be nice if the output format for float was documented, to the
> extent this is possible.

I don't think we should make any promises about the repr() of floats. 
We've already changed the format at least twice:

- once to switch to the shortest unambiguous representation;
- and once to shift to a more consistent output for NANs.

(NANs on Windows prior to 2.6 used to be displayed as '1.#IND', if I 
recall correctly.)

We may never want to change output format again, but if we document a 
certain format that will be read by people as a guarantee, and that 
closes the door to any change without a long and tedious deprecation 
period.

If anyone wants a guaranteed output format for floats, they ought to use 
the various string formatting operations, which offer guaranteed 
formatting outputs. Or build your own formatter.

I think that the most we should promise is that (with the exception of 
NANs) float -> repr -> float should round-trip with no change in value.


> I don't really care whether there's documentation for __str__() or
> __repr__() or something else.  I'm just thinking that there should be
> some way to guarantee a well defined "useful" float output formatting.

https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting

https://docs.python.org/3/library/string.html#format-string-syntax


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M5NUDXVHCPZNIYXBTNPCGITF4WXNVYHI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Serhiy Storchaka

21.01.20 10:37, Eric V. Smith пише:
For what it's worth, float's repr internally uses a format of '.17g'. 
So, format(value, '.17g') will be equal to repr(f), where f is any float.


It was in Python 2, but since Python 3.1 it returns the shortest 
unambiguous representation, which may be shorter than 17 digits.


https://docs.python.org/3/whatsnew/3.1.html#other-language-changes
https://bugs.python.org/issue1580
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/22IWLRREAHKAR3LKSFUWFUT4PRPZ3JG2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Documenting Python's float.__str__()

2020-01-21 Thread Eric V. Smith

On 1/20/2020 10:59 PM, Karl O. Pinc wrote:

Hello,

There appears to be extremely minimal documentation on how floats are
formatted on output.  All I really see is that float.__str__() is
float.__repr__().  So that means that float->str->float does not
result in a different value.

It would be nice if the output format for float was documented, to the
extent this is possible.  #python suggested that I propose a patch,
but I see no way to write a documentation patch without having any
clue about what Python promises, whether in the CPython implementation
or as part of a specification for Python.

What are the promises Python makes about the str() of a float?  Will
it produce 1.0 today and 1.0e0 or +1.0 tomorrow?  When is the result
in exponential notation and when not?  Does any of this depend on the
underlying OS or hardware or Python implementation?  Etc.
For what it's worth, float's repr internally uses a format of '.17g'. 
So, format(value, '.17g') will be equal to repr(f), where f is any float.


I think (but don't exactly recall, it's been a while) that you'll get 
different values if sys.float_repr_style is 'short' or not. I don't know 
if any current systems don't support 'short'.


I don't know if this is documented. I'm also not sure if this is 
considered a CPython implementation detail or not, but I would argue 
that it is.


Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N2O4V4MUG6VYT4HRFRFOIH6NJHVPA6DQ/
Code of Conduct: http://python.org/psf/codeofconduct/