Re: [Tutor] Limitation of int() in converting strings

Steven D'Aprano Mon, 31 Dec 2012 21:09:58 -0800

On 23/12/12 04:38, Oscar Benjamin wrote:

On 22 December 2012 01:34, Steven D'Aprano<st...@pearwood.info>  wrote:

On 18/12/12 01:36, Oscar Benjamin wrote:

I think it's unfortunate that Python's int() function combines two
distinct behaviours in this way. In different situations int() is used
to:
1) Coerce an object of some type other than int into an int without
changing the value of the integer that the object represents.


The second half of the sentence (starting from "without changing") is not
justified. You can't safely make that assumption. All you know is that
calling int() on an object is intended to convert the object to an int,
in whatever way is suitable for that object. In some cases, that will
be numerically exact (e.g. int("1234") will give 1234), in other cases it
will not be.


If I was to rewrite that sentence  would replace the word 'integer'
with 'number' but otherwise I'm happy with it. Your reference to
"numerically exact" shows that you understood exactly what I meant.


Yes. And it is a demonstrable fact that int is *not* intended to coerce
objects to int "without changing the value of the number", because
changing the value of the number is precisely what int() does, in some
circumstances.

If you would like to argue that it would have been better if int did
not do this, then I might even agree with you. There is certainly
precedence: if I remember correctly, you cannot convert floating point
values to integers directly in Pascal, you first have to truncate them
to an integer-valued float, then convert.

# excuse my sloppy Pascal syntax, it has been a few years
var
  i: integer;
  x: real;
begin
  i = integer(trunc(x));
end;


So I'm not entirely against the idea that Python should have had separate
int() and trunc() functions, with int raising an exception on (non-whole
number?) floats.

But back to Python as it actually is, rather than how it might have been.
There's no rule that int() must be numerically lossless. It is lossless
with strings, and refuses to convert strings-that-look-like-floats to ints.
And that makes sense: in an int, the "." character is just as illegal as
the characters "k" or "&" or "Ω", int will raise on "123k456", so why
wouldn't it raise on "123.456"?

But that (good, conservative) design decision isn't required or enforced.
Hence my reply that you cannot safely make the assumption that int() on a
non-numeric type will be numerically exact.

2) Round an object with a non-integer value to an integer value.



int() does not perform rounding (except in the most generic sense that any
conversion from real-valued number to integer is "rounding"). That is what
the round() function does. int() performs truncating: it returns the
integer part of a numeric value, ignoring any fraction part:


I was surprised by your objection to my use of the word "rounding"
here. So I looked it up on Wikipedia:
http://en.wikipedia.org/wiki/Rounding#Rounding_to_integer

That section describes "round toward zero (or truncate..." which is
essentially how I would have put it, and also how you put it below:


Well, yes. I explicitly referred to the generic sense where any conversion
from real-valued to whole number is "rounding". But I think that it is a
problematic, ambiguous term that needs qualification:

* sometimes truncation is explicitly included as a kind of rounding;

* sometimes truncation is used in opposition to rounding.


For example, I think that in everyday English, most people would be
surprised to hear you describe "rounding 9.9999999 to 9". In the
absence of an explicit rounding direction ("round down", "round up"),
some form of "round to nearest" is assumed in everyday English, and
as such is used in contrast to merely cutting off whatever fraction
part is there (truncation).

Hence the need for qualification.

So you shouldn't think of int(number) as "convert number to an int", since
that is ambiguous. There are at least six common ways to convert arbitrary
numbers to ints:


This is precisely my point. I would prefer if if int(obj) would fail
on non-integers leaving me with the option of calling an appropriate
rounding function. After catching RoundError (or whatever) you would
know that you have a number type object that can be passed to round,
ceil, floor etc.


Well, I guess that comes down to the fact that Python is mostly aimed at
mathematically and numerically naive users who would be scared off at a
plethora of rounding modes :-)

Python provides truncation via the int and math.trunc functions, floor and
ceiling via math.floor and math.ceil, and round to nearest via round.
In Python 2, ties are rounded up, which is biased; in Python 3, the
unbiased banker's rounding is used.


I wasn't aware of this change. Thanks for that.



Actually, I appear to have been wrong: in Python 2, ties are rounded
away from zero rather than up. Positive arguments round up, negative
arguments round down:

py> round(1.5), round(2.5)
(2.0, 3.0)
py> round(-1.5), round(-2.5)
(-2.0, -3.0)

Instead, you should consider int(number) to be one of a pair of functions,
"return integer part", "return fraction part", where unfortunately the
second function isn't provided directly. In general though, you can get
the fractional part of a number with "x % 1". For floats, math.modf also
works.


Assuming that you know you have an object that supports algebraic
operations in a sensible way then this works, although the
complementary function for "x % 1" would be "x // 1" or
"math.floor(x)" rather than "int(x)".


Again, I was mistaken. x%1 is not suitable to get the fraction part of a
number in Python: it returns the wrong result for negative values. You need
math.modf:


py> x = -99.25
py> x % 1  # want -0.25
0.75
py> math.modf(x)
(-0.25, -99.0)




--
Steven
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Limitation of int() in converting strings

Reply via email to