Re: [Python-Dev] Fraction arithmetic (Was: Decimal ... float comparisons in py3k)

2010-03-20 Thread Mark Dickinson
On Sat, Mar 20, 2010 at 11:41 AM, Paul Moore p.f.mo...@gmail.com wrote:
 On 20 March 2010 04:20, Nick Coghlan ncogh...@gmail.com wrote:
 In the case of floats and Decimals, there's no ambiguity here that
 creates any temptation to guess - to determine a true/false result for a
 comparison, floats can be converted explicitly to Decimals without any
 loss of accuracy. For Fractions, the precedent has already been set by
 allowing implicit (potentially lossy) conversion to binary floats - a
 lossy conversion to Decimal wouldn't be any worse.

 Hmm, given that a float can be converted losslessly to a fraction, why
 was the decision taken to convert the fraction to a float rather than
 the other way round?

I'm not sure of the actual reason for this decision, but one argument
I've seen used for other languages is that it's desirable for the
inexactness of the float type to be contagious:  rationals are
perceived as exact, while floats are perceived as approximations.

Note that this only applies to arithmetic operations:  for
comparisons, an exact conversion *is* performed.  This is much like
what currently happens with ints and floats in the core:  a mixed-type
arithmetic operation between an int and a float first converts the int
to a float (possibly changing the value in the process).  A mixed-type
comparison makes an exact comparison without doing such a conversion.
For example (in any non-ancient version of Python):

 n = 2**53 + 1
 x = 2.**53
 n  x   # compares exact values;  no conversion performed
True
 n - x# converts n to a float before subtracting
0.0

 I don't see a PEP for the fractions module, and my google-fu has
 failed to find anything. Was there a discussion on this?

There's PEP 3141 (http://www.python.org/dev/peps/pep-3141/), which was
the motivation for adding the fractions module in the first place, and
there's the issue tracker item for the fractions module
(http://bugs.python.org/issue1682).

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal - float comparisons in py3k.

2010-03-20 Thread Mark Dickinson
On Sat, Mar 20, 2010 at 3:17 PM, Case Vanhorsen cas...@gmail.com wrote:
 On Sat, Mar 20, 2010 at 4:06 AM, Mark Dickinson dicki...@gmail.com wrote:
 What external modules are there that rely on existing hash behaviour?

 I'm only aware of  gmpy and SAGE.

 And exactly what behaviour do they rely on?

 Instead of calculating hash(long(mpz)), they calculate hash(mpz)
 directly. It avoids creation of a temporary object that could be quite
 large and is faster than the two-step process. I would need to modify
 the code so that it continues to produce the same result.

Does gmpy only do this for Python 2.6?  Or does it use different
algorithms for 2.4/2.5 and 2.6?  As far as I can tell, there was no
reasonable way to compute long_hash directly at all before the
algorithm was changed for 2.6, unless you imitate exactly what Python
was doing (break up into 15-bit pieces, and do all the rotation and
addition exactly the same way), in which case you might as well be
calling long_hash directly.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal - float comparisons in py3k.

2010-03-20 Thread Mark Dickinson
On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum gu...@python.org wrote:
 I propose to reduce all hashes to the hash of a normalized fraction,
 which we can define as a combination of the hashes for the numerator
 and the denominator. Then all we have to do is figure fairly efficient
 ways to convert floats and decimals to normalized fractions (not
 necessarily Fractions). I may be naive but this seems doable: for a
 float, the denominator is always a power of 2 and removing factors of
 2 from the denominator is easy (just right-shift until the last bit is
 zero). For Decimal, the unnormalized denominator is always a power of
 10, and the normalization is a bit messier, but doesn't seem
 excessively so. The resulting numerator and denominator may be large
 numbers, but for typical use of Decimal and float they will rarely be
 excessively large, and I'm not too worried about slowing things down
 when they are (everything slows down when you're using really large
 integers anyway).

I *am* worried about slowing things down for large Decimals:  if you
can't put Decimal('1e1234567') into a dict or set without waiting for
an hour for the hash computation to complete (because it's busy
computing 10**1234567), I consider that a problem.

But it's solvable!  I've just put a patch on the bug tracker:

http://bugs.python.org/issue8188

It demonstrates how hashes can be implemented efficiently and
compatibly for all numeric types, even large Decimals like the above.
It needs a little tidying up, but it works.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-20 Thread Mark Dickinson
On Sat, Mar 20, 2010 at 11:20 PM, Greg Ewing
greg.ew...@canterbury.ac.nz wrote:
 * Decimal and float really belong side-by-side in the
 tower, rather than one above the other. Neither of them is
 inherently any more precise or exact than the other.

Except that float is fixed-width (typically 53 bits of precision),
while Decimal allows a user-specified, arbitrarily large, precision;
so in that sense the two floating-point types aren't on an equal
footing.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-22 Thread Mark Dickinson
On Sun, Mar 21, 2010 at 10:50 PM, Greg Ewing
greg.ew...@canterbury.ac.nz wrote:
 Raymond Hettinger wrote:

 Since decimal also allows arbitrary sizes, all long ints can be
 exactly represented (this was even one of the design goals
 for the decimal module).

 There may be something we need to clarify here. I've been
 imagining that the implicit conversions to Decimal that
 we're talking about would be done to whatever precision
 is set in the context. Am I wrong about that? Is the intention
 to always use enough digits to get an exact representation?

I've been thinking about this, too.

Currently, Decimal + integer - Decimal converts the integer
losslessly, with no reference to the Decimal context.

But the Fraction type is going to mess this up:  for Decimal +
Fraction -  Decimal, I don't see any other sensible option than to
convert the Fraction using the current context, since lossless
conversion isn't generally possible.

So with the above two conventions, we'd potentially end up with
Decimal('1.23') + 314159 giving a different result from
Decimal('1.23') + Fraction(314159, 1)  (depending on the context).

It may be that we should reconsider Decimal + int interactions, making
the implicit int-Decimal conversion lossy.  Decimal would then match
the way that float behaves:  float + int and float + Fraction both do
a lossy conversion of the non-Fraction argument to Fraction.  But then
we're changing established behaviour of the Decimal module, which
could cause problems for existing users.

Note that comparisons are a separate issue:  those always need to be
done exactly (at least for equality, and once you're doing it for
equality it makes sense to make the other comaprisons exact as well),
else the rule that x == y implies hash(x) == hash(y) would become
untenable.  Again, this is the pattern that already exists for
int-float and Fraction-float interactions: comparisons are exact,
but arithmetic operations involve a lossy conversion.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-22 Thread Mark Dickinson
On Mon, Mar 22, 2010 at 5:56 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:

 On Mar 22, 2010, at 10:00 AM, Guido van Rossum wrote:

   Decimal + float -- Decimal

 If everybody associated with the Decimal implementation wants this I
 won't stop you; as I repeatedly said my intuition about this one (as
 opposed to the other two above) is very weak.

 That's my vote.
 I believe Nick chimed-in in agreement.
 Mark, do you concur?

Yes;  Decimal + float - Decimal works for me; the greater flexibility
of the Decimal type is the main plus for me.  I don't think my own
intuition is particularly strong here, either.  It would be
interesting to hear from major users of the decimal module (I have to
confess to not actually using it myself very much).

I argued earlier for Decimal + float - Decimal on the basis that the
float-Decimal conversion can be done exactly;  but having thought
about Decimal + Fraction it's no longer clear to me that this makes
sense.  Having Decimal + float - Decimal round the float using the
current Decimal context still seems like a reasonable option.

Just for the record, I'd also prefer Decimal + Fraction - Decimal.

I don't want to let the abstractions of the numeric tower get in the
way of the practicalities:  we should modify the abstractions if
necessary!  In particular, it's not clear to me that all numeric types
have to be comparable with each other.  It might make sense for
Decimal + complex mixed-type operations to be disallowed, for example.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-22 Thread Mark Dickinson
On Mon, Mar 22, 2010 at 7:00 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:

 On Mar 22, 2010, at 11:26 AM, Mark Dickinson wrote:

 Just for the record, I'd also prefer Decimal + Fraction - Decimal.


 Guido was persuasive on why float + Fraction -- float,
 so this makes sense for the same reasons.

 For the implementation, is there a way to avoid the double rounding
 in   myfloat + myfrac.numerator / myfrac.denominator?

 Perhaps translate it to:

      f = Fractions.from_decimal(myfloat) + myfract   # Lossless, exact 
 addition
      return f.numerator / f.denominator           # Only one decimal context 
 rounding applied.

I'm not sure;  I see a couple of problems with this.  (1) It's fine
for the basic arithmetic operations that Fraction already supports,
but what about all the other Decimal methods that don't have Fraction
counterparts.  (2) It bothers me that the Decimal - Fraction
conversion can be inefficient in cases like Decimal('1elarge');
currently, all Decimal operations are relatively efficient (no
exponential-time behaviour) provided only that the coefficients don't
get too large;  large exponents aren't a problem.

I think getting this to work would involve a lot of extra code and
significant 'cleverness'.  I'd prefer the simple-to-implement and
simple-to-explain option of rounding the Fraction before performing
the operation, even if this means that the whole operation involves
two rounding operations.  It's not so different from what currently
happens for Fraction+float, or even int+float.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-22 Thread Mark Dickinson
On Mon, Mar 22, 2010 at 8:02 PM, Pierre B. pierre...@hotmail.com wrote:
 Sorry to intervene out of the blue, but I find the suggested rule for
 fractional to decimal conversion not as clean as I'd expect.

 If fractions are converted to decimals when doing arithmetics, would it be
 worthwhile to at least provide a minimum of fractional conversion integrity?
 What I have in mind is the following rule:

 When doing conversion from fraction to decimal, always generate a whole number
 of repeating digits, always at least twice.

 Examples, with a precision of 5 in Decimal:

 1/2 - 0.5

 1/3 - 0.3

 1/11 - 0.090909
 # Note that we produced 6 digits, because
 # the repeating pattern contains 2 digits.

 1/7 - 0.142857142857
  # Always at least two full patterns.

And for 1/123123127?  The decimal expansion of this fraction has a
period of over 15 million!

Sorry, but this doesn't seem like a feasible or desirable strategy.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-22 Thread Mark Dickinson
On Mon, Mar 22, 2010 at 8:33 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
 While we're on the topic, I think you should consider allowing the Fraction()
 constructor to accept a decimal input.

 This corresponds to common schoolbook problems and simple client requests:
   Express 3.5 as a fraction.

     Fraction(Decimal('3.5'))
   Fraction(7, 2)

 Unlike typical binary floats which use full precision, it is not uncommon
 to have decimal floats with only a few digits of precision where the
 expression as a fraction is both useful and unsurprising.

Sounds fine to me.  Fraction already accepts decimal floating-point
strings, so the implementation of this would be trivial (convert to
string, then call the Fraction constructor).

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-22 Thread Mark Dickinson
On Mon, Mar 22, 2010 at 8:44 PM, Guido van Rossum gu...@python.org wrote:
 On Mon, Mar 22, 2010 at 12:33 PM, Raymond Hettinger
 raymond.hettin...@gmail.com wrote:
 While we're on the topic, I think you should consider allowing the Fraction()
 constructor to accept a decimal input.

 This corresponds to common schoolbook problems and simple client requests:
   Express 3.5 as a fraction.

     Fraction(Decimal('3.5'))
   Fraction(7, 2)

 There is already a Fraction.from_decimal() class method.

So there is;  I'd failed to notice that!  So decimal - fraction
conversion is implemented twice in the Fractions module---once in
__new__ and once in from_decimal.  Hmm.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-23 Thread Mark Dickinson
On Tue, Mar 23, 2010 at 12:33 AM, Greg Ewing
greg.ew...@canterbury.ac.nz wrote:
 Mark Dickinson wrote:

 It might make sense for
 Decimal + complex mixed-type operations to be disallowed, for example.

 As long as you're allowing Decimal-float comparisons,
 Decimal-complex comparison for equality has an obvious
 interpretation.

Agreed.  Decimal-to-complex equality and inequality comparisons should
be permitted.  Order comparisons would raise TypeError (just as
complex-to-complex and float-to-complex comparisons do at the moment.)

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-23 Thread Mark Dickinson
On Tue, Mar 23, 2010 at 12:09 PM, Stefan Krah ste...@bytereef.org wrote:
 Facundo Batista facundobati...@gmail.com wrote:
 On Fri, Mar 19, 2010 at 5:50 PM, Guido van Rossum gu...@python.org wrote:

  As a downside, there is the worry that inadvertent mixing of Decimal
  and float can compromise the correctness of programs in a way that is
  hard to detect. But the anomalies above indicate that not fixing the

 Decimal already has something that we can use in this case, and fits
 very nice here: Signals.

 I like the simplicity of having a single signal (e.g. CoercionError), but
 a strictness context flag could offer greater control for people who only
 want pure decimal/integer operations.

Sounds worth considering.

 For example:

  strictness 0: completely promiscuous behaviour

  strictness 1: current py3k behaviour

  strictness 2: current py3k behaviour + pure equality comparisons

Can you explain what you mean by + pure equality comparisons here?
If I'm understanding correctly, this is a mode that's *more* strict
than the current py3k behaviour;  what's it disallowing that the
current py3k behaviour allows?

  strictness 3: current py3k behaviour + pure equality comparisons +
                disallow NaN equality comparisons [1]

Sorry, no.  I think there are good reasons for the current NaN
equality behaviour:  2.0 really *isn't* a NaN, and Decimal(2) ==
Decimal('nan') should return False rather than raising an exception.
And the decimal module provides compare and compare_signal for those
who want complete standards-backed control here.  Besides, this seems
to me to be an orthogonal issue to the issue of mixing Decimal with
other numeric types.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-23 Thread Mark Dickinson
On Tue, Mar 23, 2010 at 3:09 PM, Stefan Krah ste...@bytereef.org wrote:
 Mark Dickinson dicki...@gmail.com wrote:
 [Stefan]
 
   strictness 2: current py3k behaviour + pure equality comparisons

 Can you explain what you mean by + pure equality comparisons here?
 If I'm understanding correctly, this is a mode that's *more* strict
 than the current py3k behaviour;  what's it disallowing that the
 current py3k behaviour allows?

 It's disallowing all comparisons between e.g. float and decimal. The idea
 is that the context can provide a cheap way of enforcing types for people
 who like it:

 DefaultContext.strictness = 2
 Decimal(9) == 9.0
 Traceback (most recent call last):
  File stdin, line 1, in module
  File /home/stefan/svn/py3k/Lib/decimal.py, line 858, in __eq__
    other = _convert_other(other)
  File /home/stefan/svn/py3k/Lib/decimal.py, line 5791, in _convert_other
    raise TypeError(Unable to convert %s to Decimal % other)
 TypeError: Unable to convert 9.0 to Decimal

Hmm.  It seems to me that deliberately making an __eq__ method between
hashable types raise an exception isn't something that should be done
lightly, since it can *really* screw up sets and dicts.  For example,
with your proposal, {9.0, Decimal(x)} would either raise or not,
depending on whether Decimal(x) happened to hash equal to 9.0 (if they
don't hash equal, then __eq__ will never be called).  If the hash is
regarded as essentially a black box (which is what it should be for
most users) then you can easily end up with code that almost always
works, but *very* occasionally and unpredicatably raises an exception.

 And I think that an sNaN should really signal by default.

Agreed, notwithstanding the above comments.  Though to avoid the
problems described above, I think the only way to make this acceptable
would be to prevent hashing of signaling nans.  (Which the decimal
module current does; it also prevents hashing of quiet NaNs, but I
can't see any good rationale for that.)

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-23 Thread Mark Dickinson
On Tue, Mar 23, 2010 at 5:48 PM, Adam Olsen rha...@gmail.com wrote:
 a = Decimal('nan')
 a != a

 They don't follow the behaviour required for being hashable.

What's this required behaviour?  The only rule I'm aware of is that if
a == b then hash(a) == hash(b).  That's not violated here.

Note that containment tests check identity before equality, so there's
no problem with putting (float) nans in sets or dicts:

 x = float('nan')
 s = {x}
 x in s
True

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-23 Thread Mark Dickinson
On Tue, Mar 23, 2010 at 6:07 PM, Adam Olsen rha...@gmail.com wrote:
 On Tue, Mar 23, 2010 at 12:04, Mark Dickinson dicki...@gmail.com wrote:
 Note that containment tests check identity before equality, so there's
 no problem with putting (float) nans in sets or dicts:

 x = float('nan')
 s = {x}
 x in s
 True

 Ergh, I thought that got changed.  Nevermind then.

Hmm.  I think you're right:  it did get changed at some point early in
py3k's history;  I seem to recall that the identity-checking behaviour
got restored before 3.1 was released, though.  There was an issue
about this somewhere, but I'm failing to find it.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 5:36 AM, Stephen J. Turnbull step...@xemacs.org wrote:
 Steven D'Aprano writes:

   As usual though, NANs are unintuitive:
  
    d = {float('nan'): 1}
    d[float('nan')] = 2
    d
   {nan: 1, nan: 2}
  
  
   I suspect that's a feature, not a bug.

Right:  distinct nans (i.e., those with different id()) are treated as
distinct set elements or dict keys.

 I don't see how it can be so.  Aren't all of those entries garbage?
 To compute a histogram of results for computations on a series of
 cases would you not have to test each result for NaN-hood, then hash
 on a proxy such as the string Nan?

So what alternative behaviour would you suggest, and how would you implement it?

I agree that many aspects of the current treatment of nans aren't
ideal, but I as far as I can see that's unavoidable.  For sane
containment testing, Python's == operator needs to give an equivalence
relation.  Meanwhile IEEE 754 requires that nans compare unequal to
themselves, breaking reflexivity.  So there have to be some
compromises somewhere.

The current compromise at least has the virtue that it doesn't require
special-casing nans anywhere in the general containment-testing and
hashing machinery.

One alternative would be to prohibit putting nans into sets and dicts
by making them unhashable;  I'm not sure what that would gain, though.
 And there would still be some unintuitive behaviour for containment
testing of nans in lists.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 11:47 AM, Nick Coghlan
 Interning NaN certainly seems like it should be sufficient to eliminate
 the set/dict membership weirdness.

 That is, make it so that the first two lines of the following return
 True, while the latter two lines continue to return False:

 float(nan) is float(nan)
 False
 dec(nan) is dec(nan)
 False
 float(nan) == float(nan)
 False
 dec(nan) == dec(nan)
 False

Yes;  that could be done.  Though as Steven points out, not all NaNs
are equivalent (possibility of different payloads and different
signs), so float nans with different underlying bit patterns, and
Decimal nans with different string representations, would ideally be
interned separately.  For floats it might be possible to get away with
pretending that there's only one nan.  For decimal, I don't think
that's true, since the payload and sign are part of the standard, and
are very visible (e.g. in the repr of the nan).

The obvious way to do this nan interning for floats would be to put
the interning code into PyFloat_FromDouble.  I'm not sure whether this
would be worth the cost in terms of added code (and possibly reduced
performance, since the nan check would be done every time a float was
returned), but I'd be willing to review a patch.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 6:26 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 Mark, I wonder if you could describe an algorithm off the top of your
 head that relies on NaN == NaN being false.


No, I certainly couldn't!  And I often wonder if the original IEEE 754
committee, given 20/20 foresight, would have made the same decisions
regarding comparisons of nans.  It's certainly not one of my favourite
features of IEEE 754.  (Though sqrt(-0.) - -0. ranks lower for me.
Grr.)

A bogus application that I've often seen mentioned is that it allows
checking whether a float 'x' is a nan by doing `x == x';  but the
proper way to do this is to have an 'isnan' function or method, so
this isn't particularly convincing.

Slightly more convincing is history:  this is the way that nan
comparisons behave in other languages (Fortran, C) used for numerics.
 If Python were to do something different then a naively translated
algorithm from another language would fail.  It's the behaviour that
numerically-aware people expect, and I'd expect to get complaints from
those people if it changed.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Why is nan != nan?

2010-03-24 Thread Mark Dickinson
[Changing the subject line, since we're way off the original topic]

On Wed, Mar 24, 2010 at 7:04 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Wed, Mar 24, 2010 at 2:50 PM, Mark Dickinson dicki...@gmail.com wrote:
 ..
  If Python were to do something different then a naively translated
 algorithm from another language would fail.  It's the behaviour that
 numerically-aware people expect, and I'd expect to get complaints from
 those people if it changed.

 Numerically-aware people are likely to be aware of the differences in
 languages that they use.

Sure.  But I'd still expect them to complain.  :)

Here's an interesting recent blog post on this subject, from the
creator of Eiffel:

http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-24 Thread Mark Dickinson
Slight change of topic.  I've been implementing the extra comparisons
required for the Decimal type and found an anomaly while testing.
Currently in py3k, order comparisons (but not ==, !=) between a
complex number and another complex, float or int raise TypeError:

 z = complex(0, 0)
 z  int()
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: unorderable types: complex()  int()
 z  float()
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: unorderable types: complex()  float()
 z  complex()
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: unorderable types: complex()  complex()

But Fraction is the odd man out:  a comparison between a Fraction and
a complex raises a TypeError for complex numbers with nonzero
imaginary component, but returns a boolean value if the complex number
has zero imaginary component:

 z  Fraction()
False
 complex(0, 1)  Fraction()
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: unorderable types: complex()  Fraction()

I'm tempted to call this Fraction behaviour a bug, but maybe it arises
from the numeric integration themes of PEP 3141.  Any ideas?

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 8:56 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
 FWIW, my viewpoint on this is softening over time
 and I no longer feel a need to push for a new context flag.

 It is probably simplest for users if implicit coercions didn't come
 with control knobs.  We already have Fraction+float--float
 occurring without any exceptions or warnings, and nothing
 bad has happened as a result.

I agree with this;  I'd be happy to avoid the control knobs.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 10:30 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Wed, Mar 24, 2010 at 6:21 PM, Raymond Hettinger
 raymond.hettin...@gmail.com wrote:
 ..
 If we want to be able to reason about our programs,
 then we need to rely on equality relations being
 reflexsive, symmetric, and transitive.  Otherwise,
 containers and functions can't even make minimal
 guarantees about what they do.

 +1

 ..  We should probably draw the
 line at well-defined numeric contexts such as the decimal module
 and stop trying to propagate NaN awareness throughout the
 entire object model.

 I am not sure what this means in practical terms.   Should
 float('nan') == float('nan') return True or should float('nan') raise
 an exception to begin with?   I would prefer the former.


Neither is necessary, because Python doesn't actually use == as the
equivalence relation for containment testing:  the actual equivalence
relation is:  x equivalent to y iff id(x) == id(y) or x == y.  This
restores the missing reflexivity (besides being a useful
optimization).

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 10:36 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Wed, Mar 24, 2010 at 6:31 PM, Mark Dickinson dicki...@gmail.com wrote:
 ..
 Neither is necessary, because Python doesn't actually use == as the
 equivalence relation for containment testing:  the actual equivalence
 relation is:  x equivalent to y iff id(x) == id(y) or x == y.  This
 restores the missing reflexivity (besides being a useful
 optimization).

 No, it does not:

 float('nan') in [float('nan')]
 False

Sure, but just think of it as having two different nans there.  (You
could imagine thinking of the id of the nan as part of the payload.)
There's no ideal solution here;  IMO, the compromise that currently
exists is an acceptable one.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 10:52 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Wed, Mar 24, 2010 at 6:47 PM, Mark Dickinson dicki...@gmail.com wrote:
 ..
 There's no ideal solution here;  IMO, the compromise that currently
 exists is an acceptable one.

 I don't see a compromise.   So far I failed to find a use case that
 benefits from NaN violating reflexivity.


So if I understand correctly, you propose that float('nan') ==
float('nan') return True.  Would you also suggest extending this
behaviour to Decimal nans?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-24 Thread Mark Dickinson
On Wed, Mar 24, 2010 at 11:11 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Wed, Mar 24, 2010 at 7:02 PM, Mark Dickinson dicki...@gmail.com wrote:
 ..

 So if I understand correctly, you propose that float('nan') ==
 float('nan') return True.  Would you also suggest extending this
 behaviour to Decimal nans?


 yes


Okay.  So now it seems to me that there are many decisions to make:
should any Decimal nan compare equal to any other?  What if the two
nans have different payloads or signs?  How about comparing a
signaling nan with either an arbitrary quiet nan, or with the exact
quiet nan that corresponds to the signaling nan?  How do decimal nans
compare with float nans?  Should Decimal.compare(Decimal('nan'),
Decimal('nan')) return 0 rather than nan?  If not, how do you justify
the difference between == and compare?  If so, how do you justify the
deviation from the standard on which the decimal modulo is based?

In answering all these questions, you effectively end up developing
your own standard, and hoping that all the answers you chose are
sensible, consistent, and useful.

Alternatively, we could do what we're currently doing:  make use of
*existing* standards to answer these questions, and rely on the
expertise of the many who've thought about this in depth.

You say that you don't see any compromise:  I say that there's value
in adhering to (de facto and de jure) standards, and I see a
compromise between standards adherence and Python pragmatics.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 11:22 AM, Nick Coghlan ncogh...@gmail.com wrote:
 Mark Dickinson wrote:
 Here's an interesting recent blog post on this subject, from the
 creator of Eiffel:

 http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/

 Interesting. So the natural tweak that would arise from that perspective
 is for us to restore reflexivity by declaring that any given NaN is
 equal to *itself* but not to any other NaN (even one with the same payload).

 With NaN (in general) not being interned, that would actually fit the
 idea of a NaN implicitly carrying the operation that created the NaN as
 part of its definition of equivalence.

 So, I'm specifically putting that proposal on the table for both float
 and Decimal NaNs in Python:

  Not a Number is not a single floating point value. Instead each
  instance is a distinct value representing the precise conditions that
  created it. Thus, two NaN values x and y will compare equal iff they
  are the exact same NaN object (i.e. if isnan(x) then x == y iff
  x is y.

In other words, this would make explicit, at the level of ==, what
Python's already doing under the hood (e.g. in
PyObjectRichCompareBool) for membership testing---at least for nans.

 As stated above, such a change would allow us to restore reflexivity
 (eliminating a bunch of weirdness) while still honouring the idea of NaN
 being a set of values rather than a single value.

+0.2 from me.  I could happily live with this change;  but could also
equally live with the existing weirdness.

It's still a little odd for an immutable type to care about object
identity, but I guess oddness comes with the floating-point territory.
 :)

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 11:22 AM, Nick Coghlan ncogh...@gmail.com wrote:
 So, I'm specifically putting that proposal on the table for both float
 and Decimal NaNs in Python:

  Not a Number is not a single floating point value. Instead each
  instance is a distinct value representing the precise conditions that
  created it. Thus, two NaN values x and y will compare equal iff they
  are the exact same NaN object (i.e. if isnan(x) then x == y iff
  x is y.

I'd also suggest that the language make no guarantees about whether
two distinct calls to float('nan') or Decimal('nan') (or any other
function call returning a nan) return identical values or not, but
leave implementations free to do what's convenient or efficient.

For example, with the current decimal module:   Decimal('nan') returns
a new nan each time, but Decimal(-1).sqrt() always returns the same
nan object (assuming that InvalidOperation isn't trapped).  I think
it's fine to regard this as an implementation detail.

Python 2.6.2 (r262:71600, Aug 26 2009, 09:40:44)
[GCC 4.2.1 (SUSE Linux)] on linux2
Type help, copyright, credits or license for more information.
 from decimal import *
 getcontext().traps[InvalidOperation] = 0
 x, y = Decimal('nan'), Decimal('nan')
 id(x), id(y)
(47309953516000, 47309930620880)
 x, y = Decimal(-1).sqrt(), Decimal(-1).sqrt()
 id(x), id(y)
(9922272, 9922272)

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 12:36 PM, Jesus Cea j...@jcea.es wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 03/25/2010 07:54 AM, Georg Brandl wrote:
 float('nan') in [float('nan')]
 False

 Sure, but just think of it as having two different nans there.  (You
 could imagine thinking of the id of the nan as part of the payload.)

 That's interesting.  Thinking of each value created by float('nan') as
 a different nan makes sense to my naive mind, and it also explains
 nicely the behavior present right now.  Each nan comes from a different
 operation and therefore is a different non-number.

 Infinites are not equal for a good reason, for example.

Well, that depends on your mathematical model.  The underlying
mathematical model for IEEE 754 floating-point is the doubly extended
real line:  that is, the set of all real numbers augmented with two
extra elements infinity and -infinity, with the obvious total
order.  This is made explicit in section 3.2 of the standard:

The mathematical structure underpinning the arithmetic in this
standard is the extended reals, that is, the set
of real numbers together with positive and negative infinity.

This is the same model that one typically uses in a first course in
calculus when studying limits of functions; it's an appropriate model
for dealing with computer approximations to real numbers and
continuous functions.  So the model has precisely two infinities, and
1/0, 2/0 and (1/0)**(1/0) all give the same infinity.   The decision
to make 1/0 infinity rather than -infinity is admittedly a little
arbitrary.  For floating-point (but not for calculus!), it makes sense
in the light of the decision to have both positive and negative
floating-point zeros;  1/(-0) is -infinity, of course.

Other models of reals + one or more infinities are possible, of
course, but they're not relevant to IEEE 754 floating point.  There's
a case for using a floating-point model with a single infinity,
especially for those who care more about algebraic functions
(polynomials, rational functions) than transcendental ones;  however,
IEEE 754 doesn't make provisions for this.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 12:39 PM, Jesus Cea j...@jcea.es wrote:

 But IEEE 754 was created by pretty clever guys and sure they had a
 reason for define things in the way they are. Probably we are missing
 something.

Well, if we are, then nobody seems to know what!  See the Bertrand
Meyer blog post that was linked to up-thread.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 2:08 PM, Nick Coghlan ncogh...@gmail.com wrote:
 Jesus Cea wrote:
 But IEEE 754 was created by pretty clever guys and sure they had a
 reason for define things in the way they are. Probably we are missing
 something.

 Yes, this is where their implementable in a hardware circuit focus
 comes in. They were primarily thinking of a floating point
 representation where the 32/64 bits are *it* - you can't have multiple
 NaNs because you don't have the bits available to describe them.

I'm not so sure about this:  standard 64-bit binary IEEE 754 doubles
allow for 2**53-2 different nans (2**52-2 signaling nans, 2**52 quiet
nans):  anything with bit pattern (msb to lsb)

x111       

is an infinity or a nan, and there are only 2 infinities.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 2:26 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Le Thu, 25 Mar 2010 07:19:24 -0700, Curt Hagenlocher a écrit :
 Wait, what? I haven't been paying much attention, but this is backwards.
 There are multiple representations of NaN in the IEEE encoding; that's
 actually part of the problem with saying that NaN = NaN or NaN != NaN.
 If you want to ignore the payload in the NaN, then you're not just
 comparing bits any more.

 This sounds a bit sophistic, if the (Python) user doesn't have access to
 the payload anyway.

Well, you can get at the payload using the struct module, if you care
enough.  But yes, it's true that Python doesn't take much care with
the payload:  e.g., ideally, an operation on a nan (3.0 + nan,
sqrt(nan), ...) should return exactly the same nan, to make sure that
information in the payload is preserved.  Python doesn't bother, for
floats (though it does for decimal).

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 2:42 PM, Mark Dickinson dicki...@gmail.com wrote:
 On Thu, Mar 25, 2010 at 2:26 PM, Antoine Pitrou solip...@pitrou.net wrote:
 This sounds a bit sophistic, if the (Python) user doesn't have access to
 the payload anyway.

 Well, you can get at the payload using the struct module, if you care
 enough.  But yes, it's true that Python doesn't take much care with
 the payload:  e.g., ideally, an operation on a nan (3.0 + nan,
 sqrt(nan), ...) should return exactly the same nan, to make sure that
 information in the payload is preserved.  Python doesn't bother, for
 floats (though it does for decimal).

Hmm. I take it back.  I was being confused by the fact that sqrt(nan)
returns a nan with a new identity;  but it does apparently preserve
the payload.  An example:

 from struct import pack, unpack
 from math import sqrt
 x = unpack('d', pack('Q', (2047  52) + 12345))[0]
 y = sqrt(x)
 bin(unpack('Q', pack('d', x))[0])
'0b111001100111001'
 bin(unpack('Q', pack('d', y))[0])
'0b01100111001'

Here you see that the payload has been preserved.  The bit patterns
aren't quite identical:  the incoming nan was actually a signaling
nan, which got silently (because neither Python nor C understands
signaling nans) 'silenced' by setting bit 51.  So the output is the
corresponding quiet nan, with the same sign and payload.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 3:01 PM, Curt Hagenlocher c...@hagenlocher.org wrote:
 On Thu, Mar 25, 2010 at 7:54 AM, Mark Dickinson dicki...@gmail.com wrote:

 Hmm. I take it back.  I was being confused by the fact that sqrt(nan)
 returns a nan with a new identity;  but it does apparently preserve
 the payload.  An example:

 I played with this some a few months ago, and both the FPU and the C
 libraries I tested will preserve the payload. I imagine Python just
 inherits their behavior.

Pretty much, yes.  I think we've also taken care to preserve payloads
in functions that have been added to the math library as well (e.g.,
the gamma function).  Not that that's particularly hard:  it's just a
matter of making sure to do if (isnan(x)) return x; rather than if
(isnan(x)) return standard_python_nan;.  If that's not true, then
there's a minor bug to be corrected.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-25 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 3:05 PM, Nick Coghlan ncogh...@gmail.com wrote:
 Mark Dickinson wrote:
 On Thu, Mar 25, 2010 at 2:08 PM, Nick Coghlan ncogh...@gmail.com wrote:
 Jesus Cea wrote:
 But IEEE 754 was created by pretty clever guys and sure they had a
 reason for define things in the way they are. Probably we are missing
 something.
 Yes, this is where their implementable in a hardware circuit focus
 comes in. They were primarily thinking of a floating point
 representation where the 32/64 bits are *it* - you can't have multiple
 NaNs because you don't have the bits available to describe them.

 I'm not so sure about this:  standard 64-bit binary IEEE 754 doubles
 allow for 2**53-2 different nans (2**52-2 signaling nans, 2**52 quiet
 nans):  anything with bit pattern (msb to lsb)

 x111       

 is an infinity or a nan, and there are only 2 infinities.

 I stand corrected :)

 It still seems to me that the problems mostly arise when we're trying to
 get floats and Decimals to behave like Python *objects* (i.e. with
 reflexive equality) rather than like IEEE defined numbers.

 It's an extra element that isn't part of the problem the numeric
 standards are trying to solve.

Agreed.  We don't have to be missing something;  rather, the IEEE
folks (quite understandably) almost certainly didn't anticipate this
kind of usage.  So I'll concede that it's reasonable to consider
deviating from the standard in the light of this.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-03-27 Thread Mark Dickinson
On Thu, Mar 25, 2010 at 1:15 AM, Jeffrey Yasskin jyass...@gmail.com wrote:
 On Wed, Mar 24, 2010 at 2:09 PM, Mark Dickinson dicki...@gmail.com wrote:
 Slight change of topic.  I've been implementing the extra comparisons
 required for the Decimal type and found an anomaly while testing.
 Currently in py3k, order comparisons (but not ==, !=) between a
 complex number and another complex, float or int raise TypeError:

 z = complex(0, 0)
 z  int()
 Traceback (most recent call last):
  File stdin, line 1, in module
 TypeError: unorderable types: complex()  int()
 z  float()
 Traceback (most recent call last):
  File stdin, line 1, in module
 TypeError: unorderable types: complex()  float()
 z  complex()
 Traceback (most recent call last):
  File stdin, line 1, in module
 TypeError: unorderable types: complex()  complex()

 But Fraction is the odd man out:  a comparison between a Fraction and
 a complex raises a TypeError for complex numbers with nonzero
 imaginary component, but returns a boolean value if the complex number
 has zero imaginary component:

 z  Fraction()
 False
 complex(0, 1)  Fraction()
 Traceback (most recent call last):
  File stdin, line 1, in module
 TypeError: unorderable types: complex()  Fraction()

 I'm tempted to call this Fraction behaviour a bug, but maybe it arises
 from the numeric integration themes of PEP 3141.  Any ideas?

 I'd call it a bug.


Thanks, Jeffrey (and everyone else who answered).  Fixed in r79456
(py3k) and r79455 (trunk).

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is nan != nan?

2010-03-27 Thread Mark Dickinson
On Fri, Mar 26, 2010 at 11:16 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
 Of the ideas I've seen in this thread, only two look reasonable:
 * Do nothing.  This is attractive because it doesn't break anything.
 * Have float.__eq__(x, y) return True whenever x and y are
    the same NaN object.  This is attractive because it is a
    minimal change that provides a little protection for
    simple containers.
 I support either of those options.

Yes;  those are the only two options I've seen that seem workable.  Of
the two, I prefer the first (do nothing), but would be content with
second.

I'd be interested to know whether there's any real-life code that's
suffering as a result of nan != nan.  While the nan weirdnesses
certainly exist, I'm having difficulty imagining them turning up in
real code.

Casey Duncan's point that there can't be many good uses for floats as
dict keys or set members is compelling, though there may be
type-agnostic applications that care (e.g., memoizing).  Similarly,
putting floats into a list must be very common, but I'd guess that
checking whether a given float is in a list doesn't happen that often.
 I suspect that (nan+container)-related oddities turn up infrequently
enough to make it not worth fixing.

By the way, for those suggesting that any operation producing a nan
raise an exception instead:  Python's math module actually does go out
of its way to protect naive users from nans.  You can't get a nan out
of any of the math module functions without having put a nan in in the
first place.  Invalid operations like math.sqrt(-1), math.log(-1),
consistently raise ValueError rather than return a nan.  Ideally I'd
like to see this behaviour extended to arithmetic as well, so that
e.g., float('inf')/float('inf') raises instead of producing
float('nan') (and similarly 1e300 * 1e300 raises OverflowError instead
of producing an infinity), but there are backwards compatibility
concerns.  But even then, I'd still want it to be possible to produce
nans deliberately when necessary, e.g., by directly calling
float('nan').

Python also needs to be able to handle floating-point data generated
from other sources;  for this alone it should be at least able to read
and write infinities and nans.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Optional delta argument for assertAlmostEqual

2010-03-27 Thread Mark Dickinson
On Sat, Mar 27, 2010 at 12:59 AM, Michael Foord
fuzzy...@voidspace.org.uk wrote:
 Hello all,

 A user has suggested an optional argument to
 unittest.TestCase.assertAlmostEqual for specifying a maximum difference
 between the expected and actual values, instead of using rounding.

+1.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mixing float and Decimal -- thread reboot

2010-04-02 Thread Mark Dickinson
On Mon, Mar 22, 2010 at 7:52 PM, Guido van Rossum gu...@python.org wrote:
 On Mon, Mar 22, 2010 at 11:36 AM, Raymond Hettinger
 raymond.hettin...@gmail.com wrote:
 One other thought.

 The Decimal constructor should now accept floats as a possible input type.
 Formerly, we separated that out to Decimal.from_float() because
 decimals weren't interoperable with floats.

 Not sure this follows; Fraction(1.1) raises an exception, you have to
 use Fraction.from_float().

Is there any good reason for this, other than a parallel with Decimal?
 It seems to me that Raymond's arguments for allowing direct
construction of a Decimal from a float apply equally well to the
Fraction type.

If we're going to allow Decimal(1.1), I'd like to allow Fraction(1.1)
to succeed as well (giving the equivalent of
Fraction.from_float(1.1)).

The main argument against allowing this (for both Fraction and
Decimal) seems to be that the result of Decimal(1.1) or Fraction(1.1)
could be confusing.  But it's an immediate, explicit confusion, which
can be quickly resolved by pointing the confusee to the section on
floating-point in the appendix, so I don't find this objection
particularly compelling.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [buildbots] 'stop build' button causing subsequent builds to fail?

2010-04-02 Thread Mark Dickinson
On Fri, Apr 2, 2010 at 3:54 PM, Stefan Krah ste...@bytereef.org wrote:

 I looks like the 'stop build' button can a) cause subsequent builds to fail
 and b) cause pending builds to be deleted from the queue.


 a) http://www.python.org/dev/buildbot/builders/ARM Linux EABI 3.x/builds/18
   was apparently interrupted by a 'stop build' for a previous build.

Actually, I think that was me being impatient.  I was trying to get
some information about the float.fromhex test failure on ARM
(bugs.python.org/issue8265) and didn't want to wait days.  :)

 b) I stopped http://www.python.org/dev/buildbot/builders/sparc solaris10 gcc 
 3.x/builds/558
   and a pending build vanished (I'm certain that I used 'stop build' and not 
 'cancel all').

Don't know about this one.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ffi junk messages

2010-04-07 Thread Mark Dickinson
On Wed, Apr 7, 2010 at 1:39 PM, Jeroen Ruigrok van der Werven
asmo...@in-nomine.org wrote:
 Before I file a bug report, is anyone else seeing this (in my case on
 FreeBSD 8):

 Modules/_ctypes/libffi/src/x86/sysv.S:360: Error: junk at end of line, first 
 unrecognized character is `@'
 Modules/_ctypes/libffi/src/x86/sysv.S:387: Error: junk at end of line, first 
 unrecognized character is `@'
 Modules/_ctypes/libffi/src/x86/sysv.S:423: Error: junk at end of line, first 
 unrecognized character is `@'

It's on the buildbots, too.  See:

http://www.python.org/dev/buildbot/builders/x86%20FreeBSD%20trunk/builds/208/steps/compile/logs/stdio

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Very Strange Argument Handling Behavior

2010-04-16 Thread Mark Dickinson
On Fri, Apr 16, 2010 at 9:04 AM, Hagen Fürstenau ha...@zhuliguan.net wrote:
 This behavior seems pretty strange to me, indeed PyPy gives the
 TypeError for both attempts.  I just wanted to confirm that it was in
 fact intentional.

 Oleg already answered why f(**{1:3}) raises a TypeError. But your
 question seems to be rather why dict(**{1:3}) doesn't.

 For functions implemented in Python, non-string arguments are always
 rejected, but C functions (like the dict constructor) don't have to
 reject them. I don't see any benefit in allowing them, but it's probably
 not worth breaking code by disallowing them either.

dict(x, **y) as an expression version of x.update(y) seems to be
fairly well known[1], so disallowing non-string keyword arguments
seems likely to break existing code, as well as (probably?) harming
performance.  So I can't see CPython changing here.  I'm not sure
whether other implementations should be required to follow suit,
though---maybe this should be regarded as an implementation-defined
detail?

Mark

[1] 
http://stackoverflow.com/questions/38987/how-can-i-merge-two-python-dictionaries-as-a-single-expression
)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Very Strange Argument Handling Behavior

2010-04-16 Thread Mark Dickinson
On Fri, Apr 16, 2010 at 3:33 PM, Guido van Rossum gu...@python.org wrote:
 On Fri, Apr 16, 2010 at 7:15 AM, Nick Coghlan ncogh...@gmail.com wrote:
 I would agree with leaving it implementation defined - I don't think
 either PyPy or CPython should be forced to change their current
 behaviour in relation to this. A minor note in the language reference to
 that effect may be worthwhile just to make that stance official.

 That is just going to cause some programs to have a portability
 surprise. I think one or the other should be fixed. I am fine with
 declaring dict({}, **{1:3}) illegal, since after all it is abuse of
 the ** mechanism. We should deprecate it in at least one version
 though.

Okay;  I'll open an issue for deprecation in 3.2 and removal in 3.3.

Can this sneak in under the 'incorrect language semantics' exemption
for PEP 3003 (the moratorium PEP)?  If not, then deprecation
presumably has to wait for 3.3.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Very Strange Argument Handling Behavior

2010-04-16 Thread Mark Dickinson
On Fri, Apr 16, 2010 at 3:57 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Mark Dickinson dickinsm at gmail.com writes:

 Okay;  I'll open an issue for deprecation in 3.2 and removal in 3.3.

 Can this sneak in under the 'incorrect language semantics' exemption
 for PEP 3003 (the moratorium PEP)?  If not, then deprecation
 presumably has to wait for 3.3.

 It seems that in spirit the moratorium applies more to language additions than
 to removals/limitations. The goal being that alternate implementation stop
 chasing a moving target in terms of features.

 So IMVHO it is fine for 3.2.

Removing it certainly seems in keeping with the goal of making life
easier for alternate implementations.  (Out of curiosity, does anyone
know what IronPython does here?)

I've opened http://bugs.python.org/issue8419

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bbreport

2010-04-17 Thread Mark Dickinson
On Sat, Apr 17, 2010 at 7:41 PM, Victor Stinner
victor.stin...@haypocalc.com wrote:
 Ezio and Florent are developing a tool called bbreport to collect buildbot
 results and generate short reports to the command line. It's possible to
 filter results by Python branch, builder name, etc. I send patches to link
 failed tests to existing issues to see quickly known failures vs new failures.
 This tool becomes really useful to analyze buildbot results!

Seconded.  I've been using this for a few days, and found it
especially useful to be able to get a quick summary of exactly *which*
tests are failing on the various buildbots.

 bbreport requires Python trunk (2.7) and color output only works on UNIX/BSD
 OS (ie. not Windows).

Does it really need trunk?  I've been running it under 2.6 without
problems, but I probably haven't explored all the options fully.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 328, relative imports and Python 2.7

2010-04-21 Thread Mark Dickinson
On Wed, Apr 21, 2010 at 2:40 PM, Barry Warsaw ba...@python.org wrote:
 While talking about Python 2.6 - 2.7 transitions, the subject of relative and
 absolute imports has come up.  PEP 328 states that absolute imports will be
 enabled by default in Python 2.7, however I cannot verify that this has
 actually happened.

I'm fairly sure it hasn't.  I brought this up on python-dev in
February (around Feb 2nd;  thread entitled 'Absolute imports in Python
2.x'), but for some reason I can only find the tail end of that thread
on mail.python.org:

http://mail.python.org/pipermail/python-dev/2010-February/097458.html

 Python 2.7?  If not, given that we're into beta, I don't think we can do it
 now, so I would suggest updating the PEP.

Agreed.  There's also the question of whether deprecation warnings or
-3 warnings should be raised;  see

http://bugs.python.org/issue7844

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 328, relative imports and Python 2.7

2010-04-21 Thread Mark Dickinson
On Wed, Apr 21, 2010 at 2:56 PM, Mark Dickinson dicki...@gmail.com wrote:
 On Wed, Apr 21, 2010 at 2:40 PM, Barry Warsaw ba...@python.org wrote:
 While talking about Python 2.6 - 2.7 transitions, the subject of relative 
 and
 absolute imports has come up.  PEP 328 states that absolute imports will be
 enabled by default in Python 2.7, however I cannot verify that this has
 actually happened.

 I'm fairly sure it hasn't.  I brought this up on python-dev in
 February (around Feb 2nd;  thread entitled 'Absolute imports in Python
 2.x'), but for some reason I can only find the tail end of that thread
 on mail.python.org:

 http://mail.python.org/pipermail/python-dev/2010-February/097458.html

Ah, here's a better link to a different archive of the previous discussion.

http://www.mail-archive.com/python-dev@python.org/msg45275.html

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Did I miss the decision to untabify all of the C code?

2010-05-09 Thread Mark Dickinson
On Thu, May 6, 2010 at 4:52 AM, Joao S. O. Bueno jsbu...@python.org.br wrote:
 On Wed, May 5, 2010 at 9:59 PM, Eric Smith e...@trueblade.com wrote:
 That's my point. Since it's basically unreviewable, is it smart to do it
 during a beta?

 Hello folks -
 I don't think these modifications are that unreviewable: the
 generated binaries have to be exactly the same with the untabified
 files don't they? So is a matter of stashing the binaries, applying
 the patches, rebuild and check to see if the binaries match. Any
 possible script defects undetected by this would be only (C code)
 indentation, which could be fixed later.

That's not foolproof, though:  there are lots of sections of code that
will only get compiled on certain platforms, or with certain configure
options, etc.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Incorrect length of collections.Counter objects / Multiplicity function

2010-05-20 Thread Mark Dickinson
On Tue, May 18, 2010 at 11:00 PM, Gustavo Narea m...@gustavonarea.net wrote:
 I've checked the new collections.Counter class and I think I've found a bug:

  from collections import Counter
  c1 = Counter([1, 2, 1, 3, 2])
  c2 = Counter([1, 1, 2, 2, 3])
  c3 = Counter([1, 1, 2, 3])
  c1 == c2 and c3 not in (c1, c2)
 True
  # Perfect, so far. But... There's always a but:
 ...
  len(c1)
 3

This is the intended behaviour;  it also agrees with what you get when
you iterate
over a Counter object:

 list(c1)
[1, 2, 3]

As I understand it, there are other uses for Counter objects besides
treating them
as multisets;  I think the choices for len() and iter() reflected
those other uses.

 Is this the intended behavior? If so, I'd like to propose a proper multiset
 implementation for the standard library (preferably called Multiset; should
 I create a PEP?).

Feel free!  The proposal should probably go to python-list or
python-ideas rather
than here, though.

See also this recent thread on python-list, and in particular the messages
from Raymond Hettinger in that thread:

http://mail.python.org/pipermail/python-list/2010-March/thread.html

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Incorrect length of collections.Counter objects / Multiplicity function

2010-05-20 Thread Mark Dickinson
On Thu, May 20, 2010 at 10:18 PM, Mark Dickinson dicki...@gmail.com wrote:
 See also this recent thread on python-list, and in particular the messages
 from Raymond Hettinger in that thread:

 http://mail.python.org/pipermail/python-list/2010-March/thread.html

Sorry, bad thread link.  Try:

http://mail.python.org/pipermail/python-list/2010-March/1238730.html

instead.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Mark Dickinson
On Wed, May 26, 2010 at 10:15 AM, Colin H hawk...@gmail.com wrote:
   issue991196 was closed being described as intentional.  I've added
 a comment in that issue which argues that this is a serious bug (also
 aserted by a previous commenter - Armin Rigo), because it creates a
 unique, undocumented, oddly behaving scope that doesn't apply closures
 correctly. At the very least I think this should be acknowledged as a
 plain old bug (rather than a feature), and then a discussion about
 whether it will be fixed or not.

Here's a quick recap of the issue so that people don't have to go
searching through the bug archive.  In Python 2.x, we get the
following behaviour:

 code = \
... y = 3
... def f():
... return y
... f()
... 
 exec code in {}   # works fine
 exec code in {}, {}   # dies with a NameError
Traceback (most recent call last):
  File stdin, line 1, in module
  File string, line 4, in module
  File string, line 3, in f
NameError: global name 'y' is not defined

The issue is whether the second example should work, given that two
different dictionaries have been passed.

The cause of the NameError can be seen by looking at the bytecode: y
is bound using STORE_NAME, which stores y into the locals dictionary
(which here is *not* the same as the globals dictionary) but the
attempt to retrieve the value of y uses LOAD_GLOBAL, which only looks
in the globals.

 co = compile(code, 'mycode', 'exec')
 dis.dis(co)
  1   0 LOAD_CONST   0 (3)
  3 STORE_NAME   0 (y)

  2   6 LOAD_CONST   1 (code object f at
0xa22b40, file mycode, line 2)
  9 MAKE_FUNCTION0
 12 STORE_NAME   1 (f)

  4  15 LOAD_NAME1 (f)
 18 CALL_FUNCTION0
 21 POP_TOP
 22 LOAD_CONST   2 (None)
 25 RETURN_VALUE
 dis.dis(co.co_consts[1])  # disassembly of 'f'
  3   0 LOAD_GLOBAL  0 (y)
  3 RETURN_VALUE

This is a long way from my area of expertise (I'm commenting here
because it was me who sent Colin here in the first place), and it's
not clear to me whether this is a bug, and if it is a bug, how it
could be resolved.  What would the impact be of having the compiler
produce 'LOAD_NAME' rather than 'LOAD_GLOBAL' here?

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Language Summit notes

2014-04-17 Thread Mark Dickinson
On Wed, Apr 16, 2014 at 11:26 PM, Antoine Pitrou solip...@pitrou.netwrote:

 What does this mean exactly? Under OS X and Linux, Python is typically
 installed by default.


Under OS X, at least, I think there are valid reasons to not want to use
the system-supplied Python.  On my up-to-date OS X 10.9.2 machine, I see
Python 2.7.5, NumPy 1.6.2, Matplotlib 1.1.1 and Twisted 12.2.0.  For at
least Matplotlib and NumPy, those versions are pretty old (mid 2012), and
I'd be wary of updating them on the *system* Python: I have no idea what I
might or might not break.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] is the concept of 'reference ownership' no long applicable in Python 3.4?

2014-04-17 Thread Mark Dickinson
On Thu, Apr 17, 2014 at 4:34 PM, Jianfeng Mao j...@rocketsoftware.comwrote:

  I noticed the following changes in the C API manuals from 3.3.5 (and
 earlier versions) to 3.4. I don’t know if these changes are deliberate and
 imply that we C extension developers no longer need to care about
 ‘reference ownership’ because of some improvements in 3.4. Could anyone
 clarify it?


AFAIK there's been no deliberate change to the notion of reference
ownership.  Moreover, any such change would break existing C extensions, so
it's highly unlikely that anything's changed here, behaviour-wise.

This looks like a doc build issue: when I build the documentation locally
for the default branch, I still see the expected Return value: New
reference. lines.  Maybe something went wrong with refcounts.dat or the
Sphinx refcounting extension when building the 3.4 documentation?  Larry:
any ideas?

Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] is the concept of 'reference ownership' no long applicable in Python 3.4?

2014-04-17 Thread Mark Dickinson
On Thu, Apr 17, 2014 at 5:33 PM, Mark Dickinson dicki...@gmail.com wrote:

 This looks like a doc build issue: when I build the documentation locally
 for the default branch, I still see the expected Return value: New
 reference. lines.


Opened http://bugs.python.org/issue21286 for this issue.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-18 Thread Mark Dickinson
[Moderately off-topic]

On Sun, Aug 17, 2014 at 3:39 AM, Steven D'Aprano st...@pearwood.info
wrote:

 I used to refer to Python 4000 as the hypothetical compatibility break
 version. Now I refer to Python 5000.


I personally think it should be Python 500, or Py5M.  When we come to
create the mercurial branch, that should of course, following tradition, be
called p5ym.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Mark Dickinson
On Sun, Sep 11, 2016 at 7:43 PM, Elliot Gorokhovsky
 wrote:
> So I suppose the thing to do is to benchmark stable radix sort against 
> timsort and see if it's still worth it.

Agreed; it would definitely be interesting to see benchmarks for the
two-array stable sort as well as the American Flag unstable sort.
(Indeed, I think it would be hard to move the proposal forward without
such benchmarks.)

Apart from the cases already mentioned by Chris, one of the situations
you'll want to include in the benchmarks is the case of a list that's
already almost sorted (e.g., an already sorted list with a few extra
unsorted elements appended). This is a case that does arise in
practice, and that Timsort performs particularly well on.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drastically improving list.sort() for lists of strings/ints

2016-09-11 Thread Mark Dickinson
> I am interested in making a non-trivial improvement to list.sort() [...]

Would your proposed new sorting algorithm be stable? The language
currently guarantees stability for `list.sort` and `sorted`.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 64 bit units in PyLong

2017-07-05 Thread Mark Dickinson
On Mon, Jul 3, 2017 at 5:52 AM, Siyuan Ren  wrote:
> The current PyLong implementation represents arbitrary precision integers in
> units of 15 or 30 bits. I presume the purpose is to avoid overflow in
> addition , subtraction and multiplication. But compilers these days offer
> intrinsics that allow one to access the overflow flag, and to obtain the
> result of 64 bit multiplication as a 128 bit number. Or at least on x86-64,
> which is the dominant platform.  Any reason why it is not done?

Portability matters, so any use of these intrinsics would likely also
have to be accompanied by fallback code that doesn't depend on them,
as well as some buildsystem complexity to figure out whether those
intrinsics are supported or not. And then the Objects/longobject.c
would suffer in terms of simplicity and readability, so there would
have to be some clear gains to offset that. Note that the typical
Python workload does not involve thousand-digit integers: what would
matter would be performance of smaller integers, and it seems
conceivable that 64-bit limbs would speed up those operations simply
because so many more integers would become single-limb and so there
would be more opportunities to take fast paths, but there would need
to be benchmarks demonstrating that.

Oh, and you'd have to rewrite the power algorithm, which currently
depends on the size of a limb in bytes being a multiple of 5. :-)

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

2018-07-01 Thread Mark Dickinson
On Fri, Jun 22, 2018 at 7:28 PM, Chris Barker via Python-Dev <
python-dev@python.org> wrote:

>
> But once it becomes a more common idiom, students will see it in the wild
> pretty early in their path to learning python. So we'll need to start
> introducing it earlier than later.
>
> I think this reflects that the "smaller" a language is, the easier it is
> to learn.
>

For what it's worth, Chris's thoughts are close to my own here. I and
several of my colleagues teach week-long Python courses for Enthought. The
target audience is mostly scientists and data scientists (many of whom are
coming from MATLAB or R or IDL or Excel/VBA or some other development
environment, but some of whom are new to programming altogether), and our
curriculum is Python, NumPy, SciPy, Pandas, plus additional course-specific
bits and pieces (scikit-learn, NLTK, seaborn, statsmodels, GUI-building,
Cython, HPC, etc., etc.).

There's a constant struggle to keep the Python portion of the course large
enough to be coherent and useful, but small enough to allow time for the
other topics. To that end, we separate the Python piece of the course into
"core topics" that are essential for the later parts, and "advanced topics"
that can be covered if time allows, or if we get relevant questions. I
can't see a way that the assignment expression wouldn't have to be part of
the core topics. async stuff only appears in async code, and it's easy to
compartmentalize; in contrast, I'd expect that once the assignment
expression took hold we'd be seeing it in a lot of code, independent of the
domain.

And yes, I too see enough confusion with "is" vs == already, and don't
relish the prospect of teaching := in addition to those.

That's with my Python-teaching hat on. With my Python-developer hat on, my
thoughts are slightly different, but that's off-topic for this thread, and
I don't think I have anything to say that hasn't already been said many
times by others, so I'll keep quiet about that bit. :-)

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Mark Dickinson
On Mon, Mar 12, 2018 at 4:49 PM, Raymond Hettinger <
raymond.hettin...@gmail.com> wrote:

> What is the proposal?
> * Add an is_integer() method to int(), Decimal(), Fraction(), and Real().
> Modify Rational() to provide a default implementation.
>

>From the issue discussion, it sounds to me as though the OP would be
content with adding is_integer to int and Fraction (leaving the decimal
module and the numeric tower alone).


> Starting point: Do we need this?
> * We already have a simple, traditional, portable, and readable way to
> make the test:  int(x) == x
>

As already pointed out in the issue discussion, this solution isn't
particularly portable (it'll fail for infinities and nans), and can be
horribly inefficient in the case of a Decimal input with large exponent:

In [1]: import decimal
In [2]: x = decimal.Decimal('1e9')
In [3]: %timeit x == int(x)
1.42 s ± 6.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [4]: %timeit x == x.to_integral_value()
230 ns ± 2.03 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

* In the context of ints, the test x.is_integer() always returns True.
> This isn't very useful.
>

It's useful in the context of duck typing, which I believe is a large part
of the OP's point. For a value x that's known to be *either* float or int
(which is not an uncommon situation), it makes x.is_integer() valid without
needing to know the specific type of x.

* It conflicts with a design goal for the decimal module to not invent new
> functionality beyond the spec unless essential for integration with the
> rest of the language.  The reasons included portability with other
> implementations and not trying to guess what the committee would have
> decided in the face of tricky questions such as whether
> Decimal('1.01').is_integer()
> should return True when the context precision is only three decimal places
> (i.e. whether context precision and rounding traps should be applied before
> the test and whether context flags should change after the test).
>

I don't believe there's any ambiguity here. The correct behaviour looks
clear: the context isn't used, no flags are touched, and the method returns
True if and only if the value is finite and an exact integer. This is
analogous to the existing is-sNaN, is-signed, is-finite, is-zero,
is-infinite tests, none of which are affected by (or affect) context.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-13 Thread Mark Dickinson
On Mon, Mar 12, 2018 at 9:18 PM, Tim Peters  wrote:

> [Guido]
> >  as_integer_ratio() seems mostly cute (it has Tim Peters all
> > over it),
>
> Nope!  I had nothing to do with it.  I would have been -0.5 on adding
> it had I been aware at the time.
>

Looks like it snuck into the float type as part of the fractions.Fraction
work in https://bugs.python.org/issue1682 . I couldn't find much related
discussion; I suspect that the move was primarily for optimization (see
https://github.com/python/cpython/commit/3ea7b41b5805c60a05e697211d0bfc14a62a19fb).
Decimal.as_integer_ratio was added here: https://bugs.python.org/issue25928
 .

I do have significant uses of `float.as_integer_ratio` in my own code, and
wouldn't enjoy seeing it being deprecated/ripped out, though I guess I'd
cope.

Some on this thread have suggested that things like is_integer and
as_integer_ratio should be math module functions. Any suggestions for how
that might be made to work? Would we special-case the types we know about,
and handle only those (so the math module would end up having to know about
the fractions and decimal modules)? Or add a new magic method (e.g.,
__as_integer_ratio__) for each case we want to handle, like we do for
math.__floor__, math.__trunc__ and math.__ceil__? Or use some form of
single dispatch, so that custom types can register their own handlers? The
majority of current math module functions simply convert their arguments to
a float, so a naive implementation of math.is_integer in the same style
wouldn't work: it would give incorrect results for a non-integral Decimal
instance that ended up getting rounded to an integral value by the float
conversion.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Mark Dickinson
I'd prefer to see `float.is_integer` stay. There _are_ occasions when one
wants to check that a floating-point number is integral, and on those
occasions, using `x.is_integer()` is the one obvious way to do it. I don't
think the fact that it can be misused should be grounds for deprecation.

As far as real uses: I didn't find uses of `is_integer` in our code base
here at Enthought, but I did find plenty of places where it _could_
reasonably have been used, and where something less readable like `x % 1 ==
0` was being used instead. For evidence that it's generally useful: it's
already been noted that the decimal module uses it internally. The mpmath
package defines its own "isint" function and uses it in several places: see
https://github.com/fredrik-johansson/mpmath/blob/2858b1000ffdd8596defb50381dcb83de2b6/mpmath/ctx_mp_python.py#L764.
MPFR also has an mpfr_integer_p predicate:
http://www.mpfr.org/mpfr-current/mpfr.html#index-mpfr_005finteger_005fp.

A concrete use-case: suppose you wanted to implement the beta function (
https://en.wikipedia.org/wiki/Beta_function) for real arguments in Python.
You'll likely need special handling for the poles, which occur only for
some negative integer arguments, so you'll need an is_integer test for
those. For small positive integer arguments, you may well want the accuracy
advantage that arises from computing the beta function in terms of
factorials (giving a correctly-rounded result) instead of via the log of
the gamma function. So again, you'll want an is_integer test to identify
those cases. (Oddly enough, I found myself looking at this recently as a
result of the thread about quartile definitions: there are links between
the beta function, the beta distribution, and order statistics, and the
(k-1/3)/(n+1/3) expression used in the recommended quartile definition
comes from an approximation to the median of a beta distribution with
integral parameters.)

Or, you could look at the SciPy implementation of the beta function, which
does indeed do the C equivalent of is_integer in many places:
https://github.com/scipy/scipy/blob/11509c4a98edded6c59423ac44ca1b7f28fba1fd/scipy/special/cephes/beta.c#L67

In sum: it's an occasionally useful operation; there's no other obvious,
readable spelling of the operation that does the right thing in all cases,
and it's _already_ in Python! In general, I'd think that deprecation of an
existing construct should not be done lightly, and should only be done when
there's an obvious and significant benefit to that deprecation. I don't see
that benefit here.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Mark Dickinson
On Wed, Mar 21, 2018 at 8:49 PM, David Mertz  wrote:

> For example, this can be true (even without reaching inf):
>
> >>> x.is_integer()
> True
> >>> (math.sqrt(x**2)).is_integer()
> False
>

If you have a moment to share it, I'd be interested to know what value of
`x` you used to achieve this, and what system you were on. This can't
happen under IEEE 754 arithmetic.

-- 
Mark
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Re: What is __int__ still useful for?

2021-10-15 Thread Mark Dickinson
I'd propose that we relegate `__trunc__` to the same status as `__floor__`
and `__ceil__`: that is, have `__trunc__` limited to being support for
`math.trunc`, and nothing more. Right now the `int` constructor potentially
looks at all three of `__int__`, `__index__` and `__trunc__`, so the
proposal would be to remove that special role of `__trunc__` and reduce the
`int` constructor to only looking at `__int__` and `__index__`.

Obviously that's a backwards incompatible change, but a fairly mild one,
with an obvious place to insert a `DeprecationWarning` and a clear
transition path for affected code: code that relies on `int` being able to
use `__trunc__` would need to add a separate implementation of `__int__`.
(We made this change recently for the `Fraction` type in
https://bugs.python.org/issue44547.)

I opened an issue for this proposal a few weeks back:
https://bugs.python.org/issue44977

Mark




On Thu, Oct 14, 2021 at 11:50 AM Serhiy Storchaka 
wrote:

> 14.10.21 12:24, Eryk Sun пише:
> > Maybe an alternate constructor could be added -- such as
> > int.from_number() -- which would be restricted to calling __int__(),
> > __index__(), and __trunc__().
>
> See thread "More alternate constructors for builtin type" on Python-ideas:
>
> https://mail.python.org/archives/list/python-id...@python.org/thread/5JKQMIC6EUVCD7IBWMRHY7DRTTNSBOWG/
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/NU3774YDVCIUH44C7RZXCSSVRVYSLCUI/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WX6246JW43A25MJJ6YRBLTN3GCQQQXZF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: What is __int__ still useful for?

2021-10-15 Thread Mark Dickinson
Meta: apologies for failing to trim the context in the previous post.

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GCOJ6ZNMTP6RSNTE3R5OKBTFKDIW3VCI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2021-12-31 Thread Mark Dickinson
On Fri, Dec 31, 2021 at 12:40 PM Skip Montanaro 
wrote:

> Perhaps I missed it, but maybe an action item would be to add a
> buildbot which configures for 15-bit PyLong digits.
>

Yep, good point. I was wrong to say that  "15-bit builds don't appear to be
exercised by the buildbots": there's a 32-bit Gentoo buildbot that's
(implicitly) using 15-bit digits, and the GitHub Actions Windows/x86 build
also uses 15-bit digits. I don't think we have anything that's explicitly
using the `--enable-big-digits` option, though.

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZIR2UF7KHYJ2W5Z4A3OS5BDRI3DS5QTM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2021-12-31 Thread Mark Dickinson
Thanks all! So to summarize:

- 15-bit digits are still very much in use, and deprecating the option
would likely be premature at this point
- the main users are old 32-bit (x86), which it's difficult to care about
too much, and new 32-bit (principally ARM microarchitectures), which we
*do* care about

So my first suspicion is just downright wrong. In particular, the
decade-old logic that chooses 15-bit digits whenever SIZEOF_VOID_P < 8 is
still in place (albeit with a recent modification for WebAssembly).

For the second suspicion, that "There are few machines where using 15-bit
digits is faster than using 30-bit digits.", we need more data.

It looks as though the next step would be to run some integer-intensive
benchmarks on 32-bit ARM, with both --enable-big-digits=15 and
--enable-big-digits=30. If those show a win (or at least, not a significant
loss) for 30-bit digits, then there's a case for at least making 30-bit
digits the default, which would be a first step towards eventually
dropping that support.

GPS: I'm not immediately seeing the ABI issue. If you're able to dig up
more information on that, I'd be interested to see it.

Mark


On Fri, Dec 31, 2021 at 3:33 AM Tim Peters  wrote:

> >> The reason for digits being a multiple of 5 bits should be revisited vs
> >> its original intent
>
> > I added that. The only intent was to make it easier to implement
> > bigint exponentiation easily ...
>
> That said, I see the comments in longintrepr.h note a stronger constraint:
>
> """
> the marshal code currently expects that PyLong_SHIFT is a multiple of 15
> """
>
> But that's doubtless also shallow.
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LLGLC7XMTFC5JVFVP45HJ7Y7DAOQUV3I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2021-12-30 Thread Mark Dickinson
tl;dr: I'd like to deprecate and eventually remove the option to use 15-bit 
digits in the PyLong implementation. Before doing so, I'd like to find out 
whether there's anyone still using 15-bit PyLong digits, and if so, why they're 
doing so.

History: the use of 30-bit digits in PyLong was introduced for Python 3.1 and 
Python 2.7, to improve performance of int (Python 3) / long (Python 2) 
arithmetic. At that time, we retained the option to use 15-bit digits, for two 
reasons:

- (1) use of 30-bit digits required C99 features (uint64_t and friends) at a 
time when we hadn't yet committed to requiring C99
- (2) it wasn't clear whether 30-bit digits would be a performance win on 
32-bit operating systems

Twelve years later, reason (1) no longer applies, and I suspect that:

- No-one is deliberately using the 15-bit digit option.
- There are few machines where using 15-bit digits is faster than using 30-bit 
digits.

But I don't have solid data on either of these suspicions, hence this post.

Removing the 15-bit digit option would simplify the code (there's significant 
mental effort required to ensure we don't break things for 15-bit builds when 
modifying Objects/longobject.c, and 15-bit builds don't appear to be exercised 
by the buildbots), remove a hidden compatibility trap (see b.p.o. issue 35037), 
widen the applicability of the various fast paths for arithmetic operations, 
and allow for some minor fast-path small-integer optimisations based on the 
fact that we'd be able to assume that presence of *two* extra bits in the C 
integer type rather than just one. As an example of the latter: if `a` and `b` 
are PyLongs that fit in a single digit, then with 15-bit digits and a 16-bit 
`digit` and `sdigit` type, `a + b` can't currently safely (i.e., without 
undefined behaviour from overflow) be computed with the C type `sdigit`. With 
30-bit digits and a 32-bit `digit` and `sdigit` type, `a + b` is safe.

Mark


*References*

Related b.p.o. issue: https://bugs.python.org/issue45569
MinGW compatibility issue: https://bugs.python.org/issue35037
Introduction of 30-bit digits: https://bugs.python.org/issue4258
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZICIMX5VFCX4IOFH5NUPVHCUJCQ4Q7QM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2022-01-02 Thread Mark Dickinson
On Sat, Jan 1, 2022 at 9:05 PM Antoine Pitrou  wrote:

> Note that ARM is merely an architecture with very diverse
> implementations having quite differing performance characteristics.  [...]
>

Understood. I'd be happy to see timings on a Raspberry Pi 3, say. I'm not
too worried about things like the RPi Pico - that seems like it would be
more of a target for MicroPython than CPython.

Wikipedia thinks, and the ARM architecture manuals seem to confirm, that
most 32-bit ARM instruction sets _do_ support the UMULL
32-bit-by-32-bit-to-64-bit multiply instruction. (From
https://en.wikipedia.org/wiki/ARM_architecture#Arithmetic_instructions:
"ARM supports 32-bit × 32-bit multiplies with either a 32-bit result or
64-bit result, though Cortex-M0 / M0+ / M1 cores don't support 64-bit
results.") Division may still be problematic.

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F53IZRZPNAKB4DUPOVYWGMQDC4DAWLTF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2022-01-16 Thread Mark Dickinson
On Sat, Jan 15, 2022 at 8:12 PM Tim Peters  wrote:

> Something is missing here, but can't guess what without seeing the
> generated machine code.But I trust Mark will do that.
>

Welp, there goes my weekend. :-)

 $ python -m timeit -n 150 -s "x = 10**1000" "x//10"

150 loops, best of 5: 376 nsec per loop
>
> Which actually makes little sense to me. [...] Under 4 nsec per iteration
> seems

close to impossibly fast on a 3.8GHz box, given the presence of any
> division instruction.







However, dividing by 10 is not a worst case on this box. Dividing by
> 100 is over 3x slower:
>
> $ python -m timeit -n 150 -s "x = 10**1000" "x//100"
> 150 loops, best of 5: 1.25 usec per loop


Now *that* I certainly wasn't expecting. I don't see the same effect on
macOS / Clang, whether compiling with --enable-optimizations or not; this
appears to be a GCC innovation. And indeed, as Tim suggested, it turns out
that there's no division instruction present in the loop for the
division-by-10 case - we're doing division via multiplication by the
reciprocal. In Python terms, we're computing `x // 10` as `(x *
0xcccd) >> 67`. Here's the tell-tale snippet of the assembly
output from the second compilation (the one that makes use of the generated
profile information) of longobject.c at commit
09087b8519316608b85131ee7455b664c00c38d2

on
a Linux box, with GCC 11.2.0. I added a couple of comments, but it's
otherwise unaltered

.loc 1 1632 36 view .LVU12309
movl %r13d, %r11d
salq $2, %rbp
cmpl $10, %r13d # compare divisor 'n' with 10, and
jne .L2797  # go to the slow version if n != 10
leaq 1(%r10), %r9 # from here on, the divisor is 10
addq %rbp, %r8
.LVL3442:
.loc 1 1632 36 view .LVU12310
addq %rbp, %rdi
.LVL3443:
.loc 1 1632 36 view .LVU12311
.LBE8049:
.loc 1 1624 15 view .LVU12312
xorl %r13d, %r13d
.LVL3444:
.loc 1 1624 15 view .LVU12313
movabsq $-3689348814741910323, %r11 # magic constant 0xcccd for
division by 10

and then a few lines later:

.loc 1 1630 9 is_stmt 1 view .LVU12316
.loc 1 1631 9 view .LVU12317
.loc 1 1631 39 is_stmt 0 view .LVU12318
movl (%r8,%r10,4), %r14d # move top digit of divisor into the low word of
r14
.LVL3446:
.loc 1 1632 9 is_stmt 1 view .LVU12319
movq %r14, %rax # set up for division: top digit is now in rax
.loc 1 1633 13 is_stmt 0 view .LVU12320
movq %r14, %r13
mulq %r11 # here's the division by 10: multiply by the magic constant
shrq $3, %rdx # and divide by 8 (via a shift)

and then it all gets a bit repetitive and boring - there's a lot of loop
unrolling going on.

So gcc is anticipating divisions by 10 and introducing special-case
divide-by-reciprocal-multiply code for that case, and presumably the
profile generated for the PGO backs up this being a common enough case, so
we end up with the above code in the final compilation.

TIL ...

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VDII5EBMXLNO4U3BSSNWAW2ETLNG6YUN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 682: Format Specifier for Signed Zero

2022-03-06 Thread Mark Dickinson
PEP 682 (Format Specifier for Signed Zero) has been accepted! Please see
https://discuss.python.org/t/accepting-pep-682-format-specifier-for-signed-zero/14088

Thanks to all involved,

Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2ZYYMWOT2HQ4Q3PT6RNRC5F3DI2VEGTO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Should we require IEEE 754 floating-point for CPython?

2022-02-07 Thread Mark Dickinson
On Mon, Feb 7, 2022 at 5:11 PM Victor Stinner  wrote:

> I made a change to require C99  "NAN" constant [...]


There's a separate discussion topic lurking here. It's equally in need of
discussion here (IMO), but it's orthogonal to the "should we require C99"
discussion. I've changed the subject line accordingly to try to avoid
derailing that discussion.

Unlike the other things Victor mentions ("copysign", "round", etc.), the
NAN macro is not required to be present by C99. Instead, the standard says
that "NAN is defined if and only if the implementation supports quiet NaNs
for the float type" (C99 §7.12p5).

Victor is proposing in GH-31160
 to require the presence of
the NAN macro in order for CPython to build, which under C99 is equivalent
to requiring that the C float type supports quiet NaNs. That's not the same
as requiring IEEE 754 floating-point, but it's not far off - there aren't
many non-IEEE 754 floating-point formats that support NaNs. (Historically,
there are essentially none, but it seems quite likely that there will be at
least some non-IEEE 754 formats in the future that support NaNs; Google's
bfloat16 format is one example.)

So there (at least) three questions here:

- Should we require the presence of NaNs in order for CPython to build?
- Should we require IEEE 754 floating-point for CPython-the-implementation?
- Should we require IEEE 754 floating-point for Python-the-language?

For the first two, I'd much prefer either to not require NaNs, or to go the
whole way and require IEEE 754 for CPython. Requiring NaNs but not IEEE 754
feels like an awkward halfway house: in practice, it would be just as
restrictive as requiring IEEE 754, but without the benefits of making that
requirement explicit (e.g., being able to get rid of non-IEEE 754 paths in
existing code, and being able to tell users that they can reasonably expect
IEEE 754-conformant behaviour).

Note that on the current main branch there's a Py_NO_NAN macro that
builders can define to indicate that NaNs aren't supported, but the Python
build is currently broken if Py_NO_NAN is defined (see
https://bugs.python.org/issue46656). If the answer to the first question is
"No", then we need to fix the build under Py_NO_NAN. That's not a big deal
- perhaps a couple of hours of work.

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GUIR2HZHFV2TDS7GUQHAHFSA4IC3QLMZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2022-01-14 Thread Mark Dickinson
On Sun, Jan 2, 2022 at 10:35 AM Mark Dickinson  wrote:

> Division may still be problematic.
>

On that note: Python divisions are somewhat crippled even on x64. Assuming
30-bit digits, the basic building block that's needed for multi-precision
division is a 64-bit-by-32-bit unsigned integer division, emitting a 32-bit
quotient (and ideally also a 32-bit remainder). And there's an x86/x64
instruction that does exactly that, namely DIVL. But without using inline
assembly, current versions of GCC and Clang apparently can't be persuaded
to emit that instruction from the longobject.c source - they'll use DIVQ (a
128-bit-by-64-bit division, albeit with the top 64 bits of the dividend set
to zero) on x64, and the __udivti3 or __udivti4 intrinsic on x86.

I was curious to find out what the potential impact of the failure to use
DIVL was, so I ran some timings. A worst-case target is division of a large
(multi-digit) integer by a single-digit integer (where "digit" means digit
in the sense of PyLong digit, not decimal digit), since that involves
multiple CPU division instructions in a fairly tight loop.

Results: on my laptop (2.7 GHz Intel Core i7-8559U, macOS 10.14.6,
non-optimised non-debug Python build), a single division of 10**1000 by 10
takes ~1018ns on the current main branch and ~722ns when forced to use the
DIVL instruction (by inserting inline assembly into the inplace_divrem1
function). IOW, forcing use of DIVL instead of DIVQ, in combination
with getting the remainder directly from the DIV instruction instead of
computing it separately, gives a 41% speedup in this particular worst case.
I'd expect the effect to be even more marked on x86, but haven't yet done
those timings.

For anyone who wants to play along, here's the implementation of the
inplace_divrem1 (in longobject.c) that I was using:

static digit
inplace_divrem1(digit *pout, digit *pin, Py_ssize_t size, digit n)
{
digit remainder = 0;

assert(n > 0 && n <= PyLong_MASK);
while (--size >= 0) {
twodigits dividend = ((twodigits)remainder << PyLong_SHIFT) | pin[size];
digit quotient, high, low;
high = (digit)(dividend >> 32);
low = (digit)dividend;
__asm__("divl %2\n"
: "=a" (quotient), "=d" (remainder)
: "r" (n), "a" (low), "d" (high)
);
pout[size] = quotient;
}
return remainder;
}


I don't know whether we *really* want to open the door to using inline
assembly for performance reasons in longobject.c, but it's interesting to
see the effect.

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZWGPO3TMCI7WNLC3EMS26DIKI5D3ZWMK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2022-01-16 Thread Mark Dickinson
On Sun, Jan 16, 2022 at 4:11 PM Terry Reedy  wrote:

>
>
> https://stackoverflow.com/questions/41183935/why-does-gcc-use-multiplication-by-a-strange-number-in-implementing-integer-divi
>
> and
>
>
> https://stackoverflow.com/questions/30790184/perform-integer-division-using-multiplication
>
> have multiple discussions of the technique for machine division
> invariant (small) ints and GCC's use thereof (only suppressed with -0s?).
>

Yes, it's an old and well-known technique, and compilers have been using it
for division by a known-at-compile-time constant for many decades. What's
surprising here is the use by GCC in a situation where the divisor is
*not* known
at compile time - that GCC essentially guesses that a divisor of 10 is
common enough to justify special-casing.

There's also the libdivide library[1], which caters to situations where you
have a divisor not known at compile time but you know you're going to be
using it often enough to compensate for the cost of computing the magic
multiplier dynamically at run time.

[1] https://libdivide.com

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PPF6TOGH6QJXGKYTYVVAQC4D3D3HT7R4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2022-01-16 Thread Mark Dickinson
On Sun, Jan 16, 2022 at 12:08 PM Mark Dickinson  wrote:

> So gcc is anticipating divisions by 10 and introducing special-case
> divide-by-reciprocal-multiply code for that case, and presumably the
> profile generated for the PGO backs up this being a common enough case, so
> we end up with the above code in the final compilation.
>

Nope, that's not what's happening. This analysis is backwards, and unfairly
attributes to GCC the apparently arbitrary choice to optimise division by
10. But it's not GCC's fault; it's ours. What's *actually* happening is
that GCC is simply recording values for n used in calls to divrem1 (via the
-fprofile-values option, which is implied by -fprofile-generate, which is
used as a result of the --enable-optimizations configure script option).
It's then noticing that in our profile task (which consists of a selection
of Lib/test/test_*.py test files) we most often do divisions by 10, and so
it optimizes that case.

To test this hypothesis I added a large number of tests for division by 17
in test_long.py, and then recompiled from scratch (again with
--enable-optimizations). Here are the results:

root@341b5fd44b23:/home/cpython# ./python -m timeit -n 100 -s
"x=10**1000; y=10" "x//y"

100 loops, best of 5: 1.14 usec per loop

root@341b5fd44b23:/home/cpython# ./python -m timeit -n 100 -s
"x=10**1000; y=17" "x//y"

100 loops, best of 5: 306 nsec per loop

root@341b5fd44b23:/home/cpython# ./python -m timeit -n 100 -s
"x=10**1000; y=1" "x//y"

100 loops, best of 5: 1.14 usec per loop

root@341b5fd44b23:/home/cpython# ./python -m timeit -n 100 -s
"x=10**1000; y=2" "x//y"

100 loops, best of 5: 1.15 usec per loop

As expected, division by 17 is now optimised; division by 10 is as slow as
division by other small scalars.

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2MOQCVMEQBV7PATT47GUYHS42QIJHTRK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Is anyone using 15-bit PyLong digits (PYLONG_BITS_IN_DIGIT=15)?

2022-01-16 Thread Mark Dickinson
On Sun, Jan 16, 2022 at 9:28 PM Guido van Rossum  wrote:

> Does the optimization for //10 actually help in the real world? [...]
>

Yep, I don't know. If 10 is *not* the most common small divisor in real
world code, it must at least rank in the top five. I might hazard a guess
that division by 2 would be more common, but I've no idea how one would go
about establishing that.

The reason that the divisor of 10 is turning up from the PGO isn't a
particularly convincing one - it looks as though it's a result of our
testing the builtin int-to-decimal-string conversion by comparing with an
obviously-correct repeated-division-by-10 algorithm.

Then again I'm not sure what's *lost* even if this optimization is
> pointless -- surely it doesn't slow other divisions down enough to be
> measurable.
>

Agreed. That at least is testable. I can run some timings (but not tonight).

-- 
Mark
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OMZI5SQ2SQ7SYN4PCDKIXQQIKGXVJTO5/
Code of Conduct: http://python.org/psf/codeofconduct/


<    1   2   3   4