Re: for / while else doesn't make sense

2016-05-23 Thread Mark Dickinson
Ben Bacarisse  bsb.me.uk> writes:
> [1] Not being a Python expert I don't know how you show that actual
> value of a float.  What is the Pythonic way to do that?

I don't know about Pythonic, but here are some options.

1. Convert the float to Decimal, and print the result. This shows
   the exact binary value that's stored, but displays it in decimal.
   Be aware that the result will be hundreds of digits long for
   very large or very small floats.

   >>> print(Decimal(pi))
   3.141592653589793115997963468544185161590576171875

2. If you're happy with a hexadecimal representation, use the
   float.hex method. Again, this shows the exact value stored.

   >>> print(pi.hex())
   0x1.921fb54442d18p+1

3. To get an equivalent fraction, convert to the fractions.Fraction
   type or use the as_integer_ratio method.

   >>> from fractions import Fraction
   >>> print(Fraction(pi))
   884279719003555/281474976710656
   >>> print(pi.as_integer_ratio())
   (884279719003555, 281474976710656)

-- 
Mark (who should probably take a numerical methods class someday)

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: GCD in Fractions

2014-09-24 Thread Mark Dickinson
Mark Lawrence  yahoo.co.uk> writes:
> Somebody got there first http://bugs.python.org/issue22477

I think there's good reason to suspect that Brian Gladman and
blindanagram are one and the same. :-)

>>> sorted("BrianGladman".lower()) == sorted("blindanagram")
True


 




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug in Decimal??

2014-05-19 Thread Mark Dickinson
 isp.com> writes:

> I've tested on all platforms I know of and confirmed it. The wrong digit
> occurs in the middle of the number.  Propagation error would have a bad digit
> near the end, and garbage after that. Here there's a perfect sequence of
> numbers, but with one single digit changed in the middle of the number.  No
> error propagation in a series expansion can do that.

I can see how it might be surprising if you don't think about it too hard, but
I'm afraid that you're wrong here: error propagation is *exactly* what's
causing the effects you're seeing.

Here's another way of looking at it: if you truncate the Taylor series about 0
for (1 + x) / (1 - x) to k (>= 1) terms, you get the polynomial (1 + x - 2x^k)
/ (1 - x).  For example, taking k to be 3, we're getting (1 + x - 2x^3) / (1 -
x).  Given that the particular value of x you're testing with has the form
10**, rounding your intermediate result to the working precision has
exactly the effect of truncating the series at some k.

Now you can compute and compare (by hand, via Wolfram alpha, or however you
like) the Taylor series expansions for log((1 + x) / (1 - x)) and log((1 + x -
2x^3) / (1 - x)).  For the first you'll see:

  2x + 2/3 x^3  + 2/5 x^5 - 2/7 x^7 + 2/9 x^9 - ...

and for the second you'll get:

  2x - 4/3 x^3 + 2 x^4 - 8/5 x^5 + 16/7 x^7 - ...

The difference between the two series is:

  -2x^3 + 2x^4 - 2x^5 + 2x^7 - 4x^8 + ...

So again with x a small power of 10, you're going to see a single-digit error
from the -2x^3 term, and another single-digit error further along from
the 2x^3 term, and so on.

Here's a simpler example of the same phenomenon.  Note how the error propagation
leads to a single incorrect digit in the *middle* of the digit string.

Python 3.4.0 (default, Mar 25 2014, 11:07:05) 
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from decimal import Decimal
>>> x = Decimal('1e-15')
>>> y = (1 - 2 * x) / (1 - x)
>>> 2 * x + (y - 1) * (1 - x)  # Mathematically, expect to get 'x' back.
Decimal('1.001E-15')
>>> x
Decimal('1E-15')

-- 
Mark


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug in Decimal??

2014-05-04 Thread Mark Dickinson
On Tue, 29 Apr 2014 19:37:17 -0700, pleasedontspam wrote:

> Hello, I believe I found a bug in the Decimal library. The natural logarithm
> results seem to be off in a certain range, as compared with Wolfram Alpha.

I had a quick look: this isn't a bug - it's just the result of propagation of
the error in "partial" to "final".

In more detail: we've got a working precision of 2016 significant figures.  For
any small x, we have (1 + x) / (1 - x) = 1 + 2x + 2x^2 + 2x^3 +   For your
value of x, `Decimal('1e-1007'), we've got enough precision to store
1 + 2x + 2x^2 exactly, but that's all.

So partial has an absolute error of around 2x^3, or 2e-3021.

And since the derivative of the natural log function is almost exactly 1 at the
point we're interested in, we expect the absolute error in the output to be
close to 2e-3021, too.

And that's *precisely* what you're seeing: the Decimal module is giving you a
result that's exactly `Decimal('2e-1007') - Decimal('1.3e-3021')`, while the
result you were expecting is `Decimal('2e-1007') + Decimal('0.7e-3021')`.  A
difference of exactly `Decimal('2e-3021')`, as expected.

-- 
Mark


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'complex' function with string argument.

2014-03-17 Thread Mark Dickinson
Jayanth Koushik  gmail.com> writes:

> "Note: When converting from a string, the string must not contain whitespace
> around the central + or - operator. For example, complex('1+2j') is fine, but
> complex('1 + 2j') raises ValueError."
> 
> Why is this so?

See http://bugs.python.org/issue9574 for a bit more context.  To begin with,
it's not at all clear what *should* be allowed.  If "1 + 2j" is valid, what
about "+ 2j"?  How about "+ 2"?  What about things like "+1.0 + -2.3j"?
Ultimately, I closed that issue because the proposed change seemed like
unnecessary code churn, and there wasn't a clear consensus that the change was
desirable (or even what that change should be).  If it ain't broke, etc.

-- 
Mark


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Float to String

2012-10-31 Thread Mark Dickinson
  3ds.com> writes:

> When formatting a float using the exponential format, the rounding is
> different in Python-2.6 and Python-2.7. See example below.  Is this
> intentional?

Yes, in a sense.  Python <= 2.6 uses the OS-provided functionality (e.g., the C
library's strtod, dtoa and sprintf functions) to do float-to-string and
string-to-float conversions, and hence behaves differently from platform to
platform.  In particular, it's common for near halfway cases (like the one
you're looking at here) and tiny numbers to give different results on different
platforms.  Python >= 2.7 has its own built-in code for performing
float-to-string and string-to-float conversions, so those conversions are
platform- independent and always correctly rounded.  (Nitpick: it's still
theoretically possible for Python 2.7 to use the OS code if it can't determine
the floating-point format, or if it can't find a way to ensure the proper FPU
settings, but I don't know of any current platforms where that's the case.)

> Is there any way of forcing the Python-2.6 behavior (for compatibility
> reasons when testing)?

Not easily, no.

--
Mark


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting bit fields from an IEEE-784 float

2012-07-30 Thread Mark Dickinson
On Monday, July 30, 2012 3:16:05 PM UTC+1, Grant Edwards wrote:
> The last ones I worked on that where the FP format wasn't IEEE were
> 
> the DEC VAX and TI's line if 32-bit floating-point DSPs.  I don't
> 
> think Python runs on the latter, but it might on the former.

For current hardware, there's also IBM big iron:  the IBM System z10 apparently 
has hardware support for IBM's hexadecimal floating-point format in addition to 
IEEE 754 binary *and* decimal floating-point.  But IIUC, a typical Linux 
installation on one of these machines uses the IEEE 754 stuff, not the 
hexadecimal bits.  So unlikely to be an issue for Python.

-- 
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting bit fields from an IEEE-784 float

2012-07-30 Thread Mark Dickinson
On Monday, July 30, 2012 1:44:04 AM UTC+1, Steven D'Aprano wrote:
> 1) Are there any known implementations or platforms where Python floats 
> are not C doubles? If so, what are they?

Well, IronPython is presumably using .NET Doubles, while Jython will be using 
Java Doubles---in either case, that's specified to be the IEEE 754 binary64 
type.

For CPython, and I guess PyPy too, we're using C doubles, which in theory are 
in whatever format the platform provides, but in practice are always IEEE 754 
binary64 again.

So you're pretty safe assuming IEEE 754 binary64 format.  If you ever meet a 
current Python running on a system that *doesn't* use IEEE 754 for its C 
doubles, please let me know---there are a lot of interesting questions that 
would come up in that case.   

> 2) If the platform byte-order is reversed, do I need to take any special 
> 
> action? I don't think I do, because even though the float is reversed, so 
> 
> will be the bit mask. Is this correct?

True; on almost all current platforms, the endianness of int types matches the 
endianness of float types.But to be safe, why not use ' 3) Any other problems with the way I am doing this?

You might consider whether you want to use 'http://mail.python.org/mailman/listinfo/python-list


Re: Numeric root-finding in Python

2012-02-12 Thread Mark Dickinson
On Feb 12, 6:41 am, Steven D'Aprano  wrote:

>             err = -a/b  # Estimate of the error in the current w.
>             if abs(err) <= 1e-16:
>                 break

If the result you're expecting is around -1.005, this exit condition
is rather optimistic:  the difference between the two Python floats
either side of this value is already 2.22e-16, so you're asking for
less than half a ulp of error!

As to the rest;  your error estimate simply doesn't have enough
precision.  The main problem is in the computation of a, where you're
subtracting two almost identical values.  The absolute error incurred
in computing w*exp(w) is of the same order of magnitude as the
difference 'w*exp(w) - x' itself, so err has lost essentially all of
its significant bits, and is at best only a crude indicator of the
size of the error.  The solution would be to compute the quantities
'exp(w), w*exp(w), and w*exp(w) - x' all with extended precision.  For
the other quantities, there shouldn't be any major issues---after all,
you only need a few significant bits of 'delta' for it to be useful,
but with the subtraction that generates a, you don't even get those
few significant bits.

> (The correct value for w is approximately -1.00572223991.)

Are you sure?  Wolfram Alpha gives me the following value for W(-1,
-0.36787344117144249455719773322925902903079986572265625):

-1.005722239691522978...

so it looks as though the values you're getting are at least
alternating around the exact value.


--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: surprising result all (generator) (bug??)

2012-01-31 Thread Mark Dickinson
On Jan 31, 7:24 am, Neal Becker  wrote:
> In [31]: all is numpy.all
> Out[31]: True
>
> Excellent detective work, Mark!  But it still is unexpected, at least to me.

Agreed that it's a bit surprising.  It's a consequence of NumPy's
general mechanisms for converting arbitrary inputs to arrays:

>>> from numpy import asarray
>>> asarray([i > 0 for i in range(10)])
array([False,  True,  True,  True,  True,  True,  True,  True,  True,
True], dtype=bool)
>>> asarray(i > 0 for i in range(10))
array( at 0x4688aa8>, dtype=object)

So in the second case you get a 0-dimensional array of type 'object',
whose only element is that generator object;  not surprisingly, the
generator object is considered true.

As to why it's this way:  best to ask on the NumPy mailing lists.
It's probably something that's quite difficult to change without
breaking lots of code, though.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: surprising result all (generator) (bug??)

2012-01-31 Thread Mark Dickinson
On Jan 31, 6:40 am, Neal Becker  wrote:
> I was just bitten by this unexpected behavior:
>
> In [24]: all ([i > 0 for i in xrange (10)])
> Out[24]: False
>
> In [25]: all (i > 0 for i in xrange (10))
> Out[25]: True

What does:

>>> import numpy
>>> all is numpy.all

give you?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: struct calcsize discrepency?

2011-12-05 Thread Mark Dickinson
On Dec 5, 8:09 am, Chris Angelico  wrote:
> May be, yes, but since calcsize() is returning 12 when the elements
> are put in the other order, it would seem to be not counting such
> padding.

Indeed.  That's arguably a bug in the struct module, and one that
people have had to find workarounds for in the past.  See note 3 at:

http://docs.python.org/library/struct.html#byte-order-size-and-alignment

If it weren't for backwards compatibility issues, I'd say that this
should be fixed.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: struct calcsize discrepency?

2011-12-05 Thread Mark Dickinson
On Dec 4, 3:17 pm, Chris Angelico  wrote:
> On Mon, Dec 5, 2011 at 1:51 AM, Dave Angel  wrote:
> > In C, the padding to the largest alignment occurs at the
> > end of a structure as well as between items.  Otherwise, an array of the
> > struct would not be safely aligned.  if you have an 8byte item followed by a
> > 4 byte item, the total size is 16.
>
> That's padding of the array, not of the structure.

That's a strange way to think of it, especially since the padding also
happens for a single struct object when there's no array present.  I
find it cleaner to think of C as having no padding in arrays, but
padding at the end of a struct.  See C99 6.7.2.1p15: 'There may be
unnamed padding at the end of a structure or union.'  There's no
mention in the standard of padding for arrays.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unpack('>f', b'\x00\x01\x00\x00')

2011-12-02 Thread Mark Dickinson
On Dec 1, 10:21 am, Hrvoje Niksic  wrote:
> Chris Rebert  writes:
> > C does not have a built-in fixed-point datatype, so the `struct`
> > module doesn't handle fixed-point numbers directly.
>
> The built-in decimal module supports fixed-point arithmetic, but the
> struct module doesn't know about it.  A bug report (or patch) by someone
> who works with binary representations of fixed-point would be a good
> start to improve it.

Not really:  the decimal module is for floating-point, not fixed-
point, though its semantics for significant trailing zeros can help
give the appearance of fixed-point arithmetic for simple operations
(addition, subtraction) that stay within suitable bounds.  And decimal
fixed-point isn't so much use for a binary fixed-point format, anyway.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Implement comparison operators for range objects

2011-10-14 Thread Mark Dickinson
On Oct 12, 8:24 pm, Ian Kelly  wrote:
> On Wed, Oct 12, 2011 at 11:51 AM, MRAB  wrote:
> >> Aside:
>
> >> I'm astonished to see that range objects have a count method! What's the
> >> purpose of that? Any value's count will either be 0 or 1, and a more
> >> appropriate test would be `value in range`:
>
> >>  >>> 17 in range(2, 30, 3) # like r.count(17) => 1
> >> True
> >>  >>> 18 in range(2, 30, 3) # like r.count(18) => 0
> >> False
>
> > In Python 2, range returns a list, and lists have a .count method.
> > Could that be the reason?
>
> Python 2 xrange objects do not have a .count method.  Python 3 range
> objects do have a .count method.  The addition is curious, to say the
> least.

See http://bugs.python.org/issue9213

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generating equally-spaced floats with least rounding error

2011-09-24 Thread Mark Dickinson
On Sep 24, 6:01 pm, Steven D'Aprano  wrote:
> Ah, I knew it was too easy!
>
> >>> from fractions import Fraction as F
> >>> start, stop, n = 1, 3.1, 7
> >>> [float(F(start) + i*(F(stop)-F(start))/n) for i in range(n+1)]
>
> [1.0, 1.3, 1.6, 1.9001, 2.2, 2.5, 2.8003, 3.1]

I believe that's still giving correctly-rounded results.  Note that
the stop value of 3.1 isn't exactly 3.1 here:  it's

3.100088817841970012523233890533447265625

So the 4th value above is the closest float to 4/7 * 1.0 + 3/7 *
3.100088817841970012523233890533447265625.

--
Mark

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generating equally-spaced floats with least rounding error

2011-09-24 Thread Mark Dickinson
On Sep 24, 5:46 pm, Steven D'Aprano  wrote:
> Speed is important, but secondary to correctness. To be honest, I never even
> thought of using fractions as an intermediate result -- I was thinking of
> generating lists of Fractions. I think I can use this approach.

Speed could be improved greatly by using integer arithmetic (with some
help from the float.as_integer_ratio method) instead of using Fraction
directly, though it would require writing quite a lot more code;
Fraction's many and slow gcd calls for normalization are a problem
here, and since all denominators will be powers of 2 they're largely
unnecessary.

> But out of curiosity, how would you do it using nothing but floats? Is there
> a way?

Hmm.  Beyond writing my own multiple-precision integer library using
floats as the base type, I can't think of anything obvious. :-)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generating equally-spaced floats with least rounding error

2011-09-24 Thread Mark Dickinson
On Sep 24, 2:53 pm, Steven D'Aprano  wrote:
> I'm trying to generate a sequence of equally-spaced numbers between a lower
> and upper limit. Given arbitrary limits, what is the best way to generate a
> list of equally spaced floats with the least rounding error for each point?
>
> For example, suppose I want to divide the range 0 to 2.1 into 7 equal
> intervals, then the end-points of each interval are:
>
> (0.0)---(0.3)---(0.6)---(0.9)---(1.2)---(1.5)---(1.8)---(2.1)
>
> and I'd like to return the values in the brackets. Using Decimal or Fraction
> is not an option, I must use floats. If the exact value isn't representable
> as a float, I'm okay with returning the nearest possible float.

Can you explain why you're constrained not to use Fraction?  Speed?

Using Fraction for intermediate calculations actually works perfectly
for this, since conversions from float to Fraction are exact, while
conversions from Fraction to float are correctly rounded.  So if
you're using Python, you're not too bothered about efficiency, and you
want provably correctly-rounded results, why not use Fraction?

>>> from fractions import Fraction
>>> start, stop, n = 0.0, 2.1, 7
>>> [float(Fraction(start) + i * (Fraction(stop) - Fraction(start)) / n) for i 
>>> in range(n+1)]
[0.0, 0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1]

--
Mark

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random.randint() slow, esp in Python 3

2011-09-24 Thread Mark Dickinson
On Sep 24, 10:38 am, Ian Kelly  wrote:
> On Sat, Sep 24, 2011 at 12:55 AM, Steven D'Aprano
>
>
>
>
>
>
>
>
>
>  wrote:
> > If you want unbiased, random (or at least pseudo-random) integers chosen
> > from an uniform distribution with proper error checking, you should use
> > randint or randrange.
>
> >> but if you just
> >> want a pile of more-or-less random integers, int(random.random()*top)
> >> is the best option.
>
> > "I only need random-ish ints, but I need 'em yesterday!!!"
>
> > If you want biased, not-quite-uniformly distributed integers with no error
> > checking, then the above will save you a few nanoseconds per call. So if
> > your application needs a million such ints, you might save a full five
> > seconds or so. Sounds like premature optimization to me, but what do I
> > know? If you have an application where the quality of randomness is
> > unimportant and generating random ints is a genuine performance bottleneck,
> > then go right ahead.
>
> Peeking under the hood, after all the error-checking and extra
> features, randrange basically just does int(random.random() * top).
> Any bias present in the latter will also be present in the former.
>
> Cheers,
> Ian

In Python 2.x that's true, and indeed randrange has some fairly bad
behaviour for large arguments ( see http://bugs.python.org/issue9025
).  In Python 3.2 and up, randrange is more careful:  it doesn't
introduce any extra bias beyond that already present in the random
source (Mersenne Twister).  If you look at the source, you'll see that
Python 2.x only calls _getrandbits for large arguments, while Python
3.2+ calls _getrandbits for *every* invocation of randrange.  (All
assuming that we're using the default MT random number generator.)
This may well explain the difference in timings observed by the OP.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Operator commutativity

2011-09-21 Thread Mark Dickinson
On Sep 21, 2:07 am, Steven D'Aprano  wrote:
> After playing around with various combinations of C1, C2, D1 and D2, it
> seems to me that the rule is:
>
> If the right-hand argument is a subclass of the left-hand argument, AND also
> defines __radd__ directly rather than inheriting it, then its __radd__
> method is called before the left-hand argument's __add__ method.
>
> which strikes me as a strangely specific and not very useful rule. I suspect
> it might be an accident of implementation rather than a deliberate feature.

I'm 99.9% sure it's deliberate rather than an accident of
implementation.  See the note in the docs at:

http://docs.python.org/reference/datamodel.html#emulating-numeric-types

Support that you're subclassing int, (class MyInt(int): ...) and you
want to intercept additions of the form 3 + MyInt(4)  (perhaps because
you want MyInt to be 'contagious', so that an arithmetic operation
that combines an int and a MyInt returns a MyInt).  How would you
achieve this without this rule?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: help needed on decimal formatting issue

2011-09-19 Thread Mark Dickinson
On Sep 19, 1:42 pm, Robin Becker  wrote:
> I'm not really very used to the decimal module so I'm asking here if any one 
> can
> help me with a problem in a well known third party web framework
>
> The code in question is
>
> def format_number(value, max_digits, decimal_places):
>      """
>      Formats a number into a string with the requisite number of digits and
>      decimal places.
>      """
>      if isinstance(value, decimal.Decimal):
>          context = decimal.getcontext().copy()
>          context.prec = max_digits
>          return u'%s' % str(
>           value.quantize(decimal.Decimal(".1") ** decimal_places,
>                         context=context))
>      else:
>          return u"%.*f" % (decimal_places, value)

What's the meaning of the 'max_digits' argument here?  Are you
guaranteed that the incoming value is smaller than 10**max_digits in
absolute value?

If so, then a precision of max_digits + decimal_places + 1 should be
enough.  The +1 is there to cover the corner case where a value close
to 10**max_digits is rounded up to 10**max_digits by the quantize
operation.

BTW, that's a fairly horrible way of creating the first argument to
the quantize method, too.  It would be more efficient to do something
like:

>>> decimal_places = 2
>>> decimal.Decimal('0.{}1'.format('0'*(decimal_places-1)))
Decimal('0.01')

(perhaps with suitable special-casing for the case where
decimal_places <= 0;  not sure whether this applies in your context).

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: way to calculate 2**1000 without expanding it?

2011-09-18 Thread Mark Dickinson
On Sep 16, 9:17 pm, Arnaud Delobelle  wrote:
> Ah go on, let's make a codegolf contest out of it.
> My entry:
>
> >>> sum(map(int,str(2**1000)))

You could save another character by replacing "2**1000" with "2<<999"

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: recursive algorithm for balls in numbered boxes

2011-09-11 Thread Mark Dickinson
On Sep 11, 1:43 am, "Dr. Phillip M. Feldman"
 wrote:
> I've written a recursive class that creates an iterator to solve a general
> formulation of the combinatorics problem known as "balls in numbered boxes"
> (also known as "indistinguishable balls in distinguishable boxes").  The
> code has been extensively tested and appears to work, but isn't terribly
> elegant.  Any suggestions about how to improve it will be appreciated.
>
> Also, I'd like to get this functionality into the Python's `itertools`
> module (the present set of combinatorics functions in `itertools` does not
> include "balls in boxes").  Does anyone know whom I should contact about
> this?

Note that for the version without size limits on individual boxes, the
itertools.combination function already provides most of what's
needed.  For example:

import itertools

def balls_in_boxes(nballs, nboxes):
n, k = nballs + nboxes - 1, nboxes - 1
for comb in itertools.combinations(range(n), k):
yield [y - x - 1 for y, x in zip(comb + (n,), (-1,) +
comb)]

print "5 balls in 3 boxes"
for combination in balls_in_boxes(5, 3):
print combination
assert len(combination) == 3
assert sum(combination) == 5


This is a well-known trick:  to divide 5 (unlabeled) balls amongst 3
(labeled) boxes, you write down sequences of 5 o's and 2 x's, where
the o's represent the 5 balls and the 'x's represent dividers:

ooxooxo  -> [2, 2, 1]
xoooxoo  -> [0, 3, 2]

And 'combinations(7, 2)' yields successively all the possible
different placements for the 2 dividers in the 7 symbols.


This question seems to come up often enough (without the box size
limit twist) that maybe it would be useful to include something like
this recipe in the itertool documentation.


For getting this into itertools, I'd suggest opening a feature request
on bugs.python.org and assigning it to Raymond Hettinger.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Representation of floats (-> Mark Dickinson?)

2011-09-07 Thread Mark Dickinson
On Sep 7, 4:58 am, casevh  wrote:
> IIRC, Python
> 3.2 changed (for floats) __str__ to call __repr__.

Yes, exactly:  str and repr of a float are identical in Python 3.2 +

I'm also puzzled by the

2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
[...]
>>> 1.1 * 1.1
1.21

in jmf's message.  Cut-and-paste typo?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Large number multiplication

2011-07-08 Thread Mark Dickinson
On Jul 7, 9:30 am, Ulrich Eckhardt 
wrote:
> That said, I'm sure that the developers would accept a patch that switches
> to a different algorithm if the numbers get large enough. I believe it
> already doesn't use Karatsuba for small numbers that fit into registers,
> too.

I'm far from sure that such a patch would be accepted. :-)  Indeed,
Tim Peters has been heard to mention that if he were to do it all
again, he probably wouldn't even have implemented Karatsuba [1].
Asymptotically fast integer multiplication is a specialist need that's
already available in 3rd-party Python libraries (gmpy).  IMO, it's not
appropriate to include it in a general-purpose programming language,
and the burden of maintaining such code would outweigh the benefits.

One thing that has been considered in the past is making it possible
to use GMP directly for Python's implementation of long integers;  see
Victor Stinner's efforts in this direction [2].  Licensing concerns,
and the fact that Python's implementation is faster for small
integers, ended up killing this issue.

[1] http://mail.python.org/pipermail/python-dev/2008-November/083355.html
[2] http://bugs.python.org/issue1814

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fun and games with lambda

2011-06-17 Thread Mark Dickinson
On Jun 17, 5:10 pm, Steven D'Aprano  wrote:
>
> print((lambda f:((lambda p:p[0]+'.'+p[1:])(str((lambda Q:2*Q[0]*Q[0]//Q
> [3])((lambda F:(lambda S:f(lambda T,_:((T[0]+T[1])//2,S((T[0]*T[1])//
> F),2*T[2],(T[3]-(T[2]*(((T[0]+T[1])//2)**2-(S((T[0]*T[1])//F))**2))//F)),
> [0]*13,(F,(F*F)//S(2*F),2,F//2)))(lambda n:f(lambda x,_:(x-x//2+(n*F)//
> (2*x)),[0]*15,n//2)))(10**(5010[:5000])))(reduce))

Very nice, but a little unnatural.  Can't you find room to stick an
extra factor of 2 in there somewhere?

(See also: http://bugs.python.org/issue12345 )

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Recursion or iteration (was Fibonacci series recursion error)

2011-05-15 Thread Mark Dickinson
On May 15, 8:20 pm, Mark Dickinson  wrote:
> On May 15, 4:32 am, rusi  wrote:
>
> > On May 15, 2:19 am, Ian Kelly  wrote:
> > > Yup, linear.  Assuming you optimize the even case so that it doesn't
> > > actually call fib(n//2) twice, the call tree can be approximated as a
> > > balanced binary tree with height log(n).  The total number of nodes in
> > > the tree is thus O(2 ** log(n)) = O(n).
>
> > It would be linear if the base of the log were 2.
> > I am not sure it is.
> > You see the naive fib has a complexity which is fib itself. [Earlier
> > discussion with Steven]
> > fib is exponential but with radix < 2 [phi = (1 + sqrt(5))/2 ]
> > This would suggest that this algo is slightly better than linear.
>
> Nope.  It's linear, just as Ian Kelly said.  If g(n) is the total
> number of fib calls made for fib(n), then it's easy to show (e.g., by
> induction) that:
>
> (a) g(n) is an increasing function of n, and
> (b) g(2^k) = 2^k - 1 for all k >= 1.
>
> Hence g(n) is O(n).

Hmm.  It's even easier:  g(n) is:

  * 1 if n == 1
  * n if n > 1, n odd
  * n-1 if n > 1, n even

So definitely linear. :-)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Recursion or iteration (was Fibonacci series recursion error)

2011-05-15 Thread Mark Dickinson
On May 15, 4:32 am, rusi  wrote:
> On May 15, 2:19 am, Ian Kelly  wrote:
> > Yup, linear.  Assuming you optimize the even case so that it doesn't
> > actually call fib(n//2) twice, the call tree can be approximated as a
> > balanced binary tree with height log(n).  The total number of nodes in
> > the tree is thus O(2 ** log(n)) = O(n).
>
> It would be linear if the base of the log were 2.
> I am not sure it is.
> You see the naive fib has a complexity which is fib itself. [Earlier
> discussion with Steven]
> fib is exponential but with radix < 2 [phi = (1 + sqrt(5))/2 ]
> This would suggest that this algo is slightly better than linear.

Nope.  It's linear, just as Ian Kelly said.  If g(n) is the total
number of fib calls made for fib(n), then it's easy to show (e.g., by
induction) that:

(a) g(n) is an increasing function of n, and
(b) g(2^k) = 2^k - 1 for all k >= 1.

Hence g(n) is O(n).

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Recursion or iteration (was Fibonacci series recursion error)

2011-05-13 Thread Mark Dickinson
On May 11, 11:06 pm, Hans Mulder  wrote:
> On 03/05/2011 09:52, rusi wrote:
>
> > [If you believe it is, then try writing a log(n) fib iteratively :D ]
>
> It took me a while, but this one seems to work:
>
> from collections import namedtuple
>
> Triple = namedtuple('Triple', 'hi mid lo')
> Triple.__mul__ = lambda self, other: Triple(
>      self.hi * other.hi + self.mid * other.mid,
>      self.hi * other.mid + self.mid * other.lo,
>      self.mid * other.mid + self.lo * other.lo,
> )
> [...]

You can even get away with pairs rather than triples:



from collections import namedtuple

Pair = namedtuple('Pair', 'z o')
Pair.__mul__ = lambda self, other: Pair(
self.z * other.z + self.o * other.o,
self.z * other.o + self.o * other.z + self.o * other.o,
)

def fib(n):
 f = Pair(0, 1)
 a = Pair(1, 0)
 while n:
 if n & 1:
 a *= f
 f *= f
 n >>= 1
 return a.o



I don't see this (or Hans' version) as cheating at all.  This really
*is* the power algorithm, just in a different number system from the
usual one.  For those with a bit of abstract algebra, the above
algorithm is just computing x^n in the ring Z[x] / (x^2 - x - 1).  A
pair 'Pair(a, b)' represents the element 'a + bx' (more precisely, the
image of 'a + bx' under the natural quotient map Z[x] -> Z[x] / (x^2 -
x - 1)) of that ring.  And this *can* be generalised to other
sequences given by a linear recurrence.

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fibonacci series recursion error

2011-05-02 Thread Mark Dickinson
On May 2, 5:24 pm, Steven D'Aprano  wrote:

> As far as I'm concerned, there are only two definitions of Fibonacci
> numbers that have widespread agreement in the mathematical community:
>
> 0,1,1,2,3 ... (starting from n=0)
> 1,1,2,3,5 ... (starting from n=1)
>
> Any other definition is rather, shall we say, idiosyncratic.

And a concrete reason for preferring the above definition (in either
form) is that divisibility properties of the sequence are much neater
with this choice:

gcd(F_m, F_n) = F_{gcd(m, n)}

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some questions on pow and random

2011-03-27 Thread Mark Dickinson
On Mar 27, 3:00 pm, joy99  wrote:
> (i) Suppose we have 8 which is 2^3 i.e., 3 is the power of 2, which we
> are writing in Python as,
> variable1=2
> variable2=3
> result=pow(variable1,variable2)
>
> In my first problem p(x) a list of float/decimals and f(x) is another
> such.
> Here,
> variable1=p(x)
> variable2=f(x)
> so that we can write, pow(variable1,variable2) but as it is a list not
> a number and as the size is huge, so would it pow support it?

No:  pow won't work on lists.  It will work on (a) numbers (pow(2, 3) -
> 8),
or (b) numpy arrays, e.g.:

>>> import numpy as np
>>> x = np.array([0.1, 0.5, 0.4])
>>> y = np.array([3, 24, 18])
>>> pow(x, y)
array([  1.e-03,   5.96046448e-08,   6.87194767e-08])
>>> x ** y # exactly equivalent
array([  1.e-03,   5.96046448e-08,   6.87194767e-08])

> (ii) The second question is, if I have another set of variables,
>
> variable1=random.random()
> variable2=random.random()

In this case 'variable1' and 'variable2' are Python floats, so yes,
you can multiply them directly.  (BTW, you can always experiment
directly at the Python interactive prompt to answer this sort of
question.)

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some questions on pow and random

2011-03-27 Thread Mark Dickinson
On Mar 27, 11:07 am, joy99  wrote:

> (b) Suppose we have two distributions p(x1) and p(x2), of the Model M,
> the E of EM algorithm, without going into much technical details is,
> P0(x1,x2), P1(x1,x2) 
>
> Now I am taking random.random() to generate both x1 and x2 and trying
> to multiply them, is it fine? Or should I take anything else?

Sorry, it's unclear to me what you're asking here.  Can you rephrase
this as a question about Python's random.random() function?

If you're asking whether it's okay to regard your generated x1 and x2
as independent, then the answer is yes.

--
Mark

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some questions on pow and random

2011-03-27 Thread Mark Dickinson
On Mar 27, 11:07 am, joy99  wrote:
> (i) By standard definition of Likelihood Estimation, we get if x EURO X,
> where X is a countable set of types, p is probability and f is
> frequency.
> L(f;p)=Ðp(x)f(x)
>
> My question is python provides two functions,
> (a)     pow for power.
> (b)     reduce(mul, list)
>
> Now, how to combine them? If any one can suggest any help?

If I'm understanding your question correctly, it sounds as though
you've got a vector p = (p_1, p_2, ..., p_n) of probabilities and a
corresponding vector f = (f_1, f_2, ..., f_n) of integer frequencies,
and you want to compute the product of the quantities p_i ** f_i.  Is
that right?

Assuming that p and f are represented by Python lists, you might do
something like this:

>>> p = [0.1, 0.5, 0.4]
>>> f = [3, 24, 18]
>>> product = 1.0
>>> for pi, fi in zip(p, f):
... product *= pi**fi
...
>>> product
4.0960005e-18

I wouldn't particularly recommend using 'reduce' for this, since it
doesn't lead to readable code, but if you must you can do:

>>> import operator
>>> reduce(operator.mul, map(pow, p, f), 1)
4.0960005e-18

or:

>>> reduce(operator.mul, [pi**fi for pi, fi in zip(p, f)], 1)
4.0960005e-18

> As p(x)f(x), would be huge would pow support it?

You'll run into problems with underflow to zero fairly quickly.  The
usual solution is to work with log likelihood instead of likelihood
itself, in which case you'd want a sum instead:

>>> sum(fi*log(pi) for pi, fi in zip(p, f))
-40.03652078615561

If you really need the likelihood itself, you might try using the
Decimal module, which allows a much wider exponent range than Python's
builtin float.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Guido rethinking removal of cmp from sort method

2011-03-26 Thread Mark Dickinson
On Mar 25, 2:00 pm, Stefan Behnel  wrote:
> Westley Martínez, 25.03.2011 14:39:
>
> > On Fri, 2011-03-25 at 07:11 +0100, Stefan Behnel wrote:
> >> Steven D'Aprano, 25.03.2011 06:46:
> >>> On Thu, 24 Mar 2011 18:32:11 -0700, Carl Banks wrote:
>
>  It's probably the least justified builtin other than pow.
>
> >>> I don't know about that. Correctly, efficiently and *quickly*
> >>> implementing the three-argument version of pow is exactly the sort of
> >>> thing that should be in the built-ins, or at least the standard library.
>
> >> I think that touches it already. We have a "math" module, so why is there a
> >> *builtin* for doing special math? How much code is there really that only
> >> uses pow() and does not import "math"?
>
> > pow() and math.pow() are different.
>
> I don't find that a good excuse for pow() being a builtin. Why not just
> rename it to "math.powmod" instead?

+1 (modulo bikeshedding about the name).  I've long thought that three-
argument pow belongs in the standard library rather than the core.
And then the existence of the ** operator obviates the need for two-
argument pow.

Worse than pow itself is the fact that the __pow__ special method
takes an optional 3rd argument, which apart from unnecessarily
complicating Python's internal, also leads to horrors like Decimal's
(IMO inappropriate) support for 3-argument pow.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python fails on math

2011-03-09 Thread Mark Dickinson
On Feb 25, 12:52 am, Grant Edwards  wrote:
> So I think the C standard actually
> forces the compiler to convert results to 64-bits at the points where
> a store to a temporary variable happens.

I'm not sure that this is true.  IIRC, C99 + Annex F forces this, but
C99 by itself doesn't.

> Indeed.  Though C-python _will_ (AFAIK) store results to variables
> everywhere the source code says to

Agreed.

That doesn't rescue Python from the pernicious double-rounding
problem, though:  it still bugs me that you get different results for
e.g.,

>>> 1e16 + 2.9
1.0002e+16

depending on the platform.  OS X, Windows, 64-bit Linux give the
above;  32-bit Linux generally gives 1.0004e+16 instead,
thanks to using the x87 FPU with its default 64-bit precision.
(Windows uses the x87 too, but changes the precision to 53-bit
precision.)

In theory this is prohibited too, under C99 + Annex F.


--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: builtin max() and weak ordering

2011-03-03 Thread Mark Dickinson
On Mar 3, 10:38 am, Stephen Evans  wrote:
> The CPython builtins min() and max() both return the first satisfactory 
> object found. This behaviour is undocumented. Examining the source in 
> bltinmodule.c shows that min() and max() both result in a call to min_max() 
> with Py_LT and Py_GT passed respectively.
>
> The behaviour of min() is correct. But with weak orderings of equivalent 
> objects, surely max() should be using Py_GE ? (the converse of Py_LT). Should 
> I be reporting a bug, submitting a patch, making feature request or 
> suggesting a documentation update ?

See:

http://bugs.python.org/issue9802

(and feel free to add comments to that issue).

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An amazing one-minute bit of fun at the interactive prompt

2011-02-21 Thread Mark Dickinson
On Feb 20, 8:08 am, Raymond Hettinger  wrote:
> [...]
> >>> n * e
>
> 3.1415926
>
> Compute ð ± e by counting Mandlebrot set iterations :-)

Very neat!  Is it supposed to be obvious why this gives an
approximation to pi?  If so, I'll think about it a bit more;  if not,
do you have any references?

Maybe you should have saved this for March 14th...

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: round in 2.6 and 2.7

2010-12-29 Thread Mark Dickinson
On Dec 28, 9:47 pm, "Martin v. Loewis"  wrote:
> >> "Float-to-string and string-to-float conversions are correctly rounded.
> >> The round() function is also now correctly rounded."
>
> >> Not sure that this is correct English; I think it means that the
> >> round() function is now correct.
>
> > Well, the correct result of the example the OP gave would be 9.9
> > exactly.  But since 9.9 isn't exactly representable as a Python float,
> > we necessarily get an approximation.  The language above is intended
> > to convey that it's the 'correctly rounded' approximation
>
> I see. Shouldn't it say then "The round() function
> gives/produces/returns correctly rounded results now", instead of saying
> that
> the round function *is* correctly rounded? ISTM that the round
> function cannot be rounded (correctly or not):
>
> py> round(round)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: a float is required
>
> But then, I'm not a native speaker (of English).

I dare say that you're just as much a native speaker as anyone else
when it comes to the language of floating-point arithmetic.  Not sure
English comes into it that much. :-)

Sure, I agree it's probably a slight abuse of language to refer to a
function as 'correctly rounded', but I think it's a fairly standard
abuse. From the two nearest references to hand: IEEE 754-2008 has a
whole section entitled 'Recommended correctly rounded functions' (nit:
shouldn't there be a hyphen in that title?), and persists in referring
to 'correctly rounded' conversions.  The more formal uses in that
document do indeed refer to single values, though. ("A conforming
function shall return results correctly rounded ...") The 'Handbook of
Floating-Point Arithmetic' by Muller et. al. gives a definition of a
correctly rounded function in section 2.2.  "When the exact result of
a function is rounded according to a given rounding mode ..., one says
that the function is *correctly rounded*."

It's not so dissimilar from other mathematical abuses, like describing
a real-valued function as 'positive', when what you actually mean is
that all its values are positive.

It's also slightly unsettling usage for me, not least because the
statement 'f is correctly rounded' for a floating-point function f is
really a statement about *two* functions:  namely, f (a function from
floating-point numbers to floating-point numbers) and the true
mathematical function that it's based on;  the identity of the
underlying mathematical function is left implicit.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: round in 2.6 and 2.7

2010-12-27 Thread Mark Dickinson
On Dec 23, 6:57 pm, Hrvoje Niksic  wrote:
> I stumbled upon this.  Python 2.6:
>
> >>> round(9.95, 1)
>
> 10.0
>
> So it seems that Python is going out of its way to intuitively round
> 9.95, while the repr retains the unnecessary digits.

No, Python's not doing anything clever here.  Python 2.6 uses a simple
rounding algorithm that frequently gives the wrong answer for halfway
or near-halfway cases.  It's just luck that in this particular case it
gives the apparently-correct (but actually incorrect) answer.
Martin's already explained that the 2.7 behaviour is correct, and
agrees with string formatting.  However, note that there's still a
disconnect between these two operations in Python 2.7:

>>> round(1.25, 1)
1.3
>>> format(1.25, '.1f')
'1.2'

That's because 'round' in Python 2.x (including 2.7) still rounds
exact halfway cases away from zero, while string formatting rounds
them to the value with even last digit.  In Python 3.x, even this
discrepancy is fixed---everything does round-halfway-to-even.

> Is the change to round() expected?

Expected, and intentional. :-)

[Martin]
> "Float-to-string and string-to-float conversions are correctly rounded.
> The round() function is also now correctly rounded."
>
> Not sure that this is correct English; I think it means that the
> round() function is now correct.

Well, the correct result of the example the OP gave would be 9.9
exactly.  But since 9.9 isn't exactly representable as a Python float,
we necessarily get an approximation.  The language above is intended
to convey that it's the 'correctly rounded' approximation---that is,
the closest Python float to the true value of 9.9 (with halfway cases
rounded to even, as usual).

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some syntactic sugar proposals

2010-12-02 Thread Mark Dickinson
On Nov 15, 12:46 pm, Tim Chase  wrote:
> On 11/15/2010 12:39 AM, Dmitry Groshev wrote:
>
> > x in range optimisation
>
> I've often thought this would make a nice O(1)-test lookup on an
> xrange() generator.

An O(1) test for 'x in ' is implemented in Python 3.2,
at least provided that x has type 'int', 'long' or 'bool'.  (If x is
an instance of a subclass of int or long, then there's a risk that the
semantics of the membership test have been changed by an explicitly
overridden __eq__, so Python wimps out and falls back to the O(n)
algorithm in that case.)

Python 3.2a4+ (py3k:86635:86636M, Nov 21 2010, 19:22:18)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> -1 in range(10**9)
False
>>> 5 in range(0, 10**100, 2)
False
>>> 10**99 in range(0, 10**100, 2)
True

IIRC, there wasn't sufficient interest to get it backported to Python
2.7 in time for its release.  Though as a pure optimization, one could
argue that it would still be possible to get this into Python 2.7.2.

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Weibull distr. random number generation

2010-11-20 Thread Mark Dickinson
On Nov 19, 3:21 pm, Dimos  wrote:
> I would like to use the random module, and if I understand well the Random 
> class, to create  1300 decimal numbers in python, by providing the 2 Weibull 
> parameters (b,c). How this can be done???
>
> import random
> print random
> random.weibullvariate(b,c)
> How can I set the population size n=1300 in decimals?

random.weibullvariate(b, c) generates a single sample from the
specified Weibull distribution.  If you want 1300 samples, you'd use
this within a loop.  For example, here's code to put 10 Weibull
variates with scale parameter 12345.0 and shape parameter 6.0 into a
list:

>>> import random
>>> samples = []
>>> for i in xrange(10):
... samples.append(random.weibullvariate(12345.0, 6.0))
...
>>> print samples
[15553.186762792948, 14304.175032317309, 9015.5053691933044,
12585.469181732506, 9436.2677219460638, 13350.89758791281,
5687.4330250037565, 12172.747202474553, 9687.8685933610814,
11699.040541029028]

A more experienced Python programmer might use a list comprehension to
achieve the same effect in a single line:

>>> samples = [random.weibullvariate(12345.0, 6.0) for _ in xrange(10)]
>>> print samples
[10355.396846416865, 14689.507803932587, 11491.850991569485,
14128.56397290655, 12592.739991974759, 9076.7752548878998,
11868.012238422616, 12016.784656753523, 14724.818462506191,
13253.477389116439]

Is this the sort of thing you were looking for?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: what's the precision of fractions.Fraction?

2010-11-20 Thread Mark Dickinson
On Nov 19, 3:29 pm, RJB  wrote:
> Does Fractions remove common factors the way it should?
>
> If it does and you want to find the closest fraction with a smaller
> denominator i think tou'll need some number theory and continued
> fractions.

Or perhaps just use the existing Fraction.limit_denominator method
(which does indeed use some number theory and continued fractions):

>>> from fractions import Fraction
>>> from math import pi
>>> Fraction.from_float(pi).limit_denominator(1000)
Fraction(355, 113)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weakref.proxy behaviour in python 3.0

2010-08-21 Thread Mark Dickinson
On Aug 21, 5:06 pm, Nicholas Cole  wrote:
> On Sat, Aug 21, 2010 at 3:31 PM, Mark Dickinson  wrote:
>
> [SNIP]
>
> > So my guess is that the change was unintentional.
>
> > It's probably worth a bug report.  Even if the behaviour isn't going
> > to change in either 2.x or 3.x (and it probably isn't), it might be
> > possible to clarify the docs.
>
> I think the docs should be fixed: it would be good to have a list of
> key examples where the behaviour is different.  Although the new
> behaviour is better, it certainly tripped me up badly.
>
> I'm happy to fill a report out, but since you seem to know much more
> about the internals, I wonder if a bug report written by you would be
> more useful!

http://bugs.python.org/issue9658

Please do log in and add any extra comments you feel appropriate.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weakref.proxy behaviour in python 3.0

2010-08-21 Thread Mark Dickinson
On Aug 21, 1:13 pm, Nicholas Cole  wrote:
> I've searched for information on this without success.  Has
> weakref.proxy changed in Python 3?  I couldn't see any note in the
> documentation, but the following code behaves differently on Python
> 2.6.1 and Python 3:
>
> import weakref
> class Test(object): pass
>
> realobject = Test()
> pobject = weakref.proxy(realobject)
> l = [pobject,]
>
> print(realobject in l)   # On python 2.6 this prints False, on Python 3 True.

So the change underlying what you're seeing is that comparisons in 3.x
'unwrap' the proxy, while in 2.x they don't:

Python 2.7 (r27:82500, Aug 15 2010, 14:21:15)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import weakref
>>> s = set()
>>> s == weakref.proxy(s)
False

Python 3.1.2 (r312:79147, Aug 20 2010, 20:06:00)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import weakref
>>> s = set()
>>> s == weakref.proxy(s)
True

It looks to me as though this could be a long-standing defect in
Python 2.x, unwittingly(?) corrected in Python 3.x when 3-way
comparisons were removed in favour of rich comparisons.  See

  http://svn.python.org/view?view=rev&revision=51533

For 2.7, the proxy source (in Objects/weakrefobject.c) defines a
'proxy_compare' function that's used in the 'tp_compare' slot for
proxy objects, and that proxy_compare function has code to unwrap the
proxy objects when necessary so that comparisons are done on the real
underlying objects.

*But* C-level tp_compare slots only ever get called when both objects
have the same type, so when comparing a real object with the proxy for
that object, that's never.

In 3.x, that's replace with a proxy_richcompare function for the
tp_richcompare slot.

So my guess is that the change was unintentional.

It's probably worth a bug report.  Even if the behaviour isn't going
to change in either 2.x or 3.x (and it probably isn't), it might be
possible to clarify the docs.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Assert statements in python 3.1

2010-08-20 Thread Mark Dickinson
On Aug 20, 6:13 am, genxtech  wrote:
> This is more of a curiosity question then anything else...  I was just
> wondering why in version 3 of python assertions weren't converted to
> use parenthesis, since print was.
>
> I am just asking because it seems the following line of code would
> seem more readable as a function:
>    assert 2 + 2 == 5, "Only for very large values of 2."

Well, part of the idea of asserts is that when you're running with
optimizations turned on (python -O), asserts should be disabled.  But
if assert were a normal function then in

assert(expensive_check)

the argument expensive_check would be evaluated both with 'python' and
with 'python -O'.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to convert bytearray into integer?

2010-08-16 Thread Mark Dickinson
On Aug 16, 8:36 pm, Mark Dickinson  wrote:
> On Aug 16, 8:08 pm, Jacky  wrote:
> > My concern is that struct may need to parse the format string,
> > construct the list, and de-reference index=0 for this generated list
> > to get the int out.
>
> > There should be some way more efficient?
>
> Well, you can improve on the struct solution by using the
> struct.Struct class to avoid parsing the format string repeatedly:
>
> >>> import struct
> >>> S = struct.Struct(' >>> S.unpack_from(buffer(bytearray([1,2,3,4,5])))
>
> (67305985,)
>
> This doesn't make a huge difference on my machine (OS X 10.6.4, 64-bit
> build of Python 2.6) though;  it's probably more effective for long
> format strings.

Sorry, this was inaccurate:  this makes almost *no* significant
difference on my machine for large test runs (1 and up).  For
small ones, though, it's faster.  The reason is that the struct module
caches (up to 100, in the current implementation) previously used
format strings, so with your tests you're only ever parsing the format
string once anyway.  Internally, the struct module converts that
format string to a Struct object, and squirrels that Struct object
away into its cache, which is implemented as a dict from format
strings to Struct objects.  So the next time that the format string is
used it's simply looked up in the cache, and the Struct object
retrieved.

By the way, in Python 3.2 there's yet another fun way to do this,
using int.from_bytes.

>>> int.from_bytes(bytearray([1,2,3,4]), 'little')
67305985

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to convert bytearray into integer?

2010-08-16 Thread Mark Dickinson
On Aug 16, 8:08 pm, Jacky  wrote:
> Hi Thomas,
>
> Thanks for your comments!  Please check mine inline.
>
> On Aug 17, 1:50 am, Thomas Jollans  wrote:
>
> > On Monday 16 August 2010, it occurred to Jacky to exclaim:
>
> > > Hi there,
>
> > > Recently I'm facing a problem to convert 4 bytes on an bytearray into
> > > an 32-bit integer.  So far as I can see, there're 3 ways:
> > > a) using struct module,
>
> > Yes, that's what it's for, and that's what you should be using.
>
> My concern is that struct may need to parse the format string,
> construct the list, and de-reference index=0 for this generated list
> to get the int out.
>
> There should be some way more efficient?

Well, you can improve on the struct solution by using the
struct.Struct class to avoid parsing the format string repeatedly:

>>> import struct
>>> S = struct.Struct('>> S.unpack_from(buffer(bytearray([1,2,3,4,5])))
(67305985,)

This doesn't make a huge difference on my machine (OS X 10.6.4, 64-bit
build of Python 2.6) though;  it's probably more effective for long
format strings. Adding:

def test_struct2(buf, offset, S=struct.Struct('http://mail.python.org/mailman/listinfo/python-list


Re: Floating numbers

2010-08-13 Thread Mark Dickinson
On Aug 13, 3:03 pm, jmfauth  wrote:
> A quick question.
>
> I understand how to get these numbers
>
> 34.5231263880373081783294677734375
>
> and
>
> 47 (from 2**47)
>
> and the sign.
>
> with the decimal module, but I fail to find this one
>
> 4858258098025923
>
> Possible?

See the float.as_integer_ratio method.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Floating numbers

2010-08-13 Thread Mark Dickinson
On Aug 12, 9:43 pm, Bradley Hintze 
wrote:
> Hi all.
>
> Is there a way I can keep my floating point number as I typed it? For
> example, I want 34.52 to be 34.52 and NOT 34.520002.

Nitpick: unless you're on very unusual hardware, you're missing some
zeros here.  On my machine, under Python 2.6, the float 34.52 displays
as 34.523, and the value stored internally is actually
34.5231263880373081783294677734375;  so that's within
1 part in 10**16 of the value you entered.

Why do you care?  That's a serious question, and its answer goes a
long way to determining what you should do.

 - If you're doing calculations with this number, then the difference
between the number Python stores and 34.52 is so miniscule that in
normal situations it's not going to matter.  In particular, if it
represents some physical quantity then any error in the representation
will be swamped by the inherent measurement error.  IOW, it's not
worth worrying about.

 - If you're printing this number, and you just want the output to
look nice (why?  perhaps because you're showing this to other
people?), then use float formatting operations to limit the number of
decimal places you're printing.  For example, '%.6f' % my_float, or
format(my_float, '.6f'), will give my_float to 6 places after the
decimal point.  Or, as others have mentioned, it just so happens that
Python 2.7 and 3.x will output a nice representation for this float
automatically.  That wouldn't necessarily be true if the result were
coming from a calculation, though, so you shouldn't rely on repr
producing nice results in those versions of Python.

 - If you *really* need a number that represents the *exact* value
34.52, then use the decimal module, or perhaps consider using a simple
home-brewed fixed-point representation.  One situation where you might
care is when doing financial calculations, and in particular when
rounding a quantity to a smaller number of decimal digits.  Here
binary floats can give unpredictable results in halfway cases. (E.g.,
round(2.675, 2) might give 2.68 or 2.67, depending on what version of
Python you're using, and also possibly depending on your platforms.)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ctypes' c_longdouble: underflow error (bug?)

2010-07-16 Thread Mark Dickinson
On Jul 16, 2:53 pm, kj  wrote:
> This is extremely confusing.  From my naive reading of the
> documentation, I would have expected that the following two blocks
> would produce identical results (expl is one of the standard C math
> library exponential functions, with signature "long double expl(long
> double)"):
>
> MATH.expl.argtypes = [c_longdouble]
> MATH.expl.restype = c_longdouble
> print MATH.expl(0)
>
> MATH.expl.argtypes = [c_double]
> MATH.expl.restype = c_double
> print MATH.expl(0)
>
> ...but no, they don't: the first one prints out the correct result,
> 1.0, while the second one prints out 0.0, of all things.  (In fact,
> with the second (mis)configuration, the value returned by MATH.expl
> is always equal to its argument, go figure.)

This is just a case of garbage in, garbage out.  In the second case
you're telling ctypes that the signature of expl is

double expl(double)

which just isn't true.  ctypes has no way of telling that in fact expl
takes a long double rather than a double.

> I find these results perplexing because, based on the docs, I
> expected that they *both* would be analogous to doing the following
> in C:
>
> printf("%f\n", (double) expl((double) 0.0)); /* prints out 1.00 */

No, not really.  Assuming that expl has been declared somewhere (e.g.
you've included math.h), the C compiler knows that expl takes a long
double, so it'll convert the "(double) 0.0" argument to a long double
before passing it to expl.

Your second Python+ctypes example would be equivalent to doing
something like this in C:

#include 
double expl(double x);  /* deliberate misdeclaration of expl */

int main(void) {
  printf("%f\n", expl(0.0));
  return 0;
}

which indeed produces 0.0 on my machine (gcc 4.2 / OS X 10.6).  And
the compiler helpfully issues a warning:

test.c:1: warning: conflicting types for built-in function ‘expl’

Actually, the warning is a bit surprising, because there's only one
(wrong) declaration for expl, so what is there for it to conflict
with?  The answer is that there are some commonly used library
functions that gcc has builtin versions for, so already knows the
signature of;  expl is one of these.  This is backed up by the fact
that when compiling with -fno-builtin-expl, no warning is issued.

> i.e., in *both* cases, expl would get passed a double (which gets
> automatically cast into a long double),

But how on earth would ctypes *know* it's supposed to convert
(nitpick: not cast) to a long double?  The function signature isn't
available to ctypes;  it's only through you setting .argtypes
and .restype that ctypes knows anything about it.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: round issue

2010-07-12 Thread Mark Dickinson
Emile van Sebille  fenx.com> writes:

> 
> On 7/12/2010 2:52 AM Robin Becker said...
> 
> > What value should round(-9.85,1) return?
> 
> Per round's definition, -9.9.

No.  The float that's represented by the literal '-9.85' *isn't*
exactly -9.85, for all the usual binary floating-point reasons.
The value that gets stored is a tiny amount smaller (i.e., closer
to zero) than -9.85, so according to the definition it should
round *down*, to -9.8.  (Or rather, to a float that's very close
to, but not exactly equal to, -9.8.)

>  String interpolation for %n.mf doesn't 
> appear to define it's rounding behavior, so a peek at the source would 
> answer what's being done.

In Python 2.6, the string interpolation delegates to the system,
so it does whatever C string formatting does.  Usually that's
rounding to nearest, with exact halfway cases rounded to
even.  In Python 2.7 and 3.x, Python has its own code for
string formatting, and again it rounds to nearest, rounding
ties to even.

> It does look inconsistent however, and it seems to me rounding and 
> interpolation should behave similarly.

Agreed.  In this case it's a minor bug that round(-9.85, 1)
produces -9.9 instead of -9.8;  both string formatting
and round should give -9.8.  This bug is fixed in Python
2.7 and in Python 3.x.

Note that in 2.7 there's still a legitimate difference: round
rounds halfway cases away from 0, while string formatting
rounds them to even.  So the following results are correct:

Python 2.7 (r27:82500, Jul 11 2010, 22:38:53) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> round(1.25, 1)
1.3
>>> '%.1f' % 1.25
'1.2'

(1.25 *is* an exact halfway case, since it's exactly
representable as a binary float.)

In Python 3.x, round always does round-half-to-even, so
string formatting and round should agree (and if they don't,
it's definitely a bug: please report it!)

With all this said, asking for *decimal* rounding of
*binary* approximations to *decimal* halfway cases to give
the results you expect is ... optimistic, to say the least.
Use the decimal module if you care about which way
your (almost) halfway cases get rounded.

[I already replied to this earlier through Google groups, but
I'm not sure whether it went through properly.  Apologies
for the duplication, if so.]

-- 
Mark


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ValueError: invalid literal for float(): -1.#IND (pickle.py)

2010-07-12 Thread Mark Dickinson
Alexander Eisenhuth  stacom-software.de> writes:

>File "C:\Python25\lib\pickle.py", line 954, in load_float
>  self.append(float(self.readline()[:-1]))
> ValueError: invalid literal for float(): -1.#IND
> 
> - I'm not sure what -1.#IND means. Can somebody assist?

It's the Windows way of representing a NaN (not a number).
A NaN is typically what you get when you try to perform an
invalid floating-point operation, like taking the square root
of a negative number, or dividing 0.0 by 0.0.  Some people
also use NaNs to represent uninitialized values, or as
placeholders for missing data.

> - As pickle write the data I'm a bit confused, that is can't
> be unpickled it. Is that a bug or a feature?

Well, it's certainly not ideal.  It shouldn't be a problem in
Python 2.6 or 2.7, though;  unfortunately,  Python 2.5 is no
longer receiving bugfixes, so it's not going to change there.

> BTW: I'm tied to version 2.5 of python

Have you tried using pickle protocol 1 or 2, instead of pickle
protocol 0?  That may well solve your  problem.  (Those
protocols write out the binary form of a float directly, instead
of reading and writing a string representation.)

-- 
Mark


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: round issue

2010-07-12 Thread Mark Dickinson
On Jul 12, 10:52 am, Robin Becker  wrote:
> What value should round(-9.85,1) return? Is the result explainable in python 
> (ie
> without resort to the internal FP representations etc etc)?

As you observe, the closest float to -9.85 is actually just a little
smaller (i.e., closer to 0) than -9.85:

Python 2.7 (r27:82500, Jul 11 2010, 22:38:53)
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import decimal
>>> decimal.Decimal(-9.85)
Decimal('-9.8496447286321199499070644378662109375')

So you're right: round(-9.85, 1) *should* return -9.8.  The 2.x (for x
<= 6) version of round is a bit deficient in that respect.
Internally, it's doing the obvious thing:  namely, multiplying by
10.0, rounding to the nearest integer, then dividing by 10.0.  The
problem is that this not-quite-9.85 value, when multiplied by 10,
becomes (as a result of rounding error) *exactly* 98.5, which then
gets rounded *up* to 99 instead of down to 98.

This is fixed in Python 2.7, and in Python 3.x.  (The code that was
introduced for the new short float repr made it easy to fix.)

That said, if your client really *means* -9.85 (rather than some
binary approximation to it), and wants it to round in a predictable
manner, the right way to fix this would be to use Decimal instead of
float to represent the number.  That way, you can also specify what
rounding mode you want (instead of relying on the default round-half-
away-from-zero in 2.x or round-half-to-even in 3.x.)

>>> decimal.Decimal('-9.85').quantize(decimal.Decimal('0.1'), 
>>> rounding=decimal.ROUND_HALF_UP)
Decimal('-9.9')
>>> decimal.Decimal('-9.85').quantize(decimal.Decimal('0.1'), 
>>> rounding=decimal.ROUND_HALF_EVEN)
Decimal('-9.8')

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: integer >= 1 == True and integer.0 == False is bad, bad, bad!!!

2010-07-11 Thread Mark Dickinson
On Jul 11, 6:38 am, rantingrick  wrote:
> Seems kinda dumb to build a tuple just so a conditional wont blow
> chunks! This integer bool-ing need to be fixed right away!

Okay.  What fix do you propose?  Would your fix maintain the identity
"0 == False"? For bonus points, explain how you'd deal with any
backwards compatibility problems that your fix introduces.

Have you considered forking Python?  That may be the way forward here.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python -- floating point arithmetic

2010-07-08 Thread Mark Dickinson
On Jul 8, 9:52 pm, Wolfram Hinderer 
wrote:
> JFTR, it works because a+b == a+b (while I don't think that a+b == b+a
> holds for all a and b).

Actually, that's one of the few identities that's safe.  Well, for non-
NaN IEEE 754 floating-point, at any rate.  And assuming that there's
no use of extended precision to compute intermediate results (in which
case one side could conceivably be computed with different precision
from the other;  but that applies to a+b == a+b, too).  And assuming
that no FPE signals occur and halt the program... (e.g., for overflow,
or from doing -inf + inf).  Okay, maybe not that safe, then.  :)

For NaNs, there are two problems:  first, if exactly one of a and b is
a NaN then a+b and b+a will both be NaNs, so a + b == b + a will be
false, even though they're identical.  Second, if a and b are *both*
NaNs, and you're feeling really fussy and care about signs and
payloads, then a + b and b + a won't even necessarily be bitwise
identical:  a + b will have the sign and payload of a, while b + a has
the sign and payload of b.  You can see something similar with the
Decimal module:

>>> Decimal('NaN123') + Decimal('-NaN456')
Decimal('NaN123')
>>> Decimal('-NaN456') + Decimal('NaN123')
Decimal('-NaN456')

Or more directly (just for fun), using the struct module to create
particular nans:

>>> import struct
>>> x = struct.unpack('>> y = struct.unpack('>> x
nan
>>> y
nan
>>> struct.pack('>> struct.pack('http://mail.python.org/mailman/listinfo/python-list


Re: Python -- floating point arithmetic

2010-07-08 Thread Mark Dickinson
On Jul 8, 2:59 pm, Stefan Krah  wrote:
> pow() is trickier. Exact results have to be weeded out before
> attempting the correction loop for correct rounding, and this is
> complicated.
>
> For example, in decimal this expression takes a long time (in cdecimal
> the power function is not correctly rounded):
>
> Decimal('100.0') ** Decimal('-557.71e-74288')

Hmm.  So it does.  Luckily, this particular problem is easy to deal
with.  Though I dare say that you have more up your sleeve. :)?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python -- floating point arithmetic

2010-07-08 Thread Mark Dickinson
On Jul 8, 3:29 pm, Adam Skutt  wrote:
> On Jul 8, 9:22 am, Mark Dickinson  wrote:
> > On Jul 8, 2:00 pm, Adam Skutt  wrote:
> > > For some computations, the number of bits required to
> > > get the desired precision can quickly overwhelm the finite limitations
> > > of your machine (e.g., you run out of RAM first or the time to compute
> > > the answer is simply unacceptable).
>
> > Perhaps in theory.  In practice, though, it's very rare to need to
> > increase precision more than once or twice beyond an initial first
> > guesstimate, and the amount of extra precision needed is small.  That
> > increase is unlikely to cause problems unless you were operating right
> > up against your machine's limits in the first place.
>
> I suspect your platitude isn't especially comforting for those who
> need more computing capability than we can currently construct.
> However, I wouldn't call the amount of extra needed precision "small"

I think that's because we're talking at cross-purposes.

To clarify, suppose you want to compute some value (pi;  log(2);
AGM(1, sqrt(2)); whatever...) to 1000 significant decimal places.
Then typically the algorithm (sometimes known as Ziv's onion-peeling
method) looks like:

(1) Compute an initial approximation to 1002 digits (say), with known
absolute error (given by a suitable error analysis); for the sake of
argument, let's say that you use enough intermediate precision to
guarantee an absolute error of < 0.6 ulps.

(2) Check to see whether that approximation unambiguously gives you
the correctly-rounded 1000 digits that you need.

(3) If not, increase the target precision (say by 3 digits) and try
again.

It's the precision increase in (3) that I was calling small, and
similarly it's step (3) that isn't usually needed more than once or
twice.  (In general, for most functions and input values;  I dare say
there are exceptions.)

Step (1) will often involve using significantly more than the target
precision for intermediate computations, depending very much on what
you happen to be trying to compute.  IIUC, it's the extra precision in
step (1) that you don't want to call 'small', and I agree.

IOW, I'm saying that the extra precision required *due to the table-
maker's dilemma* generally isn't a concern.

I actually agree with much of what you've said.  It was just the
"impossible" claim that went over the top (IMO).  The MPFR library
amply demonstrates that computing many transcendental functions to
arbitrary precision, with correctly rounded results, is indeed
possible.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python -- floating point arithmetic

2010-07-08 Thread Mark Dickinson
On Jul 8, 2:00 pm, Adam Skutt  wrote:
> On Jul 8, 7:23 am, Mark Dickinson  wrote:> On Jul 8, 
> 11:58 am, Adam Skutt  wrote:
>
> > > accurately.  Moreover, in general, it's impossible to even round
> > > operations involving transcendental functions to an arbitrary fixed-
> > > precision, you may need effectively infinite precision in order to the
> > > computation.
>
> > Impossible?  Can you explain what you mean by this?  Doesn't the
> > decimal module do exactly that, giving correctly-rounded exp() and
> > log() results to arbitrary precision?
>
> You run into the table-maker's dilemma: there's no way to know in
> advance how many digits you need in order to have n bits of precision
> in the result.

Sure.  But it's a bit of a stretch to go from not knowing what
resources you'll need in advance to calling something 'impossible'. :)

> For some computations, the number of bits required to
> get the desired precision can quickly overwhelm the finite limitations
> of your machine (e.g., you run out of RAM first or the time to compute
> the answer is simply unacceptable).

Perhaps in theory.  In practice, though, it's very rare to need to
increase precision more than once or twice beyond an initial first
guesstimate, and the amount of extra precision needed is small.  That
increase is unlikely to cause problems unless you were operating right
up against your machine's limits in the first place.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python -- floating point arithmetic

2010-07-07 Thread Mark Dickinson
On Jul 7, 1:05 pm, david mainzer  wrote:
> Dear Python-User,
>
> today i create some slides about floating point arithmetic. I used an
> example from
>
> http://docs.python.org/tutorial/floatingpoint.html
>
> so i start the python shell on my linux machine:
>
> d...@maxwell $ python
> Python 2.6.5 (release26-maint, May 25 2010, 12:37:06)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.>>> >>> 
> sum = 0.0
> >>> >>> for i in range(10):
>
> ...     sum += 0.1
> ...>>> >>> sum
> 0.99989
>
> But thats looks a little bit wrong for me ... i must be a number greater
> then 1.0 because 0.1 = 
> 0.155511151231257827021181583404541015625000
> in python ... if i print it.

So you've identified one source of error here, namely that 0.1 isn't
exactly representable (and you're correct that the value stored
internally is actually a little greater than 0.1).  But you're
forgetting about the other source of error in your example: when you
do 'sum += 0.1', the result typically isn't exactly representable, so
there's another rounding step going on.  That rounding step might
produce a number that's smaller than the actual exact sum, and if
enough of your 'sum += 0.1' results are rounded down instead of up,
that would easily explain why the total is still less than 1.0.

>
> So i create an example program:
>
> sum = 0.0
> n = 10
> d = 1.0 / n
> print "%.60f" % ( d )
> for i in range(n):
>     print "%.60f" % ( sum )
>     sum += d
>
> print sum
> print "%.60f" % ( sum )
>
>  RESULTs --
> 0.1555111512312578270211815834045410156250
> 0.
> 0.1555111512312578270211815834045410156250
> 0.20001110223024625156540423631668090820312500
> 0.3000444089209850062616169452667236328125
> 0.40002220446049250313080847263336181640625000
> 0.5000
> 0.59997779553950749686919152736663818359375000
> 0.6999555910790149937383830547332763671875
> 0.79993338661852249060757458209991455078125000
> 0.8999111821580299874767661094665527343750
> 1.0
> 0.99988897769753748434595763683319091796875000
>
> and the jump from 0.50*** to 0.5999* looks wrong
> for me ... do i a mistake or is there something wrong in the
> representation of the floating points in python?

Look at this more closely:  you're adding

0.5000

to

0.155511151231257827021181583404541015625

The *exact* result is, of course

0.655511151231257827021181583404541015625

However, that's not a number that can be exactly represented as a C
double (which is how Python stores floats internally).  This value
falls between the two (consecutive) representable values:

0.59997779553950749686919152736663818359375

and

0.600088817841970012523233890533447265625

But of these two, the first is closer to the exact value than the
second, so that's the result that you get.

You can convince yourself of these results by using the fractions
module to do exact arithmetic:

>>> from fractions import Fraction
>>> tenth = Fraction.from_float(0.1)
>>> half = Fraction.from_float(0.5)
>>> point_six = Fraction.from_float(0.6)   # 0.5999
>>> point_six_plus = Fraction.from_float(0.6 + 2**-53)  # next float up, 
>>> 0.600
>>> sum = tenth + half # exact value of the sum
>>> point_six < sum < point_six_plus# lies between point_six and 
>>> point_six_plus
True
>>> sum - point_six < point_six_plus - sum  # but it's closer to point_six
True


> my next question, why could i run
>
> print "%.66f" % ( sum )
>
> but not
>
> print "%.67f" % ( sum )

That's a historical artefact resulting from use of a fixed-length
buffer somewhere deep in Python's internals.  This restriction is
removed in Python 2.7 and Python 3.x.

> can anybody tell me how python internal represent a float number??

In CPython, it's stored as a C double, which typically means in IEEE
754 binary64 format.  (Of course, since it's a Python object, it's not
just storing the C double itself;  it also has fields for the object
type and the reference count, so a Python float typically takes 16
bytes of memory on a 32-bit machine, and 24 bytes on a 64-bit
machine.)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SyntaxError not honoured in list comprehension?

2010-07-05 Thread Mark Dickinson
On Jul 5, 7:12 pm, jmfauth  wrote:
> BTW, if I understand correctly the module tokenize import
> the module token. So your example becomes:
>
> >>> from cStringIO import StringIO
> >>> import tokenize
> >>> for tok in tokenize.generate_tokens(StringIO("print9.0").readline):
>
>         print tokenize.tok_name[tok[0]], tok[1]

Ah yes;  you're right.  Thanks!

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SyntaxError not honoured in list comprehension?

2010-07-05 Thread Mark Dickinson
On Jul 4, 11:02 pm, John Machin  wrote:
> On Jul 5, 1:08 am, Thomas Jollans  wrote:
>
> > On 07/04/2010 03:49 PM, jmfauth wrote:
> > >   File "", line 1
> > >     print9.0
> > >            ^
> > > SyntaxError: invalid syntax
>
> > somewhat strange, yes.
>
> There are two tokens, "print9" (a name) and ".0" (a float constant) --
> looks like SyntaxError to me.

Yep.  Looks that way to me, too.

Python 2.7.0+ (release27-maint:82569, Jul  5 2010, 08:35:08)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from cStringIO import StringIO
>>> import tokenize, token
>>> for tok in tokenize.generate_tokens(StringIO("print9.0").readline):
... print token.tok_name[tok[0]], tok[1]
...
NAME print9
NUMBER .0
ENDMARKER

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SyntaxError not honoured in list comprehension?

2010-07-04 Thread Mark Dickinson
On Jul 4, 9:55 am, Mark Dickinson  wrote:
> Why?  If Python itself has no problem parsing this code, why should it
> be so difficult for editors?  Python's grammar is fairly simple:  it's
> LL(1) (unlike C's, for example), so can be parsed with only 1 token of
> lookahead.

Bah.  Ignore the bit about C.  I was thinking that the dangling else
issue made this a problem, but now I'm not sure that's true.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SyntaxError not honoured in list comprehension?

2010-07-04 Thread Mark Dickinson
On Jul 4, 9:31 am, jmfauth  wrote:
> Python all versions.
>
> It's not a bug, but I'm suprised the following does
> not raise a SyntaxError (missing space between
> '9' and 'for').
>
>
>
> >>> [9for c in 'abc']
> [9, 9, 9]
>
> Side effect: If this behaviour is considered as correct,
> it makes a correct Python code styling (IDLE, editors, ...)
> practically impossible to realise.

Why?  If Python itself has no problem parsing this code, why should it
be so difficult for editors?  Python's grammar is fairly simple:  it's
LL(1) (unlike C's, for example), so can be parsed with only 1 token of
lookahead.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: a +b ?

2010-06-14 Thread Mark Dickinson
On Jun 13, 5:46 pm, exar...@twistedmatrix.com wrote:
> On 04:25 pm, wuwe...@gmail.com wrote:
>
>
>
> >Steven D'Aprano  wrote:
> >>No, I think your code is very simple. You can save a few lines by
> >>writing
> >>it like this:
>
> >>s = input('enter two numbers: ')
> >>t = s.split()
> >>print(int(t[0]) + int(t[1]))  # no need for temporary variables a and
> >>b
>
> >Not that we're playing a round of code golf here, but this is a
> >slightly nicer take on your version:
>
> >one, two = input('enter two numbers: ').split()
> >print(int(one) + int(two))
>
> >I like names over subscripts, but that's just me :)
>
> Fore!
>
>     print(sum(map(int, input('enter two numbers: ').split(
>
> Jean-Paul

58 characters.  You could remove the space after 'int' to make it 57.
Here's an evil 56-character version...

print(eval(input('enter two numbers: ').replace(*' +')))

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: math.erfc OverflowError

2010-06-13 Thread Mark Dickinson
On Jun 13, 12:56 am, geremy condra  wrote:
> On Sat, Jun 12, 2010 at 4:40 PM, Robert Kern  wrote:
> > On 2010-06-12 17:49 , geremy condra wrote:
>
> >> In Python3.2, calling math.erfc with a value in [-27.2, -30) raises
> >> an OverflowError: math range error. This is inconsistent with the
> >> erfc function from scipy (scipy.special.erfc) as well as with the C99
> >> function by the same name, both of which return 2. I suspect that
> >> this is the result of the cutoff for the use of the continuing fraction
> >> approximation of erfc beginning when abs(x)>  30, but I'm not sure.
> >> Is this desired behavior or should I file a bug report?
>
> > Bug, I think.
>
> Bug filed,http://bugs.python.org/issue8986.

A bug indeed.  Many thanks for catching and reporting this, Geremy;
it would have been embarrassing for Python 2.7 to be released with
this bug in it.  (Well, I would have been embarrassed, anyway; not
sure about anyone else.)

In case anyone cares, the bug was due to a misinterpreted "errno"
value:  there's an "exp(-x*x)" term in the formulas for erf(x) and
erfc(x), and that term underflows to zero at abs(x) =
sqrt(1075*log(2)), which is around 27.297.  Some system math libraries
(unfortunately not including the ones I tested on;  I *did* test,
honest!) set errno on underflow to zero;  this errno value was then
being misinterpreted as an indication of overflow by Python's general
libm wrappings.

Anyway, it's fixed now.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Decimal problem

2010-06-10 Thread Mark Dickinson
On Jun 10, 8:45 pm, durumdara  wrote:
> ne 91, in fixed_conv_out_precise
>     from decimal import Decimal
> ImportError: cannot import name Decimal

Is it possible that you've got another file called decimal.py
somewhere in Python's path?  What happens if you start Python manually
and type 'from decimal import Decimal' at the prompt?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Mixing Decimal and float

2010-06-02 Thread Mark Dickinson
On Jun 2, 9:24 am, "B.V."  wrote:
> Hi,
>
> In order to solve some issues due to operations between Decimal and
> float, we wanted to implement a class that inherits from both float
> and Decimal.
>
> Typically, we wrote:
>
> class Float(Decimal, float):

Can you explain exactly what issues you want to solve, and how you
want your Float class to behave?  Do I understand correctly that you
want your Float class to be able to represent both floats and
Decimals?

> But we also need to do:
> isinstance(Float('1'), float) == True
> isinstance(Float('1'), Decimal) == True

Can you explain why you need this?

Should isinstance(Float('1.1'), float) and isinstance(Float('1.1'),
Decimal) also both be true, or would only one of those be true?  (And
by the way, what value would Float('1.1') have?  float('1.1') and
Decimal('1.1') are different values.)

I don't think your approach can succeed;  I'd suggest just subclassing
'object' and abandoning the 'isinstance' requirements.  Or perhaps
creating a subclass of Decimal that interacts nicely with floats.  You
might also want to investigate the numbers ABC, though that's new in
Python 2.6.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about permutations (itertools)

2010-05-31 Thread Mark Dickinson
On May 31, 3:04 pm, Vincent Davis  wrote:
> For example If I want all possible ordered lists of 0,1 of length 3
> (0,0,0)
> (0,0,1)
> (0,1,1)
> (1,1,1)
> (1,0,1)
> (1,1,0)
> (1,0,0)
> I don't see a way to get this directly from the itertools. But maybe I
> am missing something.

In this case, you're missing itertools.product:

>>> list(itertools.product([0, 1], repeat=3))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1,
1, 0), (1, 1, 1)]

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Friday Python Programming Pearl: random sampling

2010-05-29 Thread Mark Dickinson
On May 29, 3:43 pm, Bryan  wrote:
> Mark Dickinson wrote:
> > N.B.  I don't claim any originality for the algorithm; only for the
> > implementation: the algorithm is based on an algorithm attributed to
> > Robert Floyd, and appearing in Jon Bentley's 'Programming Pearls' book
>
> Actually it is the sequel, /More Programming Pearls/.

Thanks for the correction.  I confess that I've yet to read either
book;  I'll have to try to track them down.

> > (though that algorithm produces a set, so doesn't worry about the
> > ordering of the sample).
>
> Bentley presents a version of the Floyd algorithm that provides random
> order, but it requires a set data type with some idea of order, as in
> "insert j in s after t".

Ah, nice.  The dict values, of course, exactly provide the necessary
idea of order, so I guess this amounts to pretty much the same thing.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


A Friday Python Programming Pearl: random sampling

2010-05-28 Thread Mark Dickinson
For a lazy Friday evening, here's a Python algorithm that seemed so
cute that I just had to share it with everyone.  I'm sure it's well
known to many here, but it was new to me.  Skip directly to the
'sample2' function to see the algorithm and avoid the commentary...

Suppose that you want to select a number of elements, k, say, from a
population, without replacement.  E.g., selecting 3 elements from
range(30) might give you:

[13, 3, 27]

Order matters, so the above is considered distinct from [3, 13, 27].
And you want to be sure that each possible selection has equal
probability of occurring (to within the limits of the underlying
PRNG).

One solution is to select elements from the population one-by-one,
keep track of the indices of already-selected elements in a set, and
if you end up selecting something that's already in your set, simply
try again.  Something like this (code stolen and adapted from
Random.sample in Python's standard library 'random' module):

from random import randrange

def sample1(population, k):
n = len(population)
result = [None] * k
selected = set()
for i in range(k):
j = randrange(n)
# retry until we get something that's not already selected
while j in selected:
j = randrange(n)
selected.add(j)
result[i] = population[j]
return result


N.B.  The above is Python 3 code; for Python 2, replace range with
xrange.

All that's required of 'population' here is that it implements __len__
and __getitem__.  The method works well for k significantly smaller
than n, but as k approaches n the number of reselections required
increases.  So for larger k, Random.sample uses a different method:
roughly, make a copy of 'population', do a partial in-place shuffle of
that copy that randomizes the first k elements, and return those.
This second method isn't so great when k is small and n is huge, since
it ends up being O(n) from the list copy, but it works out that the
two methods complement each other nicely.

Looking at the above code, I was idly wondering whether there was a
way to alter 'sample1' to avoid the need for resampling, thus giving a
single algorithm that works reasonably efficiently regardless of the
population size and requested sample size.  And it turns out that
there is.  The code below is similar to 'sample1' above, except that
instead of using a set to keep track of indices of already-selected
members of the population, it uses a dict; for an index i
(corresponding to a member of the population), d[i] gives the position
that population[i] will occupy in the resulting sample.


from random import randrange

def sample2(population, k):
n = len(population)

d = {}
for i in reversed(range(k)):
j = randrange(i, n)
if j in d:
d[i] = d[j]
d[j] = i

result = [None] * k
for j, i in d.items():
result[i] = population[j]
return result


Note that no resampling is required, and that there's no copying of
the population list.  The really clever bit is the 'if j in d: ...'
block.  If you stare at the algorithm for long enough (and it does
take some staring), you can convince yourself that after the first
'for' loop, d can be any of the n*(n-1)*...*(n-k+1)
mappings-with-no-repeated-elements from some set of k elements of
range(n) to range(k), and that each one of these mappings is equally
likely to occur.  In a sense, this d is the inverse of the desired
sample, which would be a map with no repetitions from range(k) to
range(n).  So inverting d, and replacing d's keys by the corresponding
population elements, gives the random sample.

N.B.  I don't claim any originality for the algorithm; only for the
implementation: the algorithm is based on an algorithm attributed to
Robert Floyd, and appearing in Jon Bentley's 'Programming Pearls' book
(though that algorithm produces a set, so doesn't worry about the
ordering of the sample).  But I was struck by its beauty and
simplicity, and thought it deserved to be better known.

Happy Friday!

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can upper() or lower() ever change the length of a string?

2010-05-24 Thread Mark Dickinson
On May 24, 1:13 pm, Steven D'Aprano  wrote:
> Do unicode.lower() or unicode.upper() ever change the length of the
> string?
>
> The Unicode standard allows for case conversions that change length, e.g.
> sharp-S in German should convert to SS:
>
> http://unicode.org/faq/casemap_charprop.html#6
>
> but I see that Python doesn't do that:
>
> >>> s = "Paßstraße"
> >>> s.upper()
>
> 'PAßSTRAßE'
>
> The more I think about this, the more I think that upper/lower/title case
> conversions should change length (at least sometimes) and if Python
> doesn't do so, that's a bug. Any thoughts?

Digging a bit deeper, it looks like these methods are using the
Simple_{Upper,Lower,Title}case_Mapping functions described at
http://www.unicode.org/Public/5.1.0/ucd/UCD.html fields 12, 13 and 14
of the unicode data;  you can see this in the source in Tools/unicode/
makeunicodedata.py, which is the Python code that generates the
database of unicode properties.  It contains code like:

if record[12]:
upper = int(record[12], 16)
else:
upper = char
if record[13]:
lower = int(record[13], 16)
else:
lower = char
if record[14]:
title = int(record[14], 16)

... and so on.

I agree that it might be desirable for these operations to product the
multicharacter equivalents.  That idea looks like a tough sell,
though:  apart from backwards compatibility concerns (which could
probably be worked around somehow), it looks as though it would
require significant effort to implement.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can upper() or lower() ever change the length of a string?

2010-05-24 Thread Mark Dickinson
On May 24, 1:13 pm, Steven D'Aprano  wrote:
> Do unicode.lower() or unicode.upper() ever change the length of the
> string?

>From looking at the source, in particular the fixupper and fixlower
functions in Objects/unicode.c [1], it looks like not:  they do a
simple character-by-character replacement.

[1] http://svn.python.org/view/python/trunk/Objects/unicodeobject.c?view=markup
--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: function that counts...

2010-05-19 Thread Mark Dickinson
On May 19, 9:30 pm, superpollo  wrote:
> René 'Necoro' Neumann ha scritto:
> > An idea would be:
>
>  def prttn(m, n):
> > ...        return sum(1 for x in range(n) if sum(map(int, str(x))) == m)
>
> TypeError: 'int' object is not callable
>
> on 2.5.4

The TypeError is almost certainly because you've created a integer
'sum' variable in your script/interpreter session, hiding the built-in
sum function.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to calculate leading whitespace

2010-05-09 Thread Mark Dickinson
On May 9, 6:13 am, Steven D'Aprano  wrote:
> On Sat, 08 May 2010 13:46:59 -0700, Mark Dickinson wrote:
> >> However, s[:-len(t)] should be both faster and correct.
>
> > Unless len(t) == 0, surely?
>
> Doh! The hazards of insufficient testing. Thanks for catching that.

I have a love-hate relationship with the negative index semantics for
exactly this reason: code like 'x[-n]' always seems smelly to me.
It's often not what the code author actually wanted, except when n is
guaranteed strictly positive for some reason.  'x[-1]' is fine, of
course.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fastest way to calculate leading whitespace

2010-05-08 Thread Mark Dickinson
On May 8, 8:46 pm, Steven D'Aprano  wrote:
> On Sat, 08 May 2010 12:15:22 -0700, Wolfram Hinderer wrote:
> > On 8 Mai, 20:46, Steven D'Aprano 
> > wrote:
>
> >> def get_leading_whitespace(s):
> >>     t = s.lstrip()
> >>     return s[:len(s)-len(t)]
>
> >> >>> c = get_leading_whitespace(a)
> >> >>> assert c == leading_whitespace
>
> >> Unless your strings are very large, this is likely to be faster than
> >> any other pure-Python solution you can come up with.
>
> > Returning s[:-1 - len(t)] is faster.
>
> I'm sure it is. Unfortunately, it's also incorrect.
>
> >>> z = "*abcde"
> >>> z[:-1-5]
> ''
> >>> z[:len(z)-5]
>
> '*'
>
> However, s[:-len(t)] should be both faster and correct.

Unless len(t) == 0, surely?

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: long int computations

2010-05-06 Thread Mark Dickinson
On May 3, 9:49 pm, s...@sig.for.address (Victor Eijkhout) wrote:
> Jerry Hill  wrote:
> > >>> from __future__ import division
> > >>> long1/long2
> > 0.5
>
> Beautiful. Thanks so much guys.

And if for some reason you don't want to use the 'from __future__'
import, then you can do long1.__truediv__(long2):

>>> n = 765*10**1000 + 1
>>> n.__truediv__(n+1)
1.0

If you care about speed at all, I'd avoid the Fractions solution;  it
does an expensive and slow gcd computation.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Engineering numerical format PEP discussion

2010-04-27 Thread Mark Dickinson
On Apr 27, 2:16 am, Keith  wrote:
> On Apr 26, 8:47 pm, MRAB  wrote:
>
> > "t" for "powers of a thousand", perhaps? (Or "m"?)
>
> Both of those letters are fine.  I kinda like "m" for the whole Greco-
> Roman angle, now that you point it out :-)

By the way, there's already a feature request open for this:

http://bugs.python.org/issue8060

That might be a good place to hash out the precise semantics.  If you
could provide unit tests and/or an implementation that would likely
help move the issue along.

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Engineering numerical format PEP discussion

2010-04-26 Thread Mark Dickinson
On Apr 26, 6:47 am, Keith  wrote:
> From that document it appears that my decimal.Decimal(1234567) example
> shows that the module has a bug:
>
> Doc says:
> [0,123,3] ===>  "123E+3"
>
> But Python does:>>> import decimal
> >>> decimal.Decimal(123000).to_eng_string()
>
> '123000'

That's not a bug.  The triple [0,123,3] is Decimal('123e3'), which is
not the same thing as Decimal('123000').  The former has an exponent
of 3 (in the language of the specification), while the latter has an
exponent of 0.

>>> decimal.Decimal('123e3').to_eng_string()
'123E+3'

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Engineering numerical format PEP discussion

2010-04-26 Thread Mark Dickinson
On Apr 26, 4:36 am, Keith  wrote:
> I am considering writing a PEP for the inclusion of an engineering
> format specifier, and would appreciate input from others.

> [...]

> I am thinking that if we simply added something like %n (for eNgineer)
> to the list of format specifiers that we could make life easier for
> engineers:
>
> ("%n" % 12345)  == "12.345e+03"
> ("%n" %  1234)  == "1.234e+03"
> ("%n" %   123)  == "123e+00"
> ("%n" % 1.2345e-5)  == "12.345e+06"

I don't think there's much chance of getting changes to old-style
string formatting accepted;  you might be better off aiming at the new-
style string formatting.  (And there, the 'n' modifier is already
taken for internationalization, so you'd have to find something
different. :)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ctypes question

2010-04-14 Thread Mark Dickinson
On Apr 14, 7:09 pm, Brendan Miller  wrote:
> I'm using python 2.5.2.
>
> I have a ctypes function with argtypes like this:
>
> _create_folder.argyptes = [c_void_p, c_int]

Is that line a cut-and-paste?  If so, try 'argtypes' instead of
'argyptes'.  :)

> The issue I am having is that I can call it like this
>
> _create_folder(some_pointer, "asdf")
>
> and it won't raise a TypeError. Why would it accept a string for an
> integer argument?

I get the expected TypeError in 2.5.4 (OS X) (using printf, for lack
of anything better):

Python 2.5.4 (r254:67916, Feb 11 2010, 00:50:55)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import *
>>> libc = CDLL("libc.dylib")
>>> libc

>>> printf = libc.printf
>>> printf.argtypes = [c_char_p, c_int]
>>> printf("%d\n", 53)
53
3
>>> printf("%d\n", "asdf")
Traceback (most recent call last):
  File "", line 1, in 
ctypes.ArgumentError: argument 2: : wrong
type

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why these don't work??

2010-04-08 Thread Mark Dickinson
On Apr 8, 7:10 pm, "M. Hamed" 
wrote:
> I'm trying the following statements that I found here and there on
> Google, but none of them works on my Python 2.5, are they too old? or
> newer?
>
> "abc".reverse()

This isn't valid Python in any version that I'm aware of.  Where did
you see it?  It wouldn't make a lot of sense anyway, since by analogy
with list.reverse you'd expect it to reverse the given string in
place.  But that's not possible in Python, because strings are
immutable.

Maybe you're looking for something like:

>>> reversed("abc")


which works in versions of Python >= 2.4.

> import numpy

For this to work, you need to have numpy installed.  numpy is a third-
party package that isn't part of the standard Python distribution;
for more information, see:

http://numpy.scipy.org/

The best method for installing numpy would depend on your system, and
on where you got Python from.  On OS X, the system Python comes with
numpy as standard, for example.  On Linux, there's probably a python26-
numpy package (or something with a similar name) that you can
install.  On Windows:  no idea.  :)

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing tests for the Python bug tracker

2010-03-20 Thread Mark Dickinson
On Mar 20, 6:52 am, Steven D'Aprano  wrote:
> I've found this:
>
> http://docs.python.org/library/test.html
>
> and I've written a small test:
>
> $ cat test_unicode_interpolation.py
> # For testinghttp://bugs.python.org/issue8128
>
> import test.test_support
> import unittest
>
> class K(unicode):
>     def __str__(self): return "Surprise!"
>
> class UnicodeInterpolationTest(unittest.TestCase):
>     def test_interpolation(self):
>         self.assertEquals(u'%s' % K('some text'), 'Surprise!')
>
> def test_main():
>     test.test_support.run_unittest(UnicodeInterpolationTest)
>
> if __name__ == "__main__":
>     test_main()

This looks like a fine start to me.  I have a feeling that the current
fashion is for assertEqual rather than assertEquals, but I might be
wrong.  :)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing tests for the Python bug tracker

2010-03-20 Thread Mark Dickinson
On Mar 20, 6:23 am, Steven D'Aprano  wrote:
> I have two reported bugs in the bug tracker waiting on tests:
>
> http://bugs.python.org/issue8128http://bugs.python.org/issue4037
>
> Are there any guidelines for writing tests for the standard library and
> language?

Not that I can think of, beyond those you've already mentioned.  I
mostly just copy the style of existing tests (though there are
definitely some test_*** files that aren't particularly well written).

For quick questions, you might get good answers by asking on the
#python-dev freenode IRC channel:  a good few of the people interested
in testing (esp. Michael Foord, Ezio Melotti) can often be found
there.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Calculating very large exponents in python

2010-03-10 Thread Mark Dickinson
On Mar 9, 1:54 pm, casevh  wrote:
> After a few hours, the remaining factors are
>
> 6060517860310398033985611921721
>
> and
>
> 9941808367425935774306988776021629111399536914790551022447994642391
>
> casevh

Whoops---I missed this.  I'm too slow!  But at least my answers agree
with yours.  (Factoring 10**78+1 took around 7 seconds using GP/Pari
on a 2.5 GHz MacBook;  factoring the remaining quotient n / (10**78+1)
was much quicker.)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Calculating very large exponents in python

2010-03-10 Thread Mark Dickinson
On Mar 9, 6:39 am, casevh  wrote:
> [also replying to Geremy since the OP's message doesn't appear...]
>
> On Mar 8, 11:05 am, geremy condra  wrote:
>
>
>
>
>
> > On Mon, Mar 8, 2010 at 2:15 AM, Fahad Ahmad  wrote:
> > > Thanks Geremy,
>
> > > That has been an absolute bump... GOD i cant sit on my chair, it 
> > > has
> > > worked even on 512 bit number and with no time..
> > > superb i would say.
>
> > > lastly, i am using the code below to calculate Largest Prime factor of a
> > > number:
>
> > > print
> > > ('''===
> > >  ­'''
> > >    '''  CALCULATE  HIGHEST PRIME
> > > FACTOR  '''
>
> > > '''
> > >  ­===''')
>
> > > #!/usr/bin/env python
> > > def highest_prime_factor(n):
> > >    if isprime(n):
> > >   return n
> > >    for x in xrange(2,n ** 0.5 + 1):
> > >   if not n % x:
> > >  return highest_prime_factor(n/x)
> > > def isprime(n):
> > >    for x in xrange(2,n ** 0.5 + 1):
> > >   if not n % x:
> > >  return False
> > >    return True
> > > if  __name__ == "__main__":
> > >    import time
> > >    start = time.time()
> > >    print highest_prime_factor(1238162376372637826)
> > >    print time.time() - start
>
> > > the code works with a bit of delay on the number : "1238162376372637826" 
> > > but
> > > extending it to
> > > (10902610991329142436630551158108608965062811746392577675456004845499113044
> > >  
> > > ­30471090261099132914243663055115810860896506281174639257767545600484549911
> > >  3­0443047)
> > >  makes python go crazy. Is there any way just like above, i can have it
> > > calculated it in no time.
>
> > > thanks for the support.
>
> > If you're just looking for the largest prime factor I would suggest using
> > a fermat factorization attack. In the example you gave, it returns
> > nearly immediately.
>
> > Geremy Condra- Hide quoted text -
>
> > - Show quoted text -
>
> For a Python-based solution, you might want to look at pyecm (http://
> sourceforge.net/projects/pyecm/)
>
> On a system with gmpy installed also, pyecm found the following
> factors:
>
> 101, 521, 3121, 9901, 36479, 300623, 53397071018461,
> 1900381976777332243781
>
> There still is a 98 digit unfactored composite:
>
> 602525071745682437589111511878284384468144476539868422797968232621651594065 
> 00174226172705680274911
>
> Factoring this remaining composite using ECM may not be practical.
>
> casevh

The complete factorization is: 101 x 521 x 3121 x 9901 x 36479 x
300623 x 53397071018461 x 1900381976777332243781 x
6060517860310398033985611921721 x
9941808367425935774306988776021629111399536914790551022447994642391

It helps if you notice that the digits of the original 156-digit
number come from concatenating a 78-digit string to itself, giving an
immediate factor of 10**78 + 1.  (Oops.  Perhaps this was supposed to
be a secret back door to the OP's crypto scheme.  I've given it away
now. :))

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Calculating very large exponents in python

2010-03-08 Thread Mark Dickinson
[Replying to Geremy's reply because the OP's post didn't show up in my
newsreader.]
On Mar 7, 8:40 pm, geremy condra  wrote:
> On Sun, Mar 7, 2010 at 1:55 PM, Fahad Ahmad  wrote:
> > Dear All,
>
> > i am writing my crytographic scheme in python, i am just a new user to it.
> > I have written the complete code, the only point i am stuck it is that i am
> > using 256 exponentiation which is normal in crytography but python just
> > hangs on it.
>
> > g**x  [where both g and x are 256 bit numbers , in decimal they are around
> > 77]

No library can solve this problem.  If g and x are both 256-bit
numbers then the result of g**x will have on the order of 10**79 bits,
which matches estimates of the number of particles in the universe.  I
can only imagine that you actually want g**x % m for some m, in which
case three-argument pow is your friend, as Geremy pointed out.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre arithmetic results

2010-02-23 Thread Mark Dickinson
On Feb 23, 8:11 am, Steven D'Aprano
 wrote:
> Making spaces significant in that fashion is mind-bogglingly awful. Let's
> look at a language that does this:
>
> [st...@sylar ~]$ cat ws-example.rb
> def a(x=4)
>     x+2
> end
>
> b = 1
> print (a + b), (a+b), (a+ b), (a +b), "\n"
>
> [st...@sylar ~]$ ruby ws-example.rb
> 7773

Hmm.  That's pretty nasty, all right.  Not that Python can claim to be
immune to such behaviour:

>>> 3 .real
3
>>> 3. real
  File "", line 1
3. real
  ^
SyntaxError: invalid syntax


Though the fact that one of the cases raises an exception (rather than
silently giving some different behaviour) ameliorates things a bit.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: formatting a number as percentage

2010-02-21 Thread Mark Dickinson
On Feb 21, 5:53 pm, vsoler  wrote:
> I'm trying to print .7 as 70%
> I've tried:
>
> print format(.7,'%%')
> .7.format('%%')
>
> but neither works. I don't know what the syntax is...

Assuming that you're using Python 2.6 (or Python 3.x):

>>> format(.7, '%')
'70.00%'
>>> format(.7, '.2%')
'70.00%'

Or see TomF's response for how to use this with the str.format method.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Precision issue in python

2010-02-20 Thread Mark Dickinson
On Sat, Feb 20, 2010 at 2:42 PM, Shashwat Anand
 wrote:
> A quick solution I came out with, no stirling numbers and had tried to avoid
> large integer multiplication as much as possible.
>
> import math
>
> for i in range(int(raw_input())):
>     n, k, l = [int(i) for i in raw_input().split()]
>     e = sum(math.log10(i) for i in range(1, n+1))
> frac_e = e - math.floor(e)

This isn't going to give you enough accuracy when n gets large (and
the original problem seems to allow n to be as large as 10**8), for
the reasons I described earlier---that is, Python floats only give you
around 16 decimal digits of precision;  your 'e' value will already
have used up some of those 16 digits before the point, leaving you
fewer than 16 digits of precision after the point, so the absolute
error in frac_e will likely be larger than 1e-15.  That's not good
enough for getting the first 15 digits of 10**frac_e accurately.  For
large n, you'll also be accumulating significant error in the sum.

> a = str(10**frac_e * 10**(k - 1)).split('.')[0]

The str(...).split('.') here doesn't do a good job of extracting the
integer part when its argument is >= 1e12, since Python produces a
result in scientific notation.  I think you're going to get strange
results when k >= 13.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Precision issue in python

2010-02-20 Thread Mark Dickinson
On Feb 20, 3:37 pm, mukesh tiwari 
wrote:
> I don't know if is possible to import this decimal module but kindly
> tell me.Also a bit about log implementation

The decimal module is part of the standard library;  I don't know what
the rules are for SPOJ, but you're already importing the math module,
so at least *some* imports are obviously permitted.  Here's a quick
demonstration of how you might use the decimal module:  you can
probably find ways to tweak it for speed and accuracy.

from decimal import Decimal as D
from decimal import getcontext

getcontext().prec = 100   # working precision = 100 digits

pi =
D('3.14159265358979323846264338327950288419716939937510582097494459'
 
'2307816406286208998628034825342117067982148086513282306647093844')
half = D('0.5')
log = D.ln

def logfac(x):
"""Approximation to log(x!), using first few terms of Stirling's
series."""

x = D(x)
return log(2*pi)/2 + (x + half)*log(x) - x + \
1/(12*x) - 1/(360*x**3) + 1/(1260*x**5)

def fac_first_digits(n, k):
"""Get first k decimal digits of n!."""

log10_nfac = logfac(n)/log(D(10))
frac = log10_nfac - int(log10_nfac)
return int(10**(frac - 1 + k))


With the above code I get, for example (with Python 2.6):

>>> fac_first_digits(12345, 15)
344364246918678
>>> from math import factorial
>>> int(str(factorial(12345))[:15])
344364246918678

For small n, you'll need more terms of Stirling's series than I've
used above.  And there's always a small chance that the intermediate
errors involved in the computation might change those first 15
digits;  with a working precision of 100 and lots of terms of
Stirling's series it's a *very* small chance, but it's there.  If you
want something 100% watertight, you'd need to  compute strict upper
and lower bounds for the true result, and increase precision until
those bounds match sufficiently far that you can be sure of the first
k digits being correct.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Precision issue in python

2010-02-20 Thread Mark Dickinson
On Feb 20, 11:17 am, mukesh tiwari 
wrote:
> Hello everyone. I think it is  related to the precision with double
> arithmetic so i posted here.I am trying with this problem 
> (https://www.spoj.pl/problems/CALCULAT) and the problem say that "Note : for
> all test cases whose N>=100, its K<=15." I know precision of doubles
> in c is 16 digits. Could some one please help me with this precision
> issue.I used stirling (http://en.wikipedia.org/wiki/
> Stirling's_approximation) to calculate the first k digits of N.
> Thank you.

If I understand you correctly, you're trying to compute the first k
digits in the decimal expansion of N!, with bounds of k <= 15 and 100
<= N < 10**8.  Is that right?

Python's floats simply don't give you enough precision for this:
you'd need to find the fractional part of log10(N!) with >= 15 digits
of precision.  But the integral part needs ~ log10(log10(N!)) digits,
so you end up needing to compute log10(N!) with at least 15 +
log10(log10(N!)) ~ 15 + log10(N) + log10(log10(N)) digits (plus a few
extra digits for safety);  so that's at least 25 digits needed for N
close to 10**8.

The decimal module would get you the results you need (are you allowed
imports?).  Or you could roll your own log implementation based on
integer arithmetic.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Optimization

2010-02-14 Thread Mark Dickinson
On Feb 14, 4:53 pm, mukesh tiwari 
wrote:
> Hello everyone. I am new to python and previously i did programming in
> c/c++.Could some one please help me to improve the run time for this
> python program as i don't have idea how to optimized this code.This
> code also seems to be more unpythonic so how to make it look like more
> pythonic . I am trying for this problem(https://www.spoj.pl/problems/
> FACT1/).
> Thank you

One other thing:  in the 'brent' function, you're setting m to
randrange(1, n).  What's the purpose of this?  It looks to me as
though m controls the number of Pollard-Rho iterations that are
clumped together at one time (before doing a gcd operation), and using
a random number for this doesn't make a lot of sense to me.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Optimization

2010-02-14 Thread Mark Dickinson
On Feb 14, 6:03 pm, a...@pythoncraft.com (Aahz) wrote:
> In article 
> <363498c7-3575-4f1e-ad53-d9cd10c8d...@q16g2000yqq.googlegroups.com>,
> Mark Dickinson   wrote:
>
> >(2) Obvious things: use range rather than xrange in your loops.  
>
> Um, what?  You meant the reverse, surely?

Er, yes I did.  Thanks!

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Optimization

2010-02-14 Thread Mark Dickinson
On Feb 14, 4:53 pm, mukesh tiwari 
wrote:
> Hello everyone. I am new to python and previously i did programming in
> c/c++.Could some one please help me to improve the run time for this
> python program as i don't have idea how to optimized this code.
> [...]

How much of a speedup do you need?  Are we talking about orders of
magnitude (in which case you might want to consider using something
like the Multiple Polynomial Quadratic Sieve method instead, or as
well), or just a few percent?

(1) Have you profiled the code to see where it's spending most of its
time?  This is an essential first step.

(2) Obvious things: use range rather than xrange in your loops.  Make
sure that all heavily used variables/functions are local to the
function you're using them in.  E.g., you use range, min and abs in
the middle of the 'brent' function.  Instead, start the brent function
by setting _abs, _range, _min = abs, range, min, and then use _abs,
_range, etc. instead.  Lookups of local variables are faster than
globals.

(3) In the inner loop:

for i in range(min(m,r-k)):
y,q=(y*y+c)%n,q*abs(x-y)%n

you can get probably rid of the abs call.  It *might* also help to
avoid the tuple packing/unpacking (but see (1)).  You could try doing
a couple of steps at a time instead of one (i.e., unroll the loop a
little bit);  one advantage is that you probably don't need to bother
reducing q modulo n at every step;  every other step would be good
enough.  Depending on the relative speed of multiplication and
reduction, and the sizes of the integers involved, this might save
time.

(4) Have you tried using Montgomery reduction in the Brent method?
The inner loop of that method involves two reductions modulo n, which
may well be where the biggest bottleneck is.  But see (1). The other
obvious bottleneck is the gcd method;  if profiling shows that that's
the case, there might be ways to speed that up, too.  (E.g., use a
binary gcd algorithm, or use Lehmer's method.)

Good luck!
--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre arithmetic results

2010-02-11 Thread Mark Dickinson
On Feb 11, 1:38 pm, Duncan Booth  wrote:
> Tim Chase  wrote:
> > But perhaps Py3 changes evaluation, returning an complex number.
>
> Yes, the change is documented 
> athttp://docs.python.org/3.1/reference/expressions.html#the-power-operator
>
> If it is in any of the "What's new in Python x.xx" documents or in a PEP
> somewhere I haven't spotted it.

Not in the 'what's new' documents, as far as I can tell, but this
change was introduced as part of implementing PEP 3141.

http://www.python.org/dev/peps/pep-3141/

Here's an extract from the PEP, describing the 'Complex' abstract base
class:


class Complex(Number):
"""Complex defines the operations that work on the builtin complex
type.

In short, those are: conversion to complex, bool(), .real, .imag,
+, -, *, /, **, abs(), .conjugate(), ==, and !=.

If it is given heterogenous arguments, and doesn't have special
knowledge about them, it should fall back to the builtin complex
type as described below.
"""



@abstractmethod
def __pow__(self, exponent):
"""a**b; should promote to float or complex when necessary."""
raise NotImplementedError


--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bizarre arithmetic results

2010-02-11 Thread Mark Dickinson
On Feb 11, 12:44 am, Terrence Cole  wrote:
> Can someone explain to me what python is doing here?

> >>> -0.1 ** 0.1
> -0.7943282347242815

Here you're computing -(0.1 ** 0.1).  The exponentiation operator
binds more strongly than the negation operator.

> >>> a = -0.1; b = 0.1
> >>> a ** b
> (0.7554510437117542+0.2454609236416552j)

Here you're computing (-0.1) ** 0.1.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ternary plus

2010-02-10 Thread Mark Dickinson
On Feb 10, 8:31 am, Mark Dickinson  wrote:
> And here's how it's used in the decimal.Context module:

Aargh!  decimal.Context *class*, not module.

And it occurs to me that it would have been cleaner to have
Decimal.__add__ call Context.add rather than the other way around.
Then Decimal.__add__ could have stayed a two-argument function, as
 intended.  Oh well.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ternary plus

2010-02-10 Thread Mark Dickinson
On Feb 9, 6:47 pm, Martin Drautzburg  wrote:
> BTW I am not really trying to add three objects, I wanted a third object
> which controls the way the addition is done. Sort of like "/" and "//"
> which are two different ways of doing division.

That seems like a reasonable use case for a third parameter to
__add__, though as others have pointed out the only way to pass the
third argument is to call __add__ explicitly.  Here's an extract from
the decimal module:

class Decimal(object):

...

def __add__(self, other, context=None):
other = _convert_other(other)
if other is NotImplemented:
return other

if context is None:
context = getcontext()



...

And here's how it's used in the decimal.Context module:

class Context(object):

...

def add(self, a, b):
"""Return the sum of the two operands.

>>> ExtendedContext.add(Decimal('12'), Decimal('7.00'))
Decimal('19.00')
>>> ExtendedContext.add(Decimal('1E+2'), Decimal('1.01E+4'))
Decimal('1.02E+4')
"""
return a.__add__(b, context=self)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: max / min / smallest float value on Python 2.5

2010-02-07 Thread Mark Dickinson
On Feb 7, 8:45 pm, duncan smith  wrote:

[...]
> interested, but the following pseudo-python gives the idea.  For an
[...]

>              try:
>                  yield rand() < exp(dF / temp)

Practically speaking, the condition rand() < exp(dF / temp) is never
going to be satisfied if dF / temp < -40  (in fact, the output of
rand() is always an exact multiple of 2**-53, so the condition rand()
< exp(-40) is identical to the condition rand() == 0.0, which should
occur for one random sample out of every 9 thousand million million or
so).

So assuming that your fitness delta dF can't get smaller than 1e-16 or
so in absolute value (which seems reasonable, given that dF is
presumably the result of subtracting two numbers of 'normal'
magnitude), there would be little point having temp go much smaller
than, say, 1e-20.

IOW, I agree with Steven:  2.2e-308 seems extreme.

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: max / min / smallest float value on Python 2.5

2010-02-07 Thread Mark Dickinson
On Feb 7, 12:52 am, duncan smith 
wrote:
> import platform
> if platform.architecture()[0].startswith('64'):
>      TINY = 2.2250738585072014e-308
> else:
>      TINY = 1.1754943508222875e-38

As Christian said, whether you're using 32-bit or 64-bit shouldn't
make a difference here.  Just use the first TINY value you give.

> I'm not 100% sure how reliable this will be across platforms.  Any ideas
> about the cleanest, reliable way of uncovering this type of information?

In practice, it's safe to assume that your 2.225e-308 value is
reliable across platforms.  That value is the one that's appropriate
for the IEEE 754 binary64 format, and it's difficult these days to
find CPython running on a machine that uses any other format for C
doubles (and hence for Python floats).

The smallest positive *normal* number representable in IEEE 754
binary64 is exactly 2**-1022 (or approximately
2.2250738585072014e-308).  The smallest positive *subnormal* number
representable is exactly 2**-1074, or approximately
'4.9406564584124654e-324'.  (Subnormals have fewer bits of precision
than normal numbers;  it's the presence of subnormals that allows for
'gradual underflow'.)  Some machines can/will treat subnormal numbers
specially for speed reasons, either flushing a subnormal result of a
floating-point operation to 0, or replacing subnormal inputs to an
floating-point operation with 0, or both.  So for maximal portability,
and to avoid numerical problems, it's best to avoid the subnormal
region.

> The precise issue is that I'm supplying a default value of
> 2.2250738585072014e-308 for a parameter (finishing temperature for a
> simulated annealing algorithm) in an application.  I develop on
> Ubuntu64, but (I am told) it's too small a value when run on a Win32
> server.  I assume it's being interpreted as zero and raising an
> exception.

This is a bit surprising.  What's the precise form of the error you
get?  Do you still get the same error if you replace your TINY value
by something fractionally larger?  (E.g., 2.23e-308.)

--
Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   >