Re: [Python-Dev] ctypes and win64

2006-08-19 Thread Tim Peters
[Steve Holden]
>> Reasonable enough, but I suspect that Thomas' suggestion might save us
>> from raising false hopes. I'd suggest that the final release
>> announcement point out that this is the first release containing
>> specific support for 64-bit architectures (if indeed it is)

[Martin v. Löwis]
> It isn't. Python ran on 64-bit Alpha for nearly a decade now (I guess),
> and was released for Win64 throughout Python 2.4. ActiveState has
> been releasing an AMD64 package for some time now.

Python has also been used on 64-bit Crays, and I actually did the
first 64-bit port in 1993 (to a KSR Unix machine -- took less than a
day to get it running fine!  Guido's an excellent C coder.).  Win64 is
the first (and probably only forever more) where sizeof(long) <
sizeof(void*), and that caused some Win64-unique pain, and may cause
more.

BTW, at least two of the people at the NFS sprint earlier this year
were compiling and running Python on Win64 laptops.  It's "solid
enough", and surely nobody expects that Win64 users expect 100%
perfection of anything they run <0.5 wink>.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can LOAD_GLOBAL be optimized to a simple array lookup?

2006-08-24 Thread Tim Peters
Note that there are already three PEPs related to speeding dict-based
namespace access; start with:

http://www.python.org/dev/peps/pep-0280/

which references the other two.

The "normal path" through dict lookups got faster since there was a
rash of those, to the extent that more complication elsewhere got much
less attractive.  It's possible that dict lookups got slower again
since /then/, though.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Need help with test_mutants.py

2006-08-24 Thread Tim Peters
[Guido]
> There's a unit test "test_mutants" which I don't understand. If anyone
> remembers what it's doing, please contact me -- after ripping out
> dictionary ordering in Py3k,

Is any form of dictionary comparison still supported, and, if so, what
does "dict1 cmp_op dict2" mean now?

> it stops working.

Traceback?

> In particular, the code in test_one() requires changes, but I don't
> know how... Please help!

The keys and values of dict1 and dict2 are filled with objects of a
user-defined class whose __cmp__ method randomly mutates dict1 and
dict2.  dict1 and dict2 are initially forced to have the same number
of elements, so in current Python:

c = cmp(dict1, dict2)

triggers a world of pain, with the internal dict code doing fancy
stuff comparing keys and values.  However, every key and value
comparison /may/ mutate the dicts in arbitrary ways, so this is
testing whether the dict comparison implementation blows up
(segfaults, etc) when the dicts it's comparing mutate during
comparison.

If it's only ordering comparisons that have gone away for dicts, then,
e.g., replacing

c = cmp(dict1, dict2)

with

c = dict1 == dict2

instead will still meet the test's intent.

No particular /result/ is expected.  The test passes if and only if
Python doesn't crash.  When the test was introduced, it uncovered at
least six distinct failure (crashing) modes across the first 20 times
it was run, so it's well worth keeping around in some form.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gcc 4.2 exposes signed integer overflows

2006-08-26 Thread Tim Peters
[David Hopwood]
>> CPython should be fixed anyway. The correct fix is
>> "if (y == -1 && x < 0 && (unsigned long)x == -(unsigned long)x)".

Note that this was already suggested in the bug report.

[Thomas Wouters]
> Why not just "... && x == LONG_MIN"?

In full,

if (y == -1 && x == LONG_MIN)

"should work" too.  In practice we try to avoid numeric symbols from
platform header files because so many platforms have screwed these up
over the centuries (search for LONG_BIT or HUGE_VAL ;-)), and because
it's better (when possible) not to tie the code to that `x` was
specifically declared as type "long" (e.g., just more stuff that will
break if Python decides to make its short int of type PY_LONG_LONG
instead).  In this specific case, there may also have been a desire to
avoid generating a memory load for a fat constant.  However, since
this is integer division, in real life (outside the test suite) we'll
never go beyond the "y == -1" test.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gcc 4.2 exposes signed integer overflows

2006-08-26 Thread Tim Peters
[David Hopwood]
> (CPython has probably only been tested on 2's complement systems anyway,

Definitely so.  Are there any boxes using 1's-comp or sign-magnitude
integers anymore?  Python assumes 2's-comp in many places.

> but if we're going to be pedantic about depending only on things in the
> C standard...)

No, in that respect we're driven by the silliest decisions made by C
compiler writers ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gcc 4.2 exposes signed integer overflows

2006-08-27 Thread Tim Peters
[Anthony Baxter]
> Regardless of whether we consider gcc's behaviour to be correct or not,

It is correct, but more to the point it's, umm, /there/ ;-)

> I do agree we need a fix for this in 2.5 final. That should also be 
> backported to
> release24-maint for the 2.4.4 release, and maybe release23-maint, as Barry
> recently started talking about cutting a 2.3.6.
>
> Can I nominate Tim, with his terrifying knowledge of C compiler esoterica, as
> the person to pick the best fix?

It's a bitch.  Changing to

if (y == -1 && x < 0 && (unsigned long)x == -(unsigned long)x)

is the obvious fix, but violates our "no warnings" policy:  the MS
compiler warns about applying unary minus to an unsigned operand -- it
"looks insane" to /their/ compiler writers ;-).  Elegant patch below
-- LOL.

Would be nice if someone verified it worked on a box where it matters.
 Would also be nice if people checked to see whether their compiler(s)
warn about something else now.

IIndex: Objects/intobject.c
===
--- Objects/intobject.c (revision 51618)
+++ Objects/intobject.c (working copy)
@@ -564,8 +564,14 @@
"integer division or modulo by zero");
return DIVMOD_ERROR;
}
-   /* (-sys.maxint-1)/-1 is the only overflow case. */
-   if (y == -1 && x < 0 && x == -x)
+   /* (-sys.maxint-1)/-1 is the only overflow case.  x is the most
+* negative long iff x < 0 and, on a 2's-complement box, x == -x.
+* However, -x is undefined (by C) if x /is/ the most negative long
+* (it's a signed overflow case), and some compilers care.  So we cast
+* x to unsigned long first.  However, then other compilers warn about
+* applying unary minus to an unsigned operand.  Hence the weird "0-".
+*/
+   if (y == -1 && x < 0 && (unsigned long)x == 0-(unsigned long)x)
return DIVMOD_OVERFLOW;
xdivy = x / y;
xmody = x - xdivy * y;
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gcc 4.2 exposes signed integer overflows

2006-08-29 Thread Tim Peters
[Thomas Wouters]
>>> Why not just "... && x == LONG_MIN"?

[Tim Peters]
>> it's better (when possible) not to tie the code to that `x` was
>> specifically declared as type "long" (e.g., just more stuff that will
>> break if Python decides to make its short int of type PY_LONG_LONG
>> instead).

[Armin Rigo]
> The proposed "correct fix" breaks this goal too:
>
> > >> "if (y == -1 && x < 0 && (unsigned long)x == -(unsigned long)x)".
>
>^^

Yup, although as noted before the proposed fix actually writes

== 0-(unsigned long)x

at the tail end instead (to avoid compiler warnings at least under MS C).

It doesn't run afoul of the other criterion you snipped from the start
of the quoted paragraph:

In practice we try to avoid numeric symbols from platform header files
because so many platforms have screwed these up over the centuries
(search for LONG_BIT or HUGE_VAL ;-)), 

This is a wrong time in the release process to take on chance on
discovering a flaky LONG_MIN on some box, so I want to keep the code
as much as possible like what's already there (which worked fine for >
10 years on all known boxes) for now.

Speaking of which, I saw no feedback on the proposed patch in

http://mail.python.org/pipermail/python-dev/2006-August/068502.html

so I'll just check that in tomorrow.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 32-bit and 64-bit python on Solaris

2006-08-29 Thread Tim Peters
[Laszlo (Laca) Peter]
> I work in the team that delivers python on Solaris.  Recently we've
> been getting requests for delivering python in 64-bit as well
> as in 32-bit.  As you probably know, Solaris can run 64-bit and
> 32-bit binaries on the same system, but of course you can't mix and
> match shared objects with different ISAs.  This seems to apply to
> python bytecode as well: the pyc files generated by a 64-bit build
> of python are incompatible with those generated by the 32-bit python.
> Note the caveat at
> http://aspn.activestate.com/ASPN/docs/ActivePython/2.3/python/lib/module-marshal.html

Which caveat, specifically?  As it says there, the only known problem
was fixed in Python 2.2:

This behavior is new in Python 2.2. In earlier versions, all but the least-
significant 32 bits of the value were lost, and a warning message
was printed

> I guess my first question is if there are any plans to make the
> bytecodes for different ISAs compatible.  That would make most of
> our problems magically go away (;

I suspect they already have ;-)  There are no plans to make marshal
store a Python long object on a 64-bit box for integers that fit in 64
points but not in 32 bits, and there would be very little point to
doing so.  As the referenced page says, you get the same numeric value
regardless.  It's /possible/ to write Python code to detect the
difference in type, but real code wouldn't do that.

> ...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cross-platform math functions?

2006-09-04 Thread Tim Peters
[Andreas Raab]
> I'm curious if there is any interest in the Python community to achieve
> better cross-platform math behavior. A quick test[1] shows a
> non-surprising difference between the platform implementations.
> Question: Is there any interest in changing the behavior to produce
> identical results across platforms (for example by utilizing fdlibm
> [2])? Since I have need for a set of cross-platform math functions I'll
> probably start with a math-compatible fdlibm module (unless somebody has
> done that already ;-)

Package a Python wrapper and see how popular it becomes.  Some reasons
against trying to standardize on fdlibm were explained here:

http://mail.python.org/pipermail/python-list/2005-July/290164.html

Bottom line is I suspect that when it comes to bit-for-bit
reproducibility, fewer people care about that x-platform than care
about it x-language on the box they use.  Nothing wrong with different
modules for people with different desires.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gcc 4.2 exposes signed integer overflows

2006-09-04 Thread Tim Peters
[Tim Peters]
>> Speaking of which, I saw no feedback on the proposed patch in
>>
>> http://mail.python.org/pipermail/python-dev/2006-August/068502.html
>>
>> so I'll just check that in tomorrow.

[Anthony Baxter]
> This should also be backported to release24-maint and release23-maint. Let me
> know if you can't do the backport...

Done in rev 51711 on the 2.5 branch.

Done in rev 51715 on the 2.4 branch.

Done in rev 51716 on the trunk, although in the LONG_MIN way (which is
less obscure, but a more "radical" code change).

I don't care about the 2.3 branch, so leaving that to someone who
does.  Merge rev 51711 from the 2.5 branch.  It will generate a
conflict on Misc/NEWS.  Easiest to revert Misc/NEWS then and just
copy/paste the little blurb from 2.5 news at the appropriate place:

"""
- Overflow checking code in integer division ran afoul of new gcc
  optimizations.  Changed to be more standard-conforming.
"""
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: Problem withthe API for str.rpartition()

2006-09-05 Thread Tim Peters
upto, sep, rest

in whatever order they apply.  I think of a partition-like function as
starting at some position and matching "up to" the first occurence of
the separator (be that left or right or diagonally, "up to" is
relative to the search direction), and leaving "the rest" alone. The
docs should match that, since my mental model is correct ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cross-platform math functions?

2006-09-06 Thread Tim Peters
[Tim Peters]
>> Package a Python wrapper and see how popular it becomes.  Some reasons
>> against trying to standardize on fdlibm were explained here:
>>
>>http://mail.python.org/pipermail/python-list/2005-July/290164.html

[Andreas Raab]
> Thanks, these are good points. About speed, do you have any good
> benchmarks available?

Certainly not for "typical Python use" -- doubt such a benchmark
exists.  Some people use  sqrt once in a blue moon, others make heavy
use of many libm functions over millions & millions of floats, and in
some apps extremely heavy use is made where speed is everything and
accuracy doesn't much matter at all (e.g., gross plotting).

I'd ask on numeric Python lists, and (e.g.) people working with visualization.

> In my experience fdlibm is quite reasonable for speed in the context of use
> by dynamic languages (i.e., counting allocation overheads, lookup and send
> performance etc)

"Reasonable" for which purpose(s), specifically?  Some people would
certainly care about a 5% slowdown, while most others wouldn't, but
one thing to avoid is pissing off the people who use a thing the most
;-)

> but since I'm not a Python expert I'd appreciate some help with realistic
> benchmarks.

As above, python-dev isn't a likely place to look for such answers.

> ...
> Agreed. Thus my question if someone had already done this ;-)

Not that I know of, although my understanding (which may be wrong) is
that glibc's current math functions started as a copy of fdlibm.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] datetime's strftime implementation: by design or bug

2006-09-11 Thread Tim Peters
[Eric V. Smith]
> [I hope this belongs on python-dev, since it's about the design of
> something.  But if not, let me know and I'll post to c.l.py.]
>
> I'm willing to file a bug report and patch on this, but I'd like to know
> if it's by design or not.
>
> In datetimemodule.c, the function wrap_strftime() insists that the
> length of a format string be <= 127 chars, by forcing the length into a
> char.  This seems like a bug to me.  wrap_strftime() calls time's
> strftime(), which doesn't have this limitation because it uses size_t.

Yawn ;-)  I'm very surprised the code doesn't verify that the format
size fits in a C char, but there's nothing deep about the assumption.
I expect it would work fine to just change the declarations of
`totalnew` and `usednew` from `char` to `Py_ssize_t` (for 2.5.1 and
2.6; to something else for 2.4.4 (I don't recall which C type
PyString_Size returned then -- probably `int`)), and /also/ change the
resize-and-overflow check.  The current:

int bigger = totalnew << 1;
if ((bigger >> 1) != totalnew) { /* overflow */
PyErr_NoMemory();
goto Done;
}

doesn't actually make sense even if it's certain than sizeof(int) is
strictly larger than sizeof(totalnew) (which C guarantees for type
`char`, but is plain false on some boxes if changed to Py_ssize_t).
Someone must have been on heavy drugs when writing that endlessly
tedious wrapper ;-)

> ...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file

2006-09-13 Thread Tim Peters
[Dino Viehland]
> We've noticed a strange occurance on Python 2.4.3 w/ the floating point
> value 1.79769313486232e+308 and how it interacts w/ a .pyc.  Given x.py:
>
> def foo():
> print str(1.79769313486232e+308)
> print str(1.79769313486232e+308) == "1.#INF"
>
>
> The 1st time you run this you get the correct value, but if you reload the 
> module
> after a .pyc is created then you get different results (and the generated 
> byte code
> appears to have changed).
> ...

Exhaustively explained in this recent thread:

http://mail.python.org/pipermail/python-list/2006-August/355986.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] .pyc file has different result for value "1.79769313486232e+308" than .py file

2006-09-13 Thread Tim Peters
[Dino Viehland]
> FYI I've opened a bug against the VC++ team to fix their round tripping on 
> floating
> point values (doesn't sound like it'll make the next release, but hopefully 
> it'll make it
> someday).

Cool!  That would be helpful to many languages implemented in C/C++
relying on the platform {float, double}<->string library routines.

Note that the latest revision of the C standard ("C99") specifies
strings for infinities and NaNs that conforming implementations must
accept (for example, "inf").  It would be nice to accept those too,
for portability; "most" Python platforms already do.  In fact, this is
the primary reason people running on, e.g., Linux, resist upgrading to
Windows ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Testsuite fails on Windows if a space is in the path

2006-09-16 Thread Tim Peters
[Martin v. Löwis]
> ...
> Can somebody remember what the reason is to invoke cmd.exe (or COMSPEC)
> in os.popen?

Absolutely necessary, as any number of shell gimmicks can be used in
the passed string, same as on non-Windows boxes; .e.g.,

>>> import os
>>> os.environ['STR'] = 'SSL'
>>> p = os.popen("findstr %STR% *.py | sort")
>>> print p.read()
build_ssl.py:print " None of these versions appear suitable
for building OpenSSL"
build_ssl.py:print "Could not find an SSL directory in '%s'" %
(sources,)
build_ssl.py:print "Found an SSL directory at '%s'" % (best_name,)
build_ssl.py:# Look for SSL 2 levels up from pcbuild - ie, same
place zlib etc all live.
...

That illustrates envar substitution and setting up a pipe in the
passed string, and people certainly do things like that.

These are the MS docs for cmd.exe's inscrutable quoting rules after /C:

"""
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:

1.  If all of the following conditions are met, then quote characters
on the command line are preserved:

- no /S switch
- exactly two quote characters
- no special characters between the two quote characters,
  where special is one of: &<>()@^|
- there are one or more whitespace characters between the
  the two quote characters
- the string between the two quote characters is the name
  of an executable file.

2.  Otherwise, old behavior is to see if the first character is
a quote character and if so, strip the leading character and
remove the last quote character on the command line, preserving
any text after the last quote character.
"""

Your

cmd.exe /c "c:\Program Files\python25\python.exe"

example fit clause #1 above.

cmd.exe /c "c:\Program Files\python25\python.exe" -c "import sys;print
sys.version"

fails the "exactly two quote characters" part of #1, so falls into #2,
and after stripping the first and last quotes leaves the senseless:

cmd.exe /c c:\Program Files\python25\python.exe" -c "import sys;print
sys.version

> (i.e. doubling the quotes at the beginning and the end) [works]

And that follows from the above, although not for a reason any sane
person would guess :-(

I personally wouldn't change anything here for 2.5.  It's a minefield,
and people who care a lot already have their own workarounds in place,
which we'd risk breaking.  It remains a minefield for newbies, but
we're really just passing on cmd.exe's behaviors.  People are
well-advised to accept the installer's default directory.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Before 2.5 - More signed integer overflows

2006-09-17 Thread Tim Peters
[Armin Rigo]
>> There are more cases of signed integer overflows in the CPython source
>> code base...
>>
>> That's on a 64-bits machine:
>>
>> [GCC 4.1.2 20060715 (prerelease) (Debian 4.1.1-9)] on linux2
>> abs(-sys.maxint-1) == -sys.maxint-1
><
>> Humpf!  Looks like one person or two need to do a quick last-minute
>> review of all places trying to deal with -sys.maxint-1, and replace them
>> all with the "official" fix from Tim [SF 1545668].

[Anthony Baxter]
> Ick. We're now less than 24 hours from the scheduled release date for 2.5
> final. There seems to be a couple of approaches here:
>
> 1. Someone (it won't be me, I'm flat out with work and paperwriting today)
>reviews the code and fixes it
> 2. We leave it for a 2.5.1. I'm expecting (based on the number of bugs found
>and fixed during the release cycle) that we'll probably need a 2.5.1 in 
> about
>3 months.
> 3. We delay the release until it's fixed.
>
> I'm strongly leaning towards (2) at this point. (1) would probably require
> another release candidate, while (3) would result in another release
> candidate and massive amount of sobbing from a lot of people (including me).

I ignored this since I don't have a box where problems are visible (&
nobody responded to my request to check my last flying-blind "fix" on
a box where it mattered).

Given that these are weird, unlikely-in-real-life endcase bugs
specific to a single compiler, #2 is the natural choice.

BTW, did anyone try compiling Python with -fwrapv on a box where it
matters?  I doubt that Python's speed is affected one way or the
other, and if adding wrapv makes the problems go away, that would be
an easy last-second workaround for all possible such problems (which
of course could get fixed "for real" for 2.5.1, provided someone cares
enough to dig into it).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Before 2.5 - More signed integer overflows

2006-09-18 Thread Tim Peters
[Neal Norwitz]
>> I'm getting a crash when running test_builtin and test_calendar (at
>> least) with gcc 4.1.1 on amd64.  It's happening in pymalloc, though I
>> don't know what the cause is.  I thought I tested with gcc 4.1 before,
>> but probably would have been in debug mode.

Neil, in context it was unclear whether you were using trapv at the
time.  Were you?

[Martin v. Löwis]
> Can't really check right now, but it might be that this is just the
> limitation that a debug obmalloc doesn't work on 64-bit systems.
> There is a header at each block with a fixed size of 4 bytes, even
> though it should be 8 bytes on 64-bit systems. This header is there
> only in a debug build.

Funny then how all the 64-bit buildbots manage to pass running debug builds ;-)

As of revs 46637 + 46638 (3-4 months ago), debug-build obmalloc uses
sizeof(size_t) bytes for each of its header and trailer debugging
fields.

Before then, the debug-build obmalloc was "safe" in this respect:  if
it /needed/ to store more than 4 bytes in a debug bookkeeping field,
it assert-failed in a debug build.  That would happen if and only if a
call to malloc/realloc requested >= 2**32 bytes, so was never provoked
by Python's test suite.  As of rev 46638, that limitation should have
gone away.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_itertools fails for trunk on x86 OS X machine

2006-09-21 Thread Tim Peters
[Neal Norwitz]
> It looks like %zd of a negative number is treated as an unsigned
> number on OS X, even though the man page says it should be signed.
>
> """
> The z modifier, when applied to a d or i conversion, indicates that
> the argument is of a signed type equivalent in size to a size_t.
> """

It's not just some man page ;-), this is required by the C99 standard
(which introduced the `z` length modifier -- and it's the `d` or `i`
here that imply `signed`, `z` is only supposed to specify the width of
the integer type, and can also be applied to codes for unsigned
integer types, like %zu and %zx).

> The program below returns -123 on Linux and 4294967173 on OS X.
>
> n
> --
> #include 
> int main()
> {
> char buffer[256];
>   if(sprintf(buffer, "%zd", (size_t)-123) < 0)
> return 1;
>  printf("%s\n", buffer);
>  return 0;
> }

Well, to be strictly anal, while the result of

(size_t)-123

is defined, the result of casting /that/ back to a signed type of the
same width is not defined.  Maybe your compiler was "doing you a
favor" ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-03 Thread Tim Peters
[EMAIL PROTECTED]
> If C90 doesn't distinguish -0.0 and +0.0, how can Python?

With liberal applications of piss & vinegar ;-)

> Can you give a simple example where the difference between the two is apparent
> to the Python programmer?

Perhaps surprsingly, many (well, comparatively many, compared to none
) people have noticed that the platform atan2 cares a lot:

>>> from math import atan2 as a
>>> z = 0.0  # postive zero
>>> m = -z   # minus zero
>>> a(z, z)   # the result here is actually +0.0
0.0
>>> a(z, m)
3.1415926535897931
>>> a(m, z)# the result here is actually -0.0
0.0
>>> a(m, m)
-3.1415926535897931

It work like that "even on Windows", and these are the results C99's
754-happy appendix mandates for atan2 applied to signed zeroes.  I've
even seen a /complaint/ on c.l.py that atan2 doesn't do the same when

z = 0.0

is replaced by

z = 0

That is, at least one person thought it was "a bug" that integer
zeroes didn't deliver the same behaviors.

Do people actually rely on this?  I know I don't, but given that more
than just 2 people have remarked on it seeming to like it, I expect
that changing this would break /some/ code out there.

BTW, on /some/ platforms all those examples trigger EDOM from the
platform libm instead -- which is also fine by C99, for
implementations ignoring C99's optional 754-happy appendix.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-03 Thread Tim Peters
[EMAIL PROTECTED]
> Can you give a simple example where the difference between the two is apparent
> to the Python programmer?

BTW, I don't recall the details and don't care enough to reconstruct
them, but when Python's front end was first changed to recognize
"negative literals", it treated +0.0 and -0.0 the same, and we did get
bug reports as a result.

A bit more detail, because it's necessary to understand that even
minimally.  Python's grammar doesn't have negative numeric literals;
e.g., according to the grammar,

-1
and
-1.1

are applications of the unary minus operator to the positive numeric
literals 1 and 1.1.  And for years Python generated code accordingly:
LOAD_CONST followed by the unary minus opcode.

Someone (Fred, I think) introduced a front-end optimization to
collapse that to plain LOAD_CONST, doing the negation at compile time.

The code object contains a vector of compile-time constants, and the
optimized code initially didn't distinguish between +0.0 and -0.0.  As
a result, if the first float 0.0 in a code block "looked postive",
/all/ float zeroes in the code block were in effect treated as
positive; and similarly if the first float zero was -0.0, all float
zeroes were in effect treated as negative.

That did break code.  IIRC, it was fixed by special-casing the snot
out of "-0.0", leaving that single case as a LOAD_CONST followed by
UNARY_NEGATIVE.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-03 Thread Tim Peters
[Tim]
>> Someone (Fred, I think) introduced a front-end optimization to
>> collapse that to plain LOAD_CONST, doing the negation at compile time.

> I did the original change to make negative integers use just LOAD_CONST, but I
> don't think I changed what was generated for float literals.  That could be
> my memory going bad, though.

It is ;-)  Here under Python 2.2.3:

>>> from dis import dis
>>> def f(): return 0.0 + -0.0 + 1.0 + -1.0
...
>>> dis(f)
  0 SET_LINENO   1

  3 SET_LINENO   1
  6 LOAD_CONST   1 (0.0)
  9 LOAD_CONST   1 (0.0)
 12 UNARY_NEGATIVE
 13 BINARY_ADD
 14 LOAD_CONST   2 (1.0)
 17 BINARY_ADD
 18 LOAD_CONST   3 (-1.0)
 21 BINARY_ADD
 22 RETURN_VALUE
 23 LOAD_CONST   0 (None)
 26 RETURN_VALUE

Note there that "0.0", "1.0", and "-1.0" were all treated as literals,
but that "-0.0" still triggered a UNARY_NEGATIVE opcode.  That was
after "the fix".

You don't remember this as well as I do since I probably had to fix
it, /and/ I ate enormous quantities of chopped, pressed, smoked,
preservative-laden bag o' ham at the time.  You really need to do both
to remember floating-point trivia.  Indeed, since I gave up my bag o'
ham habit, I hardly ever jump into threads about fp trivia anymore.
Mostly it's because I'm too weak from not eating anything, though --
how about lunch tomorrow?

> The code changed several times as people with more numeric-fu that myself
> fixed all sorts of border cases.  I've tried really hard to stay away from
> the code generator since then.  :-)

Successfully, too!  It's admirable.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can't check in on release25-maint branch

2006-10-08 Thread Tim Peters
[Skip]
> I checked in a change to Doc/lib/libcsv.tex on the trunk yesterday, then
> tried backporting it to the release25-maint branch but failed due to
> permission problems.  Thinking it might be lock contention, I waited a few
> minutes and tried a couple more times.  Same result.  I just tried again:
...
> Here's my svn status output:
>
> Path: .
> URL: http://svn.python.org/projects/python/branches/release25-maint

As Georg said, looks like you did a read-only checkout.  It /may/
(can't recall for sure, but think so) get you unstuck to do:

svn switch --relocate \
http://svn.python.org/projects/python/branches/release25-maint \
svn+ssh://svn.python.org/python/branches/release25-maint

from your checkout directory.  If that works, it will go fast; if not,
start over with an svn+ssh checkout.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can't check in on release25-maint branch

2006-10-08 Thread Tim Peters
[Skip]
> Thanks Georg & Tim.  That was indeed the problem.  I don't know why I've had
> such a hard time wrapping my head around Subversion.

I have a theory about that:  it's software <0.5 wink>.  If it's any
consolation, at the NFS sprint earlier this year, I totally blanked
out on how to do a merge using SVN, despite that I've merged hundreds
of times when working on ZODB's seemingly infinite collection of
active branches.  Luckily, I was only trying to help someone else do a
merge at the time, so it frustrated them more than me ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] 2.4 vs Windows vs bsddb

2006-10-09 Thread Tim Peters
I just noticed that the bsddb portion of Python fails to compile on
the 2.4 Windows buildbots, but for some reason the buildbot machinery
doesn't notice the failure:

"""
Compiling...
_bsddb.c
Linking...
   Creating library .\./_bsddb_d.lib and object .\./_bsddb_d.exp
_bsddb.obj : warning LNK4217: locally defined symbol _malloc imported
in function __db_associateCallback
_bsddb.obj : warning LNK4217: locally defined symbol _free imported in
function __DB_consume
_bsddb.obj : warning LNK4217: locally defined symbol _fclose imported
in function _DB_verify
_bsddb.obj : warning LNK4217: locally defined symbol _fopen imported
in function _DB_verify
_bsddb.obj : warning LNK4217: locally defined symbol _strncpy imported
in function _init_pybsddb
_bsddb.obj : error LNK2019: unresolved external symbol __imp__strncat
referenced in function _makeDBError
_bsddb.obj : error LNK2019: unresolved external symbol __imp___assert
referenced in function _makeDBError
./_bsddb_d.pyd : fatal error LNK1120: 2 unresolved externals
...

_bsddb - 3 error(s), 5 warning(s)

 Build: 15 succeeded, 1 failed, 0 skipped
"""

The warnings there are old news, but no idea when the errors started.

The test suite doesn't care that bsddb is missing either, just ending with:

1 skip unexpected on win32:
test_bsddb

Same kind of things when building from my 2.4 checkout.  No clues.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 vs Windows vs bsddb

2006-10-09 Thread Tim Peters
[Tim Peters]
>> I just noticed that the bsddb portion of Python fails to compile on
>> the 2.4 Windows buildbots, but for some reason the buildbot machinery
>> doesn't notice the failure:

[Martin v. Löwis]
> It's been a while that a failure to build some extension modules doesn't
> cause the "compile" step to fail; this just happened with the _ssl.pyd
> module before.

I'm guessing only on the release24-maint branch?

> I'm not sure how build.bat communicates an error, or whether devenv.com
> fails in some way when some build step fails.
>
> Revision 43156 may contribute here, which adds additional commands
> into build.bat after devenv.com is invoked.

More guessing:  devenv gives a non-zero exit code when it fails, and a
.bat script passes on the exit code of the last command it executes.

True or false, after making changes based on those guesses, the 2.4
Windows buildbots now say they fail the compile step.

It was my fault to begin with (boo! /bad/ Timmy), but should have been
unique to the 24 branch (2.5 and trunk fetch Unicode test files all by
themselves).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 vs Windows vs bsddb

2006-10-09 Thread Tim Peters
[Tim]
> I just noticed that the bsddb portion of Python fails to compile on
> the 2.4 Windows buildbots, but for some reason the buildbot machinery
> doesn't notice the failure:

But it does now.  This is the revision that broke the Windows build:

"""
r52170 | andrew.kuchling | 2006-10-05 14:49:36 -0400 (Thu, 05 Oct
2006) | 12 lines

[Backport r50783 | neal.norwitz.  The bytes_left code is complicated,
 but looks correct on a casual inspection and hasn't been modified
 in the trunk.  Does anyone want to review further?]

Ensure we don't write beyond errText.  I think I got this right, but
it definitely could use some review to ensure I'm not off by one
and there's no possible overflow/wrap-around of bytes_left.
Reported by Klocwork #1.

Fix a problem if there is a failure allocating self->db.
Found with failmalloc.
"""

It introduces uses of assert() and strncat(), and the linker can't
resolve them.  I suppose that's because the Windows link step for the
_bsddb subproject explicitly excludes msvcrt (in the release build)
and msvcrtd (in the debug build), but I don't know why that's done.

OTOH, we got a lot more errors (about duplicate code definitions) if
the standard MS libraries aren't explicitly excluded, so that's no
fix.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 vs Windows vs bsddb

2006-10-10 Thread Tim Peters
[Gregory P. Smith]
> It seems bad form to C assert() within a python extension.  crashing
> is bad.  Just code it to not copy the string in that case.  The
> exception type should convey enough info alone and if someone actually
> looks at the string description of the exception they're welcome to
> notice that its missing info and file a bug (it won't happen; the
> strings come from the BerkeleyDB or C library itself).

The proper use of C's assert() in Python (whether core or extension)
is to strongly document a condition the author believes /must/ be
true.  It's a strong sanity-check on the programmer's beliefs about
necessary invariants, things that must be true under all possible
conditions.  For example, it would always be wrong to assert that the
result of calling malloc() with a non-zero argument is non-NULL; it
would be correct (although trivially and unhelpfully so) to assert
that the result is NULL or is not NULL.

Given that, the assert() in question looks fine to me:

if (_db_errmsg[0] && bytes_left < (sizeof(errTxt) - 4)) {
bytes_left = sizeof(errTxt) - bytes_left - 4 - 1;
assert(bytes_left >= 0);

We can't get into the block unless

bytes_left < sizeof(errTxt) - 4

is true.  Subtracting bytes_left from both sides, then swapping LHS and RHS:

sizeof(errTxt) - bytes_left - 4 > 0

which implies

sizeof(errTxt) - bytes_left - 4 >= 1

Subtracting 1 from both sides:

sizeof(errTxt) - bytes_left - 4 - 1 >= 0

And since the LHS of that is the new value of bytes_left, it must be true that

 bytes_left >= 0

Either that, or the original author (and me, just above) made an error
in analyzing what must be true at this point.  From

bytes_left < sizeof(errTxt) - 4

it's not /instantly/ obvious that

bytes_left >= 0

inside the block, so there's value in assert'ing that it's true.  It's
both documentation and an executable sanity check.

In any case, assert() statements are thrown away in a release build,
so can't be a cause of abnormal termination then.

> As for the strncat instead of strcat that is good practice.  The
> buffer is way more than large enough for any of the error messages
> defined in the berkeleydb common/db_err.c db_strerror() function but
> the C library could supply its own unreasonably long one in some
> unforseen circumstance.

That's fine -- there "shouldn't have been" a problem with using any
standard C function here.  It was just the funky linker step on
Windows on the 2.4 branch that was hosed.  Martin figured out how to
repair it, and there's no longer any problem here.  In fact, even the
been-there-forever linker warnings in 2.4 on Windows have gone away
now.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 vs Windows vs bsddb

2006-10-10 Thread Tim Peters
[Tim]
>> Given that, the assert() in question looks fine to me:
>> ...
|>> Either that, or the original author (and me, just above) made an error
>>  in analyzing what must be true at this point.
|

[David Hopwood]
> You omitted to state an assumption that sizeof(errTxt) >= 4, since size_t
> (and the constant 4) are unsigned. Also bytes_left must initially be 
> nonnegative
> so that the subexpression 'sizeof(errTxt) - bytes_left' cannot overflow.

I don't care, but that's really the /point/:  asserts are valuable
precisely because any inference that's not utterly obvious at first
glance at best stands a good chance of relying on hidden assumptions.
assert() makes key assumptions and key inferences visible, and
verifies them in a debug build of Python.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why spawnvp not implemented on Windows?

2006-10-12 Thread Tim Peters
[Alexey Borzenkov]
>>> Umm... do you mean that spawn*p* on python 2.5 is an absolute no?

[Martin v. Löwis]
>> Yes. No new features can be added to Python 2.5.x; Python 2.5 has
>> already been released.

[Alexey Borzenkov]
> Ugh... that's just not fair. Because of this there will be no spawn*p*
> in python for another two years. x_x

Or the last 15 years.  Yet somehow people still have kids ;-)

> ...
> But the fact that I have to use similar code anywhere I need to use
> spawnlp is not fair.

"Fair" is a very strange word here.  Pain in the ass, sure, but not
fair?  Doesn't make sense.

> ...
> P.S. Although it's a bit stretching, one might also say that
> implementing spawn*p* on windows is not actually a new feature, and
> rather is a bugfix for misfeature.

No.  Introducing any new function is obviously a new feature, which
would become acutely and catastrophically visible as soon as someone
released code using the new function in 2.5.1, and someone tried to
/use/ that new code under 2.5.0.  Micro releases of Python do not
introduce new features -- take that as given.  It's been tried before,
for what appeared to be "very good reasons" at the time, and we lived
to regret it deeply.  It won't happen again.

> Why every other platform can benefit from spawn*p* and only Windows can't?

Just the obvious reason:  because so far nobody cared enough to do the
work of writing code, docs and tests for some of these functions on
Windows.

> This just makes os.spawn*p* useless: it becomes unreliable and can't be
> used in portable code at all.

It's certainly true that it can't be used in portable code, at least
not before Python 2.6.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.3.6 for the unicode buffer overrun

2006-10-13 Thread Tim Peters
[Thomas Heller]
>> Yes.  But I've switched machines since I last build an installer,
and I do not
>> have all of the needed software installed any longer, for example the Wise
>> Installer.

[Martin v. Löwis]
> Ok. So we are technically incapable of producing the Windows binaries of
>  another 2.3.x release, then?

FYI, I still have the Wise Installer.  But since my understanding is
that the "Unicode buffer overrun" thingie is a non-issue on Windows,
I've got no interest in wrestling with a 2.3.6 for Windows.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Segfault in python 2.5

2006-10-18 Thread Tim Peters
[Michael Hudson]
>> I've been reading the bug report with interest, but unless I can
>> reproduce it it's mighty hard for me to debug, as I'm sure you know.

[Mike Klaas]
> Indeed.

Note that I just attached a much simpler pure-Python script that fails
very quickly, on Windows, using a debug build.  Read the new comment
to learn why both "Windows" and "debug build" are essential to it
failing reliably and quickly ;-)

>>> Unfortunately, I've been attempting for hours to reduce the problem to a
>>> completely self-contained script, but it is resisting my efforts
due to timing
>>> problems.

Yes, but you did good!  This is still just an educated guess on my
part, but my education here is hard to match ;-):  this new business
of generators deciding to "clean up after themselves" if they're left
hanging appears to have made it possible for a generator to hold on to
a frame whose thread state has been free()'d, after the thread that
created the generator has gone away.  Then when the generator gets
collected as trash, the new exception-based "clean up abandoned
generator" gimmick tries to access the generator's frame's thread
state, but that's just a raw C struct (not a Python object with
reachability-based lifetime), and the thread free()'d that struct when
the thread went away.  The important timing-based vagary here is
whether dead-thread cleanup gets performed before the main thread
tries to clean up the trash generator.

> I've peered at the code, but my knowledge of the python core is
> superficial at best.  The fact that it is occuring as a result of a
> long string of garbage collection/dealloc/etc. and involves threading
> lowers my confidence further.   That said, I'm beginning to think that
> to reproduce this in a standalone script will require understanding
> the problem in greater depth regardless...

Or upgrade to Windows ;-)

>> Are you absolutely sure that the fault does not lie with any extension
>> modules you may be using?  Memory scribbling bugs have been known to
>> cause arbitrarily confusing problems...

Unless I've changed the symptom, it's been reduced to minimal pure
Python.  It does require a thread T, and creating a generator in T,
where the generator object's lifetime is controlled by the main
thread, and where T vanishes before the generator has exited of its
own accord.

Offhand I don't know how to repair it.  Thread states /aren't/ Python
objects, and there's no provision for a thread state to outlive the
thread it represents.

> I've had sufficient experience being arbitrarily confused to never be
> sure about such things, but I am quite confident.  The script I posted
> in the bug report is all stock python save for the operation in <>'s.
> That operation is pickling and unpickling (using pickle, not cPickle)
> a somewhat complicated pure-python instance several times.

FYI, in my whittled script, your `getdocs()` became simply:

def getdocs():
while True:
yield None

and it's called only once, via self.docIter.next().  In fact, the
"while True:" isn't needed there either (given that it's only resumed
once now).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Nondeterministic long-to-float coercion

2006-10-19 Thread Tim Peters
[Raymond Hettinger]
> My colleague got an odd result today that is reproducible on his build
> of Python (RedHat's distribution of Py2.4.2) but not any other builds
> I've checked (including an Ubuntu Py2.4.2 built with a later version of
> GCC).  I hypothesized that this was a bug in the underlying GCC
> libraries, but the magnitude of the error is so large that that seems
> implausible.
>
> Does anyone have a clue what is going-on?
>
> Python 2.4.2 (#1, Mar 29 2006, 11:22:09) [GCC 4.0.2 20051125 (Red Hat
> 4.0.2-8)] on linux2 Type "help", "copyright", "credits" or "license" for
> more information.
> >>> set(-194 * (1/100.0) for i in range(1))
> set([-19400.0, -193995904.0, -193994880.0])

Note that the Hamming distance between -19400.0 and -193995904.0
is 1, and ditto between -193995904.0 and -193994880.0, when viewed as
IEEE-754 doubles.  That is, 193995904.0 is "missing a bit" from
-19400.0, and -193994880.0 is missing the same bit plus an
additional bit.  Maybe clearer, writing a function to show the hex
little-endian representation:

>>> def ashex(d):
... return binascii.hexlify(struct.pack(">> ashex(-19400)
'6920a7c1'
>>> ashex(-193995904)   # "the 2 bit" from "6" is missing, leaving 4
'4920a7c1'
>>> ashex(-193994880)   # and "the 8 bit" from "9" is missing, leaving 1
'4120a7c1'

More than anything else that suggests flaky memory, or "weak bits" in
a HW register or CPU<->FPU path.  IOW, it looks like a hardware
problem to me.

Note that the missing bits here don't coincide with a "natural"
software boundary -- screwing up a bit "in the middle of" a byte isn't
something software is prone to do.

You could try different inputs and see whether the same bits "go
missing", e.g. starting with a double with a lot of 1 bits lit.  Might
also try using these as keys to a counting dict to see how often they
go missing.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] valgrind

2006-11-06 Thread Tim Peters
[Herman Geza]
>> Here python reads from an already-freed memory area, right?

[Martin v. Löwis]
> It looks like it, yes. Of course, it could be a flaw in valgrind, too.
> To find out, one would have to understand what the memory block is,
> and what part of PyObject_Free accesses it.

When PyObject_Free is handed an address it doesn't control, the "arena
base address" it derives from that address may point at anything the
system malloc controls, including uninitialized memory, memory the
system malloc has allocated to something, memory the system malloc has
freed, or internal system malloc bookkeeping bytes.  The
Py_ADDRESS_IN_RANGE macro has no way to know before reading it up.

So figure out which line of code valgrind is complaining about
(doesn't valgrind usually produce that?).  If it's coming from the
expansion of Py_ADDRESS_IN_RANGE, it's not worth more thought.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] valgrind

2006-11-07 Thread Tim Peters
[Kristján V. Jónsson]
> ...
> Actually, obmalloc could be improved in this aspect.  Similar code that I 
> once wrote
> computed the block base address, but than looked in its tables to see if it 
> was
> actually a known block before accessing it.

Several such schemes were tried (based on, e.g., binary search and
splay trees), but discarded due to measurable sloth.  The overwhelming
advantage of the current scheme is that it does the check in constant
time, independent of how many distinct arenas (whether one or
thousands makes no difference) pymalloc is managing.

> That way you can have blocks that are larger than the virtual memory block
> of the process.

If you have a way to do the check in constant time, that would be
good.  Otherwise speed rules here.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] valgrind

2006-11-07 Thread Tim Peters
[Martin v. Löwis]

Thanks for explaining all this!  One counterpoint:

> Notice that on a system with limited memory, you probably don't
> want to use obmalloc, even if it worked. obmalloc uses arenas
> of 256kiB, which might be expensive on the target system.

OTOH, Python allocates a lot of small objects, and one of the reasons
for obmalloc's existence is that it typically uses memory more
efficiently (less bookkeeping space overhead and less fragmentation)
for mounds of small objects than the all-purpose system malloc.

In a current (trunk) debug build, simply starting Python hits an arena
highwater mark of 9, and doing "python -S" instead hits a highwater
mark of 2.  Given how much memory Python needs to do nothing ;-), it's
doubtful that the system malloc would be doing better.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Threading, atexit, and logging

2006-12-06 Thread Tim Peters
[Martin v. Löwis]
> In bug #1566280 somebody reported that he gets an
> exception where the logging module tries to write
> to closed file descriptor.
>
> Upon investigation, it turns out that the file descriptor
> is closed because the logging atexit handler is invoked.
> This is surprising, as the program is far from exiting at
> this point.

But the main thread is done, right?  None of this appears to make
sense unless we got into Py_Finalize(), and that doesn't happen until
the main thread has nothing left to do.

> Investigating further, I found that the logging atexit
> handler is registered *after* the threading atexit handler,
> so it gets invoked *before* the threading's atexit.

Ya, and that sucks.  Can't recall details now, but it's not the first
time the vagaries of atexit ordering bit a threaded program.  IMO,
`threading` shouldn't use atexit at all.

> Now, threading's atexit is the one that keeps the
> application running, by waiting for all non-daemon threads
> to shut down. As this application does all its work in
> non-daemon threads, it keeps running for quite a while -
> except that the logging module gives errors.
>
> The real problem here is that atexit handlers are
> invoked even though the program logically doesn't exit,
> yet (it's not just that the threading atexit is invoked
> after logging atexit - this could happen to any atexit
> handler that gets registered).
>
> I added a patch to this report which makes the MainThread
> __exitfunc a sys.exitfunc, chaining what is there already.
> This will work fine for atexit (as atexit is still imported
> explicitly to register its sys.exitfunc), but it might break
> if other applications still insist on installing a
> sys.exitfunc.

Well, that's been officially deprecated since 2.4, but who knows?

> What do you think about this approach?

It's expedient :-)  So was using atexit for this to begin with.
Probably "good enough".  I'd rather, e.g., that `threading` stuff an
exit function into a module global, and change Py_Finalize() to look
for that and run it (if present) before invoking call_sys_exitfunc().
That is, break all connection between the core's implementation of
threading and the user-visible `atexit` machinery.

`atexit` is a hack specific to "don't care about order" finalization
functions, and it gets increasingly obscure to try to force it to
respect a specific ordering sometimes (e.g., now you have a patch to
try to fix it by relying on an obscure deprecated feature and hoping
users don't screw with that too -- probably "good enough", but still
sucky).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Threading, atexit, and logging

2006-12-06 Thread Tim Peters
[Martin v. Löwis]
>>> Upon investigation, it turns out that the file descriptor
>>> is closed because the logging atexit handler is invoked.
>>> This is surprising, as the program is far from exiting at
>>> this point.

[Tim Peters]
>> But the main thread is done, right?

[Martin]
> Wrong. main.py (which is the __main__ script in the demo
> code) is done, yes.

Fine, but the main thread /has/ entered Py_Finalize().  That's key
here, and wasn't clear originally.

> However, threading.py has machinery to not terminate the main
> thread as long as there are non-daemon threads.

Right.

...

>> IMO, `threading` shouldn't use atexit at all.

> That is (in a way) my proposal (although I suggest to use
> sys.exitfunc instead).

Same thing to me.  I'd rather thread cleanup, which is part of the
Python core, not rely on any of the user-visible (hence also
user-screwable) "do something at shutdown" gimmicks.  Thread cleanup
is only vaguely related to that concept because "cleanup" here implies
waiting for an arbitrarily long time until all thread threads decide
on their own to quit.  That's not something to be cleaned up /at/
shutdown time, it's waiting (potentially forever!) /for/ shutdown
time, and that mismatch is really the source of the problem.

>> It's expedient :-)  So was using atexit for this to begin with.
>> Probably "good enough".  I'd rather, e.g., that `threading` stuff an
>> exit function into a module global, and change Py_Finalize() to look
>> for that and run it (if present) before invoking call_sys_exitfunc().

> Ok, that's what I'll do then.
>
> Yet another alternative would be to have the "daemonic" thread feature
> in the thread module itself (along with keeping track of a list of
> all running non-daemonic thread).

Sorry, I couldn't follow the intent there.  Not obvious to me how
moving this stuff from `threading` into `thread` would make it
easier(?) for the implementation to wait for non-daemon threads to
finish.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Threading, atexit, and logging

2006-12-06 Thread Tim Peters
[Tim Peters]
>> Sorry, I couldn't follow the intent there.  Not obvious to me how
>> moving this stuff from `threading` into `thread` would make it
>> easier(?) for the implementation to wait for non-daemon threads to
>> finish.

[Martin v. Löwis]
> Currently, if you create a thread through the thread module
> (rather than threading), it won't participate in the "wait until
> all threads are done" algorithm - you have to use the threading
> module. Moving it into the thread module would allow to cover all
> threads.

True, but that doesn't appear to have any bearing on the bug
originally discussed.  You introduced this as "yet another
alternative" in the context of how to address the original complaint,
but if that was the intent, I still don't get it.

Regardless, I personally view the `thread` module as being "de facto"
deprecated.  If someone /wants/ the ability to create a non-daemon
thread, that the ability is only available via `threading` is an
incentive to move to the newer, saner module.  Besides, if the daemon
distinction were grafted on to `thread` threads too, it would have to
default to daemonic (a different default than `threading` threads)
else be incompatible with current `thread` thread behavior.  I
personally don't want to add new features to `thread` threads in any
case.

> Also, if the interpreter invokes, say, threading._shutdown():
> that's also "user-screwable", as a user may put something else
> into threading._shutdown. To make it non-visible, it has to be
> in C, not Python (and even then it might be still visible to
> extension modules).

The leading underscore makes it officially invisible <0.7 wink>, and
users would have to both seek it out and go out of their way to screw
it.  If some user believes they have a need to mess wtih
threading._shutdown(), that's fine by me too.

The problem with atexit and sys.exitfunc is that users can get in
trouble now simply by using them in documented ways, because
`threading` also uses them (but inappropriately so, IMO).  Indeed,
that's all the `logging` module did here.

While this next is also irrelevant to the original complaint, I think
it was also a minor mistake to build the newer atexit gimmicks on top
of sys.exitfunc (same reason:  a user can destroy the atexit
functionality quite innocently if they happen to use sys.exitfunc
after they (or code they build on) happens to import `atexit`.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-19 Thread Tim Peters
[Raymond Hettinger]
> I bumped into an oddity today:
>
> 6.0 // 0.001 != math.floor(6.0 / 0.001)
>
> In looking at Objects/floatobject.c, I was surprised to find that
> float_floor_division() is implemented in terms of float_divmod().  Does anyone
> know why it takes such a circuitous path?  I had expected something simpler 
> and
> faster:
>
>  return PyFloat_FromDouble(floor(a/b));

To preserve, so far as is possible with floats, that

(*)a == (a//b)*b + a%b

In this case the binary double closest to decimal 0.001 is

0.00120816681711721685132943093776702880859375

which is a little bit larger than 1/1000.  Therefore the mathematical
value of a/b is a little bit smaller than 6/(1/1000) = 6000, and the
true floor of the mathematically correct result is 5999.

a % b is always computed exactly (mathematical result == machine
result) under Python's definition whenever a and b have the same sign
(under C99's and the `decimal` module's definition it's always exact
regardless of signs), and getting the exact value for a%b implies the
exact value for a//b must also be returned (when possible) in order to
preserve identity (*) above.

IOW, since

>>> 6.0 % 0.001
0.00087512

it would be inconsistent to return 6000 for 6.0 // 0.001:

>>> 6.0 - 6000 * 0.001   # this isn't close to the value of a%b
0.0
>>> 6.0 - 5999 * 0.001   # but this is close
0.00044578

Note that two rounding errors occur in computing a - N*b via binary
doubles.  If there were no rounding errors, we'd have

6 % b ==  6.0 - 5999 * b

exactly where

b =
0.00120816681711721685132943093776702880859375

is the value actually stored in the machine for 0.001:

>>> import decimal
>>> decimal.getcontext().prec = 1000 # way more than needed
>>> import math
>>> # Set b to exact decimal value of binary double closest to 0.001.
>>> m, e = math.frexp(0.001)
>>> b = decimal.Decimal(int(m*2**53)) / decimal.Decimal(1 << (53-e))
>>> # Then 6%b is exactly equal to 6 - 5999*b
>>> 6 % b == 6 - 5999*b
True
>>> # Confirm that all decimal calculations were exact.
>>> decimal.getcontext().flags[decimal.Inexact]
False
>>> # Confirm that floor(6/b) is 5999.
>>> int(6/b)
5999
>>> print 6/b
5999.87509990972966989180234686226...

All that said, (*) doesn't make a lot of sense for floats to begin
with (for that matter, Python's definition of mod alone doesn't make
much sense for floats when the signs differ -- e.g.,

>>> -1 % 1e100
1e+100
>>> decimal.Decimal(-1) % decimal.Decimal("1e100")
Decimal("-1")

).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-21 Thread Tim Peters
[Tim Peters]
>> ...
>> >>> decimal.Decimal(-1) % decimal.Decimal("1e100")
>> Decimal("-1")

[Armin Rigo]
> BTW - isn't that case in contradiction with the general Python rule that
> if b > 0, then a % b should return a number between 0 included and b
> excluded?

Sure.

> We try hard to do that for ints, longs and floats.

But fail in this case for floats:

>>> -1 % 1e100 < 1e100
False

Can't win.  The infinite-precision result is the mathematical
F(1e100)-1, where F(1e100) is the binary float closest to 10**100, but
F(1e100)-1 isn't representable as a float -- it rounds back up to
F(1e100):

>>> -1 % 1e100 == 1e100
True

There simply is no /representable/ float value in [0, 10**100)
congruent to -1 modulo 10**100 (or modulo F(1e100)), so it's
impossible to return a non-surprising (to everyone) result in that
range.  0 and 1e100 are in some sense "the best" answers in that
range, both off by only 1 part in F(1e100), the smallest possible
error among representable floats in that range.  -1/1e100 certainly
isn't 0, so

-1 // 1e100 == -1.0

is required.  Picking -1 % 1e100 == 1e100 then follows, to try to preserve that

a = (a//b)*b + a%b

as closely as is possible.

Ints and longs never have problems here, because the exact % result is
always exactly representable.

That isn't true of floats (whether binary or decimal), but under a
different definition of "mod" the mathematically exact result is
always exactly representable:  a%b takes the sign of `a` rather than
the sign of `b`.  C's fmod (Python's math.fmod), and the proposed
standard for decimal arithmetic implemented by the `decimal` module,
use that meaning for "mod" instead.

>>> math.fmod(-1, 1e100)
-1.0

> The fact that it works differently with Decimal could be unexpected.

Yup.  See "can't win" above :-(

Another good definition of "mod" for floats is to return the
representative of smallest absolute value; i.e., satisfy

abs(a%b) <= abs(b) / 2

The mathematically exact value for that is also exactly representable
(BTW, the proposed standard for decimal arithmetic calls this
"remainder-near", as opposed to "remainder").

It's just a fact that different definitions of mod are most useful
most often depending on data type.  Python's is good for integers and
often sucks for floats.  The C99 and `decimal` definition(s) is/are
good for floats and often suck(s) for integers.  Trying to pretend
that integers are a subset of floats can't always work ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-21 Thread Tim Peters
...

[Tim]
>> It's just a fact that different definitions of mod are most useful
>> most often depending on data type.  Python's is good for integers and
>> often sucks for floats.  The C99 and `decimal` definition(s) is/are
>> good for floats and often suck(s) for integers.  Trying to pretend
>> that integers are a subset of floats can't always work ;-)

[Guido]
> That really sucks, especially since the whole point of making int
> division return a float was to make the integers embedded in the
> floats... I think the best solution would be to remove the definition
> of % (and then also for divmod()) for floats altogether, deferring to
> math.fmod() instead.

In Python 2?  I thought this was already (semi)settled for Py3K --
back last May, on the Py3K list, in a near-repetition of the current
thread:

[Tim]
I'd be happiest if P3K floats didn't support __mod__ or __divmod__ at
all.  Floating mod is so rare it doesn't need syntactic support, and
the try-to-be-like-integer __mod__ and __divmod__ floats support now
can deliver surprises to all users incautious enough to use them.

[Guido]
OK, this makes sense. I've added this to PEP 3100.

> The ints aren't really embedded in Decimal, so we don't have to do
> that there (but we could).

Floor division is an odd beast for floats, and I don't know of a good
use for it.  As Raymond's original example in this thread showed, it's
not always the case that

math.floor(x/y) == x // y

The rounding error in computing x/y can cause floor() to see an exact
integer coming in even when the true value of x/y is a little smaller
than that integer (of course it's also possible for x/y to overflow).
This is independent of fp base (it's "just as true" for decimal floats
as for binary floats).

The `decimal` module also has two distinct flavors of "mod", neither
of which match Python's integer-mod definition:

>>> decimal.Decimal(7).__mod__(10)
Decimal("7")
>>> decimal.Decimal(7).remainder_near(10)
Decimal("-3")
>>> decimal.Decimal(-7).__mod__(10)
Decimal("-7")
>>> decimal.Decimal(-7).remainder_near(10)
Decimal("3")

But, again, I think floating mod is so rare it doesn't need syntactic
support, and especially not when there's more than one implementation
-- and `decimal` is a floating type (in Cowlishaw's REXX, this was the
/only/ numeric type, so there was more pressure there to give it a
succinct spelling).

> The floats currently aren't embedded in complex, since f.real and
> f.imag don't work for floats (and math.sqrt(-1.0) doesn't return 1.0j,
> and many other anomalies). That was a conscious decision because
> complex numbers are useless for most folks. But is it still the right
> decision, given what we're trying to do with int and float?

Sounds like a "purity" thing.  The "pure form" of the original
/intent/ was probably just that nobody should get a complex result
from non-complex inputs (they should see an exception instead) unless
they use a `cmath` function to prove they know what they're doing up
front.

I still think that has pragmatic value, since things like sqrt(-1) and
asin(1.05) really are signs of logic errors for most programs.

But under that view, there's nothing surprising about, e.g.,
(math.pi).imag returning 0.0 -- or (3).imag returning 0.0 either.
That sounds fine to me (if someone cares enough to implement it).
Unclear what very_big_int.real should do (it can lose information to
rounding if the int > 2**53, or even overflow -- maybe it should just
return the int!).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-22 Thread Tim Peters
[Guido]
> ...
> So you are proposing that Decimal also rip out the % and // operators
> and __divmod__? WFM, but I don't know what Decimal users say (I'm not
> one).

Yes:  it's just as much a floating type as HW binary floats, and all
the same issues come up.  For example, decimal floats are just as
prone to the floor division surprise Raymond started this thread with;
e.g.,

>>> a
Decimal("2.172839486617283948661728393E+29")
>>> b
Decimal("1234567890123456789012345678")
>>> a / b
Decimal("176.0")
>>> a/b == 176
True
>>> a // b
Decimal("175")

That is, floor division of `a` by `b` isn't necessarily the same as
the floor of `a` divided by `b` for decimal floats either, and for
exactly the same reason as when using binary floats:  a/b can suffer a
rounding error due to finite precision, but floor division computes
the floor of the quotient "as if" infinite precision were available.
At least using `decimal` it's easy to /explain/ just by boosting the
precision:

>>> decimal.getcontext().prec *= 2
>>> a / b
Decimal("175.9773179587999814250798308")

This shows quite clearly why a/b rounded up to exactly 176 when
working with half this precision.

There's also that the decimal __mod__ implementation is like math.fmod
for binary floats, not like Python's int/long __mod__.  Having just
one builtin meaning for numeric `%` as an infix operator is a good
thing, and the int/long meaning is both by far the most common use but
only "works" for types with exactly representable results (ints and
longs built in; rationals if someone adds them; ditto constructive
reals; ... -- but not floats).

> ...
> For ints and floats, real could just return self, and imag could
> return a 0 of the same type as self.

Cool!  Works for me.

> I guess the conjugate() function could also just return self (although I see
> that conjugate() for a complex with a zero imaginary part returns
> something whose imaginary part is -0; is that intentional?

That's wrong, if true:  it should return something with the opposite
sign on the imaginary part, whether or not that equals 0 (+0. and -0.
both "equal 0").

This is harder to check than it should be because it appears there's a
bug in the complex constructor (at least in Python 2.5):  complex(1.,
0.0) and complex(1., -0.0) both appear to create a complex with a +0
imaginary part:

>>> def is_minus_0(x):
... import math
... return x == 0.0 and math.atan2(x, x) != 0
>>> is_minus_0(+0.0)  # just showing that "it works"
False
>>> is_minus_0(-0.0)   # ditto
True
>>> is_minus_0(complex(1, 0.0).imag)
False
>>> is_minus_0(complex(1, -0.0).imag)  # should be True
False

OTOH, the complex constructor does respect the sign of the real part:

>>> is_minus_0(complex(0.0, 0.0).real)
False
>>> is_minus_0(complex(-0.0, 0.0).real)
True

complex_new() ends with:

cr.real -= ci.imag;
cr.imag += ci.real;

and I have no idea what that thinks it's doing.  Surely this isn't intended?!:

>>> complex(complex(1.0, 2.0), complex(10.0, 20.0))
(-19+12j)

WTF?  In any case, that's also what's destroying the sign of the
imaginary part in complex(1.0, -0.0).

Knowing that a -0 imaginary part can't be constructed in the obvious way:

>>> is_minus_0(complex(0, 0).conjugate().imag)
True

So conjugate() does flip the sign on a +0 imaginary part, and:

>>> is_minus_0(complex(0, 0).conjugate().conjugate().imag)
False

so it also flips the sign on a -0 imaginary part.  That's all as it should be.

Hmm.  You meant to ask something different, but I actually answered that too ;-)

> I'd rather not have to do that when the input is an int or float, what do you
> think?)

Returning `self` is fine with me for those, although, e.g., it does mean that

(3).conjugate().imag
and
(complex(3)).conjugate().imag

are distinguishable with enough pain.  I don't care.  I think of
integers and floats as "not having" an imaginary part more than as
having a 0 imaginary part (but happy to invent a 0 imaginary part if
someone asks for one).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-22 Thread Tim Peters
[Guido]
>> That really sucks, especially since the whole point of making int
>> division return a float was to make the integers embedded in the
>> floats... I think the best solution would be to remove the definition
>> of % (and then also for divmod()) for floats altogether, deferring to
>> math.fmod() instead.

[Nick Maclaren]
> Please, NO!!!
>
> At least not without changing the specification of the math module.
> The problem with it is that it is specified to be a mapping of the
> underlying C library, complete with its  error handling.
> fmod isn't bad, as  goes, BUT:
>
> God alone knows what happens with fmod(x,0.0), let alone fmod(x,-0.0).
> C99 says that it is implementation-defined whether a domain error
> occurs or the function returns zero, but domain errors are defined
> behaviour only in C90 (and not in C99!)  It is properly defined only
> if Annex F is in effect (with all the consequences that implies).
>
> Note that I am not saying that syntactic support is needed, because
> Fortran gets along perfectly well with this as a function.  All I
> am saying is that we want a defined function with decent error
> handling!  Returning a NaN is fine on systems with proper NaN support,
> which is why C99 Annex F fmod is OK.

math.fmod is 15 years old -- whether or not someone likes it has
nothing to do with whether Python should stop trying to use the
current integer-derived meaning of % for floats.

On occasion we've added additional error checking around functions
inherited from C.  But adding code to return a NaN has never been
done.  If you want special error checking added to the math.fmod
wrapper, it would be easiest to "sell" by far to request that it raise
ZeroDivisionError (as integer mod does) for a modulus of 0, or
ValueError (Python's historic mapping of libm EDOM, and what Python's
fmod(1, 0) already does on some platforms).  The `decimal` module
raises InvalidOperation in this case, but that exception is specific
to the `decimal` module for now.

>> For ints and floats, real could just return self, and imag could
>> return a 0 of the same type as self. I guess the conjugate() function
>> could also just return self (although I see that conjugate() for a
>> complex with a zero imaginary part returns something whose imaginary
>> part is -0; is that intentional? I'd rather not have to do that when
>> the input is an int or float, what do you think?)

> I don't see the problem in doing that - WHEN implicit conversion
> to a smaller domain, losing information, raises an exception.

Take it as a pragmatic fact that it wouldn't.  Besides, e.g., the
conjugate of 10**5 is exactly 10**5 mathematically.  Why raise
an exception just because it can't be represented as a float?  The
exact result is easily supplied with a few lines of "obviously
correct" implementation code (incref `self` and return it).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-22 Thread Tim Peters
[Guido]
>> The ints aren't really embedded in Decimal, so we don't have to do
>> that there (but we could).

[Facundo Batista]
> -0.
>
> If we can't achieve it without disturbing the rest of Python, I'll try
> as much as possible to keep what the Spec proposes.

Which "Spec"?  For example, floor division isn't mentioned at all in
IBM's proposed decimal standard, or in PEP 327, or in the Python
Library Reference section on `decimal`.  It's an artifact of trying to
extend Python's integer mod definition to floats, and for reasons
explained in this thread (for the 27th time ;-)), that definition
doesn't make good sense for floats.  The IBM spec defines `remainder`
and `remainder-near` for floats, and those do make good sense for
floats.  But they're /different/ definitions than Python uses for
integer mod.

Do note that this discussion is only about Python 3.  Under the view
that it was a (minor, but real) design error to /try/ extending
Python's integer mod definition to floats, if __mod__, and __divmod__
and __floordiv__ go away for binary floats in P3K they should
certainly go away for decimal floats in P3K too.  And that's about
syntax, not functionality:  the IBM spec's "remainder" and
"remainder-near" still need to be there, it's "just" that a user would
have to get at "remainder" in a more long-winded way (analogous to
that a P3K user would have to spell out "math.fmod" to get at a mod
function for binary floats).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-23 Thread Tim Peters
[Anders J. Munch]
> What design error?  float.__mod__ works perfectly.
>
> >>> -1 % 50
> 49
> >>> -1.0 % 50.0
> 49.0
> >>>

Please read the whole thread.  Maybe you did, but you said nothing
here that indicated you had.  The issues aren't about tiny integers
that happen to be in float format, where the result is exactly
representable as a float too.  Those don't create problems for any
definition of mod.  But even non-tiny exact integers can.

> It's only Decimal.__mod__ that's inconsistent.  float.__mod__ has the
> usual floating-point inaccuracies, but then with float that goes with
> the territory.

No.  Decimal.__mod_  always returns the mathematically exact result.
It has this in common with math.fmod() (in fact, math.fmod() and
Decimal.__mod__() have the same definition, ignoring IEEE endcases).
It's impossible to do so under Python's integer-derived mod
definition.  Read the whole thread for why -- we can't even guarantee
that abs(a%b) < abs(b) for non-zero finite floats under the current
mod, and that has in fact been a source of complaints over the years.
math.fmod and Decimal.__mod__ do guarantee abs(a%b) < abs(b) for all
non-zero finite floats, and moreover guarantee that a%b is an exact
integer multiple of `b` away from `a` (although that integer may not
be representable as a float) -- again it's impossible for the a -
floor(a/b)*b definition to do so for floats.

> I've had occasion to use it, and it was a pleasant surprise that it
> "just worked", so I didn't have to go to the math module and ponder
> over the difference between modulo or remainder.

There's no such distinction in Python's math module -- fmod is the
only modular reduction function in the math module.  The `decimal`
module has two such functions (see earlier messages in this thread for
examples -- neither matches Python's mod function).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-23 Thread Tim Peters
[Guido]
>>> I guess the conjugate() function could also just return self (although I see
>>> that conjugate() for a complex with a zero imaginary part returns
>>> something whose imaginary part is -0; is that intentional?

[TIm Peters]
>> That's wrong, if true:  it should return something with the opposite
>> sign on the imaginary part, whether or not that equals 0 (+0. and -0.
>> both "equal 0").

|[Nick Maclaren]
> Grrk.  Why?  Seriously.

Seriously:  because there's some reason to do so and no good reason
not to.  This is the current complex conjugate implementation:

static PyObject *
complex_conjugate(PyObject *self)
{
Py_complex c;
c = ((PyComplexObject *)self)->cval;
c.imag = -c.imag;
return PyComplex_FromCComplex(c);
}

Complicating that to make a special case of c.imag == 0 is simply
goofy without a /compelling/ reason to do so.  As I read the C99
section on "conj", it's also their intent that the sign on the
imaginary part be flipped regardless of value.

> IEEE 754 signed zeroes are deceptive enough for float, but are
> a gibbering nightmare for complex; Kahan may be
> able to handle them, but mere mortals can't.  Inter alia, the only
> sane forms of infinity for complex numbers are a SINGLE one (the
> compactified model) and to may infinity into NaN (which I prefer,
> as it leads to less nonsense).
>
> And, returning to 'floor' - if one is truncating towards -infinity,
> should floor(-0.0) deliver -1.0, 0.0 or -0.0?

I'd leave a zero argument alone (for ceiling too), and am quite sure
that's "the right" 754-ish behavior.

>> math.fmod is 15 years old -- whether or not someone likes it has
>> nothing to do with whether Python should stop trying to use the
>> current integer-derived meaning of % for floats.

> Eh?  No, it isn't.  Because of the indirection to the C library, it
> is changing specification as we speak!  THAT is all I am getting at;
> not that the answer might not be A math.fmod with defined behaviour.

Couldn't quite parse that, but nearly all of Python's math-module
functions inherit most behavior from the platform libm.  This is often
considered to be a feature:  the functions called from Python
generally act much like they do when called from C or Fortran on the
same platform, easing cross-language development on a single platform.

It's never been Python's intent to define all behavior here, and
largely because Python isn't a math-library development project.  To
the extent that enough people care enough to standardize C's libm
endcase behavior across platforms, Python inherits that too.  Not much
of an inheritance so far ;-)

Do note the flip side:  to the extent that different platform
religions refuse to standardize libm endcase behavior, Python plays
along with whatever libm gods the platform it's running on worships.
That's of value to some too.

>> On occasion we've added additional error checking around functions
>> inherited from C.  But adding code to return a NaN has never been
>> done.  If you want special error checking added to the math.fmod
>> wrapper, it would be easiest to "sell" by far to request that it raise
>> ZeroDivisionError (as integer mod does) for a modulus of 0, or
>> ValueError (Python's historic mapping of libm EDOM, and what Python's
>> fmod(1, 0) already does on some platforms).  The `decimal` module
>> raises InvalidOperation in this case, but that exception is specific
>> to the `decimal` module for now.

> I never said that it should; I said that it is reasonable behaviour on systems
> that support them.  I personally much prefer an exception in this case.

So which one would you prefer?  As explained, there are 3 plausible candidates.

You seem to be having some trouble taking "yes" for an answer here ;-)

> What I was trying to point out is that the current behaviour is
> UNDEFINED (and may give total nonsense).  That is not
> good.

Eh -- I can't get excited about it.  AFAIK, in 15 years nobody has
complained about passing a 0 modulus to math.fmod (possibly because
most newbies use the Windows distro, and it does raise ValueError
there).

>>>> For ints and floats, real could just return self, and imag could
>>>> return a 0 of the same type as self. I guess the conjugate() function
>>>> could also just return self (although I see that conjugate() for a
>>>> complex with a zero imaginary part returns something whose imaginary
>>>> part is -0; is that intentional? I'd rather not have to do that when
>>>> the input is an int or float, what do you think?)

>>> I don't see the problem in doing that - W

Re: [Python-Dev] Floor division

2007-01-23 Thread Tim Peters
[Tim Peters]
>> Please read the whole thread.  Maybe you did, but you said nothing
>> here that indicated you had.  The issues aren't about tiny integers
>> that happen to be in float format, where the result is exactly
>> representable as a float too.  Those don't create problems for any
>> definition of mod.  But even non-tiny exact integers can.

[Anders J. Munch]
> I did read the whole thread, and I saw your -1%1e100 example.  Mixing
> floating-point numbers of very different magnitude can get you in
> trouble - e.g. -1+1e100==1e100.  I don't think -1%1e100 is all that
> worse.

Except that it's very easy to return an exactly correct result in that
case:  -1.  This isn't like floating addition, where rounding errors
/must/ occur at times.  It's never necessary to suffer rounding errors
for a mod function defined with floats in mind.  Indeed, that's why
the C standards define fmod the way they do, and why IBM's proposed
standard for decimal floating arithmetic defines it the same way.

Python's definition of mod makes great sense for integers, but doesn't
generalize nicely.

>>> It's only Decimal.__mod__ that's inconsistent.  float.__mod__ has the
>>> usual floating-point inaccuracies, but then with float that goes with
>>> the territory.

>> No.  Decimal.__mod_  always returns the mathematically exact result.

> I meant inconsistent with integers.

While I was responding to your second sentence there, not your first.
You can tell because I didn't say anything after your first sentence
;-)

No, it's not always true that "with float [inaccuracies] goes with the
territory".  mod "should be" like absolute value and unary minus this
way:  always exact.

> People are familiar with the semantics of % on integers, because they use
> it all the time.

I'm not sure how many people are actually familiar with the semantics
of integer % when mixing signs, in part because there's no consistency
across languages in that area so people with a lot of experience tend
to avoid it.  I agree integer % is heavily used regardless.

> % on float is a natural extension of that and hence unsurprising.

It was natural to /want/ to extend it to floats.  That didn't work
well, and to the contrary it's surprising precisely /because/ the
behavior people enjoy with integers can fail when it's applied to
floats.  Having cases where abs(a%b) >= abs(b) is a
crash-the-space-shuttle level of surprise, especially since I know of
no other language in which that's possible.  It's not possible with
ints or longs in Python either -- or with math.fmod applied to floats.

> % on Decimal is exact and correct, but surprising all the same.

Which is why I don't want binary or decimal floats to support infix
"%" as a spelling in P3K.  I don't believe floating mod is heavily
used, and if so there's scant need for a one-character spelling -- and
if there's a method or function name to look up in the docs, a user
can read about what they're getting.

In fact, I expect the decimal module's "remainder-near" is what most
people using mod on floats actually want most of the time:  they
couldn't care less about the sign of the result, but they do want its
absolute value to as small as possible.  This is because floating
mod's one "natural" mixed-sign use is for argument reduction before
feeding the remainder into a series expansion.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: Re: Floor division

2007-01-23 Thread Tim Peters
...

[Facundo]
>> We'll have to deprecate that functionality, with proper warnings (take
>> not I'm not following the thread that discuss the migration path to 3k).
>>
>> And we'll have to add the method "remainder" to decimal objects (right
>> now we have only "remainder_near" in decimal objects, and both
>> "remainder" and "remainder_near" in context).

[Raymond]
> Whoa. This needs more discussion with respect to the decimal module.  I'm
> not ready to throw-out operator access to the methods provided by the spec.

decimal's __floordiv__ and __divmod__ have nothing to do with the
spec.  They were Python-specific extensions apparently added to try to
fit in better with Python's entirely different (than either of
decimal's) definition of modular reduction.  In that respect they
succeeded by being jarringly perverse:  decimal's __floordiv__(a, b)
does not return the floor of a/b, it returns a/b truncated toward 0:

>>> -1 // 10
-1
>>> Decimal(-1) // Decimal(10)
Decimal("-0")

Even you might agree that's an odd thing for a function named "floor
div" to do ;-)

I say that's not useful.  The only method here that does have
something to do with the spec is decimal's __mod__, an implementation
of the spec's "remainder" operation.  Guido agreed last May to take
away the "%" infix spelling of mod for binary floats, and I can't
think of a reason for why decimal floats should be immune.

>  Also, I do not want the Py2.x decimal module releases to be complexified
> and slowed-down by Py3.0 deprecation warnings.  The module is already slow
> and has ease-of-use issues.

We're not talking about the usual arithmetic operations here, just
"%".  The functionality wouldn't go away, just the infix 1-character
spelling.  Note that the implementation of "remainder" is
extraordinarily expensive, just as the implementation of fmod for
binary floats is extraordinarily expensive:  it's not just the
precision at work, but also the difference in exponents.

+ - * / don't have the latter factor to contend with (hence
"extraordinarily").  Couple this with that floating mod of any kind is
almost surely rarely used, and worry about the overhead of a method or
function spelling seems a waste of tears.

Irony:  I've used decimal's remainder_near in real life, which has no
operator spelling.  I haven't used decimal's `remainder` except to
give examples in this thread :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-23 Thread Tim Peters
[Armin]
>>> BTW - isn't that case in contradiction with the general Python rule that
>>> if b > 0, then a % b should return a number between 0 included and b
>>> excluded?

[Tim]
>> Sure.

[Armin]
> You're not addressing my point, though, so I was probably not clear
> enough.

"Sure" is the answer to all possible points :-)

> My surprize was not that a % b was equal to b in this case, and
> so not strictly smaller than b (because I expect such things when
> working with floats).

That's unnecessarily pessimistic:  there's no problem defining modular
reduction for floats that's exact.

>  My surprize was that Decimals don't even try to
> be between 0 and b, and sometimes give a result that is far outside the
> range.  This surprized me because I expected floats and Decimals to try
> to give "approximately equal" results...

Sure, you do expect that, and sure, they don't.  Now what?

The `decimal` module is an implementation of IBM's proposed standard
for decimal arithmetic:

http://www2.hursley.ibm.com/decimal/

It requires two mod-like operations, neither of which behave like
Python's "number theoretic" mod.  Nobody involved cared to extend the
proposed standard by adding a third mod-like operation.

For some reason `decimal` implemented __mod__ as the proposed
standard's "remainder" operation.  That's the immediate source of your
surprise.  IMO `decimal` should not have implemented __mod__ at all,
as Python's number-theoretic mod is not part of the proposed standard,
is a poor basis for a floating-point mod regardess, and it was a
mistake to implement decimal % decimal in a way so visibly different
from float % float and integer % integer:  it confuses the meaning of
"%".  That's your complaint, right?

My preferred "solution" is to remove __mod__, __divmod__, and
__floordiv__ from all flavors of floats (both binary and decimal) in
P3K.

That addresses your concern in that "decimal % decimal" would raise an
exception in P3K (it wouldn't be implemented).  A user who wanted some
form of decimal float modular reduction would need to read the docs,
decide which of the two supported such functions they want, and
explicitly ask for it.  Similarly, a user who wanted some from of
binary float modular reduction would need to invoke math.fmod()
explicitly -- or, possibly, if someone contributes the code needed to
implement C99's additional "remainder" function cross-platform, that
too:

IBM's spec  same-as C
--  -
remainder   fmod (C89 & C99)
remainder-near  remainder (C99 & IEEE-754)

It's not a coincidence that all standards addressing modular reduction
for floats converge on essentially those two definitions.

I can't be made to feel guilty about this :-)

For P2 I don't propose any changes -- and in which case my only
response is just "sure - yes - right - decimal's __mod__ in fact does
not act like Python's integer __mod__ in mixed-sign cases -- and
neither does decimal's __floordiv__ or decimal's __divmod__ act like
their integer counterparts in mixed-sign cases".
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Having trouble committing

2007-01-25 Thread Tim Peters
[Brett Cannon]
> I am trying to commit to the 2.5 branch and I am getting an error:
>
> svn: Commit failed (details follow):
> svn: Can't create directory
> '/data/repos/projects/db/transactions/53566-1.txn': Permission denied
>
> Anyone know what is going on?

Did you do `svn info` in that directory to make sure you have a
writable ("svn+ssh") checkout of that part?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-25 Thread Tim Peters
[Guido]
> The only thing I would miss about this is that I am used to write
> certain timing loops that like to sync on whole seconds, by taking
> time.time() % 1.0 which nicely gives me the milliseconds in the
> current second. E.g.
>
> while True:
>   do_something_expensive_once_a_second_on_the_second()
>   now = time.time()
>   time.sleep(1.0 - (now % 1.0))

Indeed, the only use of floating % in the standard library I recall is
the ancient

return (x/30269.0 + y/30307.0 + z/30323.0) % 1.0

from the original Wichman-Hill random() generator.  Maybe we could
introduce "%" as a unary prefix operator, where %x means "the
fractional part of x" ;-)

Are you opposed to importing `math`?  The infix spelling had important
speed benefits in random.py (random() was the time-critical function
in that module), but the use above couldn't care less.

 time.sleep(1.0 - math.fmod(now, 1.0))

would do the same, except would be easier to reason about because it's
trivially guaranteed that 0.0 <= math.fmod(x, 1.0) < 1.0 for any
finite float x >= 0.0.  The same may or may not be true of % (I would
have to think about that, and craft a proof one way or the other -- if
it is true, it would have to invoke something special about the
modulus 1.0, as the inequality doesn't hold for % for some other
modulus values).

Better, you could use the more obvious:

time.sleep(math.ceil(now) - now)

That "says" as directly as possible that you want the number of
seconds needed to reach the next integer value (or 0, if `now` is
already an integer).

> I guess I could use (now - int(now)) in a pinch,

That would need to be

time.sleep(1.0 - (now - int(now)))

I'd use the `ceil` spelling myself, even today -- but I don't suffer
"it's better if it's syntax or __builtin__" disease ;-)

> assuming int() continues to truncate.

Has anyone suggested to change that?  I'm not aware of any complaints
or problems due to int() truncating.  There have been requests to add
new kinds of round-to-integer functions, but in addition to int().
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-25 Thread Tim Peters
[Guido]
> ...
> I don't care about the speed, but having to import math (which I
> otherwise almost never need) is a distraction, and (perhaps more so) I
> can never remember whether it's modf() or fmod() that I want.

fractional part of x == fmod(x, 1.0) == modf(x)[0], so you could use
either.  Since modf returns a tuple and fmod returns a float, you'll
get an exception quickly if you pick the wrong one :-)  The name
"modf" certainly sucks.

...

>>> assuming int() continues to truncate.

>> Has anyone suggested to change that?  I'm not aware of any complaints
>> or problems due to int() truncating.  There have been requests to add
>> new kinds of round-to-integer functions, but in addition to int().

> I thought those semantics were kind of poorly specified. But maybe
> that was long ago (when int() just did whatever (int) did in C) and
> it's part of the language now.

"(int)float_or_double" truncates in C (even in K&R C) /provided that/
the true result is representable as an int.  Else behavior is
undefined (may return -1, may cause a HW fault, ...).

So Python uses C's modf() for float->int now, which is always defined
for finite floats, and also truncates.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-25 Thread Tim Peters
[Armin]
> Thanks for the clarification.  Yes, it makes sense that __mod__,
> __divmod__ and __floordiv__ on float and decimal would eventually follow
> the same path as for complex (where they make even less sense and
> already raise a DeprecationWarning).

This truly has nothing to do with complex.  All meanings for "mod"
(whether in Python, IEEE-754, C89, C99, or IBM's proposed decimal
standard) are instances of this mathematical schema:

[1]x%y = x - R(x/y)*y

by definition, where R(z) is a specific way of rounding z to an exact
mathematical ("infinite precision") integer.

For int, long, and float, Python uses R=floor (and again it's
important to note that these are mathematical statements, not
statements about computer arithmetic).

For ints, longs, and (mathematical) reals, that's the usual
"number-theoretic" definition of mod, as, e.g., given by Knuth:

mod(x, y) = x - floor(x/y)*y

It's the only definition of all those mentioned here that guarantees
the result is non-negative when the modulus (y) is positive, and
that's a very nice property for integers.  It's /also/ the only
definition off all those mentioned here where the exact mathematical
result may /not/ be exactly representable as a computer float when x
and y are computer floats.

It's that last point that makes it a poor definition for working with
computer floats:  for any other plausible way of defining "mod", the
exact result /is/ exactly representable as a computer float.  That
makes reasoning much easier, just as you don't have to think twice
about seeing abs() or unary minus applied to a float.  No information
is lost, and you can rely on expected invariants like

[2]0 <= abs(x%y) < abs(y)if y != 0 and finite

provided one of the non- R=floor definitions of mod is used for computer floats.

For complex, Python uses R(z) = floor(real_part_of(z)).  AFAICT,
Python just made that up out of thin air.  There are no known use
cases, and it's bizarre.  For example, [2] isn't even approximately
reliable:

>>> x = 5000 + 100j
>>> y = 1j
>>> x % y
(5000+0j)
>>> print abs(x%y), abs(y)
5000.0 1.0

In short, while Python "does something" for complex % complex, what it
does seems more-than-less arbitrary.  That's why it was deprecated
years ago, and nobody complained.

But for computer floats, there are ways to instantiate R in [1] that
work fine, returning a result that is truly (exactly) congruent to x
modulo y, even though x, y and the result are all computer floats.
Two of those ways:

The C89 fmod = C99 fmod = C99 integral "%" = IBM spec "remainder"
picks R(z) = round z to the closest integer in the direction of 0.

The C99 "remainder" = IBM spec "remainder-near" = IEEE-754 REM picks
R(z) = round z to the nearest integer, or if z is exactly halfway
between integers to the nearest even integer.  This one has the nice
property (for floats!) that [2] can be strengthened to:

0 <= abs(x%y) <= abs(y)/2

That's often useful for argument reduction of periodic functions,
which is an important use case for a floating-point mod.  You
typically don't care about the sign of the result in that case, but do
want the absolute value as small as possible.  That's probably why
this was the only definition standardized by 754.

In contrast, I don't know of any use for complex %.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-25 Thread Tim Peters
...

[Tim]
>> fractional part of x == fmod(x, 1.0) == modf(x)[0], so you could use
>> either.

[Anders J. Munch]
> Actually, on the off chance that time.time() is negative, he can use
> neither.  It has to be math.ceil, float.__mod__ or divmod.

If time.time() is negative, I expect this would be the least of his worries :-)

Even on most Unixish boxes, Python's time.time() is immune to "the
year 2038 problem" anyway, since it uses POSIX's gettimeofday()
instead of C's time() when it can.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-25 Thread Tim Peters
[Armin Rigo]
>> Thanks for the clarification.  Yes, it makes sense that __mod__,
>> __divmod__ and __floordiv__ on float and decimal would eventually follow
>> the same path as for complex (where they make even less sense and
>> already raise a DeprecationWarning).

[Nick Maclaren]
> Yes.  Though them not doing so would also make sense.  The difference
> is that they make no mathematical sense for complex, but the problems
> with float are caused by floating-point (and do not occur for the
> mathematical reals).
>
> There is an argument for saying that divmod should return a long
> quotient and a float remainder,

It could, but who would have a (sane) use for a possibly 2000-bit quotient?

> which is what C99 has specified for remquo (except that it requires
> only the last 3 bits of the quotient for reasons that completely baffle
> me).

C99 has no integral type capable of representing the full quotient,
but the last few bits may suffice for performing argument reduction in
the implementation of a periodic function.  Surely you did that for
trig functions in the bad old days?  For example, express input x as
N*(pi/2) + r, where |r| <= pi/4.  Then N can be expressed as 4*n1 +
n2, with n2 in [0, 1, 2, 3], and:

cos(x) =
cos((4*n1+n2)*(pi/2) + r) =
cos(n1*(2*pi) + n2*pi/2 + r) =  [can ignore integral multiples of 2*pi]
cos(n2*pi/2 + r)

Then the combination of n2 and the sign of r tell you which quadrant
you're in, and various cos-specific rules can deliver the result from
that and cos(r) or sin(r).

The point is that only the last 2 bits of the quotient are needed to
determine n2, and all other bits in N are irrelevant to the result.

> Linux misimplemented that the last time I looked.
>
> Personally, I think that it is bonkers, as it is fiendishly expensive
> compared to its usefulness - especially with Decimal!

Ah, but the IBM spec weasels out:  it raises an exception if you try
to compute "remainder" or "remainder-near" of inputs when the
exponents differ by "too much".

This is a bit peculiar to me, because there are ways to compute
"remainder" using a number of operations proportional to the log of
the exponent difference.  It could be that people who spend their life
doing floating point forget how to work with integers ;-)

For example, what about 9e9 % 3.14?

9e9 = q*3.14 + r

if and only if (multiplying both sides by 100)

9e11 = 314*q + 100*r

So what's mod(9 * 10**11, 314)?  Binary-method modular
exponentiation goes fast:

>>> pow(10, 11, 314)
148
>>> _ * 9 % 314
76

So

9e11 = 314*q + 76

exactly for some integer q, so (dividing both sides by 100)

9e9 = 3.14*q + 0.76

exactly for the same integer q.  Done.  It doesn't require 10
long-division steps, it only requires about two dozen modular
multiplications wrt the relatively tiny modulus 314.

OTOH, I don't know how to get the last bit of `q` with comparable
efficiency, and that's required to implement the related
"remainder-near" in "halfway cases".

> But it isn't obviously WRONG.

For floats, fmod(x, y) is exactly congruent to x modulo y -- I don't
think it's possible to get more right than exactly right ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-25 Thread Tim Peters
[Tim Peters]
>> ...
>> Maybe we could introduce "%" as a unary prefix operator, where
>> %x means "the fractional part of x" ;-)

[Anders J. Munch]
> What'ya talking about?  Obviously it should be a suffix operator ;-)

Na -- that would be confusing ;-)

...

>>  time.sleep(1.0 - math.fmod(now, 1.0))
>>
>> would do the same, except would be easier to reason about because it's
>> trivially guaranteed that 0.0 <= math.fmod(x, 1.0) < 1.0 for any
>> finite float x >= 0.0.  The same may or may not be true of % (I would
>> have to think about that, and craft a proof one way or the other -- if
>> it is true, it would have to invoke something special about the
>> modulus 1.0, as the inequality doesn't hold for % for some other
>> modulus values).

And as you note later, x%y == fmod(x, y) whenever x and y have the
same sign (well, given the way CPython implements float.__mod__
today), so there's actually an easy proof.

> Other modulus values are important:

On an importance scale of 1 to 10, 9 or 10 ;-) ?

> The attraction of Guido's formula is that he could just as easily have
> used 60.0 or 0.001 if minute or millisecond intervals were desired, or
> even som user-specified arbitrary dt.  Then we're comparing dt-now%dt
> to (1.0-int(now/dt))*dt or (math.ceil(now/dt)-now/dt)*dt.

time.time() is never negative in Python (see other reply), so the
trivial respelling dt-fmod(now, dt) does the same.

> Fortunately, for all a,b>0, mathematically math.fmod(a,b) is equal to
> a%b, so if the former is exactly representable, so is the latter.

Yup.  Also when `a` and `b` both less than 0.  This /follows/ from
that when `a` and `b` have the same sign, the mathematical a/b is >=
0, so truncation is the same as the floor.  Therefore the mathematical

a - floor(a/b)*b   # Python __mod__
and
a - truncate(a/b)*b  # C fmod

are exactly the same whenever a and b have the same sign.

> Which is borne out in floatobject.c: float_rem and float_divmod just
> pass on the C fmod result if (a < 0) == (b < 0).

Yes.  In fact, I wrote all that code :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-26 Thread Tim Peters
[Tim (misattributed to Guido)]
>> "(int)float_or_double" truncates in C (even in K&R C) /provided that/
>> the true result is representable as an int.  Else behavior is
>> undefined (may return -1, may cause a HW fault, ...).

[Nick Maclaren]
> Actually, I have used Cs that didn't, but haven't seen any in over
> 10 years.

I believe those.

> C90 is unclear about its intent,

But am skeptical of that.  I don't have a copy of C90 here, but before
I wrote that I checked Kernighan & Ritchie's seminal C book, Harbison
& Steele's generally excellent "C: A Reference Manual" (2nd ed), and a
web version of Plauger & Brodie's "Standard C":

 http://www-ccs.ucsd.edu/c/

They all agree that the Cs they describe (all of which predate C99)
convert floating to integral types via truncation, when possible.

> but C99 is specific that truncation is towards zero.

As opposed to what?  Truncation away from zero?  I read "truncation"
as implying toward 0, although the Plauger & Brodie source is explicit
about "the integer part of X, truncated toward zero" for the sake of
logic choppers ;-)

> This is safe, at least for now.

>> So Python uses C's modf() for float->int now, which is always defined
>> for finite floats, and also truncates.

> Yes.  And that is clearly documented and not currently likely to
> change, as far as I know.

I don't expect to see another C standard in my lifetime, given that
some major C compiler vendors still ignore C99 (and given that my
expected remaining lifetime is much less than that of most people
reading this ;-)).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floor division

2007-01-26 Thread Tim Peters
[Tim Peters]
>> ...
>> [it would be possible for float.__divmod__ to return the exact
>> quotient], but who would have a (sane) use for a possibly 2000-bit
>> quotient?

[Nick Maclaren]
> Well, the 'exact rounding' camp in IEEE 754 seem to think that there
> is one :-)
>
> As you can gather, I can't think of one.  Floating-point is an inherently
> inaccurate representation for anything other than small integers.

Well, bounded.  Python's decimal fp can handle millions of digits, if
you want that and are very patient ;-)

OTOH, I am a fan of analyzing FP operations as if the inputs were in
fact exactly what they claim to be, which 754 went a long way toward
popularizing.  That largely replaced mountains of idiosyncratic
"probabilistic arguments" (and where it seemed no two debaters ever
agreed on the "proper" approach)  with a common approach that
sometimes allows surprisingly sharp analysis.  Since I spent a good
part of my early career as a professional apologist for Seymour Cray's
"creative" floating point, I'm probably much more grateful to leave
sloppy arithmetic behind than most.

...

>> This [that IBM's spec calls for an exception if remainder's inputs'
>> exponents differ by "too much"] is a bit peculiar to me, because
>> there are ways to compute "remainder" using a number of operations
>> proportional to the log of the exponent difference.  It could be that
>> people who spend their life doing floating point forget how to work
>> with integers ;-)

> Aargh!  That is indeed the key!  Given that I claim to know something
> about integer arithmetic, too, how can I have been so STUPID?

You're not alone.  I thought of this a decade ago, but have never seen
it implemented, or even mentioned.  All implementations of fmod I've
seen emulate long binary division one bit at a time; the IBM spec
seems to assume that's how it would be done for decimal too (why else
mandate an exception instead if the exponent delta "is large"?); and
so on.

> Yes, you are right, and that is the only plausible way to calculate the
> remainder precisely.  You don't get the quotient precisely, which is
> what my (insane) specification would have provided.

And, alas, without the last bit of the quotient IBM's "remainder-near"
(== C99's "remainder" == 754's REM) can't resolve halfway cases in the
mandated way.

> I would nitpick with your example, because you don't want to reduce
> modulo 3.14 but modulo pi

I didn't have pi in mind at all.  I just picked 3.14 as an arbitrary
decimal modulus with few digits, to make the example easy to follow.
Could just as well have been 2.72 (which has nothing to do with e
either ;-)).

> and therefore the modular arithmetic is rather more expensive (given
> Decimal).  However, it STILL doesn't help to make remquo useful!
>
> The reason is that pi is input only to the floating-point precision,
> and so the result of remquo for very large arguments will depend
> more on the inaccuracy of pi as input than on the mathematical
> result.  That makes remquo totally useless for the example you quote.

That's not our choice to make.  Many math libraries still use a
"small" approximation to pi for trig argument reduction, and that's
their choice.  Heck, a 66-bit value for pi is built in to the
Pentium's FSIN and FCOS instructions.

>>> math.sin(math.pi)
1.2246063538223773e-016

That was on Windows with Python 2.5, using the MSVC compiler.  The
result indicates it uses the FSIN instruction.  Same thing but under
the Cygwin Python, whose libm uses "infinite precision pi" trig
reduction:

>>> math.sin(math.pi)
1.2246467991473532e-16

That one is "correct" (close to the best double approximation to the
sine of the best double approximation to pi).  The FSIN result is off
by about 164 billion(!) ULP, but few people care.

Anyway, I simply gave cosine arg reduction as /an example/ of the kind
of reduction strategy for which remquo can be useful.  You said you
were baffled by why C99 only required the last 3 bits.  The example
was a hint about why some functions can indeed find that to be useful,
not an exhaustive rationale.  It's really off-topic for Python-Dev, so
I didn't/don't want to belabor it.

> Yes, I have implemented 'precise' range reduction, and there is no
> substitute for using an arbitrary precision pi value :-(

I have too (for 754 doubles), coincidentally at the same time KC Ng
was implementing it for fdlibm.  I found some bugs in fdlibm's trig
functions at the time by generating billions of random inputs and
comparing his library-in-progress's results to mine.  That was fun.
My users (this was don

Re: [Python-Dev] Python's C interface for types

2007-01-26 Thread Tim Peters
[Nick Maclaren]
> ...
> 2) _PyLong_FromByteArray and _PyLong_AsByteArray aren't in
> the API

They're not in the public API, which is equivalent to that their names
begin with a leading underscore.  They're in the private API :-)

> and have no comments.

The behavior of these functions, including return value and error
conditions, is specified in longobject.h.

> Does that mean that they are unstable, in the sense that they may
> change behaviour in new versions of Python?

They /may/ change, but they won't (== only common sense guarantees
they won't change ;-)).

> And will they be there in 3.0?

Almost certainly so.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] safety of Py_CLEAR and self

2007-02-12 Thread Tim Peters
[Jeremy Hylton]
> I was wondering today how I could convince myself that a sequence of
> Py_CLEAR() calls in a tp_clear function was safe.  Take for example a
> really trivial sequence of code on frame_clear():
>
> Py_CLEAR(f->f_exc_type);
> Py_CLEAR(f->f_exc_value);
> Py_CLEAR(f->f_exc_traceback);
> Py_CLEAR(f->f_trace);
>
> We use Py_CLEAR() so that multiple calls to frame_clear() are safe.
> The first time this runs it sets exc_type to NULL before calling
> DECREF.  This guarantees that a subsequent frame_clear() or
> frame_dealloc() won't attempt to DECREF it a second time.  I think I
> understand why that's desireable and why it works.  The primary risk
> is that via DECREF we may execute an arbitrary amount of Python code
> via weakref callbacks, finalizers, and code in other threads that gets
> resumed while the callbacks and finalizers are running.
>
> My question, specifically, then: Why it is safe to assume that f
> doesn't point to trash after a particular call to Py_CLEAR()?  Any
> particular call to Py_CLEAR() could break the cycle that the object is
> involved in an lead to a call to frame_dealloc().  The frame could get
> returned to an obmalloc pool, returned to the OS, or just re-used by
> another object before we get back to Py_CLEAR().  It seems like such
> behavior would be exceedingly unlikely, but I can't convince myself
> that it is impossible.  Which is it: improbable or impossible?  If it
> is only improbable, should we figure out how to write code that is
> safe against such an improbable event?

As Guido pointed out, tp_clear is called by gc from only one place,
which sticks an incref/decref pair around the call so that the
refcount on `f` can't fall to 0 while frame_clear() is active.

That doesn't mean frame_clear is always safe to call, it only means
that gc's use of the tp_clear slot is safe.  Nobody else "should be"
calling frame_clear (and it so happens nothing else in the core does),
but it's something to be dimly aware of.  For example, IIRC, some of
ZODB's C code internally invokes its XXX_clear() functions directly,
as part of removing persistent object state (unrelated to object
deallocation).  Determining whether those kinds of uses are safe
requires case-by-case analysis.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_ssize_t

2007-02-20 Thread Tim Peters
[Raymond Hettinger]
> After thinking more about Py_ssize_t, I'm surprised that we're not hearing 
> about
> 64 bit users having a couple of major problems.
>
> If I'm understanding what was done for dictionaries, the hash table can grow
> larger than the range of hash values.  Accordingly, I would expect large
> dictionaries to have an unacceptably large number of collisions.  OTOH, we
> haven't heard a single complaint, so perhaps my understanding is off.
> ...

As others have noted, it would require a truly gigantic dict for
anyone to notice, and nobody yet has enough RAM to build something
that large.  I added this comment to dictobject.c for 2.5:

Theoretical Python 2.5 headache:  hash codes are only C "long", but
sizeof(Py_ssize_t) > sizeof(long) may be possible.  In that case, and if a
dict is genuinely huge, then only the slots directly reachable via indexing
by a C long can be the first slot in a probe sequence.  The probe sequence
will still eventually reach every slot in the table, but the collision rate
on initial probes may be much higher than this scheme was designed for.
Getting a hash code as fat as Py_ssize_t is the only real cure.  But in
practice, this probably won't make a lick of difference for many years (at
which point everyone will have terabytes of RAM on 64-bit boxes).

Ironically, IIRC we /have/ had a complaint in the other direction:
someone on SF claims to have a box where sizeof(Py_ssize_t) <
sizeof(long).  Something else breaks as a result of that.  I think I
always implicitly assumed sizeof(Py_ssize_t) >= sizeof(long) would
hold.

In any case, hash codes are defined to be of type "long" in the C API,
so there appears no painless way to boost their size on boxes where
sizeof(Py_ssize_t) > sizeof(long).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rewrite of cmath module?

2007-03-18 Thread Tim Peters
[Mark Dickinson]
> The current cmath module has some significant numerical problems,
> particularly in the inverse trig and inverse hyperbolic trig
> functions.

IIRC, cmath was essentially whipped up over a weekend by a scientist,
transcribing formulas from a math reference.  Such an approach is
bound to suffer "significant numerical problems", but has the
advantage of taking little time to implement ;-)  A few of the worst
have been improved (e.g., see c_quot(), which contains the original
code in a comment), but no systematic effort has been made.

> ...

> I'm wondering whether anyone would be interested in a rewrite of the
> cmath module.  I have a drop-in replacement written in pure Python,
> based largely on the formulas given by Kahan in his `Much Ado about
> Nothing's Sign Bit' article, which I believe eliminates the vast
> majority of the current numerical problems.

I believe that :-)

> Besides the numerical fixes, the only significant difference from the
> current behaviour is that the branch cuts for asinh have been moved.
> (The docs suggest that the current branch cuts for asinh should be
> regarded as buggy).  I'm offering to translate this into C and add
> appropriate documentation and test cases, but I have a couple of
> questions before I start.

Make the Python implementation available first and solicit feedback
from the Numeric community?

> (1) Is there actually any interest in fixing this module?  I guess
> that the cmath module can't be used that much, otherwise these
> problems would have surfaced before.

That's the rub:  most people don't use cmath, while most who do have
no idea what these functions should return.  If there's a gross
problem, most wouldn't even notice.  The few places in cmath that have
been improved were due to users who did notice.  Of course all "should
be" fixed.

> (2) (Disregard if the answer to question 1 is `No'.)  What should be
> done about branch cuts?  The value of a function on a branch cut is
> going to depend on signs of zeros, so it'll be pretty much impossible
> to make any guarantees about the behaviour.  Even so, there are two
> approaches that seem feasible:

...

Follow the C99 standard whenever possible.  C99 added a complex type
to C, and added complex-valued math functions too.  Alas, C99 hasn't
been widely adopted yet, but Python should be compatible with the
platform C implementation to the extent possible.  Of course the
easiest way to do that is to /use/ the platform C implementation when
possible.

...

> (3) Is it worth trying to be consistent in exceptional cases?  The
> current math module documentation notes that math.log(0) might produce
> -inf on one platform and raise ValueError on another, so being consistent
> about whether a non-representable result gives a NaN, an infinity or a
> Python exception seems as though it would be tricky.  I should also note that
> I've made no effort to do the right thing when the argument to any of the
> functions contains a NaN or an infinity in either the real or imaginary part.

Welcome to the club ;-)  To the extent that your code relies on
real-valued libm functions, you inherit uncertainties from the
platform C's implementation of those too.  There's a long debate about
whether that's a feature (Python math functions act the same way as
the platform C's, and likely the platform Fortran's too) or a bug
(Python acts differently on different platforms).  "Feature" is the
traditional answer to that.

...
> [list of specific bad behaviors]
...

> One more thing: since this is my first post to python-dev I should
> probably introduce myself.  I'm a mathematician, working mostly in
> number theory.  I learnt programming and numerical analysis the hard
> way, coding service-life prediction algorithms for rocket motors in
> Fortran 77 during several long summers.  I've been using Python for
> more than ten years, mainly for teaching but also occasionally for
> research purposes.

Of course a strong background in math is very helpful for this.  If
you also spent several long summers cleaning sewers with your bare
hands, your qualifications for working on libm functions are beyond
reproach ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changes to decimal.py

2007-04-13 Thread Tim Peters
[Raymond Hettinger]
> ...
> Likewise, consider soliciting Tim's input on how to implement the ln()
> operation.  That one will be tricky to get done efficiently and correctly.

One low-effort approach is to use a general root-finding algorithm and
build ln(x) on top of exp() via (numerically) solving the equation
exp(ln(x)) == x for ln(x).  That appears to be what Don Peterson did
in his implementation of transcendental functions for Decimal:

http://cheeseshop.python.org/pypi/decimalfuncs/1.4

In a bit of disguised form, that appears to be what Brian Beck and
Christopher Hesse also did:

http://cheeseshop.python.org/pypi/dmath/0.9

The former is GPL-licensed and the latter MIT, so the latter would be
easier to start from for core (re)distribution.

However, the IBM spec requires < 1 ULP worst-case error, and that may
be unreasonably hard to meet with a root-finding approach.  If this
can wait a couple months, I'd be happy to own it.  A possible saving
grace for ln() is that while the mathematical function is one-to-one,
in any fixed precision it's necessarily many-to-one (e.g., log10 of
the representable numbers between 10 and 1e100 must be a representable
number between 1 and 100, and there are a lot more of the former than
of the latter -- many distinct representable numbers must map to the
same representable log).
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SystemErrors in generator (just happened in 2.5)

2007-04-16 Thread Tim Peters
[Neal Norwitz]
> There were some SystemErrors on one of the Windows build slaves.

Not always, though -- looks to be rare.

> Does anyone have any ideas what might be the cause?  I looked through about
> 5 previous logs on the same slave and didn't see the same problem.

I'm home today and fired up my slave, and no problem there either.

> I haven't seen this myself and I don't know if it's WIndows specific.  I
> don't know if this has happened on the trunk or just 2.5.

Me neither.

> For the full log see:
>   http://python.org/dev/buildbot/2.5/x86%20XP%202.5/builds/197/step-test/0
>
> n
> --
>
> *** Previous context (I don't think this really matters):
>
> Re-running test 'test_profilehooks' in verbose mode
> test_caught_exception (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_caught_nested_exception (test.test_profilehooks.ProfileHookTestCase) ... 
> ok
> test_distant_exception (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_exception (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_exception_in_except_clause
> (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_exception_propogation (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_generator (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_nested_exception (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_raise (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_raise_reraise (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_raise_twice (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_simple (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_stop_iteration (test.test_profilehooks.ProfileHookTestCase) ... ok
> test_basic_exception (test.test_profilehooks.ProfileSimulatorTestCase) ... ok
> test_caught_exception (test.test_profilehooks.ProfileSimulatorTestCase) ... ok
> test_distant_exception (test.test_profilehooks.ProfileSimulatorTestCase) ... 
> ok
> test_simple (test.test_profilehooks.ProfileSimulatorTestCase) ... ok
>
> --
> Ran 17 tests in 0.016s
>
> OK
>
> *** Here is the problem:
>
> Exception exceptions.SystemError: 'error return without exception set' in 
>  ignored
> Exception exceptions.SystemError: 'error return without exception set' in 
>  ignored
> Exception exceptions.SystemError: 'error return without exception set' in 
>  ignored
> Exception exceptions.SystemError: 'error return without exception set' in 
>  ignored
> Exception exceptions.SystemError: 'error return without exception set' in 
>  ignored


> *** The following messages occur in other successful tests too:
> a DOS box should flash briefly ...

Always happens in test_subprocess, during the Windows-specific
test_creationflags.  This is expected.  When you /watch/ the tests
running on Windows, it's intended to prevent panic when a mysterious
DOS box appears ;-)

> find_library('c') ->  None
> find_library('m') ->  None

Mysterious.  Looks like debug/trace(!) output while running
Lib/ctypes/test/test_loading.py's test_find().

> C:\buildbot_py25\2.5.mcintyre-windows\build\lib\test\test_unicode_file.py:103:
> UnicodeWarning: Unicode equal comparison failed to convert both
> arguments to Unicode - interpreting them as being unequal
>   filename1==filename2
> C:\buildbot_py25\2.5.mcintyre-windows\build\lib\shutil.py:36:
> UnicodeWarning: Unicode equal comparison failed to convert both
> arguments to Unicode - interpreting them as being unequal
>   os.path.normcase(os.path.abspath(dst)))

Those started showing up months ago.

> warning: DBTxn aborted in destructor.  No prior commit() or abort().

Likewise, from the bsddb test, and we've been seeing this one on
Unixish boxes too.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] whitespace normalization

2007-04-25 Thread Tim Peters
[Neal Norwitz]
> ...
> The way to fix the files is to run:  python ./Tools/scripts/reindent.py -r Lib

I apply it to everything in the checkout.  That is, I run reindent.py
from the root of my sandbox, and use "." instead of "Lib".  The goal
is that every .py file (not just under Lib/) that eventually shows up
in a release use whitespace consistently.

> At least that's what I did.  Hopefully I didn't screw anything up. :-)

reindent.py has never been blamed for a "legitimate" screwup.  On rare
occasions it has broken tests, but only when the code mysteriously
relied on significant trailing whitespace in the .py file.  Such
invisible dependence is considered (by me :-)) to be a bug in the
code, not in reindent.py.

The other no-brainer is to run Tools/scripts/svneol.py regularly.
That finds text files that don't have the svn:eol-style property set,
and sets it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] whitespace normalization

2007-04-25 Thread Tim Peters
[Skip]
> Maybe \s should expand to a single space by the lexer so people who want to
> rely on trailing spaces can be explicit about it.  There already exists
> non-whitespace escape sequences for tabs and newlines.

Trailing whitspace is never significant on a code line, only inside a
multiline string literal.  In the extremely rare cases where one of
those requires trailing spaces, \x20 works fine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] whitespace normalization

2007-04-25 Thread Tim Peters
[Skip]
> Just a little FYI, python-mode (the one Barry and I manage - dunno about the
> one distributed w/ GNU Emacs these days) is one of those tools that leaves
> trailing whitespace behind when advancing to the next line..

Shouldn't be -- unless the behavior of the Emacs newline-and-indent
has changed.  The pymode version (py-newline-and-indent) relies in
part on newline-and-indent; the intent is documented in the
py-newline-and-indent docstring:

(defun py-newline-and-indent ()
  "...
   In general, deletes the whitespace before
   point, inserts a newline, and takes an educated guess as
   to how you want the new line indented."

IIRC, pymode leaves C-j bound to plain old newline, so if you're in
the habit of starting a new line via C-j no changes of any kind are
made to whitespace.  But that's not the intended way to use pymode.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] head crashing (was: Fwd: [Python-checkins] buildbot warnings in x86 mvlgcc trunk)

2007-05-01 Thread Tim Peters
[Neal Norwitz]
>> In rev 54982 (the first time this crash was seen), I see something
>> which might create a problem.  In python/trunk/Modules/posixmodule.c
>> (near line 6300):
>>
>> +   PyMem_FREE(mode);
>>Py_END_ALLOW_THREADS

Shouldn't do that.

[Brett Cannon]
> The PyMem_MALLOC call that creates 'mode' is also called without explicitly
> holding the GIL.

Or that ;-)

>> Can you call PyMem_FREE() without the GIL held?  I couldn't find it
>> documented either way.

> I believe the GIL does not need to be held, but obviously Tim or someone
> with more memory experience should step in to say definitively.

The GIL should be held.  The relevant docs are in the Python/C API
manual, section "8.1 Thread State and the Global Interpreter Lock":

Therefore, the rule exists that only the thread that has acquired the global
interpreter lock may operate on Python objects or call Python/C
API functions.

PyMem_XYZ is certainly a "Python/C API function".  There are functions
you can call without holding the GIL, and section 8.1 intends to give
an exhaustive list of those.  These are functions that can't rely on
the GIL, like PyEval_InitThreads() (which /creates/ the GIL), and
various functions that create and destroy thread and interpreter
state.

> If you look at Include/pymem.h, PyMem_FREE gets defined as PyObject_FREE in
> a debug build.  PyObject_Free is defined at _PyObject_DebugFree.  That
> function checks that the memory has not been written with the debug bit
> pattern and then calls PyObject_Free.  That call just sticks the memory back
> into pymalloc's memory pool which is implemented without using any Python
> objects.

But pymalloc's pools have a complex internal structure of their own,
and cannot be mucked with safely by multiple threads simultaneously.

> In other words no Python objects are used in pymalloc (to my knowledge) and
> thus is safe to use without the GIL.

Nope.  For example, if two threads simultaneously try to free objects
in the same obmalloc size class, there are a number of potential
thread-race disasters in linking the objects into the same size-class
chain.

In a release build this doesn't matter, since PyMem_XYZ map directly
to the platform malloc/realloc/free, and so inherit the thread safety
(or lack thereof) of the platform C implementations.

If it's necessary to do malloc/free kinds of things without holding
the GIL, then the platform malloc/free must be called directly.
Perhaps that's what posixmodule.c wants to do in this case.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New operations in Decimal

2007-05-11 Thread Tim Peters
[Raymond Hettinger]
> ...
> My intention for the module is to be fully compliant with the spec and all of 
> its
> tests.  Code written in other languages which support the spec should expect
> to be transferrable to Python and run exactly as they did in the original 
> language.
>
> The module itself makes that promise:
>
> "this module should be kept in sync with the latest updates
> of the IBM specification as it evolves.  Those updates will
> be treated as bug fixes (deviation from the spec is considered
> a compatibility, usability bug)"
>
> If I still have any say in the matter, please consider this a pronouncement.  
> Tim,
> if you're listening, please chime in.

That was one of the major goals in agreeing to adopt an external
standard for decimal:  tedious arguments are left to the standard
creators instead of clogging python-dev.  Glad to see that's working
exactly as planned ;-)

I'm with Raymond on this one, especially given the triviality of
implementing the revised spec's new logical operations.

I personally wish they would have added more transcendental functions
to the spec instead.  That's bread-and-butter stuff for FP
applications, while I don't see much use for the new "bit" operations.
 But if I felt strongly enough about that, I'd direct my concerns to
the folks creating this standard.  As slippery slopes go, this less
than a handful of trivial new operations isn't steep enough to
measure, let alone cause a landslide.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Module cleanup improvement

2007-05-22 Thread Tim Peters
[Armin Rigo]
> On another level, would there be interest here for me to revive my old
> attempt at throwing away this messy procedure, which only made sense in
> a world where reference cycles couldn't be broken?

Definitely.

> Nowadays the fact that global variables suddenly become None when the
> interpreter shuts down is a known recipe for getting obscure messages from
> still-running threads, for example.
>
> This has one consequence that I can think about: if we consider a
> CPython in which the cycle GC has been disabled by the user, then many
> __del__'s would not be called any more at interpreter shutdown.  Do we
> care?

I don't believe this is a potential issue in CPython.  The
user-exposed gc.enable() / gc.disable() only affect "automatic" cyclic
gc -- they flip a flag that has no bearing on whether an /explicit/
call to gc.collect() will try to collect trash (it will, unless a
collection is already in progress, & regardless of the state of the
"enabled" flag).  Py_Finalize() calls the C spelling of gc.collect()
(PyGC_Collect), and I don't believe that can be user-disabled.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Floating point test_pow failure on the alpha Debian buildbot

2007-07-26 Thread Tim Peters
[Nick Coghlan]
> test_pow is failing on the alpha Debian buildbot, complaining that a
> negative number can't be raised to a fractional power. Now, to work
> around some bugs in platform implementations of math.fpow(), pow() does
> its own check to see if the exponent is an integer.
>
> The way pow() does that check is to try "iw == floor(iw)", so to see why
> the exception was being triggered, I put a couple of extra output lines
> into the test and got:
>
> *** Number: 1.2299e+167
> *** Floor: 1.2297e+167
>
> Given that the magnitude of the exponent significantly exceeds the
> precision of an IEEE double, it seems wrong for floor() to be changing
> the mantissa like that

It is wrong -- the machine representation of test_pow's 1.23e167
literal is an exact integer on any current box, and the floor of any
exact integer is the integer itself.

> (and, on my machine, and all of the other buildbots, it doesn't).
>
> I've added an explicit test for this misbehaviour to test_math so at
> least the buildbot gives a clearer indication of what's going wrong, but
> I'm not sure what to do with it beyond that.

This isn't Python's problem -- a bug report should be opened against
the platform C's implementation of floor(), and the test /should/ fail
in Python so long as the platform floor() remains broken.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return type of round, floor, and ceil in 2.6

2008-01-04 Thread Tim Peters
[EMAIL PROTECTED]
> Thanks for the pointer.  Given that it's [round-to-even[ been an ASTM
> standard since 1940 and apparently in fairly common use since the
> early 1900s, I wonder why it's not been more widely used in the past
> in programming languages.

Because "add a half and chop" was also in wide use for even longer, is
also (Wikipedia notwithstanding) part of many standards (for example,
the US IRS requires it if you do your taxes under the "round to whole
dollars" option), and-- probably the real driver --is a little cheaper
to implement for "normal sized" floats.  Curiously, round-to-nearest
can be unboundedly more expensive to implement in some obscure
contexts when floats can have very large exponents (as they can in
Python's "decimal" module -- this is why the proposed decimal standard
allows operations like "remainder-near" to fail if applied to inputs
that are "too far apart":

http://www2.hursley.ibm.com/decimal/daops.html#footnote.8

The "secret reason" is that it can be unboundedly more expensive to
determine the last bit of the quotient (to see whether it's even or
odd) than to determine an exact remainder).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return type of round, floor, and ceil in 2.6

2008-01-05 Thread Tim Peters
[Tim]
>> Because "add a half and chop" was also in wide use for even longer, is
>> also (Wikipedia notwithstanding) part of many standards (for example,
>> the US IRS requires it if you do your taxes under the "round to whole
>> dollars" option), and-- probably the real driver --is a little cheaper
>> to implement for "normal sized" floats.  Curiously, round-to-nearest
>> can be unboundedly more expensive to implement in some obscure
>> contexts when floats can have very large exponents (as they can in
>> Python's "decimal" module -- this is why the proposed decimal standard
>> allows operations like "remainder-near" to fail if applied to inputs
>> that are "too far apart":
>>
>> http://www2.hursley.ibm.com/decimal/daops.html#footnote.8
>>
>> The "secret reason" is that it can be unboundedly more expensive to
>> determine the last bit of the quotient (to see whether it's even or
>> odd) than to determine an exact remainder).

[Guido]
> Wow. Do you have an opinion as to whether we should adopt
> round-to-even at all (as a default)?

Yes:  yes :-)  There's no need to be unduly influenced by "some
obscure contexts when floats can have very large exponents", and the
decimal standard already weasels its way out of the bad consequences
then.  I should clarify that the standard "allows" relevant operations
to fail then in the same sense the IRS "allows" you to pay your taxes:
 it's not just allowed, failure is required.

Nearest/even is without doubt the rounding method experts want most
often, which is half of what makes it the best default.  The other
half is that, while newbies don't understand why experts would want
it, the underlying reasons nevertheless act to spare newbies from
subtle numeric problems.

As to what the numerically innocent /expect/, "(at least) all of the
above" is my only guess.  For example (and here I'm making up a very
simple one to show the essence), under the Windows native Python

"%.0f" % 2.5

produces "3", while under glibc-based implementations (including
Cygwin's Python) it produces "2".  Over the years I've seen "bug
reports" filed against both outcomes.  According to the 754 standard,
the glibc-based result (nearest/even rounding) is correct, and the
Microsoft result is wrong.  Why fight it?  All the HW float operations
do nearest/even rounding now too (by default), ditto the decimal
module, and I'm secretly grateful the people who decided on those were
downright eager to oppose Excel's numerically innocent implementors
;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return type of round, floor, and ceil in 2.6

2008-01-05 Thread Tim Peters
[Mark Dickinson]
> quantize is about as close as it gets.  Note that it's a Decimal method as
> well as a Context method, so you can invoke it directly on a given decimal:
>
>
> >>> Decimal("2.34567").quantize(Decimal("0.01"))
> Decimal("2.35")

This "reads better" in many cases if you define a constant first, like:

PENNIES = Decimal("0.01")

... [lots of code] ...

rounded = some_decimal.quantize(PENNIES)


> I've also occasionally felt a need for a simple rounding function that isn't
> affected by context.  Would others be interested in such a function being
> added to Decimal?  I guess there are two possibly useful operations:  (1)
> round to a particular decimal place ( e.g. nearest ten, nearest hundredth,
> ..) and (2) to round to a particular number of significant digits;  in both
> cases, the user should be able to specify the desired rounding mode.  And
> for each operation, it might also be useful to specify whether the result
> should be padded with zeros to the desired length or not.  ( i.e. when
> rounding 3.399 to 3 significant places, should it produce 3.4 or 3.40?)
>
> Any thoughts?

+1 from me.  Like the 754 standard, the decimal std is trying to
mandate a more-or-less minimal set of core functionality, with no
concern for user interface.  "Convenience functions" can be valuable
additions in such cases, & I agree it's far from obvious to most how
to accomplish rounding using the decimal facilities.

I think it's obvious ;-) that rounding 3.399 to 3 sig. dig. should produce 3.40.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Return type of round, floor, and ceil in 2.6

2008-01-05 Thread Tim Peters
[Tim]
>> Curiously, round-to-nearest
>> can be unboundedly more expensive to implement in some obscure
>> contexts when floats can have very large exponents (as they can in
>> Python's "decimal" module -- this is why the proposed decimal standard
>> allows operations like "remainder-near" to fail if applied to inputs
>> that are "too far apart":

[Daniel Stutzbach]
> Just to be clear, this problem doesn't come up in round(), right?

Right!  It's unique to 2-argument mod-like functions.


> Because in round(), you just test the evenness of the last digit
> computed.  There is never a need to compute extra digits just to
> perform the test.

Right, round has to compute the last (retained) digit in any case.

For mod(x, y) (for various definitions of mod), the result is x - n*y
(for various ways of defining an integer n), and there exist efficient
ways to determine the final result without learning anything about the
value of n in the process.  For example, consider Python's pow(10,
1, 136).  It goes very fast to compute the answer 120, but
internally Python never develops any idea about the value of n such
that 10**1 - 136*n = 120.  Is n even or odd?  "remainder-near"
can care, but there's no efficient way I know of to find out (dividing
a 100-million digit integer by 136 to find out isn't efficient ;-)).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rational approximation methods

2008-01-20 Thread Tim Peters
What would be useful is a method that generates (i.e., a generator in
the Python sense) the (continued fraction) convergents to a rational.
People wanting specific constraints on a rational approximation
(including, but not limited to, the two you identified) can easily
build them on top of such a generator.

By "useful" I don't mean lots of people will use it ;-)  I mean /some/
people will use it -- a way to generate the sequence of convergents is
a fundamental tool that can be used for all sorts of stuff, by those
with advanced applications.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-08 Thread Tim Peters
[Neal Norwitz]
> It's not just size.  Architectures may require data aligned on 4, 8,
> or 16 addresses for optimal performance depending on data type.  IIRC,
> malloc aligns by 8 (not sure if that was a particular arch or very
> common).

Just very common.  Because malloc has no idea what the pointer it
returns will be used for, it needs to satisfy the strictest alignment
requirement for any type exposed by the C language.  As an extreme
example, when I worked at Kendall Square Research, our hardware
supported atomic locking on a "subpage" basis, and HW subpage
addresses were all those divisible by 128  Subpages were exposed via C
extensions, so the KSR malloc had to return 128-byte aligned pointers.

> I don't know if pymalloc handles alignment.

pymalloc ensures 8-byte alignment.  This is one plausible reason to
keep the current int free list:  an int object struct holds 3 4-byte
members on most boxes (type pointer, refcount, and the int's value),
and the int freelist code uses exactly 12 bytes for each on most
boxes.  To keep 8-byte alignment, pymalloc would have to hand out a
16-byte chunk per int object, wasting a fourth of the space (pymalloc
always rounds up a requested size to a multiple of 8, and ensures the
address returned is 8-byte aligned).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Goodbye

2010-09-22 Thread Tim Peters
Yikes - Mark has done terrific work in some very demanding areas, &
I'd hate to see him feel unwelcome.  So that's my advice:  find a way
to smooth this over.  You're welcome ;-)

[Guido]
>> ...
>> I understand the desire to keep dirty laundry in. I would like to keep
>> it in too. Unfortunately the offending person in this case chose not
>> to; I will not speculate about his motivation. This is not unusual; I
>> can recall several incidents over the past few years (all completely
>> different in every detail of course) where someone blew up publicly
>> and there wasn't much of a chance to keep the incident under wraps. I
>> see it as the risk of doing business in public -- which to me still
>> beats the risk of doing business in back rooms many times over.

[Mark Lawrence]
> If you're referring to me I'm extremely offended.  Yes or no?

Have to confess I can't see what's offensive in what Guido wrote
there.  If you're inclined to feel offended, how about going back to
Guido's:

Which to me sounds defiant and passive-aggressive. I don't
want to go into analyzing, but I expect that Mark has issues
that are beyond what this community can deal with.

Even I felt a bit offended by that one ;-)

speaking-as-one-who-has-issues-no-community-can-deal-with-ly y'rs  - tim
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PyObject_GC_UnTrack() no longer reliable in 2.7?

2010-09-24 Thread Tim Peters
Looks like 2.7 changes introduced to exempt dicts and tuples from
cyclic gc if they obviously can't be in cycles has some unintended
consequences.  Specifically, if an extension module calls
PyObject_GC_UnTrack() on a dict it _does not want tracked_, Python can
start tracking the dict again.

I assume this is unintended because (a) the docs weren't changed to
warn about this; and, (b) it's wrong ;-)

There are two main reasons an extension module may have been calling
PyObject_GC_UnTrack():

1. For correctness, if the extension is playing games with reference
counts Python isn't expecting.

2. For speed, if the extension is creating dicts (or tuples) it knows
cannot participate in cycles.

This came up when Jim Fulton asked me for advice about assertion
failures coming out of cyclic gc in a debug build when running ZODB's
tests under 2.7.  Turned out to be due to the "#1 reason" above:  ZODB
hand-rolled its own version of weak references long before Python had
them, and has a dict mapping strings ("object IDs") to complex objects
where the referenced objects' refcounts intentionally do _not_ account
for the references due to this dict.

This had been working fine for at least 8 years, thanks to calling
PyObject_GC_UnTrack() on this dict right after it's created.

But under 2.7, new code in Python apparently decides to track this
dict again the first time its __setitem__ is called.  Cyclic gc then
discovers object references due to this dict that aren't accounted for
in the referenced objects' refcounts, and dies early on with an
assertion failure (which it should do - the refcounts are nuts as far
as Python is concerned).

Jim wormed around that for now by calling PyObject_GC_UnTrack() again
every time the dict's content changes, but that was relatively easy in
this case because the dict is an internal implementation detail all
accesses to which are under ZODB's control.

Best if no changes had been needed.  "Better than nothing" if the docs
are changed to warn that the effect of calling PyObject_GC_UnTrack()
may be undone by Python a nanosecond later ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyObject_GC_UnTrack() no longer reliable in 2.7?

2010-09-24 Thread Tim Peters
[Tim]
>> I assume this is unintended because (a) the docs weren't changed to
>> warn about this; and, (b) it's wrong ;-)

[Martin v. Löwis]
> It seems Jim is happy with (or has at least accepted) the behavior
> change. Would you still like to see it fixed (or, rather, have the
> 2.6 state restored)?

"it's wrong ;-)" meant what it said - the track/untrack APIs weren't
intended to be hints Python is free to ignore, they were meant to give
the user control over whether and when their objects participated in
cyclic gc.  It's true that their (by far) most common uses are
mandatory, to avoid tracking before a new object is sane, and to
untrack again before it becomes insane when it's being torn down, but
those were not the only intended uses.

That said, the optimization 2.7 introduced probably has value that
shouldn't be dismissed either.

BTW, if it had taken Jim 1000 lines of new code instead of 2 to worm
around the regression in ZODB under Python 2.7, I expect he'd be
singing a different tune ;-)  I view his experience as akin to the
canary in the coal mine, albeit likely a mine with very few miners
worldwide.

> I think it would be possible to have two versions of
> _PyGC_REFS_UNTRACKED, one being, say, -5.
> _PyGC_REFS_UNTRACKED_AND_KEEP_IT_THAT_WAY would be what you get
> when you call PyObject_GC_UnTrack; the code to do automatic
> tracking/untracking based on contents would use some other
> new API (which would be non-public in 2.7.x).

Looks like a promising idea!  gcmodule.c's IS_TRACKED macro would have
to change to check both states, and likewise the debug assert in
visit_reachable().
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Not-a-Number

2011-04-30 Thread Tim Peters
[Greg Ewing]
>> Taking a step back from all this, why does Python allow
>> NaNs to arise from computations *at all*?

[Mark Dickinson]
> History, I think.  There's a c.l.p. message from Tim Peters somewhere
> saying something along the lines that he'd love to make (e.g.,) 1e300
> * 1e300 raise an exception instead of producing an infinity, but dare
> not for fear of the resulting outcry from people who use the current
> behaviour.  Apologies if I've misrepresented what he actually
> said---I'm failing to find the exact message at the moment.
>
> If it weren't for backwards compatibility, I'd love to see Python
> raise exceptions instead of producing IEEE special values:  IOW, to
> act as though the divide-by-zero, overflow and invalid_operation FP
> signals all produce an exception.

Exactly.  It's impossible to create a NaN from "normal" inputs without
triggering div-by-0 or invalid_operation, and if overflow were also
enabled it would likewise be impossible to create an infinity from
normal inputs.  So, 20 years ago, that's how I arranged Kendall Square
Research's default numeric environment:  enabled those three exception
traps by default, and left the underflow and inexact exception traps
disabled by default.  It's not just "naive" users initially baffled by
NaNs and infinities; most of KSR's customers were heavy duty number
crunchers, and they didn't know what to make of them at first either.

But experts do find them very useful (after climbing the 754 learning
curve), so there was also a simple function call (from all the
languages we supported - C, C++, FORTRAN and Pascal), to establish the
754 default all-traps-disabled mode:

> As a bonus, perhaps there could be a mode that allowed 'nonstop'
> arithmetic, under which infinities and nans were produced as per IEEE 754:
>
>    with math.non_stop_arithmetic():
>        ...
>
> But this is python-ideas territory.

All of which is just moving toward the numeric environment 754 was
aiming for from the start:  complete user control over which exception
traps are and aren't currently enabled.  The only quibble I had with
that vision was its baffle-99%-of-users requirement that they _all_ be
disabled by default.

As Kahan wrote, it's called "an exception" because no matter _what_
you do, someone will take exception to your policy ;-)  That's why
user control is crucial in a 754 environment.  He wanted even more
control than 754 recommends (in particular, he wanted the user to be
able to specify _which_ value was returned when an exception
triggered; e.g., in some apps it may well be more useful for overflow
to produce a NaN than an infinity, or to return the largest normal
value with the correct sign).

Unfortunately, the hardware and academic types who created 754 had no
grasp of how difficult it is to materialize their vision in software,
and especially not of how very difficult it is to backstitch a
pleasant wholly conforming environment into an existing language.  As
a result, I'm afraid the bulk of 754's features are stilled viewed as
"a nuisance" by a vast majority of users :-(
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] "Sort attacks" (was Re: Hash collision security issue (now public))

2012-01-06 Thread Tim Peters
I can't find it now, but I believe Marc-Andre mentioned that CPython's
list.sort() was vulnerable to attack too, because of its O(n log n)
worst-case behavior.

I wouldn't worry about that, because nobody could stir up anguish
about it by writing a paper ;-)

1. O(n log n) is enormously more forgiving than O(n**2).

2. An attacker need not be clever at all:  O(n log n) is not only
sort()'s worst case, it's also its _expected_ case when fed randomly
ordered data.

3. It's provable that no comparison-based sorting algorithm can have
better worst-case asymptotic behavior when fed randomly ordered data.

So if anyone whines about this, tell 'em to go do something useful instead :-)

still-solving-problems-not-in-need-of-attention-ly y'rs  - tim
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FWD: 26 Python-Dev moderator request(s) waiting

2008-05-29 Thread Tim Peters
[Barry]
> ...
> How about we recruit additional moderators?  Any volunteers?

Sure -- add me as a python-dev admin, send me the password, and go
back to eating in peace :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-22 Thread Tim Peters
[Antoine Pitrou]
>> Would it be helpful if the GC was informed of memory growth by the
>> Python memory allocator (that is, each time it either asks or gives back
>> a block of memory to the system allocator) ?

[Martin v. Löwis]
> I don't see how. The garbage collector is already informed about memory
> growth; it learns exactly when a container object is allocated or
> deallocated. That the allocator then requests memory from the system
> only confirms what the garbage collector already knew: that there are
> lots of allocated objects. From that, one could infer that it might
> be time to perform garbage collection - or one could infer that all
> the objects are really useful, and no garbage can be collected.

Really the same conundrum we currently face:  cyclic gc is currently
triggered by reaching a certain /excess/ of allocations over
deallocations.  From that we /do/ infer it's time to perform garbage
collection -- but, as some examples here showed, it's sometimes really
the case that the true meaning of the excess is that "all the objects
are really useful, and no garbage can be collected -- and I'm creating
a lot of them".

pymalloc needing to allocate a new arena would be a different way to
track an excess of allocations over deallocations, and in some ways
more sensible (since it would reflect an excess of /bytes/ allocated
over bytes freed, rather than an excess in the counts of objects
allocated-over-freed regardless of their sizes -- an implication is,
e.g., that cyclic gc would be triggered much less frequently by mass
creation of small tuples than of small dicts, since a small tuple
consumes much less memory than a small dict).

Etc. ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed unittest changes

2008-07-13 Thread Tim Peters
[Michael Foord]
>> ...
>> Adding the following new asserts:
>>
>> ...
>>   assertNotIs   (first, second, msg=None)

[Steve Holden]
> Please, let's call this one "assertIsNot".

+1

> I know it's valid Python to say
>
>  if a not is b:

Nope, that's a syntax error.

> but it's a much less natural way of expressing the condition, and (for all I
> know) might even introduce an extra negation operation. "is not" is, I
> believe, treated as a single operator.

"is not" and "not in" are both binary infix operators, not to be
confused with the distinct use of "not" on its own as a unary prefix
operator.  "not is" and "in not" are both gibberish.

>>> 1 is not 2
True
>>> 1 is (not 2)
False
>>> 1 not is 2
SyntaxError: invalid syntax

>>> 1 not in [2]
True
>>> 1 in not [2]
SyntaxError: invalid syntax
>>> 1 in (not [2])
Traceback (most recent call last):
   ...
TypeError: argument of type 'bool' is not iterable
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3101: floats format 'f' and 'F'

2008-07-16 Thread Tim Peters
[Guido]
> My best guess as to why 'F' is the same as 'f' is that somebody
> (could've been me :-) thought, like several others in this thread,
> that %f never prints an exponent. I agree that making it emit an 'E'
> when an exponent is used is the right thing to do. Do it now!

The C standard doesn't allow for %f (or %F) to produce an exponent.
That appears to be a Python innovation.  People should try their
examples under their native C compiler instead (best I can tell, the
idea that %f/%F can produce an exponent came only from running Python
examples, never from running C examples).

For example,

"""
#include 

int main() {
printf("%f\n", 1e300);
}
"""

Under the Cygwin gcc, that displays (the admittedly atrocious, but
that's why people shouldn't use %f inappropriately to begin with ;-)):

1525047602552044202487044690

0.00

As far as C is concerned, the only difference between %f and %F is:

The F conversion specifier  produces INF, INFINITY, or NAN instead of inf,
infinity, or nan, respectively
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Optionally using GMP to implement long if available

2008-11-03 Thread Tim Peters
[Gregory P. Smith]
>> One optimization that could be done to the existing Python longobject
>> code is to allow it to use larger digits.  Currently it is hardcoded
>> to use 15bit digits.
>>
>> The most common desktop+server CPUs in the world (x86) all support
>> efficient 32bit*32bit -> 64bit multiply so there is no good reason to
>> limit python itself to 15bit digits when on either x86 or x86_64.

[Martin v. Löwis]
> Perhaps Tim Peters should also comment here - but if you can come up
> with a patch that does that in a portable way, I would be in favor.
> The challenge, ISTM, is to make it still compile on all systems
> (regardless of how inefficient it might be on minor platforms).

Eh -- the strong points of Python's long implementation have always
been reliability and portability.  Both are greatly aided by sticking
to operations spellable in standard C89, and by avoiding
platform-specific trickery & #ifdef'ery.  So, no, I'm not keen on
this.  Note that while 32x32->64 multiply is supported by x86 HW, that
doesn't mean there's a uniform way to get at this HW capability across
C compilers.  In contrast, (at least) efficient HW 15x15->30 multiply
is universally spelled in C via "i*j" :-)

A similar gripe applies to schemes to replace the long implementation
by GMP (optionally or not):  it adds complexity to the code.  I like
GMP myself, but am happy using one of the Python GMP wrappers when I
/want/ GMP (as timings in other messages show, using GMP is a speed
loser before ints grow "really big").

Indeed, if I had it to do over again, I would balk even at adding
Karatsuba multiplication to Python (it added extra complexity with no
real payback, given that those who care about the speed of very long
int multiplication are far better off using a GMP wrapper anyway).

grouchily y'rs  - tim
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Optionally using GMP to implement long if available

2008-11-04 Thread Tim Peters
Hey, Mark -- let's establish some background here first.  It's a fact
that the product of two N-bit integers can require 2N bits, and also a
fact that lots of HW supports NxN->2N bit integer multiplication
directly.

However, it's unfortunately also a fact that standard C has no
corresponding concept:  "*" in C always means NxN->N multiplication
(single-width product, with possibility of overflow).  I don't know
whether C99 improved this situation -- I know it was proposed to add
some "double-width integer product" /functions/, but I don't know
whether that was adopted.  I do know that "*" remained "single-width".

[Tim Peters]
>> Note that while 32x32->64 multiply is supported by x86 HW, that
>> doesn't mean there's a uniform way to get at this HW capability across
>> C compilers.  In contrast, (at least) efficient HW 15x15->30 multiply
>> is universally spelled in C via "i*j" :-)

['Mark Dickinson]
> If we're talking about standards and portability, doesn't "i*j" fail
> on those (nonexistent?) platforms where the 'int' type is only 16-bits?
> Shouldn't this be something like "(long)i * j" instead?

Sorry, I should have made type declarations explicit.  In Python, the
basic long building block is "digit", which is typedef'ed to C
unsigned short.  C89 guarantees this holds at least 16 bits.  Whenever
two digits are multiplied, the code intends to cast (at least) one of
them to "twodigits" first (if you ever see a spot where this doesn't
happen, it's a bug).  "twodigits" is typedef'ed to C long.  C89
guarantees that a long holds at least 32 bits.

So C guarantees that we're doing (at least) 32x32->32 multiplication
whenever you see code like

digit i, j;
twodigits k;

k = (twodigits)i * j;

In particular, the (at least) 32x32->32 C89 guarantees for that is at
least 15x15->30, and the latter is all that longobject.c intends to
rely on.  Along with the cast to twodigits, this is achieved across
all conforming C compilers simply by using infix "*".  The code from
1990 still works fine, on everything from cell phones to archaic Cray
boxes.


> And for 32x32 -> 64, can't this simply be replaced by "(uint64_t)i * j",
> where uint64_t is as in C99?  I'd hope that most compilers would
> manage to turn this into the appropriate 32x32-bit hardware multiply.

1. That's C99, not C89, and therefore less portable.

2. On platforms that support it, this is at least 64x64->64 multiplication,
   potentially much more expensive than the 32x32->64 (or 31x31->62?)
   flavor you /intend/ to move to.

3. There's no way to know exactly what compilers will do with this short of
   staring at generated code.

FYI, in a previous life working in speech recognition, under
Microsoft's Visual Studio 6 the only way we found to get at the
Pentium's 32x32->64 HW ability efficiently was via using inline
assembler.  For example, using various MSVC spellings of "64-bit int"
instead for the inputs usually generated external calls to a
long-winded C library "piece by piece" multiplication routine (which,
at the time, emulated 64x64->128 multiplication, then threw away the
high 64 bits).

Again, the fundamental problem here is the disconnect between what
some HW is capable of and what C allows to express (at least through
C89).  That's why it's impossible here to write portable code in C
that's also efficient.  Even what Python does now is vulnerable on the
efficiency count:  on some weird platforms, "long" is 64 bits, and so
multiplying a pair of twodigits incurs the expense of  (usually
non-native) 64x64->64 multiplication.


> I agree that very-long-integer optimizations probably don't really belong in
> Python,

Depends in part on whether Python can attract as many obsessed
maintainers and porters for such gonzo algorithms as GMP attracts ;-)


> but this patch should also provide significant benefits for short
> and medium-sized integers.  I guess I need to go away and do some
> benchmarking...

If you can /get at/ HW 32x32->64 int multiply, of course that would be faster.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Optionally using GMP to implement long if available

2008-11-09 Thread Tim Peters
[Tim Peters]
>> ..
>> Whenever two digits are multiplied, the code intends to
>> cast (at least) one of them to "twodigits" first (if you
>> ever see a spot where this doesn't happen, it's a bug).

[Mark Dickinson]
> There are indeed one or two spots that seem to be missing a
> cast, for example the line "rem -= hi*n" in inplace_divrem1.

Definitely a bug!  Alas, it's not surprising -- there are no platforms
on which this bug has a visible consequence (because `digit` is
currently type `unsigned short`, C coerces `hi` and `n` to `int`
before multiplying, and on all platforms to date a C int is at least
32 bits, so that the multiply is at least 32x32->32 despite the lack
of a `twodigts` cast).

>> ...
>> 3. There's no way to know exactly what compilers will do with
>>this short of staring at generated code.

> Yes---I've done the staring for gcc, so I now have precisely *one*
> data point, which is that various flavours of gcc on x86/x86_64
> *are* clever enough to turn
>
>  (uint64_t)my_uint32 * my_other_uint32
>
> into the appropriate HW instruction.

Nice!  Is this a documented feature?  "The problem" is that it
probably depends on a combination of gcc version and compilation
flags, and because /knowing/ whether it works requires staring at
generated code, there's probably no sane way to automate detection of
when, if, and under what conditions it breaks.  "Serious" packages use
assembler to avoid all this uncertainty.

> Unfortunately I don't have easy access to other compilers or
> platforms right now. :-(

Sorry, neither do I.  If you can dream up a way to automate testing
for generation of the desired HW instructions, you could post the test
program here and I bet a lot of people would run it.  Maybe even if
you just described what to look for "by eyeball" in the generated
code.


> Am still working on the benchmarking, but I'm definitely seeing
> speedup on gcc/x86---between 10% and 100% depending
> on the operations.

Sure -- it's not surprising that HW crunching more bits at a time is
significantly faster.

>> FYI, in a previous life working in speech recognition, under
>> Microsoft's Visual Studio 6 the only way we found to get at the
>> Pentium's 32x32->64 HW ability efficiently was via using inline
>> assembler.

> Urk.  We definitely don't want to go there.  Though I guess this
> is how packages like gmp and GP/Pari manage.

1. I have no idea what versions of Microsoft's compiler
   after MSVC 6 do here; perhaps it's no longer an issue
   (the Windows Python distro no longer uses MSVC 6).

2. If this is thought to be worth having, then on very
   widely used platforms I think a good case /can/ be
   made for incorporating some assembler.

3. GMP is "speed at any cost" -- they use assembler
   even when it's a stupid idea ;-)

> ..
> But maybe it's possible to write portable code (by providing fallbacks)
> that turns out to be efficient on the majority of mainstream systems?

If "it works" under the gcc and Windows compilers du jour on x86
systems, that probably covers over 90% of Python installations.  Good
enough -- stop before it gets pointlessly hard ;-)


> The extent of the ifdef'ery in the patch is really rather small:  one
> (compound) #ifdef in longintrepr.h for defining digit, twodigits, stwodigits
> etc, and a couple more for the places where digits are read and written
> in marshal.c.

But so far it only works with an unknown subset of gcc versions,
right?  These things don't get simpler, alas :-(


>>> I agree that very-long-integer optimizations probably don't really belong in
>>> Python,

>> Depends in part on whether Python can attract as many obsessed
>> maintainers and porters for such gonzo algorithms as GMP attracts ;-)

> Well, you can count me in. :)

Last I looked (which was at least 3 years ago), GMP's source code was
bigger than all of Python's combined.  For a start, I'll have the PSF
draw up a contract obligating you to lifetime servitude :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Tim Peters
[M.-A. Lemburg]
>> These long exit times are usually caused by the garbage collection
>> of objects. This can be a very time consuming task.

[Leif Walsh]
> In that case, the question would be "why is the interpreter collecting
> garbage when it knows we're trying to exit anyway?".

Because user-defined destructors (like __del__ methods and weakref
callbacks) may be associated with garbage, and users presumably want
those to execute.  Doing so requires identifying identifying garbage
and releasing it, same as if the interpreter didn't happen to be
exiting.

BTW, the original poster should try this:  use whatever tools the OS
supplies to look at CPU and disk usage during the long exit.  What I
/expect/ is that almost no CPU time is being used, while the disk is
grinding itself to dust.  That's what happens when a large number of
objects have been swapped out to disk, and exit processing has to page
them all back into memory again (in order to decrement their
refcounts).  Python's cyclic gc (the `gc` module) has nothing to do
with this -- it's typically the been-there-forever refcount-based
non-cyclic gc that accounts for supernaturally long exit times.

If that is the case here, there's no evident general solution.  If you
have millions of objects still alive at exit, refcount-based
reclamation has to visit all of them, and if they've been swapped out
to disk it can take a very long time to swap them all back into memory
again.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Tim Peters
[Mike Coleman]
>> ... Regarding interning, I thought this only worked with strings.

Implementation details.  Recent versions of CPython also, e.g.,
"intern" the empty tuple, and very small integers.

>> Is there some way to intern integers?  I'm probably creating 300M
>> integers more or less uniformly distributed across range(1)?

Interning would /vastly/ reduce memory use for ints in that case, from
gigabytes down to less than half a megabyte.


[Scott David Daniels]
> held = list(range(1))
> ...
>troublesome_dict[string] = held[number_to_hold]
> ...

More generally, but a bit slower, for objects usable as dict keys,
change code of the form:

x = whatever_you_do_to_get_a_new_object()
use(x)

to:

x = whatever_you_do_to_get_a_new_object()
x = intern_it(x, x)
use(x)

where `intern_it` is defined like so once at the start of the program:

intern_it = {}.setdefault

This snippet may make the mechanism clearer:

>>> intern_it = {}.setdefault
>>> x = 3000
>>> id(intern_it(x, x))
36166156
>>> x = 1000 + 2000
>>> id(intern_it(x, x))
36166156
>>> x = "works for computed strings too"
>>> id(intern_it(x, x))
27062696
>>> x = "works for computed strings t" + "o" * 2
>>> id(intern_it(x, x))
27062696
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Tim Peters
[Leif Walsh]
> ...
> It might be a semantic change that I'm looking for here, but it seems
> to me that if you turn off the garbage collector, you should be able
> to expect that either it also won't run on exit,

It won't then, but "the garbage collector" is the gc module, and that
only performs /cyclic/ garbage collection.  There is no way to stop
refcount-based garbage collection.  Read my message again.


> or it should have a
> way of letting you tell it not to run on exit.  If I'm running without
> a garbage collector, that assumes I'm at least cocky enough to think I
> know when I'm done with my objects, so I should know to delete the
> objects that have __del__ functions I care about before I exit.  Well,
> maybe; I'm sure one of you could drag out a programmer that would make
> that mistake, but turning off the garbage collector to me seems to
> send the experience message, at least a little.

This probably isn't a problem with cyclic gc (reread my msg).


> Does the garbage collector run any differently when the process is
> exiting?

No.


> It seems that it wouldn't need to do anything more that run
> through all objects in the heap and delete them, which doesn't require
> anything fancy,

Reread my msg -- already explained the likely cause here (if "all the
objects in the heap" have in fact been swapped out to disk, it can
take an enormously long time to just "run through" them all).


> and should be able to sort by address to aid with
> caching.

That one isn't possible.  There is no list of "all objects" to /be/
sorted.  The only way to find all the objects is to traverse the
object graph from its roots, which is exactly what non-cyclic gc does
anyway.


>  If it's already this fast, then I guess it really is the
> sheer number of function calls necessary that are causing such a
> slowdown in the cases we've seen, but I find this hard to believe.

My guess remains that CPU usage is trivial here, and 99.99+% of the
wall-clock time is consumed waiting for disk reads.  Either that, or
that platform malloc is going nuts.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2 very interesting projects - Python / Finance

2009-02-17 Thread Tim Peters
[Aahz]
> ...
> This is spam, and you have now jeopardized your correct posting to the
> Python Job Board.  The other website administrators will be informed and
> we will discuss whether spamming python-dev warrants withdrawing it.

To be fair, a python-dev moderator approved the posting, so in their
judgment it wasn't spam.

It was in my judgment, but someone else approved it before I managed
to hit the "discard" button.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Accepting PEP 3154 for 3.4?

2013-11-20 Thread Tim Peters
[Tim]
>> BTW, I'm not a web guy:  in what way is HTTP chunked transfer mode
>> viewed as being flawed?  Everything I ever read about it seemed to
>> think it was A Good Idea.

[Martin]
> It just didn't work for some time, see e.g.
>
> http://bugs.python.org/issue1486335
> http://bugs.python.org/issue1966
> http://bugs.python.org/issue1312980
> http://bugs.python.org/issue3761
>
> It's not that the protocol was underspecified - just the implementation
> was "brittle" (if I understand that word correctly).

"Easily broken in catastrophic ways" is close, like a chunk of peanut
brittle can shatter into a gazillion pieces if you drop it on the
floor.

http://en.wikipedia.org/wiki/Brittle_(food)

Or like the infinite loops in some of the bug reports, "just because"
some server screwed up the protocol a little at EOF.

But for pickling there are a lot fewer picklers than HTML transport
creators ;-)  So I'm not much worried about that.  Another of the bug
reports amounted just to that urllib, at first, didn't support chunked
transfer mode at all.

> And I believe (and agree with you) that the cause for this "difficult
> to implement" property is that the framing is in putting framing "in the 
> middle"
> of the stack (i.e. not really *below* pickle itself, but into pickle
> but below the opcodes - just like http chunked transfer is "in" http,
> but below the content encoding).

It's certainly messy that way.  But doable, and I expect the people
working on it are more than capable enough to get it right, by at
latest the 4th try ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Accepting PEP 3154 for 3.4?

2013-11-20 Thread Tim Peters
[Antoine]
> I have made two last-minute changes to the PEP:
>
> - addition of the FRAME opcode, as discussed with Tim, and keeping a
>   fixed 8-byte frame size

Cool!


> - addition of the MEMOIZE opcode, courtesy of Alexandre, which replaces
>   PUT opcodes in protocol 4 and helps shrink the size of pickles

Long overdue - clever idea!


> If there's no further opposition, I'd like to mark this PEP accepted
> (or let someone else do it) in 24 hours, so that the implementation can
> be integrated before Sunday.

I think Guido already spoke on this - but, if he didn't, I will.  Accepted :-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Accepting PEP 3154 for 3.4?

2013-11-20 Thread Tim Peters
[Alexandre Vassalotti]
> Looking at the different options available to us:
>
> 1A. Mandatory framing
>   (+) Allows the internal buffering layer of the Unpickler to rely
>   on the presence of framing to simplify its implementation.
>   (-) Forces all implementations of pickle to include support for
>   framing if they want to use the new protocol.
>   (-) Cannot be removed from future versions of the Unpickler
>   without breaking protocols which mandates framing.
> 1B. Optional framing
>   (+) Could allow optimizations to disable framing if beneficial
>   (e.g., when pickling to and unpickling from a string).

Or to slash the size of small pickles (an 8-byte size field can be
more than half the total pickle size).


> 2A. With explicit FRAME opcode
>   (+) Makes optional framing simpler to implement.
>   (+) Makes variable-length encoding of the frame size simpler
>   to implement.
>   (+) Makes framing visible to pickletools.

(+) Adds (a little) redundancy for sanity checking.

>   (-) Adds an extra byte of overhead to each frames.
> 2B. No opcode
>
> 3A. With fixed 8-bytes headers
>  (+) Is simple to implement
>  (-) Adds overhead to small pickles.
> 3B. With variable-length headers
>  (-) Requires Pickler implemention to do extra data copies when
>  pickling to strings.
>
> 4A. Framing baked-in the pickle protocol
>  (+) Enables faster implementations
> 4B. Framing through a specialized I/O buffering layer
>  (+) Could be reused by other modules
>
> I may change my mind as I work on the implementation, but at least for now,
> I think the combination of 1B, 2A, 3A, 4A will be a reasonable compromise
> here.

At this time I'd make the same choices, so don't expect an argument
from me ;-)  Thank you!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] flaky tests caused by repr() sort order

2013-11-21 Thread Tim Peters
[Christian Heimes]
> the buildbots are flaky because two repr() tests for userdict and
> functools.partial fail every now and then. The test cases depend on a
> fixed order of keyword arguments the representation of userdict and
> partial instances. The improved hash randomization of PEP 456 shows its
> power. I haven't seen the issue before because it happens rarely and
> mostly on 32 bit platforms.
>
> http://bugs.python.org/issue19681
> http://bugs.python.org/issue19664
>
> I'm not sure about the best approach for the issues. Either we need to
> change the test case and make it more resilient or the code for repr()
> must sort its dict keys.

Best to change the failing tests.  For example, _they_ can sort the
dict keys if they rely on a fixed order.  Sorting in general is a
dubious idea because it can be a major expense with no real benefit
for most uses.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: asyncio: Change bounded semaphore into a subclass, like

2013-11-23 Thread Tim Peters
[guido]
> http://hg.python.org/cpython/rev/6bee0fdcba39
> changeset:   87468:6bee0fdcba39
> user:Guido van Rossum 
> date:Sat Nov 23 15:09:16 2013 -0800
> summary:
>   asyncio: Change bounded semaphore into a subclass, like 
> threading.[Bounded]Semaphore.
>
> files:
>   Lib/asyncio/locks.py|  36 
>   Lib/test/test_asyncio/test_locks.py |   2 +-
>   2 files changed, 20 insertions(+), 18 deletions(-)
>
>
> diff --git a/Lib/asyncio/locks.py b/Lib/asyncio/locks.py
> --- a/Lib/asyncio/locks.py
> +++ b/Lib/asyncio/locks.py
> @@ -336,22 +336,15 @@

...
> +class BoundedSemaphore(Semaphore):
...
> +def release(self):
> +if self._value >= self._bound_value:
> +raise ValueError('BoundedSemaphore released too many times')
> +super().release()

If there's a lock and parallelism involved, this release()
implementation is vulnerable to races:  any number of threads can see
"self._value < self._bound_value" before one of them manages to call
the superclass release(), and so self._value can become arbitrarily
larger than self._bound_value.  I fixed the same bug in threading's
similar bounded semaphore class a few weeks ago.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3 is five years old

2013-12-05 Thread Tim Peters
[Brett]
> On 2008-12-03, Python 3.0.0 was released by Barry.

Dang - nobody ever tells me anything.  Congratulations!  It's about
time 3.0.0 was released ;-)

> ...
> Thanks to those in the community who stuck by the dev team and had faith
> we knew what we were doing and have continued to help everyone move
> forward and off of Python 2 to realize how much more pleasant Python 3
> is to work with.

I'm doing all my own coding in 3 now - I like it.  I just wish someone
had told me in 2008 ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


<    2   3   4   5   6   7   8   9   10   11   >