[Python-Dev] Re: license issues with profiler.py and md5.h/md5c.c

2005-02-17 Thread Fredrik Lundh
Gregory P. Smith wrote:

 I don't quite like the module name 'hashes' that i chose for the
 generic interface (too close to the builtin hash() function).  Other
 suggestions on a module name?  'digest' comes to mind.

hashtools, hashlib, and _hash are common names for helper modules like this.

(you still provide md5 and sha wrappers, I hope)

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 func.__name__ breakage

2005-02-17 Thread Michael Hudson
Tim Peters [EMAIL PROTECTED] writes:

 Rev 2.66 of funcobject.c made func.__name__ writable for the first
 time.  That's great, but the patch also introduced what I'm pretty
 sure was an unintended incompatibility:  after 2.66, func.__name__ was
 no longer *readable* in restricted execution mode. 

Yeah, my bad.

 I can't think of a good reason to restrict reading func.__name__,
 and it looks like this part of the change was an accident.  So,
 unless someone objects soon, I intend to restore that func.__name__
 is readable regardless of execution mode (but will continue to be
 unwritable in restricted execution mode).

 Objections?

Well, I fixed it on reading the bug report and before getting to
python-dev mail :) Sorry if this duplicated your work, but hey, it was
only a two line change...

Cheers,
mwh

-- 
  The only problem with Microsoft is they just have no taste.
  -- Steve Jobs, (From _Triumph of the Nerds_ PBS special)
and quoted by Aahz on comp.lang.python
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too slow (fwd)

2005-02-17 Thread Nick Coghlan
Peter Astrand wrote:
I'd like to have your opinion on this bug. Personally, I'd prefer to keep
test_no_leaking as it is, but if you think otherwise...
One thing that actually can motivate that test_subprocess takes 20% of the
overall time is that this test is a good generic Python stress test - this
test might catch some other startup race condition, for example.
test_decimal has a short version which tests basic functionality and always 
runs, but enabling -udecimal also runs the specification tests (which take a 
fair bit longer).

So keeping the basic subprocess tests unconditional, and running the long ones 
only if -uall or -usubprocess are given would seem reasonable.

Cheers,
Nick.
--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://boredomandlaziness.skystorm.net
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far tooslow (fwd)

2005-02-17 Thread Fredrik Lundh
Nick Coghlan wrote:

 One thing that actually can motivate that test_subprocess takes 20% of the
 overall time is that this test is a good generic Python stress test - this
 test might catch some other startup race condition, for example.

 test_decimal has a short version which tests basic functionality and always 
 runs, but 
 enabling -udecimal also runs the specification tests (which take a fair bit 
 longer).

 So keeping the basic subprocess tests unconditional, and running the long 
 ones only if -uall 
 or -usubprocess are given would seem reasonable.

does anyone ever use the -u options when running tests?

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far tooslow (fwd)

2005-02-17 Thread Michael Hudson
Fredrik Lundh [EMAIL PROTECTED] writes:

 Nick Coghlan wrote:

 One thing that actually can motivate that test_subprocess takes 20% of the
 overall time is that this test is a good generic Python stress test - this
 test might catch some other startup race condition, for example.

 test_decimal has a short version which tests basic functionality and always 
 runs, but 
 enabling -udecimal also runs the specification tests (which take a fair bit 
 longer).

 So keeping the basic subprocess tests unconditional, and running the long 
 ones only if -uall 
 or -usubprocess are given would seem reasonable.

 does anyone ever use the -u options when running tests?

Yes, occasionally.  Esp. with test_compiler a testall run is an
overnight job but I try to do it every now and again.

Cheers,
mwh

-- 
  If design space weren't so vast, and the good solutions so small a
  portion of it, programming would be a lot easier.
-- maney, comp.lang.python
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 func.__name__ breakage

2005-02-17 Thread Tim Peters
[Michael Hudson]
 ...
 Well, I fixed it on reading the bug report and before getting to
 python-dev mail :) Sorry if this duplicated your work, but hey, it was
 only a two line change...

Na, the real work was tracking it down in the bowels of Zope's C-coded
security machinery -- we'll let you do that part next time wink.

Did you add a test to ensure this remains fixed?  A NEWS blurb (at
least for 2.4.1 -- the test failures under 2.4 are very visible in the
Zope world, due to auto-generated test runner failure reports)?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 func.__name__ breakage

2005-02-17 Thread Michael Hudson
Tim Peters [EMAIL PROTECTED] writes:

 [Michael Hudson]
 ...
 Well, I fixed it on reading the bug report and before getting to
 python-dev mail :) Sorry if this duplicated your work, but hey, it was
 only a two line change...

 Na, the real work was tracking it down in the bowels of Zope's C-coded
 security machinery -- we'll let you do that part next time wink.

 Did you add a test to ensure this remains fixed?

Yup.

 A NEWS blurb (at least for 2.4.1 -- the test failures under 2.4 are
 very visible in the Zope world, due to auto-generated test runner
 failure reports)?

No, I'll do that now.  I'm not very good at remembering NEWS blurbs...

Cheers,
mwh

-- 
6. The code definitely is not portable - it will produce incorrect 
   results if run from the surface of Mars.
   -- James Bonfield, http://www.ioccc.org/2000/rince.hint
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: [ python-Bugs-1124637 ] test_subprocess is far tooslow (fwd)

2005-02-17 Thread Tim Peters
[Fredrik Lundh]
 does anyone ever use the -u options when running tests?

Yes -- I routinely do -uall, under both release and debug builds, but
only on Windows.  WinXP in particular seems to do a good job when
hyper-threading is available -- running the tests doesn't slow down
anything else I'm doing, except during the disk-intensive tests
(test_largefile is a major pig on Windows).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 func.__name__ breakage

2005-02-17 Thread Tim Peters
[sorry for the near-duplicate msgs -- looks like gmail lied when it claimed the
 first msg was still in draft status]

 Did you add a test to ensure this remains fixed?

[mwh]
 Yup.

Bless you.  Did you attach a contributor agreement and mark the test
as being contributed under said contributor agreement, adjacent to
your valid copyright notice wink?

 A NEWS blurb ...?

 No, I'll do that now.  I'm not very good at remembering NEWS blurbs...

LOL -- sorry, I'm just imagining what NEWS would look like if we
required a contributor-agreement notification on each blurb.  I
appreciate your work here, and will try to find a drug to counteract
the ones I appear to have overdosed on this morning ...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.4 func.__name__ breakage

2005-02-17 Thread Michael Hudson
Tim Peters [EMAIL PROTECTED] writes:

 [sorry for the near-duplicate msgs -- looks like gmail lied when it claimed 
 the
  first msg was still in draft status]

 Did you add a test to ensure this remains fixed?

 [mwh]
 Yup.

 Bless you.  Did you attach a contributor agreement and mark the test
 as being contributed under said contributor agreement, adjacent to
 your valid copyright notice wink?

Fortunately 2 lines  25 lines, so I think I'm safe on this one :)

Cheers,
mwh

-- 
  moshez glyph: I don't know anything about reality.
-- from Twisted.Quotes
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too slow (fwd)

2005-02-17 Thread Guido van Rossum
 I'd like to have your opinion on this bug. Personally, I'd prefer to keep
 test_no_leaking as it is, but if you think otherwise...
 
 One thing that actually can motivate that test_subprocess takes 20% of the
 overall time is that this test is a good generic Python stress test - this
 test might catch some other startup race condition, for example.

A suite of unit tests is a precious thing. We want to test as much as
we can, and as thoroughly as possible; but at the same time we want
the test to run reasonably fast. If the test takes too long, human
nature being what it is, this will actually cause less thorough
testing because developers don't feel like running the test suite
after each small change, and then we get frequent problems where
someone breaks the build because they couldn't wait to run the unit
test.

(For example, where I work we have a Java test suite that takes 25
minutes to run. The build is broken on a daily basis by developers
(including me) who make a small change and check it in believing it
won't break anything.)

The Python test suite already has a way (the -u flag) to distinguish
between regular broad-coverage testing and deep coverage for
specific (or all) areas. Let's keep the really long-running tests out
of the regular test suite.

There used to be a farm of machines that did nothing but run the test
suite (snake-farm). This seems to have stopped (it was run by
volunteers at a Swedish university). Maybe we should revive such an
effort, and make sure it runs with -u all.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far tooslow (fwd)

2005-02-17 Thread Raymond Hettinger
 Let's keep the really long-running tests out
 of the regular test suite.

For test_subprocess, consider adopting the technique used by
test_decimal.  When -u decimal is not specified, a small random
selection of the resource intensive tests are run.  That way, all of the
tests eventually get run even if no one is routinely using -u all.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Five review rule on the /dev/ page?

2005-02-17 Thread Skip Montanaro

I am frantically trying to get ready to be out of town for a week of
vacation.  Someone sent me some patches for datetime and asked me to look at
them.  I begged off but referred him to http://www.python.org/dev/ and made
mention of the five patch review idea.  Can someone make sure that's
explained on the /dev/ site?

Thx,

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too slow (fwd)

2005-02-17 Thread Walter Dörwald
Guido van Rossum wrote:
[...]
There used to be a farm of machines that did nothing but run the test
suite (snake-farm). This seems to have stopped (it was run by
volunteers at a Swedish university). Maybe we should revive such an
effort, and make sure it runs with -u all.
I've changed the job that produces the data for
http://coverage.livinglogic.de/ to run
python Lib/test/regrtest.py -uall -T -N
Unfortunately this job currently produces only coverage info, the output
of the test suite is thrown away. It should be easy to fix this, so that
the output gets put into the database.
Bye,
   Walter Dörwald
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far tooslow (fwd)

2005-02-17 Thread Michael Hudson
Raymond Hettinger [EMAIL PROTECTED] writes:

 Let's keep the really long-running tests out
 of the regular test suite.

 For test_subprocess, consider adopting the technique used by
 test_decimal.  When -u decimal is not specified, a small random
 selection of the resource intensive tests are run.  That way, all of the
 tests eventually get run even if no one is routinely using -u all.

I do like this strategy but I don't think it applies to this test --
it has to try to create more than 'ulimit -n' processes, if I
understand it correctly.  Which makes me think there might be other
ways to write the test if the resource module is available...

Cheers,
mwh

-- 
34. The string is a stark data structure and everywhere it is
passed there is much duplication of process.  It is a perfect
vehicle for hiding information.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Five review rule on the /dev/ page?

2005-02-17 Thread Aahz
On Thu, Feb 17, 2005, Skip Montanaro wrote:

 I am frantically trying to get ready to be out of town for a
 week of vacation.  Someone sent me some patches for datetime
 and asked me to look at them.  I begged off but referred him to
 http://www.python.org/dev/ and made mention of the five patch review
 idea.  Can someone make sure that's explained on the /dev/ site?

This should go into Brett's survey of the Python dev process, not as
official documentation.  It's simply an offer made by some of the
prominent members of python-dev.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death.  --GvR
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] builtin_id() returns negative numbers

2005-02-17 Thread Armin Rigo
Hi Tim,

On Mon, Feb 14, 2005 at 10:41:35AM -0500, Tim Peters wrote:
 # This is a puzzle:  there's no way to know the natural width of
 # addresses on this box (in particular, there's no necessary
 # relation to sys.maxint).

Isn't this natural width nowadays available as:

256 ** struct.calcsize('P')

?


Armin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Re: Re: string find(substring) vs. substring in string

2005-02-17 Thread Fredrik Lundh
Raymond Hettinger wrote:

  but refactoring the contains code to use find_internal sounds like a good
  first step. any takers?

 I'm up for it.

excellent!

just fyi, unless my benchmark is mistaken, the Unicode implementation has
the same problem:

str in - 25.8 µsec per loop
unicode in - 26.8 µsec per loop

str.find() - 6.73 µsec per loop
unicode.find() - 7.24 µsec per loop

oddly enough, if I change the target string so it doesn't contain any partial
matches at all, unicode.find() wins the race:

str in - 24.5 µsec per loop
unicode in - 24.6 µsec per loop

str.find() - 2.86 µsec per loop
unicode.find() - 2.16 µsec per loop

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Five review rule on the /dev/ page?

2005-02-17 Thread Brett C.
[removed pydotorg from people receiving this email]
Aahz wrote:
On Thu, Feb 17, 2005, Skip Montanaro wrote:
I am frantically trying to get ready to be out of town for a
week of vacation.  Someone sent me some patches for datetime
and asked me to look at them.  I begged off but referred him to
http://www.python.org/dev/ and made mention of the five patch review
idea.  Can someone make sure that's explained on the /dev/ site?

This should go into Brett's survey of the Python dev process, not as
official documentation.  It's simply an offer made by some of the
prominent members of python-dev.
I am planning on adding that blurb in there.
Actually, while I have everyone's attention, I might as well throw an idea out 
there about sprucing up yet again the docs on contributing.  I was thinking of 
taking the current dev intro and have it just explain how things basically work 
around here.  So the doc would become more of just a high-level overview of how 
we dev the language.

But I would cut out the helping out section and spin that into another doc that 
would go into some more detail on how to make a contribution.  So this would 
specify in more detail how to report a bug, how to comment on one, etc. (same 
goes for patches).  This is where I would stick the 5-for-1 deal.

Lastly, write up a doc that covers what one with CVS checkin rights needs to do 
when checking in code.  So how one goes about getting checkin rights, getting 
initial checkins OK'ed by others, and then the usual steps taken for a checkin.

Sound worth it to people?  Not really needed so go back and do your homework, 
Brett?  What?

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ python-Bugs-1124637 ] test_subprocess is far too slow (fwd)

2005-02-17 Thread Marcus Alanen
Guido van Rossum wrote:
The Python test suite already has a way (the -u flag) to distinguish
between regular broad-coverage testing and deep coverage for
specific (or all) areas. Let's keep the really long-running tests out
of the regular test suite.
There used to be a farm of machines that did nothing but run the test
suite (snake-farm). This seems to have stopped (it was run by
volunteers at a Swedish university). Maybe we should revive such an
effort, and make sure it runs with -u all.
Hello Guido and everybody else,
I hacked together a simple distributed unittest runner for our projects. 
Requirements are a NFS-mounted home directory across the slave nodes and 
SSH-based automatic authentication, i.e. no passwords or passphrases 
necessary. It officially works-for-me for around three hosts (see below) 
so that cuts the time down basically to a third (real-life example ~600 
seconds to ~200 seconds, so it does work :-). It also supports 
serialized tests, i.e. tests that must be run one after the other and 
cannot be run in parallel.

http://mde.abo.fi/tools/disttest/
Comes with some problems; my blurb from advogato.org:

Disttest is a distributed unittesting runner. You simply set the 
DISTTEST_HOSTS variable to a space-separated list of hostnames to 
connect to using SSH, and then run disttest. The nodes must all have 
the same filesystem (usually an NFS-mounted /home) and have the Disttest 
program installed. You even gain a bit with just one computer by setting 
the variable to localhost localhost. :-)

There are currently two annoying problem with it, though. For some 
reason, 1) the unittest program connecting to the X server sometimes 
fails to provide the correct authentication, and 2) sometimes the actual 
connection to the X server can't be established. I think these are 
related to 1) congestion on the shared .Xauthority file, and 2) a too 
small listen() queue on the forwarding port by the SSH daemon. Both 
problems show up when using too many (over 4?) hosts, which is the whole 
point of the program! Sigh.


Error checking probably bad. Anyway, feel free to check it out, modify, 
comment or anything. We're thinking of checking the assumptions in the 
blurb above, but no timetable is set.

My guess is that the NFS-mounted home directory is the showstopper and 
people usually don't have lot's of machines hanging around, but that's 
for you to decide.

Disclaimer: I don't know anything of CPython development nor of the 
tests in the CPython test suite. ;-)

Best regards, and a big thank you for Python,
Marcus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%

2005-02-17 Thread Gfeller Martin
Hi,

what immediately comes to mind are Modules/cPickle.c and Modules/cStringIO.c, 
which (I believe) are heavily used by ZODB (which in turn is heavily used by 
the application). 

The lists also get fairly large, although not huge - up to typically 5 
(complex) objects in the tests I've measured. As I said, I don't speak C, so I 
can only speculate - do the lists at some point grow beyond the upper limit of 
obmalloc, but are handled by the LFH (which has a higher upper limit, if I 
understood Tim Peters correctly)?

Best regards,
Martin





-Original Message-
From: Evan Jones [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 17 Feb 2005 02:26
To: Python Dev
Cc: Gfeller Martin; Martin v. Löwis
Subject: Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%


On Feb 16, 2005, at 18:42, Martin v. Löwis wrote:
 I must admit that I'm surprised. I would have expected
 that most allocations in Python go through obmalloc, so
 the heap would only see large allocations.

 It would be interesting to find out, in your application,
 why it is still an improvement to use the low-fragmentation
 heaps.

Hmm... This is an excellent point. A grep through the Python source 
code shows that the following files call the native system malloc (I've 
excluded a few obviously platform specific files). A quick visual 
inspection shows that most of these are using it to allocate some sort 
of array or string, so it likely *should* go through the system malloc. 
Gfeller, any idea if you are using any of the modules on this list? If 
so, it would be pretty easy to try converting them to call the obmalloc 
functions instead, and see how that affects the performance.

Evan Jones


Demo/pysvr/pysvr.c
Modules/_bsddb.c
Modules/_curses_panel.c
Modules/_cursesmodule.c
Modules/_hotshot.c
Modules/_sre.c
Modules/audioop.c
Modules/bsddbmodule.c
Modules/cPickle.c
Modules/cStringIO.c
Modules/getaddrinfo.c
Modules/main.c
Modules/pyexpat.c
Modules/readline.c
Modules/regexpr.c
Modules/rgbimgmodule.c
Modules/svmodule.c
Modules/timemodule.c
Modules/zlibmodule.c
PC/getpathp.c
Python/strdup.c
Python/thread.c

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%

2005-02-17 Thread Tim Peters
[Gfeller Martin]
 what immediately comes to mind are Modules/cPickle.c and
 Modules/cStringIO.c, which (I believe) are heavily used by ZODB (which in turn
 is heavily used by the application).

I probably guessed right the first time wink:  LFH doesn't help with
the lists directly, but helps indirectly by keeping smaller objects
out of the general heap where the list guts actually live.

Say we have a general heap with a memory map like this, meaning a
contiguous range of available memory, where 'f' means a block is free.
 The units of the block don't really matter, maybe one 'f' is one
byte, maybe one 'f' is 4MB -- it's all the same in the end:

fff

Now you allocate a relatively big object (like the guts of a large
list), and it's assigned a contiguous range of blocks marked 'b':

bbb

Then you allocate a small object, marked 's':

bbbsfff

The you want to grow the big object.  Oops!  It can't extend the block
of b's in-place, because 's' is in the way.  Instead it has to copy
the whole darn thing:

fffsbbb

But if 's' is allocated from some _other_ heap, then the big object
can grow in-place, and that's much more efficient than copying the
whole thing.

obmalloc has two primary effects:  it manages a large number of very
small (= 256 bytes) memory chunks very efficiently, but it _also_
helps larger objects indirectly, by keeping the very small objects out
of the platform C malloc's way.

LFH appears to be an extension of the same basic idea, raising the
small object limit to 16KB.

Now note that pymalloc and LFH are *bad* ideas for objects that want
to grow.  pymalloc and LFH segregate the memory they manage into
blocks of different sizes.  For example, pymalloc keeps a list of free
blocks each of which is exactly 64 bytes long.  Taking a 64-byte block
out of that list, or putting it back in, is very efficient.  But if an
object that uses a 64-byte block wants to grow, pymalloc can _never_
grow it in-place, it always has to copy it.  That's a cost that comes
with segregating memory by size, and for that reason Python
deliberately doesn't use pymalloc in several cases where objects are
expected to grow over time.

One thing to take from that is that LFH can't be helping list-growing
in a direct way either, if LFH (as seems likely) also needs to copy
objects that grow in order to keep its internal memory segregated by
size.  The indirect benefit is still available, though:  LFH may be
helping simply by keeping smaller objects out of the general heap's
hair.

 The lists also get fairly large, although not huge - up to typically 5
 (complex) objects in the tests I've measured.

That's much larger than LFH can handle.  Its limit is 16KB.  A Python
list with 50K elements requires a contiguous chunk of 200KB on a
32-bit machine to hold the list guts.

 As I said, I don't speak C, so I can only speculate - do the lists at some 
 point
grow beyond the upper limit of obmalloc, but are handled by the LFH
(which has a
 higher upper limit, if I understood Tim Peters correctly)?

A Python list object comprises two separately allocated pieces of
memory.  First is a list header, a small piece of memory of fixed
size, independent of len(list).  The list header is always obtained
from obmalloc; LFH will never be involved with that, and neither will
the system malloc.  The list header has a pointer to a separate piece
of memory, which contains the guts of a list, a contiguous vector of
len(list) pionters (to Python objects).  For a list of length n, this
needs 4*n bytes on a 32-bit box.  obmalloc never manages that space,
and for the reason given above:  we expect that list guts may grow,
and obmalloc is meant for fixed-size chunks of memory.

So the list guts will get handled by LFH, until the list needs more
than 4K entries (hitting the 16KB LFH limit).  Until then, LFH
probably wastes time by copying growing list guts from size class to
size class.  Then the list guts finally get copied to the general
heap, and stay there.

I'm afraid the only you can know for sure is by obtaining detailed
memory maps and analyzing them.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] license issues with profiler.py and md5.h/md5c.c

2005-02-17 Thread Donovan Baarda
On Wed, 2005-02-16 at 22:53 -0800, Gregory P. Smith wrote:
 fyi - i've updated the python sha1/md5 openssl patch.  it now replaces
 the entire sha and md5 modules with a generic hashes module that gives
 access to all of the hash algorithms supported by OpenSSL (including
 appropriate legacy interface wrappers and falling back to the old code
 when compiled without openssl).
 
  
 https://sourceforge.net/tracker/index.php?func=detailaid=1121611group_id=5470atid=305470
 
 I don't quite like the module name 'hashes' that i chose for the
 generic interface (too close to the builtin hash() function).  Other
 suggestions on a module name?  'digest' comes to mind.

I just had a quick look, and have these comments (psedo patch review?).
Apologies for the noise on the list...

DESCRIPTION
===

This patch keeps the current md5c.c, md5module.c files and adds the
following; _hashopenssl.c, hashes.py, md5.py, sha.py.

The old md5 and sha extension modules get replaced by hashes.py, md5.py,
and sha.py python modules that leverage off _hash (openssl) or _md5 and
_sha (no openssl) extension modules.

The new _hash extension module wraps the high level openssl EVP
interface, which uses a string parameter to indicate what type of
message digest algorithm to use. The advantage of this is it makes all
openssl supported digests available, and if openssl adds more, we get
them for free. A disadvantage of this is it is an abstraction level
above the actual md5 and sha implementations, and this may add
overheads. These overheads are probably negligible compared to the
actual implementation speedups.

The new _md5 and _sha extension modules are simply re-named versions of
the old md5 and sha modules.

The hashes.py module acts as an import wrapper for _hash, and falls back
to using _md5 and _sha modules if _hash is not available. It provides an
EVP style API (string hash name parameter), that supports only md5 and
sha hashes if openssl is not available.

The new md5.py and sha.py modules simply use hash.py.

COMMENTS


The introduction of a hashes module with a new API that supports many
different digests (provided openssl is available) is extending Python,
not just fixing the licenses of md5 and sha modules.

If all we wanted to do was fix the md5 module, a simpler solution would
be to change the md5c.c API to match openssl's implementation, and make
md5module.c use it, conditionally compiling against md5c.c or linking
against openssl in setup.py. A similar approach could be used for sha,
but would require stripping the sha implementation out of shamodule.c

I am mildly of concerned about the namespace/filespace clutter
introduced by this implementation... it feels unnecessary, as does the
tangled dependencies between them. With openssl, hashes.py duplicates
the functionality of _hash. Without openssl, md5.py and sha.py duplicate
_md5 and _sha, via a roundabout route through hash.py.

The python wrappers seem overly complicated, with things like

  def new(name, string=None):
if string:
  return _hash.new(name)
else:
  return _hash.new.(name,string)

being common where the following would suffice;

  def new(name,string=):
return _hash.new(name,string)

I think this is because _hash.new() uses an optional string parameter,
but I have a feeling a C update with a zero length string is faster than
this Python if. If it was a concern, the C implementation could check
the value of the string length before calling update.

Given the convenience methods for different hashes in hashes.py (which
incidentally look like they are only available when _hash is not
available... something else that needs fixing), the md5.py module could
be simply coded as;

  from hashes import md5
  new = md5

Despite all these nit-picks, it looks pretty good. It is orders of
magnitude better than any of the other non-existent solutions, including
the one I didn't code :-)

-- 
Donovan Baarda [EMAIL PROTECTED]
http://minkirri.apana.org.au/~abo/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Prospective Peephole Transformation

2005-02-17 Thread Raymond Hettinger
Based on some ideas from Skip, I had tried transforming the likes of x
in (1,2,3) into x in frozenset([1,2,3]).  When applicable, it
substantially simplified the generated code and converted the O(n)
lookup into an O(1) step.  There were substantial savings even if the
set contained only a single entry.  When disassembled, the bytecode is
not only much shorter, it is also much more readable (corresponding
almost directly to the original source).

The problem with the transformation was that it didn't handle the case
where x was non-hashable and it would raise a TypeError instead of
returning False as it should.  That situation arose once in the email
module's test suite.

To get it to work, I would have to introduce a frozenset subtype:

class Searchset(frozenset):
def __contains__(self, element):
try:
return frozenset.__contains__(self, element)
except TypeError:
return False

Then, the transformation would be x in Searchset([1, 2, 3]).  Since
the new Searchset object goes in the constant table, marshal would have
to be taught how to save and restore the object.

This is a more complicated than the original frozenset version of the
patch, so I would like to get feedback on whether you guys think it is
worth it.



Raymond Hettinger

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com