Re: [Python-Dev] Python wins Linux New Media Award for Best Open Source Programming Language

2009-03-06 Thread Lie Ryan

Martin v. Löwis wrote:

The prize was Martin von Löwis of the Python Foundation on behalf of the
Python community itself.

This is a funny translation from German-to-English. :-)

But yeah, a good one and the prize was presented by Klaus Knopper of Knoppix.

Congratulations!


Actually, the prize went to "Python", not to me, and not to the PSF. So
congratulations to you as well!


The (translated) article says that YOU are the prize? WOW.

Ummm... better not to use automatic translator for anything mission 
critical.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Raymond Hettinger


[Guido van Rossum]

Based on our experiences so far, I think that of the two options we
have, both of which are bad, it's better to keep things in 3.1 that we
were planning to remove but forgot, than to make 3.1 have a whole slew
of additional removals. We've removed cmp() in 3.0.1, and I think that
was actually the right thing to do given its prominence and the clear
decision to remove it, but for smaller stuff that didn't make the cut
I think we should favor backwards compatibility over cleanup.


To some extent we get both by leaving them in the module
but continuing to be left out of the docs.  



Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Guido van Rossum
On Fri, Mar 6, 2009 at 5:17 PM, Raymond Hettinger  wrote:
> I believe there was a thread (in January 2008) with a decision to keep
> qsize() but to drop empty() and full().

Based on our experiences so far, I think that of the two options we
have, both of which are bad, it's better to keep things in 3.1 that we
were planning to remove but forgot, than to make 3.1 have a whole slew
of additional removals. We've removed cmp() in 3.0.1, and I think that
was actually the right thing to do given its prominence and the clear
decision to remove it, but for smaller stuff that didn't make the cut
I think we should favor backwards compatibility over cleanup.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Raymond Hettinger

Am not following you here.  My suggestion was to remove the two
methods in Py3.1 which isn't even in alpha yet.


Your proposal was also to add a warning for 3.0.2. This is what I
primarily object to.


Okay, that's fine.  Seemed prudent but it isn't essential.




This is for a feature
that has a simple substitute, was undocumented for Py3.0, and had
long been documented in Py2.x as being unreliable.


The latter is not true. It was not documented as unreliable.


You're right.  It was the docstring that said it was unreliable.
The regular docs were more specific about its limitations.




I still fail to see the rationale for removing these
two methods.


I believe there was a thread (in January 2008) with a 
decision to keep qsize() but to drop empty() and full().




Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue .empty() and Queue.full()

2009-03-06 Thread Antoine Pitrou
Jesse Noller  gmail.com> writes:
> 
> I would tend to agree with Martin,

Agreed as well.

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Jesse Noller
On Fri, Mar 6, 2009 at 7:19 PM, "Martin v. Löwis"  wrote:
>>> I disagree that our users are served by constantly breaking the
>>> API, and removing stuff just because we can. I can't see how
>>> removing API can possibly serve a user.
>>
>> Am not following you here.  My suggestion was to remove the two
>> methods in Py3.1 which isn't even in alpha yet.
>
> Your proposal was also to add a warning for 3.0.2. This is what I
> primarily object to.
>
>> This is for a feature
>> that has a simple substitute, was undocumented for Py3.0, and had
>> long been documented in Py2.x as being unreliable.
>
> The latter is not true. It was not documented as unreliable. Instead,
> it was correctly documented as not being able, in general, to predict
> the result of a subsequent put operation. In that sense, it is as
> unreliable as the qsize() method, which remains supported and
> documented.
>
> Interestingly enough, the usage of .empty() in test_multiprocessing
> is entirely safe, AFAICT. So whether the API is reliable or unreliable
> very much depends on the application (as is true for many
> multi-threading issues).
>
>> It's seems silly to me that an incomplete patch from a year ago
>> would need to wait another two years to ever see the light of day
>
> Right. So it might be better to revert the patch, and restore the
> documentation. I still fail to see the rationale for removing these
> two methods.
>
> Regards,
> Martin

I would tend to agree with Martin, while it might be nice to
straightjacket the API into completely reliable calls (really, is
there anything like that with threads?) empty and the like when used
correctly work just fine. I think anyone using Queue with threads will
generally understand that size/empty calls will only be reliable when
put calls are completed. We can continue to warn them about the issues
with using it with continual producers, but as martin points out,
qsize suffers the same issue.

-jesse
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Martin v. Löwis
>> I disagree that our users are served by constantly breaking the
>> API, and removing stuff just because we can. I can't see how
>> removing API can possibly serve a user.
> 
> Am not following you here.  My suggestion was to remove the two
> methods in Py3.1 which isn't even in alpha yet.

Your proposal was also to add a warning for 3.0.2. This is what I
primarily object to.

> This is for a feature
> that has a simple substitute, was undocumented for Py3.0, and had
> long been documented in Py2.x as being unreliable.

The latter is not true. It was not documented as unreliable. Instead,
it was correctly documented as not being able, in general, to predict
the result of a subsequent put operation. In that sense, it is as
unreliable as the qsize() method, which remains supported and
documented.

Interestingly enough, the usage of .empty() in test_multiprocessing
is entirely safe, AFAICT. So whether the API is reliable or unreliable
very much depends on the application (as is true for many
multi-threading issues).

> It's seems silly to me that an incomplete patch from a year ago
> would need to wait another two years to ever see the light of day

Right. So it might be better to revert the patch, and restore the
documentation. I still fail to see the rationale for removing these
two methods.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft 3.1 release schedule

2009-03-06 Thread Nick Coghlan
Stephen J. Turnbull wrote:
> A suggestion, though.  View the contribution visualization video based
> on the commit log (the URL was posted here a while back, but I don't
> seem to have it offhand), which shows what a vibrant community this is
> in a very graphic way.

There's one here:
http://www.vimeo.com/1093745

That one runs up until just after the switch to subversion (as indicated
by the big influx of "new" names at the end, which is largely an
artifact of usernames changing from shortened forms to full names).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Collin Winter
On Fri, Mar 6, 2009 at 10:01 AM, Michael Haggerty  wrote:
> Antoine Pitrou wrote:
>> Le vendredi 06 mars 2009 à 13:44 +0100, Michael Haggerty a écrit :
>>> Antoine Pitrou wrote:
 Michael Haggerty  alum.mit.edu> writes:
> It is easy to optimize the pickling of instances by giving them
> __getstate__() and __setstate__() methods.  But the pickler still
> records the type of each object (essentially, the name of its class) in
> each record.  The space for these strings constituted a large fraction
> of the database size.
 If these strings are not interned, then perhaps they should be.
 There is a similar optimization proposal (w/ patch) for attribute names:
 http://bugs.python.org/issue5084
>>> If I understand correctly, this would not help:
>>>
>>> - on writing, the strings are identical anyway, because they are read
>>> out of the class's __name__ and __module__ fields.  Therefore the
>>> Pickler's usual memoizing behavior will prevent the strings from being
>>> written more than once.
>>
>> Then why did you say that "the space for these strings constituted a
>> large fraction of the database size", if they are already shared? Are
>> your objects so tiny that even the space taken by the pointer to the
>> type name grows the size of the database significantly?
>
> Sorry for the confusion.  I thought you were suggesting the change to
> help the more typical use case, when a single Pickler is used for a lot
> of data.  That use case will not be helped by interning the class
> __name__ and __module__ strings, for the reasons given in my previous email.
>
> In my case, the strings are shared via the Pickler memoizing mechanism
> because I pre-populate the memo (using the API that the OP proposes to
> remove), so your suggestion won't help my current code, either.  It was
> before I implemented the pre-populated memoizer that "the space for
> these strings constituted a large fraction of the database size".  But
> your suggestion wouldn't help that case, either.
>
> Here are the main use cases:
>
> 1. Saving and loading one large record.  A class's __name__ string is
> the same string object every time it is retrieved, so it only needs to
> be stored once and the Pickler memo mechanism works.  Similarly for the
> class's __module__ string.
>
> 2. Saving and loading lots of records sequentially.  Provided a single
> Pickler is used for all records and its memo is never cleared, this
> works just as well as case 1.
>
> 3. Saving and loading lots of records in random order, as for example in
> the shelve module.  It is not possible to reuse a Pickler with retained
> memo, because the Unpickler might not encounter objects in the right
> order.  There are two subcases:
>
>   a. Use a clean Pickler/Unpickler object for each record.  In this
> case the __name__ and __module__ of a class will appear once in each
> record in which the class appears.  (This is the case regardless of
> whether they are interned.)  On reading, the __name__ and __module__ are
> only used to look up the class, so interning them won't help.  It is
> thus impossible to avoid wasting a lot of space in the database.
>
>   b. Use a Pickler/Unpickler with a preset memo for each record (my
> unorthodox technique).  In this case the class __name__ and __module__
> will be memoized in the shared memo, so in other records only their ID
> needs to be stored (in fact, only the ID of the class object itself).
> This allows the database to be smaller, but does not have any effect on
> the RAM usage of the loaded objects.
>
> If the OP's proposal is accepted, 3b will become impossible.  The
> technique seems not to be well known, so maybe it doesn't need to be
> supported.  It would mean some extra work for me on the cvs2svn project
> though :-(

Talking it over with Guido, support for the memo attribute will have
to stay. I shall add it back to my patches.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Tennessee Leeuwenburg
I don't mind whether its "in" or "out", but as a language user I think it's
best to minimise undocumented interfaces.
According to that principle, if it's "in", then it should also work as
documented (and be documented), and be "supported". If it's "out" then it
should either be removed entirely or be marked "private" (i.e. leading
underscore, unless I'm mistaking my style guidelines).

Cheers,
-T

On Sat, Mar 7, 2009 at 10:19 AM, Raymond Hettinger  wrote:

>
> [Martin v. Löwis]
>
>> I disagree that our users are served by constantly breaking the
>> API, and removing stuff just because we can. I can't see how
>> removing API can possibly serve a user.
>>
>
> Am not following you here.  My suggestion was to remove the two
> methods in Py3.1 which isn't even in alpha yet.  This is for a feature
> that has a simple substitute, was undocumented for Py3.0, and had
> long been documented in Py2.x as being unreliable.
>
> It's seems silly to me that an incomplete patch from a year ago
> would need to wait another two years to ever see the light of day
> (am presuming that 3.1 goes final this summer and that 3.2 follows
> 18 months later). That being said, I don't really care much.
> We don't actually have to do anything.  It could stay in forever
> and cause no harm.
>
>
> Raymond
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/tleeuwenburg%40gmail.com
>



-- 
--
Tennessee Leeuwenburg
http://myownhat.blogspot.com/
"Don't believe everything you think"
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Raymond Hettinger


[Martin v. Löwis]

I disagree that our users are served by constantly breaking the
API, and removing stuff just because we can. I can't see how
removing API can possibly serve a user.


Am not following you here.  My suggestion was to remove the two
methods in Py3.1 which isn't even in alpha yet.  This is for a feature
that has a simple substitute, was undocumented for Py3.0, and had
long been documented in Py2.x as being unreliable.

It's seems silly to me that an incomplete patch from a year ago
would need to wait another two years to ever see the light of day
(am presuming that 3.1 goes final this summer and that 3.2 follows
18 months later). That being said, I don't really care much.
We don't actually have to do anything.  It could stay in forever
and cause no harm.


Raymond 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forgotten Py3.0 change to remove Queue.empty() and Queue.full()

2009-03-06 Thread Martin v. Löwis
> IIRC, that was the rationale for cmp() removal in 3.0.1.

And indeed, that removal already caused a bug report and broke
the efforts of SWIG to support Python 3.0.

I disagree that our users are served by constantly breaking the
API, and removing stuff just because we can. I can't see how
removing API can possibly serve a user.

What's wrong with empty() and full() in the first place?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Martin v. Löwis
> I see. I didn't realize you were talking about adding your own files
> to these directories. I have no idea; the best way to find out is to
> experiment. I could see the default policy of Windows installers go
> either way.

An upgrade installation removes all old files it installed (the old
MSI is still present to know what these files are), then installs
new files.

Microsoft intended version resources to be used in the upgrade, so
the upgrade would only have to replace the files that got a new
version (rather than having to do uninstall-then-install).
Unfortunately, that is incapable of upgrading .py files. So Microsoft
added md5 (I think) hashes that can be used to detect files that
don't need upgrade. I tested it, and it was *very* slow, so I reverted
to the current procedure.

In any case, any additional files present will remain untouched.
They will also remain on uninstallation - so uninstallation might not
be able to remove all folders that installation originally created.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Martin v. Löwis
>> DLLs/sqlite3.dll 557K
> 
> This is sqlite3 itself. I am presuming that the phrase "replace the
> sqlite DLL" above refers to this one

Correct.

> -- although the same argument actually holds for the .pyd file

Not quite. You can download Windows binaries for newer sqlite
versions from sqlite.org, so you don't need a compiler to update
sqlite (which you likely would if _sqlite3.pyd would need to be
replaced). So you can "bypass" Python and its release process for
updates to sqlite.

>> libs/_sqlite3.lib 2K
> 
> I think this is a summary of the entry points into one of the above
> DLLs for the benefit of other code wanting to link against it, but I'm
> not sure.

Correct. I don't know why I include them in the MSI - they are there
because they were also shipped with the Wise installer. I see no
use - nobody should be linking against an extension module.

>> I do not know whether upgrades (like 3.0.0 to 3.0.1) would clobber other
>> things added here.
> 
> It would, but not in a harmful way.

If the user had upgrade sqlite, upgrading Python would undo that,
though. So one would have to re-upgrade sqlite afterwards.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Martin v. Löwis
> Interesting. I assume you are referring to Windows here, right? Does that
> "just work" because the DLL is in the same directory?

Correct. Also, because changes to SQLite don't change the API, just the
implementation.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch commit policies (was [issue4308] repr of httplib.IncompleteRead is stupid)

2009-03-06 Thread rdmurray

On Fri, 6 Mar 2009 at 20:57, "Martin v. L??wis" wrote:

If it is possible for a hostile outsider to trigger the DOS by sending
mail to be processed by an application using the library, and the
application can't avoid the DOS without ditching / forking /
monkeypatching the library, then I would call the bug a "security bug",
period.


IIUC, it would have been straight forward for the mail servers to avoid
the DOS: simply truncate log lines to 1024 bytes, or something.


I believe that in general things that allow DOS attacks to be staged are
considered security vulnerabilities by the general security community,
albeit of lower priority than exploits.  I believe the logic is that
one would prefer the system administrator not to have to figure out what
caused the DOS and how fix it after getting hit by it and having had a
service outage as a result.

Normally the "vendor" of package with the DOS vulnerability would provide
a fix and push it out, and a conscientious sysadmin would install the
"security release" and thus be protected.  In this case the application
vendor can only fix the DOS bug by modifying the library, and that would
fix it only for that application.  The logical place to fix it is at
the source: the library in question.

But since a DOS is lower priority from a security standpoint, I can
see the argument for not burdening the release maintainer with anti-DOS
patches.

We probably should leave it to the release maintainer to decide based
on some assessment of the likely impact of not fixing it.  Which means
it might not get fixed, but that's the reality of limited development
and maintenance resources.

--RDM___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows 2.6.1 installer - msiexec not creating msvcr90.dll

2009-03-06 Thread charlie
 Thanks for the reply.

This question is out of scope for python-dev; use python-l...@python.org
> instead.


The last time I tried a question about the msi installer on python-list, no
one answered, so I thought it might be more appropriate for the dev list.

I actually thought this might be a bug with the new windows installer,
because this used to work with the 2.5.x installers. Before, using the exact
same msiexec command and options, a Python25 directory would be created
containing the required (if not already present) msvcr70.dll.


> My guess is that you have installed "for all users" in the command line,
> so msvcr90.dll went into system32.


I tried passing in the option ALLUSERS=0, which should be the default, and
it still did not work. Also, the dll is not anywhere on the system, and the
Python installed with msiexec will not run without it.

Anyway, I still can't figure out any way use msiexec to install 2.6.1 and
create the needed msvcr90.dll. Maybe it's not possible?

Thanks again, I'll try this question on python-list.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows 2.6.1 installer - msiexec not creating msvcr90.dll

2009-03-06 Thread Martin v. Löwis
charlie wrote:
> I am trying to script a Python installation on Windows, using msiexec
> from the windows cmd prompt. I do not want to register extensions.
> 
> I have tried all the combinations I can find on the following page:
> http://www.python.org/download/releases/2.5/msi/
> 
> But, no matter how I run msiexec, it seems that the msvcr90.dll in not
> created in the Python26 directory.
> 
> If I double click the msi installer and run through it manually, the
> msvcr90.dll is created.
> 
> Is there a way to run msiexec that results in msvc90.dll (and the
> manifest file) getting created?

This question is out of scope for python-dev; use python-l...@python.org
instead.

My guess is that you have installed "for all users" in the command line,
so msvcr90.dll went into system32.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python wins Linux New Media Award for Best Open Source Programming Language

2009-03-06 Thread Martin v. Löwis
>> The prize was Martin von Löwis of the Python Foundation on behalf of the
>> Python community itself.
> 
> This is a funny translation from German-to-English. :-)
> 
> But yeah, a good one and the prize was presented by Klaus Knopper of Knoppix.
> 
> Congratulations!

Actually, the prize went to "Python", not to me, and not to the PSF. So
congratulations to you as well!

It was a nice ceremony; among the 200 jurors, Python was elected
"Best Open Source Programming Language" by a very clear distance
to the second place.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch commit policies (was [issue4308] repr of httplib.IncompleteRead is stupid)

2009-03-06 Thread Martin v. Löwis
> If it is possible for a hostile outsider to trigger the DOS by sending
> mail to be processed by an application using the library, and the
> application can't avoid the DOS without ditching / forking /
> monkeypatching the library, then I would call the bug a "security bug",
> period.

IIUC, it would have been straight forward for the mail servers to avoid
the DOS: simply truncate log lines to 1024 bytes, or something.

> As for backward compatibility:  any application which is depending on
> getting arbitrarily-long lines in its logfile is already insane, and
> should be scrapped.

That's not the point. The point is that the very old releases don't
get sufficient review for bug fixes, because too few people care
about them. So a systematic, efficient review by a single person of the
entire release must be possible. This is only possible if the number
of changes is kept to an absolute minimum - just the patches targeted
at the audience of these releases.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Guido van Rossum
On Fri, Mar 6, 2009 at 11:08 AM, Terry Reedy  wrote:
>>> I do not know whether upgrades (like 3.0.0 to 3.0.1) would clobber other
>>> things added here.
>>
>> It would, but not in a harmful way.
>
> By 'clobber', I meant 'delete', and I do not see how that would not be
> harmful ;-).  I don't know whether the intaller creates a new directory (and
> deletes the old), clears and reuses the old, or merely replaces individual
> files.

I see. I didn't realize you were talking about adding your own files
to these directories. I have no idea; the best way to find out is to
experiment. I could see the default policy of Windows installers go
either way.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft 3.1 release schedule

2009-03-06 Thread Martin v. Löwis
> I guess I'm saying that I'm surprised people aren't a bit more
> appreciative of the opportunity to review code. 

Not sure what "people" you are referring to here which aren't
appreciative of the opportunity to review code. Committers?
Non-committers?

> I don't think I would even be on this list or
> attempting to put together my first (and almost inconseqentially small)
> patch if it weren't for the fact that I see it as a huge opportunity.
> It's certainly not an attempt to 'push' anything into the language.

And this attitude I like best from contributors. Many people contribute
because they want to help, and don't expect anything in return.

However, many other people contribute because it solves a problem that
they have (scratch your own itch). They keep having the problem even
after they fixed it, in a sense, because they now have to reapply the
patch over and over again - for each Python release, and possibly for
each machine they deploy to (and for some, they can't change the
installed Python). Those people are eager to see their patch integrated,
preferably into the version that is already installed on their machines
(which requires the time machine :-)

> Would you object to my blogging on the topic in line with the comments
> that I have just made? 

Go ahead! I really can't say much about blogging - I don't write blogs,
nor read them.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft 3.1 release schedule

2009-03-06 Thread Martin v. Löwis
> I hope that somebody will pick up the slack here, because review is
> really important to the workflow, and getting more people involved in
> reviewing at some level is more important (because it's less
> glamorous in itself) than attracting coders.

Ok, then let me phrase it this way: if somebody else makes the offer,
I'll continue to support it (so to share the load between us two).

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Terry Reedy

Guido van Rossum wrote:

On Fri, Mar 6, 2009 at 9:54 AM, Terry Reedy  wrote:



No, it is expected to "just work" because sqlite3 is (presumably) very
careful about backwards compatibility, and because the Windows DLL API
(just like the shared library API in Linux and other systems) is
designed to allow substitution of newer versions. The linkage
requirements are roughly that all entry points into a DLL (or shared
library) that are referenced by the caller (in this case the wrapper
extension module) are supported in the new version, and have the same
signature and semantics.


I have no idea, but my WinXP .../Python30/ install has

DLLs/_sqlite3.pyd 52K


This is the wrapper extension module.


DLLs/sqlite3.dll 557K


This is sqlite3 itself. I am presuming that the phrase "replace the
sqlite DLL" above refers to this one -- although the same argument
actually holds for the .pyd file, which is also a DLL (despite its
different extension).


libs/_sqlite3.lib 2K


I think this is a summary of the entry points into one of the above
DLLs for the benefit of other code wanting to link against it, but I'm
not sure.


For whatever reason, most other things do not have all three files.


You only see a .pyd and a .dll when there's a Python wrapper extension
*and* an underlying 3rd party library.


Thanks, I understand now.



I do not know whether upgrades (like 3.0.0 to 3.0.1) would clobber other
things added here.


It would, but not in a harmful way.


By 'clobber', I meant 'delete', and I do not see how that would not be 
harmful ;-).  I don't know whether the intaller creates a new directory 
(and deletes the old), clears and reuses the old, or merely replaces 
individual files.


tjr

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Guido van Rossum
On Fri, Mar 6, 2009 at 9:54 AM, Terry Reedy  wrote:
> Stefan Behnel wrote:
>>
>> Martin v. Löwis wrote:

 I do see the point you are making here. Even if lxml gets mature and
 static, that doesn't necessarily apply to the external libraries it
 uses.
 However, I should note that exactly the same argument also applies to
 sqlite3 and gdbm, which, again, are in the stdlib today, with sqlite3
 being
 a fairly recent addition.
>>>
>>> Fortunately, it is possible for users to just replace the sqlite DLL in
>>> a Python installation, with no need of recompiling anything.
>>
>> Interesting. I assume you are referring to Windows here, right? Does that
>> "just work" because the DLL is in the same directory?

No, it is expected to "just work" because sqlite3 is (presumably) very
careful about backwards compatibility, and because the Windows DLL API
(just like the shared library API in Linux and other systems) is
designed to allow substitution of newer versions. The linkage
requirements are roughly that all entry points into a DLL (or shared
library) that are referenced by the caller (in this case the wrapper
extension module) are supported in the new version, and have the same
signature and semantics.

> I have no idea, but my WinXP .../Python30/ install has
>
> DLLs/_sqlite3.pyd 52K

This is the wrapper extension module.

> DLLs/sqlite3.dll 557K

This is sqlite3 itself. I am presuming that the phrase "replace the
sqlite DLL" above refers to this one -- although the same argument
actually holds for the .pyd file, which is also a DLL (despite its
different extension).

> libs/_sqlite3.lib 2K

I think this is a summary of the entry points into one of the above
DLLs for the benefit of other code wanting to link against it, but I'm
not sure.

> For whatever reason, most other things do not have all three files.

You only see a .pyd and a .dll when there's a Python wrapper extension
*and* an underlying 3rd party library.

> I do not know whether upgrades (like 3.0.0 to 3.0.1) would clobber other
> things added here.

It would, but not in a harmful way.

>> That would be a nice feature for lxml, too. We could just make the libxml2
>> and libxslt DLLs package data under Windows in that case.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Michael Haggerty
Antoine Pitrou wrote:
> Le vendredi 06 mars 2009 à 13:44 +0100, Michael Haggerty a écrit :
>> Antoine Pitrou wrote:
>>> Michael Haggerty  alum.mit.edu> writes:
 It is easy to optimize the pickling of instances by giving them
 __getstate__() and __setstate__() methods.  But the pickler still
 records the type of each object (essentially, the name of its class) in
 each record.  The space for these strings constituted a large fraction
 of the database size.
>>> If these strings are not interned, then perhaps they should be.
>>> There is a similar optimization proposal (w/ patch) for attribute names:
>>> http://bugs.python.org/issue5084
>> If I understand correctly, this would not help:
>>
>> - on writing, the strings are identical anyway, because they are read
>> out of the class's __name__ and __module__ fields.  Therefore the
>> Pickler's usual memoizing behavior will prevent the strings from being
>> written more than once.
> 
> Then why did you say that "the space for these strings constituted a
> large fraction of the database size", if they are already shared? Are
> your objects so tiny that even the space taken by the pointer to the
> type name grows the size of the database significantly?

Sorry for the confusion.  I thought you were suggesting the change to
help the more typical use case, when a single Pickler is used for a lot
of data.  That use case will not be helped by interning the class
__name__ and __module__ strings, for the reasons given in my previous email.

In my case, the strings are shared via the Pickler memoizing mechanism
because I pre-populate the memo (using the API that the OP proposes to
remove), so your suggestion won't help my current code, either.  It was
before I implemented the pre-populated memoizer that "the space for
these strings constituted a large fraction of the database size".  But
your suggestion wouldn't help that case, either.

Here are the main use cases:

1. Saving and loading one large record.  A class's __name__ string is
the same string object every time it is retrieved, so it only needs to
be stored once and the Pickler memo mechanism works.  Similarly for the
class's __module__ string.

2. Saving and loading lots of records sequentially.  Provided a single
Pickler is used for all records and its memo is never cleared, this
works just as well as case 1.

3. Saving and loading lots of records in random order, as for example in
the shelve module.  It is not possible to reuse a Pickler with retained
memo, because the Unpickler might not encounter objects in the right
order.  There are two subcases:

   a. Use a clean Pickler/Unpickler object for each record.  In this
case the __name__ and __module__ of a class will appear once in each
record in which the class appears.  (This is the case regardless of
whether they are interned.)  On reading, the __name__ and __module__ are
only used to look up the class, so interning them won't help.  It is
thus impossible to avoid wasting a lot of space in the database.

   b. Use a Pickler/Unpickler with a preset memo for each record (my
unorthodox technique).  In this case the class __name__ and __module__
will be memoized in the shared memo, so in other records only their ID
needs to be stored (in fact, only the ID of the class object itself).
This allows the database to be smaller, but does not have any effect on
the RAM usage of the loaded objects.

If the OP's proposal is accepted, 3b will become impossible.  The
technique seems not to be well known, so maybe it doesn't need to be
supported.  It would mean some extra work for me on the cvs2svn project
though :-(

Michael

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Terry Reedy

Stefan Behnel wrote:

Martin v. Löwis wrote:

I do see the point you are making here. Even if lxml gets mature and
static, that doesn't necessarily apply to the external libraries it uses.
However, I should note that exactly the same argument also applies to
sqlite3 and gdbm, which, again, are in the stdlib today, with sqlite3 being
a fairly recent addition.

Fortunately, it is possible for users to just replace the sqlite DLL in
a Python installation, with no need of recompiling anything.


Interesting. I assume you are referring to Windows here, right? Does that
"just work" because the DLL is in the same directory?


I have no idea, but my WinXP .../Python30/ install has

DLLs/_sqlite3.pyd 52K
DLLs/sqlite3.dll 557K
libs/_sqlite3.lib 2K

For whatever reason, most other things do not have all three files.

I do not know whether upgrades (like 3.0.0 to 3.0.1) would clobber other 
things added here.



That would be a nice feature for lxml, too. We could just make the libxml2
and libxslt DLLs package data under Windows in that case.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2009-03-06 Thread Python tracker

ACTIVITY SUMMARY (02/27/09 - 03/06/09)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2364 open (+31) / 14890 closed (+17) / 17254 total (+48)

Open issues with patches:   822

Average duration of open issues: 661 days.
Median duration of open issues: 398 days.

Open Issues Breakdown
   open  2337 (+29)
pending27 ( +2)

Issues Created Or Reopened (51)
___

subprocess leaves open fds on construction error 03/03/09
CLOSED http://bugs.python.org/issue5179reopened ocean-city  
  
   patch   

mmap and exception type  02/27/09
   http://bugs.python.org/issue5384created  ocean-city  
  
   

mmap can crash after resize failure (windows)02/27/09
CLOSED http://bugs.python.org/issue5385created  ocean-city  
  
   patch   

mmap can crash with write_byte   02/27/09
CLOSED http://bugs.python.org/issue5386created  ocean-city  
  
   

mmap.move crashes by integer overflow02/27/09
   http://bugs.python.org/issue5387created  ocean-city  
  
   

Green-box doc glitch: winhelp version only   02/27/09
   http://bugs.python.org/issue5388created  tjreedy 
  
   

Uninitialized variable may be used in PyUnicode_DecodeUTF7Statef 02/27/09
CLOSED http://bugs.python.org/issue5389created  gvanrossum  
  
   

Item 'Python x.x.x' in Add/Remove Programs list still lacks an i 03/05/09
   http://bugs.python.org/issue5390reopened loewis  
  
   

mmap: read_byte/write_byte and object type   02/28/09
   http://bugs.python.org/issue5391created  ocean-city  
  
   patch   

stack overflow after hitting recursion limit twice   02/28/09
   http://bugs.python.org/issue5392created  gagenellina 
  
   patch   

cmath.cos and cmath.cosh have "nResult" typo in help 02/28/09
CLOSED http://bugs.python.org/issue5393created  mnewman 
  
   

Distutils in trunk does not work with old Python (2.3 - 2.5) 02/28/09
   http://bugs.python.org/issue5394created  akitada 
  
   patch   

array.fromfile not checking I/O errors   02/28/09
   http://bugs.python.org/issue5395created  aguiar  
  
   

os.read not handling O_DIRECT flag   02/28/09
   http://bugs.python.org/issue5396created  aguiar  
  
   

PEP 372:  OrderedDict03/01/09
CLOSED http://bugs.python.org/issue5397created  rhettinger  
  
   patch   

strftime("%B") returns a String unusable with unicode03/01/09
CLOSED http://bugs.python.org/issue5398created  t.steinruecken  
  
   

wer  03/01/09
CLOSED http://bugs.python.org/issue5399created  ajaksu2 
  
   

patches for multiprocessing module on NetBSD 03/01/09
   http://bugs.python.org/issue5400created  aniou   
  
   patch   

mimetypes.MAGIC_FUNCTION performance problems03/01/09
CLOSED http://bugs.python.org/issue5401created  aronacher   
  
   patch, patch, easy, needs 

[Python-Dev] Windows 2.6.1 installer - msiexec not creating msvcr90.dll

2009-03-06 Thread charlie
I am trying to script a Python installation on Windows, using msiexec from
the windows cmd prompt. I do not want to register extensions.

I have tried all the combinations I can find on the following page:
http://www.python.org/download/releases/2.5/msi/

But, no matter how I run msiexec, it seems that the msvcr90.dll in not
created in the Python26 directory.

If I double click the msi installer and run through it manually, the
msvcr90.dll is created.

Is there a way to run msiexec that results in msvc90.dll (and the manifest
file) getting created?

Thanks in advance,
-cjl
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Michael Haggerty
Antoine Pitrou wrote:
> Michael Haggerty  alum.mit.edu> writes:
>> It is easy to optimize the pickling of instances by giving them
>> __getstate__() and __setstate__() methods.  But the pickler still
>> records the type of each object (essentially, the name of its class) in
>> each record.  The space for these strings constituted a large fraction
>> of the database size.
> 
> If these strings are not interned, then perhaps they should be.
> There is a similar optimization proposal (w/ patch) for attribute names:
> http://bugs.python.org/issue5084

If I understand correctly, this would not help:

- on writing, the strings are identical anyway, because they are read
out of the class's __name__ and __module__ fields.  Therefore the
Pickler's usual memoizing behavior will prevent the strings from being
written more than once.

- on reading, the strings are only used to look up the class.  Therefore
they are garbage collected almost immediately.

This is a different situation that that of attribute names, which are
stored persistently as the keys in the instance's __dict__.

Michael
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft 3.1 release schedule

2009-03-06 Thread Stephen J. Turnbull
Tennessee Leeuwenburg writes:

 > > I hope that somebody will pick up the slack here, because review is
 > > really important to the workflow, and getting more people involved in
 > > reviewing at some level is more important (because it's less
 > > glamorous in itself) than attracting coders.

 > I would have said that participating in a project at that level
 > would basically be the best opportunity for ongoing learning and
 > development available.

It is, and IMO Python is an excellent example of that.  Please don't
get me wrong -- the core developers do a lot of reviewing.  It's just
not as visible, or as clearly available to non-core participants, as
Martin's 1-for-5 offer was.

Many, perhaps most, contributions are one-offs by people to whom
Python is a tool, not their community.  They have little time, and as
far as they know, less expertise to participate in the review process.
Martin's offer was an open invitation, in terms that any contributor
can appreciate, even if they don't take advantage of it right away.

I admire that style.

 > Would you object to my blogging on the topic in line with the
 > comments that I have just made?

It's not my place to say yes or no, to you or on behalf of the
community.

A suggestion, though.  View the contribution visualization video based
on the commit log (the URL was posted here a while back, but I don't
seem to have it offhand), which shows what a vibrant community this is
in a very graphic way.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Tracker-discuss] Adding a "Rietveld this" button?

2009-03-06 Thread Daniel (ajax) Diniz
CC'ing python-dev, as more RFEs might be uncovered :)

Daniel (ajax) Diniz wrote:
> Martin v. Löwis wrote:
>> I think a patch (or full file) would be good enough. We could put it
>> into the tracker itself, and publish it prominently where people
>> upload files.
>
> I'll post it as a patch and a full file at issue 247 when I get to it.

Posted as full file to
http://psf.upfronthosting.co.za/roundup/meta/issue247 , Guido suggests
the wrapper way so I won't bother with creating patches now.

> [snip code]
>> Nice! I didn't think of something that complicated (or, rather,
>> complicated in a different way):
>>
>> upload.py --roundup 5428
>
> Just to be clear, this will work as if the user passed the following options:
>
> upload.py --message "[issue5428] Pyshell history management error"
> --cc rep...@bugs.python.org
>
> Right? :)

If that is right, it works :)

>> That could either fill in a description given by -d, or fetch
>> the description from roundup.
>
> Fetching a description is as easy as fetching a title[1], but I can't
> think of a fixed place to look for one (last message? last patch
> description?). Maybe we can add another option that tells the script
> where to fetch description/content from? Something like "
> [-F|--fetch_description] msg83029" ?

Works too, for files or messages :)

Examples:
http://bugs.python.org/issue400608
http://bugs.python.org/issue2771
http://codereview.appspot.com/25073
http://codereview.appspot.com/24075

Thanks again,
Daniel
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Antoine Pitrou
Le vendredi 06 mars 2009 à 13:44 +0100, Michael Haggerty a écrit :
> Antoine Pitrou wrote:
> > Michael Haggerty  alum.mit.edu> writes:
> >> It is easy to optimize the pickling of instances by giving them
> >> __getstate__() and __setstate__() methods.  But the pickler still
> >> records the type of each object (essentially, the name of its class) in
> >> each record.  The space for these strings constituted a large fraction
> >> of the database size.
> > 
> > If these strings are not interned, then perhaps they should be.
> > There is a similar optimization proposal (w/ patch) for attribute names:
> > http://bugs.python.org/issue5084
> 
> If I understand correctly, this would not help:
> 
> - on writing, the strings are identical anyway, because they are read
> out of the class's __name__ and __module__ fields.  Therefore the
> Pickler's usual memoizing behavior will prevent the strings from being
> written more than once.

Then why did you say that "the space for these strings constituted a
large fraction of the database size", if they are already shared? Are
your objects so tiny that even the space taken by the pointer to the
type name grows the size of the database significantly?


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Daniel Stutzbach
On Fri, Mar 6, 2009 at 5:45 AM, Antoine Pitrou  wrote:

> If these strings are not interned, then perhaps they should be.
> There is a similar optimization proposal (w/ patch) for attribute names:
> http://bugs.python.org/issue5084
>

If I understand correctly, that would help with unpickling, but wouldn't
solve Michael's problem as, without memo, each pickle would still need to
store a copy.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python wins Linux New Media Award for Best Open Source Programming Language

2009-03-06 Thread Senthil Kumaran
On Fri, Mar 6, 2009 at 6:34 PM, Michael Foord  wrote:

> The prize was Martin von Löwis of the Python Foundation on behalf of the
> Python community itself.

This is a funny translation from German-to-English. :-)

But yeah, a good one and the prize was presented by Kluas Knooper of Knoppix.

Congratulations!


-- 
Senthil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python wins Linux New Media Award for Best Open Source Programming Language

2009-03-06 Thread Michael Foord

Hello all,

Not sure if this is the same as the LinuxQuestions award, but it looks 
different:

(German)
http://www.linux-magazin.de/news/cebit_2009_openstreetmap_erntet_zwei_linux_new_media_awards

I particularly like this snippet from the google translation:

The prize was Martin von Löwis of the Python Foundation on behalf of the 
Python community itself.


http://translate.google.co.uk/translate?hl=en&sl=de&u=http://www.linux-magazin.de/news/cebit_2009_openstreetmap_erntet_zwei_linux_new_media_awards&ei=VByxSfSnM9nHjAfb6ojPBQ&sa=X&oi=translate&resnum=1&ct=result&prev=/search%3Fq%3Dcebit%2B2009%2B%2BLinux%2BNew%2BMedia%2BAwards%2Bpython%26hl%3Den%26client%3Dfirefox-a%26rls%3Dorg.mozilla:en-GB:official%26hs%3DP9E

All the best,

Michael Foord
--
http://www.ironpythoninaction.com/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Integrate lxml into the stdlib?

2009-03-06 Thread Stefan Behnel
Martin v. Löwis wrote:
>> I do see the point you are making here. Even if lxml gets mature and
>> static, that doesn't necessarily apply to the external libraries it uses.
>> However, I should note that exactly the same argument also applies to
>> sqlite3 and gdbm, which, again, are in the stdlib today, with sqlite3 being
>> a fairly recent addition.
> 
> Fortunately, it is possible for users to just replace the sqlite DLL in
> a Python installation, with no need of recompiling anything.

Interesting. I assume you are referring to Windows here, right? Does that
"just work" because the DLL is in the same directory?

That would be a nice feature for lxml, too. We could just make the libxml2
and libxslt DLLs package data under Windows in that case.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Antoine Pitrou
Michael Haggerty  alum.mit.edu> writes:
> 
> It is easy to optimize the pickling of instances by giving them
> __getstate__() and __setstate__() methods.  But the pickler still
> records the type of each object (essentially, the name of its class) in
> each record.  The space for these strings constituted a large fraction
> of the database size.

If these strings are not interned, then perhaps they should be.
There is a similar optimization proposal (w/ patch) for attribute names:
http://bugs.python.org/issue5084

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Michael Haggerty
Collin Winter wrote:
> [...] I've found a few examples of code using the memo attribute ([1], [2],
> [3]) [...]

As author of [2] (current version here [4]) I can tell you my reason.
cvs2svn has to store a vast number of small objects in a database, then
read them in random order.  I spent a lot of time optimizing this part
of the code because it is crucial for the overall performance when
converting large CVS repositories.  The objects are not all of the same
class and sometimes contain other objects, so it is convenient to use
pickling instead of, say, marshaling.

It is easy to optimize the pickling of instances by giving them
__getstate__() and __setstate__() methods.  But the pickler still
records the type of each object (essentially, the name of its class) in
each record.  The space for these strings constituted a large fraction
of the database size.

So I "prime" the picklers/unpicklers by pickling then unpickling a
"primer" that contains the classes that I know will appear, and storing
the resulting memos once in the database.  Then for each record I create
a new pickler/unpickler and initialize its memo to the "primer"'s memo
before using it to read the actual object.  This removes a lot of
redundancy across database records.

I only prime my picklers/unpicklers with the classes.  But note that the
same technique could be used for any repeated subcomponents.  This would
have the added advantage that all of the unpickled instances would share
copies of the objects that appear in the primer, which could be a
semantic advantage and a significant savings in RAM in addition to the
space and processing time advantages described above.  It might even be
a useful feature to the "shelve" module.

> So my questions are these:
> 1) Should Pickler/Unpickler objects automatically clear their memos
> when dumping/loading?
> 2) Is memo an intentionally exposed, supported part of the
> Pickler/Unpickler API, despite the lack of documentation and tests?

For my application, either of the following alternatives would also suffice:

- The ability to pickle picklers and unpicklers themselves (including
their memos).  This is, of course, awkward because they are hard-wired
to files.

- Picklers and unpicklers could have get_memo() and set_memo() methods
that return an opaque (but pickleable) memo object.  In other words, I
don't need to muck around inside the memo object; I just need to be able
to save and restore it.

Please note that the memo for a pickler is not equal to the memo of the
corresponding unpickler.

A similar effect could *almost* be obtained without accessing the memos
by saving the pickled primer itself in the database.  The unpickler
would be primed by using it to load the primer before loading the record
of interest.  But AFAIK there is no way to prime new picklers, because
there is no guarantee that pickling the same primer twice will result in
the same id->object mapping in the pickler's memo.

Michael

> [2] - 
> http://google.com/codesearch/p?hl=en#M-DDI-lCOgE/lib/python2.4/site-packages/cvs2svn_lib/primed_pickle.py&q=lang:py%20%5C.memo
[4]
http://cvs2svn.tigris.org/source/browse/cvs2svn/trunk/cvs2svn_lib/serializer.py?view=markup
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] asyncore fixes in Python 2.6 broke Zope's version of medusa

2009-03-06 Thread glyph


On 10:01 am, greg.ew...@canterbury.ac.nz wrote:

Hrvoje Niksic wrote:

   Under Linux, select() may report a socket file descriptor
   as "ready for reading",  while  nevertheless
   a subsequent read blocks.


Blarg. Linux is broken, then. This should not happen.


You know what else is broken?  MacOS, FreeBSD, Solaris, and every 
version of Windows.  I haven't tried running Twisted on AmigaOS but I 
bet it has some problems too.


On occasion Linux has been so badly broken that Twisted has motivated a 
fix.  For example, 
http://lkml.indiana.edu/hypermail/linux/kernel/0502.3/1160.html


But even if we ignore difficulties at the OS level (which should, after 
all, be worked around rather than catered to in API design) there are 
other good reasons why the general async API should be fairly distant 
from both the select/poll wrapper and the questions of blocking vs. non- 
blocking sockets.  For another example, consider the issue of non- 
blocking SSL sockets.  Sometimes, in order to do a "read", you actually 
need to do a write and then another read.  Which means that application 
code, if it wants to be compatible with SSL, needs to deal with any 
failure that may come from a write as coming from a read, unless you 
abstract all this away somehow.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft 3.1 release schedule

2009-03-06 Thread Tennessee Leeuwenburg
> Well, that happens.  An alternative to withdrawing entirely, would be
> increasing the price (eg, to ten patches as you originally suggested).
> Or specifying windows in your calendar when the offer is open.  Eg,
> avoid doubling up on release times when you need make time to build
> installers etc. ... but of course just before release is when people
> will get antsy about their "lost" patches.
>
> I hope that somebody will pick up the slack here, because review is
> really important to the workflow, and getting more people involved in
> reviewing at some level is more important (because it's less
> glamorous in itself) than attracting coders.


It's funny ... I would have thought that one of the most attractive aspects
of offering patches for inclusion was not just getting feature X into the
language, but the opportunity to have your code reviewed by the best of the
best, or similarly to review the code of others and really think about its
strengths and weaknesses. I would have said that participating in a project
at that level would basically be the best opportunity for ongoing learning
and development available.
I guess I'm saying that I'm surprised people aren't a bit more appreciative
of the opportunity to review code. I mean, I wouldn't think that Python was
"just work" for anyone who has the passion to commit back to the core
project. I don't think I would even be on this list or attempting to put
together my first (and almost inconseqentially small) patch if it weren't
for the fact that I see it as a huge opportunity. It's certainly not an
attempt to 'push' anything into the language.

Obviously that's what you found though -- people who weren't really
understanding of how the language gets put together. I can imagine having
held that view in the past myself, also. I can to some extent understand the
perspective of feeling you have some fantastic idea which you'd love to get
implemented; yet the people who can make it happen are too concerned with
their own issues to take the time to roll in your changes.

Would you object to my blogging on the topic in line with the comments that
I have just made? I almost feel silly making that kind of suggestion after
having only been here a short time -- I feel a bit boorish! -- but having
run The Python Papers and also no longer being a 'green' developer at work,
I feel as though I do have something to contribute on the topic even if it
is somewhat immaturely conceived.

Regards,
-Tennessee
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] asyncore fixes in Python 2.6 broke Zope's version of medusa

2009-03-06 Thread Hrvoje Niksic

Greg Ewing wrote:
Even if you don't agree that using O_NONBLOCK with select/poll is the 
best approach to non-blocking, I think there is enough existing practice 
of doing this to warrant separate consideration of non-blocking sockets 
(in the OS sense) and select/poll.


I'm not saying there isn't merit in having support for
non-blocking file descriptors, only that it's not in
any sense a prerequisite or first step towards a
select/poll wrapper. They're orthogonal issues, even
if you might sometimes want to use them together.


In that case we are in agreement.  Looking back, I was somewhat confused 
by this paragraph:


So I don't think it makes sense to talk about having a
non-blocking API as a separate thing from a select/poll
wrapper. The select/poll wrapper *is* the non-blocking
API.

If they're orthogonal, then it does make sense to talk about having a 
separate non-blocking socket API and poll API, even if the latter can be 
used to implement non-blocking *functionality* (hypothetical Linux 
issues aside).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft 3.1 release schedule

2009-03-06 Thread Stephen J. Turnbull
"Martin v. Löwis" writes:

 > >From time to time, people ask what they can do push a change into Python
 > that they really think is important. I once offered that people who
 > want a patch in Python really badly should review 10 other patches in
 > return, up to the point where they make a recommendation about the fate
 > of the patches. I was then talked into accepting just 5 such patches.
 > I have since withdrawn this offer, because

I'm really sad to hear that.  I considered that one of the really nice
features of Python as a project (even though it was of course your
individual initiative).

 > a) I was the only one making that offer in public, and

IIRC others did, but you were the only one to do so repeatedly and as
a timely response to reports that the patch queue was going untended.

 > b) I was sometimes not really able to respond in a timely manner
 >when the offer was invoked, because of overload.

Well, that happens.  An alternative to withdrawing entirely, would be
increasing the price (eg, to ten patches as you originally suggested).
Or specifying windows in your calendar when the offer is open.  Eg,
avoid doubling up on release times when you need make time to build
installers etc. ... but of course just before release is when people
will get antsy about their "lost" patches.

I hope that somebody will pick up the slack here, because review is
really important to the workflow, and getting more people involved in
reviewing at some level is more important (because it's less
glamorous in itself) than attracting coders.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] asyncore fixes in Python 2.6 broke Zope's version of medusa

2009-03-06 Thread Greg Ewing

Hrvoje Niksic wrote:


   Under Linux, select() may report a socket file descriptor
   as "ready for reading",  while  nevertheless
   a subsequent read blocks.


Blarg. Linux is broken, then. This should not happen.


   This could for example
   happen when data has arrived but upon
   examination has wrong checksum and is discarded.


That's no excuse -- the kernel should check all its
checksums *before* waking up selecting processes!

Even if you don't agree that using O_NONBLOCK with select/poll is the 
best approach to non-blocking, I think there is enough existing practice 
of doing this to warrant separate consideration of non-blocking sockets 
(in the OS sense) and select/poll.


I'm not saying there isn't merit in having support for
non-blocking file descriptors, only that it's not in
any sense a prerequisite or first step towards a
select/poll wrapper. They're orthogonal issues, even
if you might sometimes want to use them together.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] asyncore fixes in Python 2.6 broke Zope's version of medusa

2009-03-06 Thread Hrvoje Niksic

Greg Ewing wrote:

Antoine Pitrou wrote:


For starters, since py3k is supposed to support non-blocking IO, why not write a
portable API to make a raw file or socket IO object non-blocking?


I think we need to be clearer what we mean when we talk
about non-blocking in this context. Normally when you're
using select/poll you *don't* make the underlying file
descriptor non-blocking in the OS sense. The non-blockingness
comes from the fact that you're using select/poll to make
sure the fd is ready before you access it.

So I don't think it makes sense to talk about having a
non-blocking API as a separate thing from a select/poll
wrapper. The select/poll wrapper *is* the non-blocking
API.


This is not necessarily the case.  In fact, modern sources often 
recommend using poll along with the non-blocking sockets for (among 
other things) performance reasons.  For example, when a non-blocking 
socket becomes readable, you don't read from it only once and go back to 
the event loop, you read from it in a loop until you get EAGAIN.  This 
allows for processing of fast-incoming data with fewer system calls.


Linux's select(2) man page includes a similar advice with different 
motivation:


   Under Linux, select() may report a socket file descriptor
   as "ready for reading",  while  nevertheless
   a subsequent read blocks.  This could for example
   happen when data has arrived but upon
   examination has wrong checksum and is discarded.  There may
   be other circumstances  in  which  a
   file  descriptor  is  spuriously  reported  as ready.
   Thus it may be safer to use O_NONBLOCK on
   sockets that should not block.

Even if you don't agree that using O_NONBLOCK with select/poll is the 
best approach to non-blocking, I think there is enough existing practice 
of doing this to warrant separate consideration of non-blocking sockets 
(in the OS sense) and select/poll.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com