from:"M.\\\-A. Lemburg"


On 2008-05-12 04:34, Brett Cannon wrote:

For the sake of argument, let's consider the Queue module. It is now
named queue. For 2.6 I plan on having both Queue and queue listed in
the index, with Queue deprecated with instructions to use the new
name.

But what to do about all the references. Should we leave them pointing
at Queue to lessen confusion for people who read about some module on
some other site that isn't using the new name, or update everything in
2.6 to use the new name?


How hard would it be to add a redirects from the old pages to the
new ones ?

mod_rewrite does wonders - well, provided you find the right patterns...

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Distutils configparser rename


On 2008-05-15 22:33, A.M. Kuchling wrote:

Python 2.6 renames the ConfigParser module to be configparser.

Distutils imports ConfigParser in various places.  I just made a
commit updating the import in one places, and then noticed that part
of commit r63248, which made the same change, was reverted in order to
preserve backward-compatibility.  Instead, the default path will
include lib-old again to keep the old module name available.

I suggest dropping that goal, though.  We've preserved compatibility
but I'm not aware that anyone uses the Python 2.x Distutils with
earlier versions of Python.  In particular:

* There's no standalone distutils package on PyPI, nor can I find
  such a package with a general web search.  Am I missing it?

* I do not see users advising other users to use a later version of 
  Distutils to fix their problems.


Is anyone actually benefiting from the effort of maintaining backward
compatibility?


Yes: all the folks who want to create distutils packages for more than
just the current Python version.

I've argued for this a couple of times in the past. Some background:

In order to build a Python package for a previous Python version,
you have to run distutils using that older Python version.

Now, as distutils evolves, new features are added, bugs are fixed,
etc. so as packager you always want to use the latest distutils
version available - even with older Python releases. In some cases,
e.g. PyPI registration, this may even be necessary, since the
new versions of those commands need to be kept in sync with the
PyPI repository.

Another aspect is keeping package setup.py files working.

If you need to support multiple Python versions, then your
setup.py will have to work with multiple different versions
of distutils.

Since performance doesn't really matter for distutils, it is well
possible and easy to keep compatibility with a few releases back.

This has worked great in the past and I don't see why we should
break this, as recent distutils checkins have done.

Note that Python doesn't exactly make it easy to ship Python
packages. You have several different dimensions to take into
consideration:

 * Python version
 * UCS2/UCS4
 * Platform and processor type
 * 32/64-bit

So there already is a lot of porting effort needed to support
a reasonable number of targets.

I don't think it takes a lot of effort to keep distutils
running with Python 2.3 and 2.4.

In the past I've usually rewritten parts of distutils that
were modified in incompatible ways. I haven't been able to
that for the recent checkins that broke distutils even on
Python 2.4.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Symbolic errno values in error messages


On 2008-05-16 16:15, Nick Coghlan wrote:

Alexander Belopolsky wrote:

Yannick Gingras ygingras at ygingras.net writes:

2) Where can I find the symbolic name in C?


Use standard C library char* strerror(int errnum) function.   You can see
an example usage in Modules/posixmodule.c (posix_strerror).


I don't believe that would provide adequate Windows support.


Well, there's still the idea of a winerror module:

http://bugs.python.org/issue1505257

Perhaps someone can pick it up and turn it into a (generated) C
module ?!

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Symbolic errno values in error messages


On 2008-05-16 17:02, Alexander Belopolsky wrote:

On Fri, May 16, 2008 at 10:52 AM, Yannick Gingras [EMAIL PROTECTED] wrote:


print e

[Errno 21] Is a directory

So now I am not sure what OP is proposing.  Do you want to replace 21
with EISDIR in the above?

Yes, that's what I had in mind.



In this case, I have a more drastic proposal.  Lets change
EnvironmentError errno attribute (myerrno in C) to string.  


-1

You never want to change an integer field to a string.


'EXYZ'
strings can be interned, which will make them more efficient than
integers for lookups and comparisons (to literals).  A half-way and
backward compatible solution would be to stick 'EXYZ' code at the end
of the args tuple and add an errnosym attribute.


Actually, you don't have to put it into any tuple. Just add it
to the error object as attribute.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Addition of pyprocessing module to standard lib.

2008-05-14 Thread M.-A. Lemburg


On 2008-05-14 14:15, Jesse Noller wrote:

On Wed, May 14, 2008 at 5:45 AM, Christian Heimes [EMAIL PROTECTED] wrote:

Martin v. Löwis schrieb:


I'm worried whether it's stable, what user base it has, whether users

  (other than the authors) are lobbying for inclusion. Statistically,
  it seems to be not ready yet: it is not even a year old, and has not
  reached version 1.0 yet.

 I'm on Martin's side here. Although I like to see some sort of multi
 processing mechanism in Python 'cause I need it for lots of projects I'm
 against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't
 old and mature enough and it has some competitors like pp (parallel
 processing).

 On the one hand the inclusion of a package gives it an unfair advantage
 over similar packages. On the other hand it slows down future
 development because a new feature release must be synced with Python
 releases about every 1.5 years.

 -0.5 from me

 Christian



I said this in reply to Martin - but the competitors (in my mind) are
not as compelling due to the alternative paradigm for application
construction they propose. The processing module is an easy win for
us if included.

Personally - I don't see how inclusion in the stdlib would slow down
development - yes, you have to stick with the same release cycle as
python-core, but if the module is feature complete and provides a
stable API as it stands I don't see following python-core timelines as
overly onerous.

The module itself doesn't change that frequently - the last release in
April was a bugfix release and API consistency change (the API would
have to be locked for inclusion obviously - targeting a 2.7/3.1
release may be advantageous to achieve this).


Why don't you start a parallel-sig and then hash this out with other
distributed computing users ?

You could then reach a decision by the time 2.7 is scheduled for release
and then add the chosen module to the stdlib.

The API of the processing module does look simple and nice, but
parallel processing is a minefield - esp. when it comes to handling
error situations (e.g. a worker failing, network going down, fail-over,
etc.).

What I'm missing with the processing module is a way to spawn processes
on clusters (rather than just on a single machine).

In the scientific world, MPI is the standard API of choice for doing
parallel processing, so if we're after standards, supporting MPI
would seem to be more attractive than the processing module.

http://pypi.python.org/pypi/mpi4py

In the enterprise world, you often find CORBA based solutions.

http://omniorb.sourceforge.net/

And then, of course, you have a gazillion specialized solutions
such as PyRO:

http://pyro.sourceforge.net/

OTOH, perhaps the stdlib should just include entry-level support
for some form of parallel processing, in which case processing
does look attractive.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 14 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tool for converting %-formatting to .format()ing ?

2008-05-10 Thread M.-A. Lemburg


On 2008-05-10 01:18, Martin v. Löwis wrote:

Is there a tool available that can convert 2.x code automagically
to the .format() method syntax ?

Just did a quick grep of our code base and it has some 2000 lines of code
that would need to be changed.


Why do you think this code needs to change?

I'd leave all the code as-is, and might not start using .format before
Python 3.2, unless some coding convention says I have to.


True, just wanted to know whether there is such a tool.

I personally like the %-notation a lot, mainly because it's more
or less the same as in C.

%i, %s and %r are by far the most used format characters in our code base.
Determining the position index and writing {0!s} or {0!r} instead
(which requires quite a finger dance on a German keyboard) doesn't
make .format() really attractive, IMHO.

Perhaps you're right and it's better to wait a few rounds of
refinements of .format() before jumping on that train :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 10 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] [Python-checkins] r62848 - python/trunk/Objects/setobject.c

2008-05-09 Thread M.-A. Lemburg


On 2008-05-08 13:59, Barry Warsaw wrote:

On May 8, 2008, at 7:54 AM, Benjamin Peterson wrote:


On Thu, May 8, 2008 at 6:32 AM, Barry Warsaw [EMAIL PROTECTED] wrote:

Since the trunk buildbots appear to be mostly happy (well those that are
connected anyway), and because I couldn't get the releases out last 
night,
I'll let this one slide.  I'd like to find a way to more forcefully 
enforce

commit freezes for the betas though.



I wonder if you couldn't alter the server side commit hook to reject
everything with the message Sorry, we're in a freeze. (You'd have to
make an exception for yourself.)


This is exactly what I'm thinking about!


+1, that's easy to do with Subversion and doesn't hurt anyone.

Please also use a term like freeze or frozen in the subject line
of the announcement - perhaps even in capital letters.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 09 2008)

Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/



 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Tool for converting %-formatting to .format()ing ?

2008-05-09 Thread M.-A. Lemburg


Is there a tool available that can convert 2.x code automagically
to the .format() method syntax ?

Just did a quick grep of our code base and it has some 2000 lines of code
that would need to be changed.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 09 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tool for converting %-formatting to .format()ing ?

2008-05-09 Thread M.-A. Lemburg


On 2008-05-09 15:29, [EMAIL PROTECTED] wrote:

mal Is there a tool available that can convert 2.x code automagically
mal to the .format() method syntax ?

mal Just did a quick grep of our code base and it has some 2000 lines
mal of code that would need to be changed.

I suggested a 2to3 fixer for this but was shot down.


Well, ideally such a tool should address 2to2 :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 09 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] Reminder: last alphas next Wednesday 07-May-2008

2008-05-04 Thread M.-A. Lemburg


On 2008-05-04 18:14, Christian Heimes wrote:

First, Skip, I *only* care about the default behavior.  There's already
a way to do it differently: PYTHONPATH.  So, Fred, I think what you're
arguing for is to drop this feature entirely.  Or is there some other
use for a new way to allow users to explicitly add something to
sys.path, aside from PYTHONPATH?  It seems that it would add more
complexity and I can't see what the value would be.


PYTHONPATH is lacking one feature which is important for lots of
packages and setuptools. The directories in PYTHONPATH are just added to
sys.path. But setuptools require a site package directory. Maybe a new
env var PYTHONSITEPATH could solve the problem.


We don't need another setup variable for this. Just place a
well-known module into the site-packages/ directory and then
query it's __file__ attribute, e.g.

site-packages/site_packages.py

The module could even include a few helpers to query various
settings which apply to the site packages directory, e.g.

site_packages.get_dir()
site_packages.list_packages()
site_packages.list_modules()
etc.


As I've said a dozen times in this thread already, the feature I'd like
to get from a per-user installation location is that 'setup.py install',
or at least some completely canonical distutils incantation, should
work, by default, for non-root users; ideally non-administrators on
windows as well as non-root users on unixish platforms.


The implementation of my PEP provides a new option for install:

$ python setup.py install --user

Is it sufficient for you?


Just in case you don't know...

python setup.py install --home=~

will install to ~/lib/python

The problem is not getting the packages installed in a non-admin
location. It's about Python looking in a non-admin location per
default (as well as in the site-packages location).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] Reminder: last alphas next Wednesday 07-May-2008

2008-05-04 Thread M.-A. Lemburg


On 2008-05-04 21:57, Christian Heimes wrote:

M.-A. Lemburg schrieb:

PYTHONPATH is lacking one feature which is important for lots of
packages and setuptools. The directories in PYTHONPATH are just added to
sys.path. But setuptools require a site package directory. Maybe a new
env var PYTHONSITEPATH could solve the problem.

We don't need another setup variable for this. Just place a
well-known module into the site-packages/ directory and then
query it's __file__ attribute, e.g.

site-packages/site_packages.py

The module could even include a few helpers to query various
settings which apply to the site packages directory, e.g.

site_packages.get_dir()
site_packages.list_packages()
site_packages.list_modules()
etc.


I don't see how it is going to solve the use case Add another site
package directory when I don't have write access to the global site
package directory and I don't want to modify my apps.


No, but it's going to solve the issue which of the sys.path directories
is to be considered the site packages directory. I was under the
impression that this is what you were after.


Just in case you don't know...

python setup.py install --home=~

will install to ~/lib/python

The problem is not getting the packages installed in a non-admin
location. It's about Python looking in a non-admin location per
default (as well as in the site-packages location).


I know the --home option. For one the --home option is Unix only and not
supported on Windows Also the --user option takes all options of my PEP
370 user site directory into account, includinge the PYTHONUSERBASE env var.


Ok. Just wanted to mention that there is a precedent in distutils
for doing user home directory installations.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

2008-04-23 Thread M.-A. Lemburg


On 2008-04-23 07:26, Terry Reedy wrote:
Martin v. Löwis [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

| I certainly agree that if the target set of documents is small enough it
|
| Ok. What advantage would you (or somebody working on a similar project)
| gain if chardet was part of the standard library? What if it was not
| chardet, but some other algorithm?

It seems to me that since there is not a 'correct' algorithm but only 
competing heuristics, encoding detection modules should be made available 
via PyPI and only be considered for stdlib after a best of breed emerges 
with community support. 


+1

Though in practice, determining the best of breed often becomes a
problem (see e.g. the JSON implementation discussion).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

2008-04-22 Thread M.-A. Lemburg


On 2008-04-21 23:31, Martin v. Löwis wrote:
This is useful when you get a hunk of data which _should_ be some  
sort of intelligible text from the Big Scary Internet (say, a posted  
web form or email message), and you want to do something useful with  
it (say, search the content).


I don't think that should be part of the standard library. People
will mistake what it tells them for certain.


+1

I also think that it's better to educate people to add (correct)
encoding information to their text data, rather than give them a
guess mechanism...

http://chardet.feedparser.org/docs/faq.html#faq.yippie

chardet is based on the Mozilla algorithm and at least in
my experience that algorithm doesn't work too well.

The Mozilla algorithm may work for Asian encodings due to the fact
that those encodings are usually also bound to a specific language
(and you can then use character and word frequency analysis), but
for encodings which can encode far more than just a single language
(e.g. UTF-8 or Latin-1), the correct detection rate is rather low.

The problem becomes completely even more difficult when leaving
the normal text domain or when mixing languages in the same
text, e.g. when trying to detect source code with comments using
a non-ASCII encoding.

The trick to just pass the text through a codec and see whether
it roundtrips also doesn't necessarily help: Latin-1, for example,
will always round-trip, since Latin-1 is a subset of Unicode.

IMHO, more research has to be done into this area before a
standard module can be added to the Python's stdlib... and
who knows, perhaps we're lucky and by the time everyone is
using UTF-8 anyway :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

2008-04-22 Thread M.-A. Lemburg


[CCing python-dev again]

On 2008-04-22 12:38, Greg Wilson wrote:

I don't think that should be part of the standard library. People
will mistake what it tells them for certain.
[etc]


These are all good arguments, but the fact remains that we can't control 
our inputs (e.g., we're archiving mail messages sent to lists managed by 
DrProject), and some of those inputs *don't* tell us how they're encoded.

Under those circumstances, what would you recommend?


I haven't done much research into this, but in general, I think it's
better to:

 * first try to look at other characteristics of a text
   message, e.g. language, origin, topic, etc.,

 * then narrow down the number of encodings which could apply,

 * rank them to try to avoid ambiguities and

 * then try to see what percentage of the text you can decode using
   each of the encodings in reverse ranking order (ie. more specialized
   encodings should be tested first, latin-1 last).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 22 2008)

Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/



 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

2008-04-22 Thread M.-A. Lemburg


On 2008-04-22 18:33, Bill Janssen wrote:

The 2002 paper A language and character set determination method
based on N-gram statistics by Izumi Suzuki and Yoshiki Mikami and
Ario Ohsato and Yoshihide Chubachi seems to me a pretty good way to go
about this. 


Thanks for the reference.

Looks like the existing research on this just hasn't made it into the
mainstream yet.

Here's their current project: http://www.language-observatory.org/
Looks like they are focusing more on language detection.

Another interesting paper using n-grams:
Language Identification in Web Pages by Bruno Martins and Mário J. Silva
http://xldb.fc.ul.pt/data/Publications_attach/ngram-article.pdf

And one using compression:
Text Categorization Using Compression Models by 
Eibe Frank, Chang Chui, Ian H. Witten
http://portal.acm.org/citation.cfm?id=789742


They're looking at LSEs, language-script-encoding
triples; a script is a way of using a particular character set to
write in a particular language.

Their system has these requirements:

R1. the response must be either correct answer or unable to detect
where unable to detect includes other than registered [the
registered set of LSEs];

R2. Applicable to multi-LSE texts;

R3. never accept a wrong answer, even when the program does not have
enough data on an LSE; and

R4. applicable to any LSE text.

So, no wrong answers.

The biggest disadvantage would seem to be that the registration data
for a particular LSE is kind of bulky; on the order of 10,000
shift-codons, each of three bytes, about 30K uncompressed.

http://portal.acm.org/ft_gateway.cfm?id=772759type=pdf


For a server based application that doesn't sound too large.

Unless you're using a very broad scope, I don't think that
you'd need more than a few hundred LSEs for a typical
application - nothing you'd want to put in the Python stdlib,
though.


Bill


IMHO, more research has to be done into this area before a
standard module can be added to the Python's stdlib... and
who knows, perhaps we're lucky and by the time everyone is
using UTF-8 anyway :-)

I walked over to our computational linguistics group and asked.  This
is often combined with language guessing (which uses a similar
approach, but using characters instead of bytes), and apparently can
usually be done with high confidence.  Of course, they're usually
looking at clean texts, not random stuff.  I'll see if I can get
some references and report back -- most of the research on this was
done in the 90's.

Bill


--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 32- and 64-bit living together

2008-04-11 Thread M.-A. Lemburg

On 2008-04-11 19:10, Sérgio Durigan Júnior wrote:
 Hi all,
 
 My question is simple: is there any problem when installing/using both
 32- and 64-bit Python's on the same machine? I'm more concerned about
 header files (those installed under /usr/include/python-2.x), because as
 far as I could see there's nothing similar to a #ifdef USE_64BIT or
 something on them.

The include files are all static and can be used on both 32-bit and
64-bit platforms or installations.

Only the /usr/lib/python2.x files differ between 32-bit and 64-bit
(the configuration files are in /usr/lib/python2.x/config).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 11 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 32- and 64-bit living together

2008-04-11 Thread M.-A. Lemburg

On 2008-04-11 20:21, Sérgio Durigan Júnior wrote:
 Hi Lemburg,
 
 On Fri, 2008-04-11 at 19:38 +0200, M.-A. Lemburg wrote:
 On 2008-04-11 19:10, Sérgio Durigan Júnior wrote:
 Hi all,

 My question is simple: is there any problem when installing/using both
 32- and 64-bit Python's on the same machine? I'm more concerned about
 header files (those installed under /usr/include/python-2.x), because as
 far as I could see there's nothing similar to a #ifdef USE_64BIT or
 something on them.
 The include files are all static and can be used on both 32-bit and
 64-bit platforms or installations.
 
 Thanks :-).
 
 Only the /usr/lib/python2.x files differ between 32-bit and 64-bit
 (the configuration files are in /usr/lib/python2.x/config).
 
 Hmm, right. I tried to modify the installation path (using --libdir
 in ./configure) to /usr/lib64, but some *.pyo objects still are
 installed under /usr/lib. AFAIK, these objects are bitness-dependent
 (i.e., if they were generated by a 32-bit Python, they can only be
 execute by a 32-bit Python - and vice-versa), right?

Right.

 Is there any way to separate these arch-dependent files in /usr/lib
 and /usr/lib64 depending on their bitness?

There's no need for that. Only the config/ dir which is included
in the Python lib dir is dependent on the Python configuration.

 Thanks,
 
 P.S.: I think this misbehaviour of --libdir is a bug. IMHO, it should
 put every arch-dependent file in the path that the user provided.

You should probably have a look at how RedHat or openSUSE solve
these problems. Some of them have patched Python to fit their
needs. You may have to do that as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 11 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 32- and 64-bit living together

2008-04-11 Thread M.-A. Lemburg

On 2008-04-11 22:25, Sérgio Durigan Júnior wrote:
 On Fri, 2008-04-11 at 22:06 +0200, M.-A. Lemburg wrote:
  
 Hmm, right. I tried to modify the installation path (using --libdir
 in ./configure) to /usr/lib64, but some *.pyo objects still are
 installed under /usr/lib. AFAIK, these objects are bitness-dependent
 (i.e., if they were generated by a 32-bit Python, they can only be
 execute by a 32-bit Python - and vice-versa), right?
 Right.

Sorry, I misread you question. PYO and PYC files are *not* dependent
on 32/64 bit sizes.

 Is there any way to separate these arch-dependent files in /usr/lib
 and /usr/lib64 depending on their bitness?
 There's no need for that. Only the config/ dir which is included
 in the Python lib dir is dependent on the Python configuration.
 
 
 I'm afraid I still don't understand your point. I mean, if the *.pyo
 file *is* dependent on the bitness of the Python interpreter (as you
 confirmed in my first question), therefore when I decide to have both
 32- and 64-bit Python on my system I *must* have two versions of
 every .pyo file: one for 32- and another for 64-bit Python. What I've
 missed?

Sorry for the confusion.

 You should probably have a look at how RedHat or openSUSE solve
 these problems. Some of them have patched Python to fit their
 needs. You may have to do that as well.
 
 I'll sure take a look at them. Thanks!

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 11 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] fixing broken build

2008-03-27 Thread M.-A. Lemburg

On 2008-03-27 09:20, Christian Heimes wrote:
 Neal Norwitz schrieb:
 Christian,

 Please fix the build on the various buildbots that are failing or
 revert your changes for unicode literals.  The build failures started
 to occur at r61953.  There were several more (~5) follow up checkins.

 You can find all the failures here:  http://www.python.org/dev/buildbot/all/

 There seem to be at least two variations for how setup.py is failing.
 See below.
 
 I've already fixed the problem in r61956. I didn't noticed the issue
 with a non initialized var until I compiled Python without pydebug. In
 order to fix the problem on the build bots one has to remove all pyc and
 pyo files.

I'm not sure why that's necessary, but whenever you change something
in the compiler, please remember to update the PYC magic.

I'd also suggest that you run a non-debug build of Python to test
any checkins before committing them. The debug builds change various
ways the code is built.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread M.-A. Lemburg

On 2008-03-26 07:11, Martin v. Löwis wrote:
 For binary representations, we already have the struct module to handle 
 the parsing, but for byte sequences with embedded ASCII digits it's 
 reasonably common practice to use strings along with the respective type 
 constructors.
 
 Sure, but why can't you write
 
  foo = int(bar[start:stop].decode(ascii))
 
 then? Explicit is better than implicit.

Agreed.

The whole purpose of Unicode is to store text. Data from a file
isn't text per-se. You have to tell Python that a particular set of
bytes is to be interpreted as text and that only works by explicitly
converting the bytes to text.

Numbers or digits aren't any different in this context.
b1234 is just a sequence of bytes and could well represent
the binary encoding of an integer, the start of a base64 encoded
image, an SSH key or an audio file.

Don't get fooled by the looks of b1234. It's really just a
shorter way of writing 0x31 0x32 0x33 0x34.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 26 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Proposal: from future import unicode_string_literals

2008-03-25 Thread M.-A. Lemburg

On 2008-03-24 09:22, Lennart Regebro wrote:
 I think 2to3 is a procedure that will work well for library type
 projects with a reasonably small set of developers that make regular
 releases. There you can release both a python 2 and a python 3 version
 of the module, for example.
 ...
 So, in short: Large projects with interconnected modules where the
 developers and users of module are the same people will have big
 difficulties with the 2to3 approach and would be the people who are
 most likely to not be able to in practice go forward to Python 3
 unless they have some sort of smooth path forward.

I don't think there's a lot to worry about:

Companies using Python for applications typically have a completely
different life-cycle of releases and applications compared to the
Python release schedule, i.e. they often still run Python 2.3 or
2.4 and wait for major releases to settle before deciding to
port to them.

Every now and then, they make the decision to port to the next
release (for the next version of their software) and this change is
then managed accordingly - sometimes skipping a complete major release
of Python.

In such projects, 2to3 will get applied to the sources once and then
all development continues on the Python 3.0 version of the code.


In reality, I don't think that 2to3 will get used for continuous
porting between a 2.x code base and a 3.0 one all that much.

The transition from 2.x to 3.0 will happen during a longer period of
time (probably a few years) and depend a lot on the release cycle of
the applications using Python, whether or not the 3.0 version provides
better features, more performance,  etc. and whether the 2.x branches
of Python and the used 3rd party modules are still supported or not.

New applications will likely choose 3.0 right away - provided that
the needed 3rd party modules are available and stable enough.


In summary: 2to3 is a very useful tool to have. Whether or not
it is used for continuous porting between the two worlds is
really secondary.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 25 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond

2008-03-21 Thread M.-A. Lemburg

On 2008-03-21 14:47, Phillip J. Eby wrote:
 So, to accomplish this, we (for some value of we) need to:
 
 1. Hash out consensus around what changes or enhancements are needed 
 to PEP 262, to resolve the previously-listed open issues, those that 
 have come up since (namespace packages, dependency specifications, 
 canonical name/version forms), and anything else that comes up.
 
 2. Update or replace the implementation as appropriate, and modify 
 the distutils to support it in Python 2.6 and beyond.  And support 
 it means, ensure that 'install' and *all* bdist commands update the 
 database.  The bdist_rpm, bdist_wininst, and bdist_msi commands, 
 even bdist_dumb.  (This should probably also include the add/remove 
 programs stuff in the Windows case.)

The bdist commands don't need to touch that database in any way,
since they don't install anything, nor do they upload things
anywhere. They simply package code and put the result into
the dist/ subdir. That's all.

What you probably mean is that the installers, pre/post-scripts,
etc. run when installing one of those packages should update
the database of installed packages.

Note that there are several package formats which do not execute
any code when installing them - the user simply unzips them in
some directory. These packages won't be able to register themselves
with a database.

I guess the only way to support all of these variants is
to use a filesystem based approach, e.g. by placing a file
with a special extension into some dir on sys.path.
The database logic could then scan sys.path for these
files, read the data and provide an interface to it.

All bdist formats would then have to include these files.

distutils already writes .egg-info files when running
python setup.py install, so perhaps that's a start (though
I'd prefer a three letter extension such as .pkg).

.egg-info files currently only include the package meta-data
(the PKG-INFO section from PEP 262).

We'd have to add a list of files making up the package (FILES
section in PEP 262) and also some extra information about any
extra files the package creates that can safely be removed in
the uninstall process (e.g. .pyo and .pyc files, temporary files,
database files, configuration data, registry entries, etc.) -
this is currently not covered in PEP 262.

I don't think the REQUIRES and PROVIDES sections from the
PEP 262 are needed. That info can easily go into the PKG-INFO
section.

A separate FILES section also doesn't seem to be necessary -
we could just add one or more entries or the format:

CreatesDir abc/
CreatesFile abc/xyz1.py
CreatesDir abc/def/
CreatesFile abc/def/xyz2.py
CreatesFile abc/def/xyz3.py
CreatesFile abc/def/xyz4.ini

(BTW: wininst writes such a file for the uninstall process)

So to keep things simple, the rfc822 approach defined in
PEP 241 would easily cover everything needed and we could
trim down the PEP 262 format to a simple rfc822 header
list.

In other words: the .egg-info files already provide the basis
and only need to be extended with a list of created files,
directories (and possibly other resources) as well as a list
of resources which may be removed even if not installed
explicitly such as byte-code files, etc.

 3. Create a document for system packagers referencing the PEP and 
 introducing them to what/why/how of the standard, in case they 
 weren't one of the original participants in creating this.

This should probably be a new PEP defining all the bits and
pieces making up the installation database.

 It will probably take some non-trivial work to do all this for Python 
 2.6, but it's probably possible, if we start now.  I don't think it's 
 critical to have an uninstall tool distributed with 2.6, as long as 
 there's a reasonable way to bootstrap its installation later.

BTW: There's a simple uninstall command in mxSetup.py that we
could contribute to distutils. It works much in the same
way as the install command... except that it removes all the
files it would have installed.

Using pre-built packages, this works without having to rebuild
the package just to be able to determine the list of things
that need to be removed.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 21 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:

Re: [Python-Dev] Proposal: from future import unicode_string_literals

2008-03-21 Thread M.-A. Lemburg

On 2008-03-21 22:32, Martin v. Löwis wrote:
 It's not implementable because the work has to occur in ast.c (see 
 Py_UnicodeFlag).  It can't occur later, because you need to skip the 
 encoding being done in parsestr().  But the __future__ import can only 
 be interpreted after the AST is built, at which time the encoding has 
 already been applied.  
 
 I think it would be possible to check for future statements on the
 basis of nodes already. Take a look at how Python 2.3 implemented
 future statements (why was that rewritten to use the AST, anyway?).
 
 As for it not making sense, this is really in the realm of 2to3.  I'm 
 beginning to really believe this statement in PEP 3000:
 
 There is still the original use case of people who don't want to run
 2to3 (for whatever reasons - mostly probably subjective ones), and
 who would rather run a single code base unmodified. They don't care
 that documentation tells them this is impossible, when they feel they
 are so close to making it possible.

Could we point them to a special byte-code compiler such as Andrew
Dalke's python4ply:

http://dalkescientific.com/Python/python4ply.html

That approach appears to be a lot easier to implement than trying
to tweak the C implementation of the Python parser.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 21 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond

2008-03-21 Thread M.-A. Lemburg

On 2008-03-21 22:21, Phillip J. Eby wrote:
 At 08:06 PM 3/21/2008 +0100, M.-A. Lemburg wrote:
 I guess the only way to support all of these variants is
 to use a filesystem based approach, e.g. by placing a file
 with a special extension into some dir on sys.path.
 The database logic could then scan sys.path for these
 files, read the data and provide an interface to it.

 All bdist formats would then have to include these files.
 
 That's the idea behind the current version of PEP 262, yes, and I think 
 it should be kept.
 
 A separate FILES section also doesn't seem to be necessary -
 we could just add one or more entries or the format:

 CreatesDir abc/
 CreatesFile abc/xyz1.py
 CreatesDir abc/def/
 CreatesFile abc/def/xyz2.py
 CreatesFile abc/def/xyz3.py
 CreatesFile abc/def/xyz4.ini
 
 I actually think the size and hash information is good, in order to be 
 able to tell if you're looking at an original file.  I'm not sure how 
 useful the permissions and uid/gid info is.  I'm hoping we'll hear from 
 anybody who has a use case for that.

You're heading off in the wrong direction: we should not be trying
to rewrite RPM or InnoSetup in Python.

Anything more complicated should be left to tools which are
specifically written to manage complex software setups.

I honestly believe that most people would be happy if we just
provide these two things (and no more):

  * install a package from a local archive, a URL or PyPI

  * uninstall a package in way that doesn't break other
installed packages

and whatever the mechanism, avoid making any undercover
changes to the Python installation such as adding
.pth files, overriding site.py, etc. - these are
not needed if the tool keeps to the simple task of
installing and uninstalling Python packages.

Examples:

python pypi.py install mypkg-1.0.tgz
python pypi.py install http://www.example.com/mypkg-1.0.tgz
python pypi.py install mypkg-1.0

python pypi.py uninstall mypkg

If there's a dependency problem, the tool should print the
list of other packages it needs. It should not try to install
things automagically.

If a package needs other modules as well, the package docs
can point the user to use e.g.

python pypi.py install mydep1-1.3 mydep2-2.3 mydep4-0.3 mypkg-1.0

instead.

Anything more complicated should be left to specialized
tools such as RPM, apt, MSI or the other such tools out
there - after all the tool should be about Python *package*
installation, not application installation.

We *don't* need the tool to:

  * support multiple versions of a package (that's just bound
to cause problems with pickle, isinstance() etc.)

  * provide namespace hacking (is a completely separate issue
and can be handled by the packages rather than the install
tool)

  * support all kinds of funky version numbers (if a package
wants to participate in the system, the author better
make sure that the version string fits the standard format)

  * provide some form of intra-package bus interface (ie.
entry points as you call them)

  * provide support for keeping whole packages in ZIP files
(doesn't play well with C extensions, clutters up the
sys.path, is read-only, needs special importers, etc. etc. )

  * try automatic version matching for required packages

  * download things from SourceForge or other sites with special
download mechanisms

  * scan websites for links

  * make coffee, clean the house, send the kids to school :-)

 And of course, there are still some issues to be resolved regarding 
 requirements, package name/version stuff, etc.  But we can hash those 
 out once we reach a quorum on the Distutils-SIG.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 21 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)

On 2008-03-18 18:05, [EMAIL PROTECTED] wrote:
 I'm reviving a very old thread based on discussions with Martin at pycon.
 
 Sent: Monday, 23 July 2007 5:12 PM
 Subject: Re: [Distutils] distutils.util.get_platform() for Windows
 
 Rather than forcing everyone to read the context, allow me to summarize:
 On 64bit Windows versions, we need a string that identifies the
 platform, and this string should ideally be used consistently.  This
 original thread related to the files created by distutils (eg,
 pywin32-210.win???64??-py2.6.exe) but it seems obvious that we should be
 consistent wherever Python wants to display the platform (eg, in the
 startup banner, in platform.py, etc).
 
 In the old thread, there was a semi-consensus that 'x86_64' be used by
 distutils (and indeed, Lib/distutils/util.py in get_platform() has been
 changed, by me, to use this string), but the Python 'banner', for example,
 reports AMD64.  Platform.py doesn't report much at all in this area, at
 least when pywin32 isn't installed, but it arguably should.
 
 Both Martin and I prefer AMD64 as the string, for various reasons. 
 Firstly, it is less ugly than 'x86_64', and doesn't include an '_'/'-'
 which might tend to confuse parsing by humans or computers.  Martin also
 made the point that AMD invented the architecture and AMD64 is their
 preferred name, so we should respect that.
 
 So, at the risk of painting a bike-shed, I'd like to propose that we adopt
 'AMD64' in distutils (needs a change), platform.py (needs a change to use
 sys.getwindowsversion() in preference to pywin32, if possible, anyway),
 and the Python banner (which already uses AMD64).
 
 Any objections?  Any strong feelings that using 'AMD' will confuse people
 with Intel processors?  Strong feelings about the parsability of the name
 (PJE? wink)?  Strong feelings about the color wink?

Not really an object, but Microsoft itself uses the term x64 for
the 64-bit variants of their OS, e.g.

http://www.microsoft.com/windowsxp/64bit/default.mspx

Since the platform name is targeting Windows, I think we should
avoid confusing Windows users more than Intel users ;-)

About the platform.py changes: if someone could provide the return
values of sys.getwindowsversion() for 64bit versions of Windows
XP and Vista, I could add support for it (don't have a 64bit version
of Windows available to check myself).

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)

On 2008-03-20 13:42, Thomas Heller wrote:
 M.-A. Lemburg schrieb:
 About the platform.py changes: if someone could provide the return
 values of sys.getwindowsversion() for 64bit versions of Windows
 XP and Vista, I could add support for it (don't have a 64bit version
 of Windows available to check myself).
 
 This is the output of a 32-bit Python running on Windows XP Professional
 x64 Edition, Version 2003, Service Pack 2:
 
 C:\Python24ver
 
 Microsoft Windows [Version 5.2.3790]
 
 C:\Python24python
 Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32
 Type help, copyright, credits or license for more information.
 import sys
 sys.getwindowsversion()
 (5, 2, 3790, 2, 'Service Pack 2')

Thank you !

Anyone with a 64bit Vista ?

Or even better: a page documenting what to expect from the system call
behind the sys.getwindowsversion() API ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)

On 2008-03-20 13:55, M.-A. Lemburg wrote:
 On 2008-03-20 13:42, Thomas Heller wrote:
 M.-A. Lemburg schrieb:
 About the platform.py changes: if someone could provide the return
 values of sys.getwindowsversion() for 64bit versions of Windows
 XP and Vista, I could add support for it (don't have a 64bit version
 of Windows available to check myself).
 This is the output of a 32-bit Python running on Windows XP Professional
 x64 Edition, Version 2003, Service Pack 2:

 C:\Python24ver

 Microsoft Windows [Version 5.2.3790]

 C:\Python24python
 Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on 
 win32
 Type help, copyright, credits or license for more information.
 import sys
 sys.getwindowsversion()
 (5, 2, 3790, 2, 'Service Pack 2')
 
 Thank you !
 
 Anyone with a 64bit Vista ?
 
 Or even better: a page documenting what to expect from the system call
 behind the sys.getwindowsversion() API ?

FYI: I added winreg and sys.getwindowsversion() support in r61674.

platform.machine() and .processor() will now use the environment
variables PROCESSOR_ARCHITECTURE and PROCESSOR_IDENTIFIER where
available (should work on Windows XP and later).

According to http://support.microsoft.com/kb/888731 platform.machine()
will return AMD64, so I guess the x64 string is just a marketing
name for 64-bit platforms on Windows and the underlying system uses
AMD64 as machine type name.

For x86 processors, you'll now get x86 on Windows XP and later.

For Itanium processors, you should get IA64 according to this
WOW64 page:

http://msdn2.microsoft.com/en-us/library/aa384274(VS.85).aspx

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 365 (Adding the pkg_resources module)

On 2008-03-20 21:34, Paul Moore wrote:
  Also, setuptools-based packages *can* build bdist_wininst
  installers.  (In fact, if memory serves, I added that feature at your 
 request.)
 
 I know. python setup.py bdist_wininst. And thank you for adding it.
 But again you miss my point. People are starting to omit distributing
 bdist_wininst installers in favour of eggs only. And you cannot (to my
 knowledge) convert an egg into a bdist_wininst installer, and if you
 can't compile from source (a C extension with complex dependencies,
 for example) you're stuck (in the sense that you're forced to use eggs
 without add/remove programs support).

You might want to look at the eGenix pre-built packages as an
alternative: they include all the information necessary to let
standard distutils continue its works *after* the build step.

It's basically a distribution of the package as it looks after
the build step has run, but before the package is wrapped up
using a packager like bdist_wininst or bdist_msi or installed
into the system.

You can download the pre-built package and create e.g. an
MSI installer or a wininst EXE without needing a compiler -
in addition to providing all the options of the standard distutils
install command (which makes repackaging them as part of
larger applications easy as well).

All the logic for this is included in mxSetup.py which ships with
the pre-built packages.

http://www.egenix.com/products/python/mxBase/#Download
http://www.egenix.com/products/python/mxBase/#Installation

The current version we have is not yet perfect. The next
iteration will also play nice with distutils extensions.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C-API status of Python 3?

2008-03-02 Thread M.-A. Lemburg

On 2008-03-02 14:47, Christian Heimes wrote:
 Alex Martelli wrote:
 Yep, but please do keep the PyUnicode for str and PyString for bytes
 (as macros/synonnyms of PyStr and PyBytes if you want!-) to help the
 task of porting existing extensions... the bytearray functions should
 no doubt be PyBytearray, though.
 
 Yeah, we've already planed to keep PyUnicode as prefix for str type
 functions. It makes perfectly sense, not only from the historical point
 of view.
 
 But for PyString I planed to rename the prefix to PyBytes. In my opinion
 we are going to regret it, when we keep too many legacy names from 2.x.
 In order to make the migration process easier I can add a header file
 that provides PyString_* functions as aliases for PyBytes_*
 
 Comments?

+1

Why not also make unicode() the default type constructor and only
keep str() as alias to simplify porting (perhaps with a warning) ?

The term string is just too overloaded with all kinds of
misinterpretations. The term string just refers to a string of
bytes - a variable length array so to speak. However, depending
on the application space, string is used as synonym for
text string just as well as data string.

Removing the term string altogether would make it easier for
people to understand that Py3k only has unicode (for text data)
and bytes (for binary data).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 02 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C-API status of Python 3?

2008-03-02 Thread M.-A. Lemburg

On 2008-03-02 20:39, Bill Janssen wrote:
 Why not also make unicode() the default type constructor and only
 keep str() as alias to simplify porting (perhaps with a warning) ?

 The term string is just too overloaded with all kinds of
 misinterpretations. The term string just refers to a string of
 bytes - a variable length array so to speak. However, depending
 on the application space, string is used as synonym for
 text string just as well as data string.

 Removing the term string altogether would make it easier for
 people to understand that Py3k only has unicode (for text data)
 and bytes (for binary data).
 
 I agree that string is very overloaded, but calling it unicode is
 sort of like calling integers int32 -- that is, you're talking about
 the implementation rather than the type. 

Hmm in that case, we'd have to call it ucs2 or ucs4 depending
on how Python was compiled ;-)

 In most programming
 languages that aren't at the machine level (like C is), string
 really is a sequence of text characters, not a string of bytes, and
 that's probably the term that should be used for Python going forward,
 despite the legacy issues it involves.

I'm not bound to unicode at all, just don't think using string
for text data will really make people think twice often enough
and then you end up having binary data in a string again -
with the only difference that it's now using the Unicode type
internally.

My personal favorite is text for text data.

 Personally, I feel that string (for text) and bytes (for binary
 data represented as a sequence of bytes) are appropriate terms for
 Python.  Keep unicode for a release or two as an alias for string.
 But isn't all this in a PEP somewhere already?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 03 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C-API status of Python 3?

2008-03-02 Thread M.-A. Lemburg

On 2008-03-02 23:11, Greg Ewing wrote:
 M.-A. Lemburg wrote:
 Why not also make unicode() the default type constructor and only
 keep str() as alias to simplify porting (perhaps with a warning) ?
 
 -1 on making us type 7 characters instead of
 3 all over the place.

Oh well... how about text() ?

 The term string is just too overloaded with all kinds of
 misinterpretations. The term string just refers to a string of
 bytes - a variable length array so to speak.
 
 I disagree -- string has come to mean string of
 characters unless otherwise qualified. Using one
 to hold non-characters is just an aberration that
 was necessary in Python 2 because there wasn't much
 alternative.

Buffer objects have been around for years and for exactly
this purpose.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 03 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode -- UTF-8 in CPython extension modules

2008-02-22 Thread M.-A. Lemburg

On 2008-02-23 00:46, Colin Walters wrote:
 On Fri, Feb 22, 2008 at 4:23 PM, John Dennis [EMAIL PROTECTED] wrote:
 
  Python programs which use Unicode string objects for their i18n and
  which link to C libraries expecting UTF-8 but which have a CPython
  binding which only uses 's' or 's#' formats programs seem to often
  fail with encoding errors.
 
 One thing to be aware of is that PyGTK+ actually sets the Python
 Unicode object encoding to UTF-8.
 
 http://bugzilla.gnome.org/show_bug.cgi?id=132040
 
 I mention this because PyGTK is a very popular library related to
 Python and Linux.  So currently if you import gtk, then libraries
 which are using UTF-8 (as you say, the vast majority) will work with
 Python unicode objects unmodified.

Are you suggesting that John should rely on a bug in some 3rd party
extension instead of fixing the Python extension to use es# where
needed ?

There's a good reason why we don't allow setting the default
encoding outside site.py.

Trying to play tricks to change the default encoding later on
will only cause problems, e.g. the cached default encoded versions
of Unicode objects will then use different encodings - the one set
in site.py and later the ones with the new encoding. As a result,
all kind of weird things can happen.

Using the Python Unicode C API really isn't all that hard and it's
well documented too, so please use it instead of trying to design
software based on workarounds.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-13 Thread M.-A. Lemburg

On 2008-02-13 08:02, Andrew MacIntyre wrote:
 Christian Heimes wrote:
 Andrew MacIntyre wrote:
 I tried a LIFO stack implementation (though I won't claim to have done it
 well), and found it slightly slower than no freelist at all. The
 advantage of such an approach is that the known size of the stack makes
 deallocating excess objects easy (and thus no need for
 sys.compact_free_list() ).
 I've tried a single linked free list myself. I used the ob_type field to
 daisy chain the int and float objects. Although the code was fairly
 short it was slightly slower than an attempt without a free list at all.
 pymalloc is fast. It's very hard to beat it though.
 
 I'm speculating that CPU cache effects can make these differences.  The
 performance of the current trunk float freelist is depressing, given that
 the same strategy works so well for ints.
 
 I seem to recall Tim Peters paying a lot of attention to cache effects
 when he went over the PyMalloc code before the 2.3 release, which would
 contribute to its performance.
 
 A fixed size LIFO array like PyFloatObject
 *free_list[PyFloat_MAXFREELIST] increased the speed slightly. IMHO a
 value of about 80-200 floats and ints is realistic for most apps. More
 objects in the free lists could keep too many pymalloced areas occupied.
 
 I tested the updated patch you added to issue 2039.  With the int
 freelist set to 500 and the float freelist set to 100, its about the same
 as the no-freelist version for my tests, but PyBench shows the simple
 float arithmetic to be about 10% better.
 
 I'm inclined to set the int LIFO a bit larger than you suggest, simply as
 ints are so commonly used - hence the value of 500 I used.  Floats are
 much less common by comparison.  Even an int LIFO of 500 is only going to
 tie up ~8kB on a 32bit box (~16kB on 64bit), which is insignificant
 enough that I can't see a need for a compaction routine.
 
 A 200 entry float LIFO would only account for ~4kB on 32bit (~8kB on
 64bit).

It is difficult to tell what good limits for free lists should
be. This depends a lot on the application focus, e.g. a financial
application is going to need lots of floats, while a word
processor or parser will need more integers.

I think the main difference between the current free list
implementation and Christian's patches is that the current
implementation bypasses pymalloc altogether and allocates
the objects directly using the system malloc().

The objects in the free list then cannot keep artificially keep
pymalloc pools alive.

Furthermore, the current free list implementation works
by allocating 1k chunks of memory for more than just one
object whenever it finds that the free list is empty.

Christian's patches and your free list removal patch, cause
all allocations to be done via pymalloc. Christian's free
list can also result in nearly empty pymalloc pools to stay
alive due to the use of a linked list rather than an
array of objects.

Finally (and I don't know if you've missed that), the integer
implementation uses sharing for small integers. In the current
implementation all integers between -5 and 257 are only ever
allocated once and then reused whenever an integer in this
range is needed. The shared integers are not subject to any
of the extra free list handling or pymalloc overhead.

This results in a significant boost, since integers in this
range are *very* common and also causes the comparison between
integers and floats to become biased - floats don't have
this optimization.

I still think that dropping the free lists can be worthwhile,
but pymalloc would need to get further optimizations to give
better performance for often requested size classes (the 16 byte
class on 32bit architectures, the 24 byte class on 64bit
architectures).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-13 Thread M.-A. Lemburg

On 2008-02-13 12:56, Andrew MacIntyre wrote:
 I'm not that interested in debating the detail of exactly how big the
 prospective LIFO freelists are - I just want to see the situation
 resolved with maximum utilisation of memory for minimum performance 
 penalty.  To that end, +1 from me for accepting your revised patch 
 against issue 2039.  In addition, unless there are other reasons to
 retain it, I would be suggesting that the freelist compaction
 infrastructure you introduced in r60567 be removed for lack of practical 
 utility (assuming acceptance of your patch).

If we're down to voting, here's my vote:

+1 on removing the freelists from ints and floats, but not the
   small int sharing optimization

+1 on focusing on improving pymalloc to handle int and float
   object allocations even better

-1 on changing the freelist implementations to use pymalloc for
   allocation of the freelist members instead of malloc, since
   this would potentially lead to pools (and arenas) being held alive
   by just a few objects - in the worst case a whole arena (256kB)
   for just one int object (14 bytes on 32bit platforms).

Eventually, all freelists should be removed, unless there's a
significant performance loss.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-08 Thread M.-A. Lemburg

On 2008-02-08 08:21, Martin v. Löwis wrote:
 One of the hopes of having a custom allocator for Python was to be
 able to get rid off all free lists. For some reason that never happened.
 Not sure why. People were probably too busy with adding new
 features to the language at the time ;-)
 
 Probably not. It's more that the free lists still outperformed pymalloc.
 
 Something you could try to make PyMalloc perform better for the builtin
 types is to check the actual size of the allocated PyObjects and then
 make sure that PyMalloc uses arenas large enough to hold a good quantity
 of them, e.g. it's possible that the float types fall into the same
 arena as some other type and thus don't have enough room to use
 as free list.
 
 I don't think any improvements can be gained here. PyMalloc carves
 out pools of 4096 bytes from an arena when it runs out of blocks
 for a certain size class, and then keeps a linked list of pools of
 the same size class. So when many float objects get allocated,
 you'll have a lot of pools of the float type's size class.
 IOW, PyMalloc has always enough room.

Well, yes, it doesn't run out of memory, but if pymalloc needs
to allocate lots of objects of the same size, then performance
degrades due to the management overhead involved for checking
the free pools as well as creating new arenas as needed.

To reduce this overhead, it may be a good idea to preallocate
pools for common sizes and make sure they don't drop under a
certain threshold.

Here's a list of a few object sizes in bytes for Python 2.5
on an AMD64 machine:

 import mx.Tools
 mx.Tools.sizeof(int(0))
24
 mx.Tools.sizeof(float(0))
24

8-bit strings are var objects:

 mx.Tools.sizeof(str(''))
40
 mx.Tools.sizeof(str('a'))
41

Unicode objects use an external buffer:

 mx.Tools.sizeof(unicode(''))
48
 mx.Tools.sizeof(unicode('a'))
48

Lists do as well:

 mx.Tools.sizeof(list())
40
 mx.Tools.sizeof(list([1,2,3]))
40

Tuples are var objects:

 mx.Tools.sizeof(tuple())
24
 mx.Tools.sizeof(tuple([1,2,3]))
48

Old style classes:

 class C: pass
...
 mx.Tools.sizeof(C)
64

New style classes are a lot heavier:

 class D(object): pass
...
 mx.Tools.sizeof(D)
848

 mx.Tools.sizeof(type(2))
848


As you can see, Integers and floats fall into the same pymalloc size
class. What's strange in Andrew's result is that both integers
and floats use the same free list technique and fall into the same
pymalloc size class, yet the results are different.

The only difference that's apparent is that small integers are
shared, so depending on the data set used for the test, fewer
calls to pymalloc or the free list are made.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-08 Thread M.-A. Lemburg

On 2008-02-08 19:28, Christian Heimes wrote:
 In addition to the pure performance aspect, there is the issue of memory
 utilisation.  The current trunk code running the int test case in my
 original post peaks at 151MB according to top on my FreeBSD box, dropping
 back to about 62MB after the dict is destroyed (without a compaction).
 The same script running on the no-freelist build of the interpreter peaks
 at 119MB, with a minima of around 57MB.
 
 I wonder why the free list has such a huge impact in memory usage. Int
 objects are small (4 byte pointer to type, 4 byte Py_ssize_t and 4 byte
 value). A thousand int object should consume less than 20kB including
 overhead and padding.

The free lists keep parts of the pymalloc pools alive.
Since these are only returned to the OS if the whole pool is
unused, a single object could keep 4k of memory associated
with the process.

I suppose that the remaining few MBs shown by the OS are not
really used by the process, but simply kept associated with
the process by the OS in case it quickly needs more memory.

In order to be sure about the true memory usage, you'd have
to force the OS to grab all available memory, e.g. by running
a huge process right next to the one you're testing.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-07 Thread M.-A. Lemburg

On 2008-02-07 14:09, Andrew MacIntyre wrote:
 Probably in response to the same stimulus as Christian it occurred to me
 that the freelist approach had been adopted long before PyMalloc was
 enabled as standard (in 2.3), and that much of the performance gains
 between 2.2 and 2.3 were in fact due to PyMalloc.

One of the hopes of having a custom allocator for Python was to be
able to get rid off all free lists. For some reason that never happened.
Not sure why. People were probably too busy with adding new
features to the language at the time ;-)

Something you could try to make PyMalloc perform better for the builtin
types is to check the actual size of the allocated PyObjects and then
make sure that PyMalloc uses arenas large enough to hold a good quantity
of them, e.g. it's possible that the float types fall into the same
arena as some other type and thus don't have enough room to use
as free list.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Limit free list of method and builtin function objects (was: [Python-checkins] r60614 - in python/trunk: Misc/NEWS Objects/classobject.c Objects/methodobject.c)

2008-02-06 Thread M.-A. Lemburg

Hi Christian,

could you explain how you came up with the 256 entry limit ?
It appears to be rather low and somehow arbitrary.

I understand that some limit is required, but since these
objects get created a lot (e.g. for bound methods), setting the
limit too low will significantly slow down the interpreter.

BTW: What does pybench have to say to this patch ?

To get an idea of how many objects are typically part of the
free list, I'd suggest running an application such as Zope for
a while and then check the maximum numfree value.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 06 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2008-02-06 13:44, christian.heimes wrote:
 Author: christian.heimes
 Date: Wed Feb  6 13:44:34 2008
 New Revision: 60614
 
 Modified:
python/trunk/Misc/NEWS
python/trunk/Objects/classobject.c
python/trunk/Objects/methodobject.c
 Log:
 Limit free list of method and builtin function objects to 256 entries each.
 
 Modified: python/trunk/Misc/NEWS
 ==
 --- python/trunk/Misc/NEWS(original)
 +++ python/trunk/Misc/NEWSWed Feb  6 13:44:34 2008
 @@ -12,6 +12,9 @@
  Core and builtins
  -
  
 +- Limit free list of method and builtin function objects to 256 entries
 +  each.
 +
  - Patch #1953: Added ``sys._compact_freelists()`` and the C API functions
``PyInt_CompactFreeList`` and ``PyFloat_CompactFreeList``
to compact the internal free lists of pre-allocted ints and floats.
 
 Modified: python/trunk/Objects/classobject.c
 ==
 --- python/trunk/Objects/classobject.c(original)
 +++ python/trunk/Objects/classobject.cWed Feb  6 13:44:34 2008
 @@ -4,10 +4,16 @@
  #include Python.h
  #include structmember.h
  
 +/* Free list for method objects to safe malloc/free overhead
 + * The im_self element is used to chain the elements.
 + */
 +static PyMethodObject *free_list;
 +static int numfree = 0;
 +#define MAXFREELIST 256
 +
  #define TP_DESCR_GET(t) \
  (PyType_HasFeature(t, Py_TPFLAGS_HAVE_CLASS) ? (t)-tp_descr_get : NULL)
  
 -
  /* Forward */
  static PyObject *class_lookup(PyClassObject *, PyObject *,
 PyClassObject **);
 @@ -2193,8 +2199,6 @@
 In case (b), im_self is NULL
  */
  
 -static PyMethodObject *free_list;
 -
  PyObject *
  PyMethod_New(PyObject *func, PyObject *self, PyObject *klass)
  {
 @@ -2207,6 +2211,7 @@
   if (im != NULL) {
   free_list = (PyMethodObject *)(im-im_self);
   PyObject_INIT(im, PyMethod_Type);
 + numfree--;
   }
   else {
   im = PyObject_GC_New(PyMethodObject, PyMethod_Type);
 @@ -2332,8 +2337,14 @@
   Py_DECREF(im-im_func);
   Py_XDECREF(im-im_self);
   Py_XDECREF(im-im_class);
 - im-im_self = (PyObject *)free_list;
 - free_list = im;
 + if (numfree  MAXFREELIST) {
 + im-im_self = (PyObject *)free_list;
 + free_list = im;
 + numfree++;
 + }
 + else {
 + PyObject_GC_Del(im);
 + }
  }
  
  static int
 @@ -2620,5 +2631,7 @@
   PyMethodObject *im = free_list;
   free_list = (PyMethodObject *)(im-im_self);
   PyObject_GC_Del(im);
 + numfree--;
   }
 + assert(numfree == 0);
  }
 
 Modified: python/trunk/Objects/methodobject.c
 ==
 --- python/trunk/Objects/methodobject.c   (original)
 +++ python/trunk/Objects/methodobject.c   Wed Feb  6 13:44:34 2008
 @@ -4,7 +4,12 @@
  #include Python.h
  #include structmember.h
  
 +/* Free list for method objects to safe malloc/free overhead
 + * The m_self element is used to chain the objects.
 + */
  static PyCFunctionObject *free_list = NULL;
 +static int numfree = 0;
 +#define MAXFREELIST 256
  
  PyObject *
  PyCFunction_NewEx(PyMethodDef *ml, PyObject *self, PyObject *module)
 @@ -14,6 +19,7 @@
   if (op != NULL) {
   free_list = (PyCFunctionObject *)(op-m_self);
   PyObject_INIT(op, PyCFunction_Type);
 + numfree--;
   }
   else {
   op = PyObject_GC_New(PyCFunctionObject, PyCFunction_Type);
 @@ -125,8 +131,14 @@
   _PyObject_GC_UNTRACK(m);
   Py_XDECREF(m-m_self);

Re: [Python-Dev] trunc()

2008-01-28 Thread M.-A. Lemburg

On 2008-01-27 08:14, Raymond Hettinger wrote:
 . You may disagree, but that doesn't make it nuts.
 
 Too many thoughts compressed into one adjective ;-)
 
 Deprecating int(float)--int may not be nuts, but it is disruptive.
 
 Having both trunc() and int() in Py2.6 may not be nuts, but it is duplicative 
 and confusing.
 
 The original impetus for facilitating a new Real type being able to trunc() 
 into a new Integral type may not be nuts, but the use 
 case seems far fetched (we're never had a feature request for it -- the 
 notion was born entirely out of numeric tower 
 considerations).
 
 The idea that programmers are confused by int(3.7)--3 may not be nuts, but 
 it doesn't match any experience I've had with any 
 programmer, ever.
 
 The idea that trunc() is beneficial may not be nuts, but it is certainly 
 questionable.
 
 In short, the idea may not be nuts, but I think it is legitimate to suggest 
 that it is unnecessary and that it will do more harm 
 than good.

All this reminds me a lot of discussions we've had when we
needed a new way to spell out string.join().

In the end, we ended up adding a method to strings (thanks to
Tim Peters, IIRC) instead of adding a builtin join().

Since all of the suggested builtins are only meant to work on
floats, why not simply add methods for them to the float object ?!

E.g.

x = 3.141
print x.trunc(), x.floor(), x.ceil()

etc.

This approach also makes it possible to write types or classes
that expose the same API without having to resort to new special
methods (we have too many of those already).

Please consider that type constructors have a different scope
than helper functions. Helper functions should only be made builtins
if they are really really useful and often needed. If they don't
meet this criteria, they are better off in a separate module.
I don't see any of the suggested helper functions meeting this
criteria and we already have math.floor() and math.ceil().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 28 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] trunc()

2008-01-25 Thread M.-A. Lemburg

On 2008-01-25 21:26, Steve Holden wrote:
 Antoine Pitrou wrote:
 Raymond Hettinger python at rcn.com writes:
 Go ask a dozen people if they are surprised that int(3.7) returns 3.
 No one will be surprised (even folks who just use Excel or VB). It
 is foolhardy to be a purist and rage against the existing art:

 Well, for what it's worth, here are MySQL's own two cents:

 mysql create table t (a int);
 Query OK, 0 rows affected (0.00 sec)

 mysql insert t (a) values (1.4), (1.6), (-1.6), (-1.4);
 Query OK, 4 rows affected (0.00 sec)
 Records: 4  Duplicates: 0  Warnings: 0

 mysql select * from t;
 +--+
 | a|
 +--+
 |1 | 
 |2 | 
 |   -2 | 
 |   -1 | 
 +--+
 4 rows in set (0.00 sec)

 Two points. Firstly, regarding MySQL as authoritative from a standards 
 point of view is bound to lead to trouble, since they have always played 
 fast and loose with the standard for reasons (I suspect) of 
 implementation convenience.
 
 Second, that example isn't making use of the INT() function. I was going 
 to show you result of taking the INT() of a float column containing your 
 test values. That was when I found out that MySQL (5.0.41, anyway) 
 doesn't implement the INT() function. What was I saying about standards?

FWIW, here's what IBM has to say to this:


http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/r814.htm

If the argument is a numeric-expression, the result is the same number 
that would occur if the argument were
assigned to a large integer column or variable. If the whole part of the 
argument is not within the range of integers,
an error occurs. The decimal part of the argument is truncated if present.

AFAIK, the INTEGER() function is not part of the SQL standard, at
least not of SQL92:

http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

The way to convert a value to an integer is by casting it to
one, e.g. CAST (X AS INTEGER). The INT() function is basically
a short-cut for this.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 25 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: per user site-packages directory

2008-01-22 Thread M.-A. Lemburg

I don't really understand what all this has to do with per user
site-packages.

Note that the motivation for having per user site-packages
was to:

 * address a common request by Python extension package users,

 * get rid off the hackery done by setuptools in order
   to provide this.

As such the PEP can also be seen as an effort to enable code
cleanup *before* adding e.g. pkg_resources to the stdlib.

Cheers,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2008-01-21 16:06, Nick Coghlan wrote:
 Steve Holden wrote:
 Christian Heimes wrote:
 Steve Holden wrote:
 Maybe once we get easy_install as a part of the core (so there's no need
 to find and run ez_setup.py to start with) things will start to improve.
 This is an issue the whole developer community needs to take seriously
 if we are interested in increasing take-up.
 setuptools and easy_install won't be included in Python 2.6 and 3.0:
 http://www.python.org/dev/peps/pep-0365/

 Yes, and yet another release (two releases) will go out without easy 
 access to the functionality in Pypi. PEP 365 is a good start, but Pypi 
 loses much of its point until new Python users get access to it out of 
 the box. I also appreciate that resource limitations are standing in 
 the way of setuptools' inclusion (is there something I can do about 
 that?) Just to hammer the point home, however ...
 
 Have another look at the rationale given in PEP 365 - it isn't the 
 resourcing to do the work that's a problem, but the relatively slow 
 release cycle of the core.
 
 By including pkg_resources in the core (with the addition of access to 
 pure Python modules and packages on PyPI), we would get a simple, stable 
 base for Python packaging to work from, and put users a single standard 
 command away from the more advanced (but also more volatile) features of 
 easy_install and friends.
 
 Cheers,
 Nick.
 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] #! magic

2008-01-22 Thread M.-A. Lemburg

On 2008-01-20 19:30, Christian Heimes wrote:
 Yet another python executable could solve the issue, named pythons as
 python secure.
 
 /*
gcc -DNDEBUG -g -O2 -Wall -Wstrict-prototypes -IInclude -I. -pthread
-Xlinker -lpthread -ldl  -lutil -lm -export-dynamic -o pythons2.6
 
pythons.c libpython2.6.a
  */
 
 #include Python.h
 
 int main(int argc, char **argv) {
 /* disable some possible harmful features */
 Py_IgnoreEnvironmentFlag++;
 Py_NoUserSiteDirectory++;
 Py_InteractiveFlag -= INT_MAX;
 Py_InspectFlag -= INT_MAX;
 
 return Py_Main(argc, argv);
 }
 
 $ ./pythons2.6
 Python 2.6a0 (:59956M, Jan 14 2008, 22:09:17)
 [GCC 4.2.1 (Ubuntu 4.2.1-5ubuntu4)] on linux2
 Type help, copyright, credits or license for more information.
 import sys
 sys.flags
 sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0,
 inspect=-2147483647, interactive=-2147483647, optimize=0,
 dont_write_bytecode=0, no_user_site=1, no_site=0, ingnore_environment=1,

Is this a copypaste error or a typo in the code ^ ?

 tabcheck=0, verbose=0, unicode=0)

To make this even more secure, you'd have to package this up
together with a copy of the stdlib, but like mxCGIPython does
(or did... I have to revive that project at some point :-):

http://www.egenix.com/www2002/python/mxCGIPython.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP: per user site-packages directory

2008-01-14 Thread M.-A. Lemburg

On 2008-01-14 22:23, Christian Heimes wrote:
 The PEP is now available at http://www.python.org/dev/peps/pep-0370/.
 The reference implementation is in svn, too:
 svn+ssh://[EMAIL PROTECTED]/sandbox/trunk/pep370

Thanks for the effort, Christian. Much appreciated.

Regarding the recent ~/bin vs. ~/.local/bin discussion:

I usually maintain my ~/bin directories by hand and wouldn't want
any application to install things in there automatically (and so far
I haven't been using any application that does), so I'd be
in favor of the ~/.local/bin dir.

Note that users typically don't know which scripts are made
available by a Python application and it's not always clear
what functionality they provide, whether they can be trusted,
include bugs, need to be run with extra care, etc, so IMHO
making it a little harder to run them by accident is well
warranted.

Thanks again,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 14 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread M.-A. Lemburg

On 2008-01-10 14:31, Eric Smith wrote:
 (I'm posting to python-dev, because this isn't strictly 3.0 related.
 Hopefully most people read it in addition to python-3000).
 
 I'm working on backporting the changes I made for PEP 3101 (Advanced
 String Formatting) to the trunk, in order to meet the pre-PyCon release
 date for 2.6a1.
 
 I have a few questions about how I should handle str/unicode.  3.0 was
 pretty easy, because everything was unicode.

Since this is a new feature, why bother with strings at all
(even in 2.6) ?

Use Unicode throughout and be done with it.

 1: How should the builtin format() work?  It takes 2 parameters, an
 object o and a string s, and returns o.__format__(s).  If s is None, it
 returns o.__format__(empty_string).  In 3.0, the empty string is of
 course unicode.  For 2.6, should I use u'' or ''?
 
 
 2: In 3.0, object.__format__() is essentially this:
 
 class object:
 def __format__(self, format_spec):
 return format(str(self), format_spec)
 
 In 2.6, I assume it should be the equivalent of:
 
 class object:
 def __format__(self, format_spec):
 if isinstance(format_spec, str):
 return format(str(self), format_spec)
 elif isinstance(format_spec, unicode):
 return format(unicode(self), format_spec)
 else:
 error
 
  Does that seem right?
 
 
 3: Every overridden __format__() method is going to have to check for
 string or unicode, just like object.__format() does, and return either a
 string or unicode object, appropriately.  I don't see any way around
 this, but I'd like to hear any thoughts.  I guess there aren't all that
 many __format__ methods that will be implemented, so this might not be a
 big burden.  I'll of course implement the built in ones.
 
 Thanks in advance for any insights.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages

2008-01-07 Thread M.-A. Lemburg

On 2008-01-07 14:57, Fred Drake wrote:
 On Jan 7, 2008, at 7:48 AM, M.-A. Lemburg wrote:
 Next, we add a per-user site-packages directory to the standard
 sys.path, and then we could get rid of most of the setuptools
 import and sys.path hackery, making it a lot cleaner.
 
 
 PYTHONPATH already provides this functionality.  I see no need to
 duplicate that.

Agreed, but one of the main arguments for all the .pth file hackery in
setuptools is that having to change PYTHONPATH in order to enable
user installations of packages is too hard for the typical user.

We could easily resolve that issue, if we add a per-user site-packages
dir to sys.path in site.py (this is already done for Macs).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 07 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages

2008-01-07 Thread M.-A. Lemburg

On 2008-01-07 17:24, Barry Warsaw wrote:
 On Jan 7, 2008, at 10:12 AM, Guido van Rossum wrote:
 
 On Jan 7, 2008 6:32 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 On Jan 7, 2008, at 9:01 AM, M.-A. Lemburg wrote:
 We could easily resolve that issue, if we add a per-user site-packages
 dir to sys.path in site.py (this is already done for Macs).

 +1.  I've advocated that for years.
 
 I'm not sure what this buys given that you can do this using
 PYTHONPATH anyway, but because of that I also can't be against it. +0
 from me. Patches for 2.6 gratefully accepted.
 
 I think it's PEP-worthy too, just so that the semantics get nailed
 down.  Here's a strawman proto-quasi-pre-PEP.
 
 Python automatically adds ~/.python/site-packages to sys.path; this is
 added /before/ the system site-packages file.  An open question is
 whether it needs to go at the front of the list.  It should definitely
 be searched before the system site-packages.
 
 Python treats ~/.python/site-packages the same as the system
 site-packages, w.r.t. .pth files, etc.
 
 Open question: should we add yet another environment variable to control
 this?  It's pretty typical for apps to expose such a thing so that the
 base directory (e.g. ~/.python) can be moved.

I'd suggest to make the ~/.python part configurable by an
env var, e.g. PYTHONRESOURCES.

Perhaps we could use that directory for other Python-related
resources as well, e.g. an optional sys.path lookup cache (pickled
dictionary of known package/module file locations to reduces Python
startup time).

 I think that's all that's needed.  It would make playing with
 easy_install/setuptools nicer to have this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 07 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Memory benchmarking?

2007-11-29 Thread M.-A. Lemburg

On 2007-11-29 11:52, Titus Brown wrote:
Hi all,

is there a good, or standard memory benchmarking system for Python?
pybench doesn't return significantly different results when Python 2.6
is compiled with pymalloc and without pymalloc. Thinking on it, I'm not
too surprised -- pybench probably benchmarks a lot of stuff -- but some
guidance on how/whether to benchmark different memory allocation schemes
would be welcome.

pybench focuses on runtime performance, not memory usage. It's
way of creating and deleting objects is also highly non-standard
when compared to typical use of Python in real life applications.

It's also rather difficult to benchmark memory allocation, since
most implementations work with some sort of pre-allocation,
buffer pools or free lists.

If you want to use a similar approach as pybench does, ie. benchmark
small parts of the interpreter instead of generating some grand
total, then you'd probably have to do this by spawning a separate
process per test.

refs:

http://code.google.com/p/google-highly-open-participation-psf/issues/detail?id=105colspec=ID%20Status%20Summary

http://evanjones.ca/memoryallocator/

http://www.advogato.org/person/wingo/diary/225.html

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Nov 29 2007)
Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free !

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Build Notes for building trunk with Visual Studio 2008 Express Edition

2007-11-24 Thread M.-A. Lemburg

On 2007-11-23 23:12, Paul Moore wrote:
 On 23/11/2007, Christian Heimes [EMAIL PROTECTED] wrote:
 bsddb is automatically build by a build step. But you have to convert
 the project files in build_win32 to VS 2008 first. Simply open the
 solution file and let VS convert the projects.
 
 VS 2008 Express doesn't have a devenv command, so the pre-link step
 doesn't work. You need to open the bsddb project file, and build
 db_static by hand. For a debug Python, you need the Debug
 configuration, for a release Python you need the Release
 configuration. Beware - the default config is Debug_ASCII which is not
 checked by the pre-link step.
 
 So, from a checkout of Python, plus the various svn externals:
 
 - dowload nasm, install it somewhere on your PATH, and copy nasm.exe
 to nasmw.exe (Why did you use nasmw.exe rather than nasm.exe? Is there
 a difference in the version you have?)

The OpenSSL build process still uses the old nasmw.exe name
(the build instructions there are for the old NASM version,
but it also works with the latest NASM release).

The NASM project has recently changed the name of the executable
to nasm.exe.

 - Open the bsddb solution file, and build debug and release versions
 of db_static
 - Open the Python pcbuild solution file, and build the solution.
 
 You'll get a total of 2 failures and 18 successes. Of the failures,
 one (_sqlite3) is not actually fatal (the pre-link step fails, and
 that only the first time), and the module is actually built correctly.
 The other is _tkinter, which isn't sorted out yet.
 
 You can then run the tests with rt.bat. If you have an openssl.exe on
 your path, test_socket_ssl may hang. Otherwise, everything should
 pass, apart from test_tcl. (Actually, there's a failure in
 test_doctest right now, seems to have come in with r59137, but I don't
 have time to diagnose right now).
 
 This is the case for both trunk and py3k (ignoring genuine test failures).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 24 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Build Notes for building trunk with Visual Studio 2008 Express Edition

2007-11-23 Thread M.-A. Lemburg

On 2007-11-23 16:59, Christian Heimes wrote:
 Paul Moore wrote:
 _ssl
 

 Christian has been making changes to allow this to build without Perl,
 so I gave it a try. I used openssl 0.9.8g, which I extracted to the
 build directory (I noticed afterwards that this is the same version as
 in Python svn, so I could have used the svn external!)

 I needed to download nasm (nasm.sf.net) version 2.00rc1, and rename
 nasm.exe to nasmw.exe and put it on my PATH.

 Build succeeded, no issues.
 
 You still need Perl if you are using an official download of openssl.
 I've added the pre-build assembly and makefiles in the svn external at
 svn.python.org

Why not include the prebuilt libraries of all external libs in SVN
as well ?

BTW: Are you including the patented algorithms in the standard
OpenSSL build or excluding them ?

The patented ones are RC5, IDEA and MDC2:

http://svn.python.org/view/external/openssl-0.9.8g/README

Here's a previous discussion:

http://mail.python.org/pipermail/python-dev/2006-August/068055.html

Here's what MediaCrypt has to say about requiring a license
for IDEA:

http://www.mediacrypt.com/_contents/20_support/204010_faq_bus.asp

Note that in the case of IDEA, any commercial use will require
getting a license to the patented algorithm first (costs start
at EUR 15 for a single use license).

I'd opt for not including these algorithms, as it's just
too easy for the user to overlook this license requirement.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 23 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

2007-11-12 Thread M.-A. Lemburg

On 2007-11-11 23:22, Martin v. Löwis wrote:
 First, XML-RPC is not the only mechanism using XML over a network
 connection. Second, you don't want to do this if you're dealing
 with several 100 MB of data just because you want to figure
 out the encoding.
 That's my original claim/question: what SPECIFIC application do
 you have in mind that transfers XML over a network and where you
 would want to have such a stream codec?
 XML-based web services used for business integration, e.g. based
 on ebXML.

 A common use case from our everyday consulting business is e.g.
 passing market and trading data to portfolio pricing web services.
 
 I still don't see the need for this feature from this example.
 First, in ebXML messaging, the message are typically *not* large
 (i.e. much smaller than 100 MB). Furthermore, the typical processing
 of such a message would be to pass it directly to the XML parser,
 no need for the functionality under discussion.

I don't see the point in continuing this discussion. If you think
you know better, that's fine. Just please don't generalize this
to everyone else working with Python and XML.

 Right. However, I' will remain opposed to adding this to the
 standard library until I see why one would absolutely need to
 have that. Not every piece of code that is useful in some
 application should be added to the standard library.
 Agreed, but the application space of web services is large
 enough to warrant this.
 
 If that was the case, wouldn't the existing Python web service
 libraries already include such a functionality?

No.

To finalize this:

We have a -1 from Martin and a +1 from Walter, Guido and myself.
Pretty clear vote if you ask me. I'd say we end the discussion here
and move on.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 12 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

2007-11-11 Thread M.-A. Lemburg

On 2007-11-11 14:51, Martin v. Löwis wrote:
 A non-seekable stream is not all that uncommon in network processing.
 Right. But what is the relationship to XML encoding autodetection?
 It pops up whenever you need to detect the encoding of the
 incoming XML data on the network connection, e.g. in XML RPC
 or data upload mechanisms.
 
 No, it doesn't. For XML-RPC, you pass the XML payload of the
 HTTP request to the XML parser, and it deals with the encoding.

First, XML-RPC is not the only mechanism using XML over a network
connection. Second, you don't want to do this if you're dealing
with several 100 MB of data just because you want to figure
out the encoding.

 It is also not always feasible to load all data into memory, so
 some form of buffering must be used.
 
 Again, I don't see the use case. For XML-RPC, it's very feasible
 and standard procedure to have the entire document in memory
 (in a processed form).

You may not see the use case, but that doesn't really mean
anything if the use cases exist in real life applications,
right ?!

 This approach is also needed if you want to stack stream codecs
 (not sure whether this is still possible in Py3, but that's how
 I designed them for Py2).
 
 The design of the Py2 codecs is fairly flawed, unfortunately.

Fortunately, this sounds like a fairly flawed argument to me ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

2007-11-11 Thread M.-A. Lemburg

On 2007-11-11 18:56, Martin v. Löwis wrote:
 First, XML-RPC is not the only mechanism using XML over a network
 connection. Second, you don't want to do this if you're dealing
 with several 100 MB of data just because you want to figure
 out the encoding.
 
 That's my original claim/question: what SPECIFIC application do
 you have in mind that transfers XML over a network and where you
 would want to have such a stream codec?

XML-based web services used for business integration, e.g. based
on ebXML.

A common use case from our everyday consulting business is e.g.
passing market and trading data to portfolio pricing web services.

 If I have 100MB of XML in a file, using the detection API, I do
 
   f = open(filename)
   s = f.read(100)
   while True:
 coding = xml.utils.detect_encoding(s)
 if coding is not undetermined:
break
 s += f.read(100)
   f.close()
 
 Having the loop here is paranoia: in my application, I might be
 able to know that 100 bytes are sufficient to determine the encoding
 always.

Doing the detection with files is easy, but that was never
questioned.

 Again, I don't see the use case. For XML-RPC, it's very feasible
 and standard procedure to have the entire document in memory
 (in a processed form).
 You may not see the use case, but that doesn't really mean
 anything if the use cases exist in real life applications,
 right ?!
 
 Right. However, I' will remain opposed to adding this to the
 standard library until I see why one would absolutely need to
 have that. Not every piece of code that is useful in some
 application should be added to the standard library.

Agreed, but the application space of web services is large
enough to warrant this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

2007-11-09 Thread M.-A. Lemburg

On 2007-11-09 14:10, Walter Dörwald wrote:
 Martin v. Löwis wrote:
 Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc
 codecs to do the encoding.  There's no need to create a magical
 mystery codec to pick out which though.
 So the code is good, if it is inside an XML parser, and it's bad if it
 is inside a codec?
 Exactly so. This functionality just *isn't* a codec - there is no
 encoding. Instead, it is an algorithm for *detecting* an encoding.
 
 And what do you do once you've detected the encoding? You decode the
 input, so why not combine both into an XML decoder?

FWIW: I'm +1 on adding such a codec.

It makes working with XML data a lot easier: you simply don't have to
bother with the encoding of the XML data anymore and can just let the
codec figure out the details. The XML parser can then work directly
on the Unicode data.

Whether it needs to be in C or not is another question (I would have
done this in Python since performance is not really an issue), but since
the code is already written, why not use it ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 09 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

2007-11-09 Thread M.-A. Lemburg

Martin v. Löwis wrote:
 It makes working with XML data a lot easier: you simply don't have to
 bother with the encoding of the XML data anymore and can just let the
 codec figure out the details. The XML parser can then work directly
 on the Unicode data.
 
 Having the functionality indeed makes things easier. However, I don't
 find
 
   s.decode(xml.detect_encoding(s))
 
 particularly more difficult than
 
   s.decode(xml-auto-detection)

Not really, but the codec has more control over what happens to
the stream, ie. it's easier to implement look-ahead in the codec
than to do the detection and then try to push the bytes back onto
the stream (which may or may not be possible depending on the
nature of the stream).

 Whether it needs to be in C or not is another question (I would have
 done this in Python since performance is not really an issue), but since
 the code is already written, why not use it ?
 
 It's a maintenance issue.

I'm sure Walter will do a great job in maintaining the code :-)

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 09 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] XML codec?

2007-11-09 Thread M.-A. Lemburg

Martin v. Löwis wrote:
 Not really, but the codec has more control over what happens to
 the stream, ie. it's easier to implement look-ahead in the codec
 than to do the detection and then try to push the bytes back onto
 the stream (which may or may not be possible depending on the
 nature of the stream).
 
 YAGNI.

A non-seekable stream is not all that uncommon in network processing.
I usually end up either reading the complete data into memory
or doing the needed buffering by hand.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 10 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Does Python need a file locking module (slightly higher level)?

2007-10-26 Thread M.-A. Lemburg

On 2007-10-26 05:41, Barry Warsaw wrote:
 On Oct 22, 2007, at 11:30 PM, [EMAIL PROTECTED] wrote:
 
 It's not clear that any of these implementations is going to be  
 perfect.
 Maybe none ever will be.
 
 I would agree with this.  You write a program and know you need to  
 implement some kind of resource locking, so you start looking for  
 some OTS solution.  But then you realize that your application needs  
 somewhat different semantics or needs to work in platforms or  
 environments that the OTS code doesn't handle.  Just a few days ago,  
 I was looking at some locking code that needed to work across  
 multiple invocations of a script on multiple machines, and the only  
 thing they shared was a PostgreSQL connection, so we ended up wanting  
 to use its advisory locks.
 
 In his reply Jean-Paul made this comment:
 
 It might be nice to have something like that in the standard  
 library,
 but it's very simple once you know what to do.
 
 I'm not so sure about the very simple part, especially if you aren't
 familiar with all the ins and outs of the different platforms.
 
 I'd totally agree with this.  Locking seems simple, but it's got some  
 really tricky aspects that need to be coded just right or you'll be  
 in a world of hurt.  Mailman's LockFile.py (which you're right is  
 *nix only) is stable now, but has had some really subtle bugs in the  
 past.

You might want to take a look at the FileLock.py module that's
part of the eGenix mx Base distribution (mx.Misc.FileLock).

It works reliably on Unix and Windows, doesn't rely on fcntl and
has been in use for years.

The only downside is that it's application specific,
ie. only applications using the module for locking will
detect the locks - but then again: this is exactly the problem
you typically want to solve.

 The fact
 that the first three bits of code I was referred to were  
 implemented by
 three significant Python tools/platforms and that all are different  
 in some
 significant ways suggests that there is some both an underlying  
 need for a
 file locking mechanism but with a lack of consensus about the best  
 way to
 implement the mother-of-all-file-locking schemes for Python.  Maybe  
 the best
 place for this is in the distribution.  PEP?
 
 I don't think any one solution will work for everybody.  I'm not even  
 sure we can define a common API a la the DBAPI, but if something were  
 to make it into the standard distribution, that's the direction I'd  
 go in.  Then we can provide various implementations that support the  
 LockingAPI under various environments, constraints, and platforms.   
 If we wanted to distribute them in the stdlib, we could put them all  
 in a package and let the user decide which features they need.
 
 I'm still planning on de-Mailman-ifying LockFile.py sometime soon.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode database

2007-08-09 Thread M.-A. Lemburg

Nick Maclaren wrote:
 Ah, the makefile. I don't think you use it create the Unicode database.

 It's only good for generating the codecs (Lib/encodings)
 
 Yes, but it DOES attempt to download the mappings, and is the ONLY
 script which attempts to do so.

Of course it does. The Tools/unicode/Makefile is meant to simplify
recreating the codecs from the (possibly updated) mapping on the Unicode
site.

If it doesn't work for you, that may well be possible, since I wrote
the Makefile and the other related stuff in that directory to help me
with updating the codecs from the mappings. It's only checked in for
convenience.

 beelzebub$find Python-2.5.1 -type f | wc
34583460  135981
 beelzebub$find Python-2.5.1 -type f | xargs grep ftp.unicode.org
 Python-2.5.1/Doc/lib/libunicodedata.tex:4.1.0 which is publicly available 
 from \url{ftp://ftp.unicode.org/}.
 grep: Python-2.5.1/Mac/Icons/Disk: No such file or directory
 grep: Image.icns: No such file or directory
 grep: Python-2.5.1/Mac/Icons/Python: No such file or directory
 grep: Folder.icns: No such file or directory
 Python-2.5.1/Misc/NEWS:  at ftp.unicode.org and contain a few updates (e.g. 
 the Mac OS
 Python-2.5.1/Tools/unicode/Makefile:# files available at 
 ftp://ftp.unicode.org/
 Python-2.5.1/Tools/unicode/Makefile:ncftpget -R ftp.unicode.org . 
 Public/MAPPINGS
 Python-2.5.1/Tools/unicode/gencodec.py:site 
 (ftp://ftp.unicode.org/Public/MAPPINGS/) and creates Python codec
 Python-2.5.1/Tools/unicode/python-mappings/TIS-620.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT the
 Python-2.5.1/Tools/unicode/python-mappings/TIS-620.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT
 Python-2.5.1/Tools/unicode/python-mappings/KOI8-U.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT
 Python-2.5.1/Tools/unicode/python-mappings/CP1140.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP037.TXT
 Python-2.5.1/Modules/unicodedata.c:4.1.0 which is publically available from 
 ftp://ftp.unicode.org/.\n
 
 AFAICT, the mappings are still where they always were: at the
 location given in the Makefile. (e.g.
 ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-15.TXT
 )
 
 Then you DEFINITELY are using a non-standard set of files.  That
 above was from the source of Python 2.5.1 that I have just downloaded.

No idea where you get that impression from, but then I'm not really
sure what you're after anyway ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 09 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!)

2007-06-19 Thread M.-A. Lemburg

On 2007-06-19 14:40, Walter Dörwald wrote:
 Georg Brandl wrote:
 A minuscule nit: the rot13 codec has no library equivalent, so it won't be
 supported anymore :)
 Given that there are valid use cases for bytes-to-bytes translations, 
 and a common API for them would be nice, does it make sense to have an 
 additional category of codec that is invoked via specific recoding 
 methods on bytes objects? For example:

encoded = data.encode_bytes('bz2')
decoded = encoded.decode_bytes('bz2')
assert data == decoded
 This is exactly what I proposed a while before under the name
 bytes.transform().

 IMO it would make a common use pattern much more convenient and
 should be given thought.

 If a PEP is called for, I'd be happy to at least co-author it.
 
 Codecs are a major exception to Guido's law: Never have a parameter
 whose value switches between completely unrelated algorithms.

I don't see much of a problem with that. Parameters are
per-se intended to change the behavior of a function or
method.

Note that you are referring to the .encode() and .decode()
methods - these are just easy to use interfaces to the codecs
registered in the system.

The codec design allows for different input and output
types as it doesn't impose restrictions on these. Codecs
are more general in that respect: they don't just deal
with Unicode encodings, it's a more general approach
that also works with other kinds of data types.

The access methods, OTOH, can impose restrictions and probably
should to restrict the return types to a predicable set.

 Why don't we put all string transformation functions into a common
 module (the string module might be a good place):
 
 import string
 string.rot13('abc')

I think the string module will have to go away. It doesn't
really separate between text and bytes data.

Adding more confusion will not really help with making
this distinction clear, either, I'm afraid.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 19 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2007-07-09: EuroPython 2007, Vilnius, Lithuania19 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Adventures with x64, VS7 and VS8 on Windows

2007-05-22 Thread M.-A. Lemburg

Hi Mark,

 +1 from me.

 I think this is simply a bug introduced with the UCS4 patches in
 Python 2.2.

 unicodeobject.h already has this code:

 #ifndef PY_UNICODE_TYPE

 /* Windows has a usable wchar_t type (unless we're using UCS-4) */
 # if defined(MS_WIN32)  Py_UNICODE_SIZE == 2
 #  define HAVE_USABLE_WCHAR_T
 #  define PY_UNICODE_TYPE wchar_t
 # endif

 # if defined(Py_UNICODE_WIDE)
 #  define PY_UNICODE_TYPE Py_UCS4
 # endif

 #endif

 But for some reason, pyconfig.h defines:

 /* Define as the integral type used for Unicode representation. */
 #define PY_UNICODE_TYPE unsigned short

 /* Define as the size of the unicode type. */
 #define Py_UNICODE_SIZE SIZEOF_SHORT

 /* Define if you have a useable wchar_t type defined in
 wchar.h; useable
means wchar_t must be 16-bit unsigned type. (see
Include/unicodeobject.h). */
 #if Py_UNICODE_SIZE == 2
 #define HAVE_USABLE_WCHAR_T
 #endif

 disabling the default settings in the unicodeobject.h.
 
 Yes, that does appear strange.  The following patch works for me, keeps
 Python building and appears to solve my problem.  Any objections?

Looks fine to me.

 Mark
 
 
 Index: pyconfig.h
 ===
 --- pyconfig.h  (revision 55487)
 +++ pyconfig.h  (working copy)
 @@ -491,22 +491,13 @@
  /* Define if you want to have a Unicode type. */
  #define Py_USING_UNICODE
 
 -/* Define as the integral type used for Unicode representation. */
 -#define PY_UNICODE_TYPE unsigned short
 -
  /* Define as the size of the unicode type. */
 -#define Py_UNICODE_SIZE SIZEOF_SHORT
 +/* This is enough for unicodeobject.h to do the right thing on Windows.
 */
 +#define Py_UNICODE_SIZE 2
 
 -/* Define if you have a useable wchar_t type defined in wchar.h; useable
 -   means wchar_t must be 16-bit unsigned type. (see
 -   Include/unicodeobject.h). */
 -#if Py_UNICODE_SIZE == 2
 -#define HAVE_USABLE_WCHAR_T
 -
  /* Define to indicate that the Python Unicode representation can be passed
 as-is to Win32 Wide API.  */
  #define Py_WIN_WIDE_FILENAMES
 -#endif
 
  /* Use Python's own small-block memory-allocator. */
  #define WITH_PYMALLOC 1
 
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/mal%40egenix.com

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 22 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Adventures with x64, VS7 and VS8 on Windows

On 2007-05-21 12:30, Kristján Valur Jónsson wrote:

 [Py_UNICODE being #defined as unsigned short on Windows]

 I'd rather make it a platform-specific definition (for platform=Windows
 API). Correct me if I'm wrong, but isn't wchar_t also available in VS
 2003 (and even in VC6?). And doesn't it have the right definition in
 all these compilers?

 So +1 for setting Py_UNICODE to wchar_t on Windows.
 
 Yes.  Btw, in previous visual studio versions, wchar_t was not treated
 as a builtin type by default, but rather as synonymous with unsighed short.
 Now the default is that it is, and this causes some semantic differences
 and incompatibilities of the type seen.

+1 from me.

If think this is simply a bug introduced with the UCS4 patches in
Python 2.2.

unicodeobject.h already has this code:

#ifndef PY_UNICODE_TYPE

/* Windows has a usable wchar_t type (unless we're using UCS-4) */
# if defined(MS_WIN32)  Py_UNICODE_SIZE == 2
#  define HAVE_USABLE_WCHAR_T
#  define PY_UNICODE_TYPE wchar_t
# endif

# if defined(Py_UNICODE_WIDE)
#  define PY_UNICODE_TYPE Py_UCS4
# endif

#endif

But for some reason, pyconfig.h defines:

/* Define as the integral type used for Unicode representation. */
#define PY_UNICODE_TYPE unsigned short

/* Define as the size of the unicode type. */
#define Py_UNICODE_SIZE SIZEOF_SHORT

/* Define if you have a useable wchar_t type defined in wchar.h; useable
   means wchar_t must be 16-bit unsigned type. (see
   Include/unicodeobject.h). */
#if Py_UNICODE_SIZE == 2
#define HAVE_USABLE_WCHAR_T
#endif

disabling the default settings in the unicodeobject.h.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

On 2007-05-21 00:07, Talin wrote:
 Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a 
 Python 2.6 PEP that would presumably impact 3.x as well.  Feedback welcome.


 PEP: 365
 Title: Adding the pkg_resources module
 
 I'm really surprised that there hasn't been more comment on this.

True both ways, I guess: I'm still waiting for a reply to my
comments.

I'd also like to see more discussion about adding e.g.:

 * support for user packages

   (ie. having site.py add a well-defined user home directory
   based Python path entry to sys.path, e.g.
   ~/.python/user-packages, much like what MacPython already does
   now)

 * support for having the import mechanism play nice
   with namespace packages

   (ie. packages that may live in different places on the disk,
   but appear to be in the same Python package as seen by the
   import mechanism)

I think those two features would go a long way in reducing the
number of hacks setuptools currently applies to get this
functionality working with code in .pth files, monkey-patching
site.py, etc.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

On 2007-05-21 16:05, Phillip J. Eby wrote:
 At 01:43 PM 5/21/2007 +0200, M.-A. Lemburg wrote:
 On 2007-05-21 00:07, Talin wrote:
 Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a
 Python 2.6 PEP that would presumably impact 3.x as 
 well.  Feedback welcome.

 PEP: 365
 Title: Adding the pkg_resources module
 I'm really surprised that there hasn't been more comment on this.
 True both ways, I guess: I'm still waiting for a reply to my
 comments.
 
 What comments are you talking about?  I must've missed them.

I've attached the email. Please see below.

 I'd also like to see more discussion about adding e.g.:

  * support for user packages

(ie. having site.py add a well-defined user home directory
based Python path entry to sys.path, e.g.
~/.python/user-packages, much like what MacPython already does
now)

  * support for having the import mechanism play nice
with namespace packages

(ie. packages that may live in different places on the disk,
but appear to be in the same Python package as seen by the
import mechanism)

 I think those two features would go a long way in reducing the
 number of hacks setuptools currently applies to get this
 functionality working with code in .pth files, monkey-patching
 site.py, etc.
 
 These items aren't directly related to the PEP, 
 however. 

Right. I wasn't referring to this PEP. I think we should have
two more PEPs covering the above points, since they offer
benefits for all users, not just setuptools users.

 pkg_resources doesn't monkeypatch anything or touch any 
 .pth files.  It only changes sys.path at runtime if you explicitly 
 ask it to locate and activate packages for you.

 As for namespace packages, pkg_resources provides a more PEP 
 302-compatible alternative to pkgutil.extend_path().  pkgutil doesn't 
 support anything but existing filesystem directories, but the 
 pkg_resources version supports zipfiles and has hooks to allow 
 namespace package support to be registered for any PEP 302 importer.  See:
 
 http://peak.telecommunity.com/DevCenter/PkgResources#supporting-custom-importers
 
 (specifically, the register_namespace_handler() function.)

Looking at the code it appears as if you've already formalized
an implementation for this.

However, since this is not egg-specific it should probably be
moved to pkgutil and get a separate PEP with detailed documentation
(the link you provided doesn't really explain the concepts, reading
the code helped a bit).

What I don't understand about your approach is why importers
would have to register with the namespace implementation.

This doesn't seem necessary, since the package __path__ attribute
already provides all functionality needed for redirecting
lookups to different paths.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
---BeginMessage---
On 2007-05-01 02:29, Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a 
 Python 2.6 PEP that would presumably impact 3.x as well.  Feedback welcome.

Could you add a section that explains the side effects of
importing pkg_resources ?

The documentation of the module doesn't mention any, but the
code suggests that you are installing (some form of) import
hooks.

Some other comments:

* Wouldn't it be better to factor out all the meta-data access
  code that's not related to eggs into pkgutil ?!

* How about then renaming the remaining module to egglib ?!

* The module needs some reorganization: imports, globals and constants
  at the top, maybe a few comments delimiting the various sections,

* The get_*_platform() should probably use the platform module
  which is a lot more flexible than distutils' get_platform()
  (which should probably use the platform module as well in the
  long run)

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764

Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

On 2007-05-21 20:01, Phillip J. Eby wrote:
 At 06:28 PM 5/21/2007 +0200, M.-A. Lemburg wrote:
 However, since this is not egg-specific it should probably be
 moved to pkgutil and get a separate PEP with detailed documentation
 (the link you provided doesn't really explain the concepts, reading
 the code helped a bit).
 
 That doesn't really make sense in the context of the current PEP,
 though, which isn't to provide a general-purpose namespace package API;
 it's specifically about adding an existing piece of code to the stdlib,
 with its API intact.

You seem to indicate that you're not up to discussing the concepts
implemented by the module and *integrating* them with the Python
stdlib.

Please correct me if I'm wrong, but if the whole point of the PEP
is a take it or leave it decision, then I don't see the point of
discussing it. I'm -1 on adding the module in its current state;
I'd be +1 on integrating the concepts with the Python stdlib.

Hope I'm wrong,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 0365: Adding the pkg_resources module