Re: [Python-Dev] PyString - PyBytes C API renaming (Stabilizing the C API of 2.6 and 3.0)
On 2008-05-28 14:02, Christian Heimes wrote: M.-A. Lemburg schrieb: I have a feeling that we should be looking for better merge tools, rather than implement code changes that cause more trouble than do good, just because our existing tools aren't smart enough. We don't have better tools at our hands. I don't think we'll get any tools in time or chance the VCS right before a major release. Wouldn't it be possible to have a 2to3.py converter take the 2.x code (including the C code), convert it and then apply any changes to the 3.x branch ? Such a converter would be nice for 3rd party code but it's not an option for the core. In the past few months I've merged a lot of code from trunk to py3k. A 2to3 C converter doesn't help with merge conflicts. Naming differences make any merge more painful I was suggesting to not use SVN to merge changes directly, but to instead use an intermediate step in the process: Init: 1. grab the latest trunk 2. apply a 2to3 converter to the Python code and the C code, applying any renaming that may be necessary 3. save this converted version in a separate branch merge-branch Update: 1. checkout the merge-branch, . grab the latest trunk and 3.x branch 2. apply a 2to3 converter to the Python code and the C code, applying any renaming that may be necessary 3. copy the files over your working copy of the merge-branch 4. create a diff on the merge-branch 5. apply the diffs to 3.x branch, resolving any conflicts as necessary This doesn't require new tools (except for some C renaming support in the 2to3 tool). It only changes the procedure. We'd basically follow our own suggestions w/r to porting to 3.x, which is to make changes in the 2.x code, apply 2to3 and then apply remaining fixes there. I'm suggesting this, since 3.x is likely to introduce more Python stdlib and C API changes. The process would likely also makes a lot of other changes more easily manageable and reduce the overall merge conflicts. I find the approach less confusing than your suggestion and my initial idea. I disagree on that. Renaming old APIs to use the new names by adding a header file with #define oldname newname is standard practice. Renaming the old APIs in the source code and undoing the renaming with a header file is not. I wasn't talking about standard practice here. I talked about less confusion for core developers. My approach doesn't split our internal API in two. No, but it does apply a well hidden renaming which will cause confusion when using a debugger to trace calls in C code. If you use PyBytes APIs, you expect to find PyBytes functions in the libs and also set breakpoints on these. With the renaming we don't have two sets of APIs (old and new) exposed in the lib, like what we normally do when applying changes to API names. And by the way it *is* a standard approach fore Python. Guido told me that the same approach was used during the 1.x to 2.0 migration. There was no API change between 1.6 and 2.0. You are probably talking about the great renaming between 1.4 and 1.5. That was different, since it changes almost all C APIs in Python. And it used the standard practice... from rename2.h in Python 1.5: /* This file contains a bunch of #defines that make it possible to use old style names (e.g. object) with the new style Python source distribution. */ #define True Py_True #define False Py_False #define None Py_None ie. #define oldname newname And all this, just because Subversion can't handle merging of symbol renaming. As I said earlier we don't have better tools at our disposal. We have to make some compromises. Sometimes practicality beat purity. See above. Please discuss any changes of the 2.x code base on python-dev. Such major changes do need more discussion and possibly a PEP as well. In the last few months I started at least three topics about the C API renaming. It's in the thread 2.6 and 3.0 tasks http://permalink.gmane.org/gmane.comp.python.devel/93016 Thanks. I stopped reading that thread after Guido's reply in http://comments.gmane.org/gmane.comp.python.devel/92541 It would really help if subject lines were more specific. This thread also uses a much to general subject line (which is why I changed it). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 28 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2008-07-07: EuroPython 2008, Vilnius, Lithuania39 days to go Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht
Re: [Python-Dev] Importing bsddb 4.6.21; with or without AES encryption?
On 2008-05-23 01:15, Bill Janssen wrote: That's all fine, but then I'm missing the OpenSSL license and attribution notice somewhere in the installer, the README of the installation or elsewhere. Good point. We need this for both the ssl module and the hashlib module. FYI: I've opened ticket #2949 to track this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buffer interface for C extensions
On 2008-05-19 00:59, Dan Lenski wrote: Hi all, I've written a small C extension to submit commands to SCSI devices via Linux's sg_io driver (for a camera hacking project). The extension is just a wrapper around a couple ioctl()'s with Pythonic exception handling thrown in. One of my extension methods is called like this from python: sg.write(fd, command[, data, timeout) Both command and data are binary strings. I would like to be able to use either a regular Python string or an array('B', ...) for these read-only arguments. So I tried to use the t# argument specifier to indicate that these arguments could be either strings or objects that implement the read- only buffer interface: if (!PyArg_ParseTuple(args, it#|t#i:write, sg_fd, cmd, cmdLen, buf, bufLen, timeout)) return NULL; Now, this works fine with strings, but when I call it with an array I get a TypeError: TypeError: write() argument 2 must be string or read-only character buffer, not array.array So, I then tried changing t# to w# to indicate that the arguments must implement the /read-write/ buffer interface. Now the array objects work, but when I try a string argument, I naturally get this error: TypeError: Cannot use string as modifiable buffer So here's what I don't understand. Why doesn't the t# argument specifier support read-write buffers as well as read-only buffers? Aren't read-write buffers a *superset* of read-only buffers?? Is there something I'm doing wrong or a quick fix to get this to work appropriately? You should probably ask such questions on the capi-sig list. To answer your question: t# requires support for the read-only 8-bit character buffer interface s# can use the read buffer interface w# requires support for the write buffer interface Those are two different buffer interface slots, so whether a particular object works with t# or w# depends on whether it implements the slots in question. You should probably try s#, as this will also work with objects that implement the getreadbuffer slot. The details can be found in Python/getargs.c -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Importing bsddb 4.6.21; with or without AES encryption?
On 2008-05-20 00:46, Jesus Cea wrote: Trent Nelson wrote: | I downloaded the source that includes AES encryption, for no reason | other than it was first on the list. I'm now wondering if we should | only be importing the 'NC' source that doesn't contain any | encryption? Jesus, does pybsddb use any of the Berkeley DB | encryption facilities? Would anything break if we built the | bsddb module without encryption? Yes, pybsddb3 4.6.4 supports cryptography if the underlying Berkeley DB library is crypto enabled. In principle, you can compile BDB without crypto, and pybsddb3 should work, but you would lose ability to open any DB formerly created using page encryption or page checksum. Export laws aside, we better compile with crypto :). I hope you're only talking about the Windows build... In any case, if you do include crypto code in the Windows installer, please make sure that the PSF is informed, so that the proper reporting procedure can be put in place (whatever it is nowadays in the US). The installer already includes the ssl module, so it's not problem to include crypto code in general. BTW: AFAIK the _ssl module is built against OpenSSL. Since I couldn't find any OpenSSL DLLs in my Python install dir and due to the size of the _ssl.pyd, I assume that it is statically linked against OpenSSL. That's all fine, but then I'm missing the OpenSSL license and attribution notice somewhere in the installer, the README of the installation or elsewhere. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: new environment variable PYTHONSTDOUTENCODING
On 2008-05-20 10:22, Martin v. Löwis wrote: I'd like to propose a new environment variable PYTHONSTDOUTENCODING. This is meant to solve various problems that people had with Python not detecting their terminal encoding correctly; it would override any detection that Python would use for determining the encoding of stdout (and stdin - but that's less relevant in 2.x). How is this relevant for 2.x ? In 2.x, stdin and stdout are just files without any io wrappers around them. Writing Unicode to stdout will still use the default encoding ASCII to convert it to an 8-bit string. All other 8-bit strings will be passed to stdout as-is. For 3.x, I'd like to see a PYTHONSTDINENCODING, because the current way of relying on the terminal encoding does work well... it then falls back to ASCII, which prevents entering e.g. German Umlauts. In particular, setting this environment variable would also disable the detection of whether stdout is a terminal. This is desirable for cases as the pydev eclipse plugin, where Python currently fails to detect that the output is a terminal (and technically, what Eclipse provides is not a terminal, but just a pipe, as you can't do pseudoterms in Java). This would have the additional effect that the encoding also gets in effect when redirecting stdout to a file. Whether or not this is a good thing might be debatable; giving the user the control over it (to set or clear that variable) is a good thing, IMO. Naming contest: it probably would be the longest of the PYTHON* variables. I would not want to call it PYTHONENCODING, or PYTHONSTDENCODING, though, because people might infer that it affects sys.getdefaultencoding(), which it shouldn't. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: new environment variable PYTHONSTDOUTENCODING
On 2008-05-20 12:16, Thomas Wouters wrote: On Tue, May 20, 2008 at 10:41 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2008-05-20 10:22, Martin v. Löwis wrote: I'd like to propose a new environment variable PYTHONSTDOUTENCODING. This is meant to solve various problems that people had with Python not detecting their terminal encoding correctly; it would override any detection that Python would use for determining the encoding of stdout (and stdin - but that's less relevant in 2.x). How is this relevant for 2.x ? In 2.x, stdin and stdout are just files without any io wrappers around them. Writing Unicode to stdout will still use the default encoding ASCII to convert it to an 8-bit string. All other 8-bit strings will be passed to stdout as-is. You're forgetting about print; in Python 2.x, when stdout is connected to a terminal, the locale settings (typically the LANG, LC_ALL and LC_CTYPE environment variables) are taken into account when 'print' writes to sys.stdout. Thanks for reminding me. I had forgotten about that special case. So sys.stdout.write(unicode) will always use the default encoding, whereas print unicode uses the sys.stdout.encoding, correct ? Hmm, wouldn't it be better to always use .encoding and also make it adjustable from Python (it is adjustable from C) ?! PYTHONSTDOUTENCODING could then provide the default to sys.stdout.encoding. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: new environment variable PYTHONSTDOUTENCODING
On 2008-05-20 20:23, Martin v. Löwis wrote: Writing Unicode to stdout will still use the default encoding ASCII to convert it to an 8-bit string. That's not true. Are you sure ? setenv LC_ALL de_DE.utf8 python2.5 Python 2.5 (r25:51908, May 9 2007, 00:53:06) u = u'äöü' sys.stdout.write(u) Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) print u äöü Only print will set the Py_PRINT_RAW flag to trigger the conversion from Unicode to 8-bit strings using .encoding in PyFile_WriteObject(). If not set, the default encoding is used. I'm not exactly sure why, since using .encoding would be useful in all cases. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Module renaming and pickle mechanisms
On 2008-05-18 22:24, Brett Cannon wrote: On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan [EMAIL PROTECTED] wrote: M.-A. Lemburg wrote: Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch. I think this is the key point here. The possibility of breaking pickling compatibility never came up during the PEP 3108 discussions, so wasn't taken into account in deciding whether or not backporting the name changes was a good idea. I think it's pretty clear that the code needs to be moved back into the modules with the old names for 2.6. The only question is whether or not we put any effort into making the new stdlib organisation usable in 2.x, or just rely on 2to3 to fix it (note that the increasing the common subset argument doesn't really apply, since you can catch the import errors in order to try both names). Problem with this is it makes forward-porting revisions to 3.0 a PITA. By keeping the module names consistent between the versions merging a revision is just a matter of ``svnmerge merge`` with the usual 3.0-specific changes. Reverting the modules back to the old name will make forward-porting much more difficult as I don't think svn keeps rename information around (and thus map the old name to the new name in terms of diffs). svnmerge is written in Python, so wouldn't it be possible to add support for maintaining such renaming to that tool ? I don't think that an administrative problem such as forward- porting patches to 3.x warrants breakage in the 2.x branch. After all, the renaming was approached for Python 3.0 and not 2.6 *because* it introduces major breakage. AFAIR, the discussion on the stdlib-sig also didn't include the plan to backport such changes to 2.6. Otherwise, we would have hashed them out there. Alexandre's idea of teaching pickle the mapping of old names to new might be the best solution. We could have a flag to pickle that deactivates the renaming. Otherwise we could bump the pickle version number so that the new number doesn't do the mapping while the old versions to the implicit module mapping. And as Greg and Glpyh have pointed out, this is a problem that might need to be addressed in the future with some changes to our serialization method (I have no clue how since I don't deal with pickle very much). It is possible to make pickle aware of the module renames, but that doesn't solve problems with other forms of serialization or use of the .__module__ attribute in general. Why can't we just provide a from __future__ import renamed_modules which then provides all the new name to old name mappings in some form (e.g. module proxies or whatever) and leave the existing modules in 2.x untouched ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 19 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Module renaming and pickle mechanisms
On 2008-05-17 16:59, Alexandre Vassalotti wrote: On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote: I'd like to bring a potential problem to attention that is caused by the recent module renaming approach: Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object. Thanks for bringing this up. I was aware of the problem myself, but I hadn't yet worked out a good solution to it. It can also happen in storage setups where Python objects are stored using e.g. pickle, ZODB being a prominent example. As soon as a Python 2.6 application starts writing to such storages, Python 2.5 and lower versions will no longer be able to read back all the data. The opposite problem exists for Python 3.0, too. Pickle streams written by Python 2.x applications will not be readable by Python 3.0. And, one solution to this is to use Python 2.6 to regenerate pickle stream. Another solution would be to write a 2to3 pickle converter using the pickletools module. It is surely not the most elegant or robust solution, but I could work. I'm not really worried much about going from 2.x to 3.x. Breakage is allowed for that transition. However, the case is different for going from 2.5 to 2.6. Breakage should be avoided if at all possible. Now, I think there's a way to solve this puzzle: Instead of renaming the modules (e.g. Queue - queue), we leave the code in the existing modules and packages and instead add the new module names and package structure with pointers and redirects to the existing 2.5 modules. This would certainly work for simple modules, but what about packages? For packages, you can't use the ``sys.modules[__name__] = Queue`` to preserve module identity. Therefore, pickle will use the new package name when writing its streams. So, we are back to the same problem again. A possible solution could be writing a compatibility layer for the Pickler class, which would map new module names to their old at runtime. Again, this is neither an elegant, nor robust, solution, but it should work in most cases. While it's possible to fix pickle (at least the Python version), this would not help with other serialization formats that rely on the .__module__ attribute mapping to an existing module. It's better to address the problem at the module level. Perhaps I have a misunderstanding of the reasoning behind doing the renaming in the 2.x branch, but it appears that the only reason is to get used to the new names. That's a rather low priority argument in comparison to the breakage the renaming will cause in the 2.x branch. I think it's much better to have 2to3.py do the renaming and only add warnings to the renamed modules in 2.x (without actually applying any renaming). It would also be possible to seed sys.modules with module proxy objects (see e.g. mx.Misc.LazyModule from egenix-mx-base) which only turn into real module object if the module is referenced. This would allow adding a from __future__ import new_module_names which then results in loading proxies for all renamed modules (without actually loading the modules until they are used under their new names). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 18 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Module renaming and pickle mechanisms
I'd like to bring a potential problem to attention that is caused by the recent module renaming approach: Object serialization protocols like e.g. pickle usually store the complete module path to the object class together with the object. They access this module path by looking at the __module__ attribute of the object classes. With the renaming, all objects which use classes from the renamed modules will now refer to the renamed modules in their serialized form, e.g. queue.Queue instead of Queue.Queue (just to name one example). While this is nice for forward compatibility, it causes rather serious problems for making object serialization backwards compatible, since the older Python versions can no longer unserialize objects due to missing modules. This can happen in client-server setups where e.g. the server uses Python 2.6 and the clients some other Python version (e.g. Python 2.5). It can also happen in storage setups where Python objects are stored using e.g. pickle, ZODB being a prominent example. As soon as a Python 2.6 application starts writing to such storages, Python 2.5 and lower versions will no longer be able to read back all the data. Now, I think there's a way to solve this puzzle: Instead of renaming the modules (e.g. Queue - queue), we leave the code in the existing modules and packages and instead add the new module names and package structure with pointers and redirects to the existing 2.5 modules. Code can (and probably should) still be changed to try to import the new module name. In cases where backwards compatibility is needed, this can also be done using try: import newname except ImportError: import oldname Later on, when porting applications to 3.0, the 2to3 script can then apply the final renaming in the source code. Example: queue.py: - import sys, Queue sys.modules[__name__] = Queue -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 17 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] How best to handle the docs for a renamed module?
On 2008-05-12 04:34, Brett Cannon wrote: For the sake of argument, let's consider the Queue module. It is now named queue. For 2.6 I plan on having both Queue and queue listed in the index, with Queue deprecated with instructions to use the new name. But what to do about all the references. Should we leave them pointing at Queue to lessen confusion for people who read about some module on some other site that isn't using the new name, or update everything in 2.6 to use the new name? How hard would it be to add a redirects from the old pages to the new ones ? mod_rewrite does wonders - well, provided you find the right patterns... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distutils configparser rename
On 2008-05-15 22:33, A.M. Kuchling wrote: Python 2.6 renames the ConfigParser module to be configparser. Distutils imports ConfigParser in various places. I just made a commit updating the import in one places, and then noticed that part of commit r63248, which made the same change, was reverted in order to preserve backward-compatibility. Instead, the default path will include lib-old again to keep the old module name available. I suggest dropping that goal, though. We've preserved compatibility but I'm not aware that anyone uses the Python 2.x Distutils with earlier versions of Python. In particular: * There's no standalone distutils package on PyPI, nor can I find such a package with a general web search. Am I missing it? * I do not see users advising other users to use a later version of Distutils to fix their problems. Is anyone actually benefiting from the effort of maintaining backward compatibility? Yes: all the folks who want to create distutils packages for more than just the current Python version. I've argued for this a couple of times in the past. Some background: In order to build a Python package for a previous Python version, you have to run distutils using that older Python version. Now, as distutils evolves, new features are added, bugs are fixed, etc. so as packager you always want to use the latest distutils version available - even with older Python releases. In some cases, e.g. PyPI registration, this may even be necessary, since the new versions of those commands need to be kept in sync with the PyPI repository. Another aspect is keeping package setup.py files working. If you need to support multiple Python versions, then your setup.py will have to work with multiple different versions of distutils. Since performance doesn't really matter for distutils, it is well possible and easy to keep compatibility with a few releases back. This has worked great in the past and I don't see why we should break this, as recent distutils checkins have done. Note that Python doesn't exactly make it easy to ship Python packages. You have several different dimensions to take into consideration: * Python version * UCS2/UCS4 * Platform and processor type * 32/64-bit So there already is a lot of porting effort needed to support a reasonable number of targets. I don't think it takes a lot of effort to keep distutils running with Python 2.3 and 2.4. In the past I've usually rewritten parts of distutils that were modified in incompatible ways. I haven't been able to that for the recent checkins that broke distutils even on Python 2.4. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symbolic errno values in error messages
On 2008-05-16 16:15, Nick Coghlan wrote: Alexander Belopolsky wrote: Yannick Gingras ygingras at ygingras.net writes: 2) Where can I find the symbolic name in C? Use standard C library char* strerror(int errnum) function. You can see an example usage in Modules/posixmodule.c (posix_strerror). I don't believe that would provide adequate Windows support. Well, there's still the idea of a winerror module: http://bugs.python.org/issue1505257 Perhaps someone can pick it up and turn it into a (generated) C module ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symbolic errno values in error messages
On 2008-05-16 17:02, Alexander Belopolsky wrote: On Fri, May 16, 2008 at 10:52 AM, Yannick Gingras [EMAIL PROTECTED] wrote: print e [Errno 21] Is a directory So now I am not sure what OP is proposing. Do you want to replace 21 with EISDIR in the above? Yes, that's what I had in mind. In this case, I have a more drastic proposal. Lets change EnvironmentError errno attribute (myerrno in C) to string. -1 You never want to change an integer field to a string. 'EXYZ' strings can be interned, which will make them more efficient than integers for lookups and comparisons (to literals). A half-way and backward compatible solution would be to stick 'EXYZ' code at the end of the args tuple and add an errnosym attribute. Actually, you don't have to put it into any tuple. Just add it to the error object as attribute. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
On 2008-05-14 14:15, Jesse Noller wrote: On Wed, May 14, 2008 at 5:45 AM, Christian Heimes [EMAIL PROTECTED] wrote: Martin v. Löwis schrieb: I'm worried whether it's stable, what user base it has, whether users (other than the authors) are lobbying for inclusion. Statistically, it seems to be not ready yet: it is not even a year old, and has not reached version 1.0 yet. I'm on Martin's side here. Although I like to see some sort of multi processing mechanism in Python 'cause I need it for lots of projects I'm against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't old and mature enough and it has some competitors like pp (parallel processing). On the one hand the inclusion of a package gives it an unfair advantage over similar packages. On the other hand it slows down future development because a new feature release must be synced with Python releases about every 1.5 years. -0.5 from me Christian I said this in reply to Martin - but the competitors (in my mind) are not as compelling due to the alternative paradigm for application construction they propose. The processing module is an easy win for us if included. Personally - I don't see how inclusion in the stdlib would slow down development - yes, you have to stick with the same release cycle as python-core, but if the module is feature complete and provides a stable API as it stands I don't see following python-core timelines as overly onerous. The module itself doesn't change that frequently - the last release in April was a bugfix release and API consistency change (the API would have to be locked for inclusion obviously - targeting a 2.7/3.1 release may be advantageous to achieve this). Why don't you start a parallel-sig and then hash this out with other distributed computing users ? You could then reach a decision by the time 2.7 is scheduled for release and then add the chosen module to the stdlib. The API of the processing module does look simple and nice, but parallel processing is a minefield - esp. when it comes to handling error situations (e.g. a worker failing, network going down, fail-over, etc.). What I'm missing with the processing module is a way to spawn processes on clusters (rather than just on a single machine). In the scientific world, MPI is the standard API of choice for doing parallel processing, so if we're after standards, supporting MPI would seem to be more attractive than the processing module. http://pypi.python.org/pypi/mpi4py In the enterprise world, you often find CORBA based solutions. http://omniorb.sourceforge.net/ And then, of course, you have a gazillion specialized solutions such as PyRO: http://pyro.sourceforge.net/ OTOH, perhaps the stdlib should just include entry-level support for some form of parallel processing, in which case processing does look attractive. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 14 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tool for converting %-formatting to .format()ing ?
On 2008-05-10 01:18, Martin v. Löwis wrote: Is there a tool available that can convert 2.x code automagically to the .format() method syntax ? Just did a quick grep of our code base and it has some 2000 lines of code that would need to be changed. Why do you think this code needs to change? I'd leave all the code as-is, and might not start using .format before Python 3.2, unless some coding convention says I have to. True, just wanted to know whether there is such a tool. I personally like the %-notation a lot, mainly because it's more or less the same as in C. %i, %s and %r are by far the most used format characters in our code base. Determining the position index and writing {0!s} or {0!r} instead (which requires quite a finger dance on a German keyboard) doesn't make .format() really attractive, IMHO. Perhaps you're right and it's better to wait a few rounds of refinements of .format() before jumping on that train :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] [Python-checkins] r62848 - python/trunk/Objects/setobject.c
On 2008-05-08 13:59, Barry Warsaw wrote: On May 8, 2008, at 7:54 AM, Benjamin Peterson wrote: On Thu, May 8, 2008 at 6:32 AM, Barry Warsaw [EMAIL PROTECTED] wrote: Since the trunk buildbots appear to be mostly happy (well those that are connected anyway), and because I couldn't get the releases out last night, I'll let this one slide. I'd like to find a way to more forcefully enforce commit freezes for the betas though. I wonder if you couldn't alter the server side commit hook to reject everything with the message Sorry, we're in a freeze. (You'd have to make an exception for yourself.) This is exactly what I'm thinking about! +1, that's easy to do with Subversion and doesn't hurt anyone. Please also use a term like freeze or frozen in the subject line of the announcement - perhaps even in capital letters. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 09 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Tool for converting %-formatting to .format()ing ?
Is there a tool available that can convert 2.x code automagically to the .format() method syntax ? Just did a quick grep of our code base and it has some 2000 lines of code that would need to be changed. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 09 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tool for converting %-formatting to .format()ing ?
On 2008-05-09 15:29, [EMAIL PROTECTED] wrote: mal Is there a tool available that can convert 2.x code automagically mal to the .format() method syntax ? mal Just did a quick grep of our code base and it has some 2000 lines mal of code that would need to be changed. I suggested a 2to3 fixer for this but was shot down. Well, ideally such a tool should address 2to2 :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 09 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Reminder: last alphas next Wednesday 07-May-2008
On 2008-05-04 18:14, Christian Heimes wrote: First, Skip, I *only* care about the default behavior. There's already a way to do it differently: PYTHONPATH. So, Fred, I think what you're arguing for is to drop this feature entirely. Or is there some other use for a new way to allow users to explicitly add something to sys.path, aside from PYTHONPATH? It seems that it would add more complexity and I can't see what the value would be. PYTHONPATH is lacking one feature which is important for lots of packages and setuptools. The directories in PYTHONPATH are just added to sys.path. But setuptools require a site package directory. Maybe a new env var PYTHONSITEPATH could solve the problem. We don't need another setup variable for this. Just place a well-known module into the site-packages/ directory and then query it's __file__ attribute, e.g. site-packages/site_packages.py The module could even include a few helpers to query various settings which apply to the site packages directory, e.g. site_packages.get_dir() site_packages.list_packages() site_packages.list_modules() etc. As I've said a dozen times in this thread already, the feature I'd like to get from a per-user installation location is that 'setup.py install', or at least some completely canonical distutils incantation, should work, by default, for non-root users; ideally non-administrators on windows as well as non-root users on unixish platforms. The implementation of my PEP provides a new option for install: $ python setup.py install --user Is it sufficient for you? Just in case you don't know... python setup.py install --home=~ will install to ~/lib/python The problem is not getting the packages installed in a non-admin location. It's about Python looking in a non-admin location per default (as well as in the site-packages location). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Reminder: last alphas next Wednesday 07-May-2008
On 2008-05-04 21:57, Christian Heimes wrote: M.-A. Lemburg schrieb: PYTHONPATH is lacking one feature which is important for lots of packages and setuptools. The directories in PYTHONPATH are just added to sys.path. But setuptools require a site package directory. Maybe a new env var PYTHONSITEPATH could solve the problem. We don't need another setup variable for this. Just place a well-known module into the site-packages/ directory and then query it's __file__ attribute, e.g. site-packages/site_packages.py The module could even include a few helpers to query various settings which apply to the site packages directory, e.g. site_packages.get_dir() site_packages.list_packages() site_packages.list_modules() etc. I don't see how it is going to solve the use case Add another site package directory when I don't have write access to the global site package directory and I don't want to modify my apps. No, but it's going to solve the issue which of the sys.path directories is to be considered the site packages directory. I was under the impression that this is what you were after. Just in case you don't know... python setup.py install --home=~ will install to ~/lib/python The problem is not getting the packages installed in a non-admin location. It's about Python looking in a non-admin location per default (as well as in the site-packages location). I know the --home option. For one the --home option is Unix only and not supported on Windows Also the --user option takes all options of my PEP 370 user site directory into account, includinge the PYTHONUSERBASE env var. Ok. Just wanted to mention that there is a precedent in distutils for doing user home directory installations. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 2008-04-23 07:26, Terry Reedy wrote: Martin v. Löwis [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] | I certainly agree that if the target set of documents is small enough it | | Ok. What advantage would you (or somebody working on a similar project) | gain if chardet was part of the standard library? What if it was not | chardet, but some other algorithm? It seems to me that since there is not a 'correct' algorithm but only competing heuristics, encoding detection modules should be made available via PyPI and only be considered for stdlib after a best of breed emerges with community support. +1 Though in practice, determining the best of breed often becomes a problem (see e.g. the JSON implementation discussion). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 23 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 2008-04-21 23:31, Martin v. Löwis wrote: This is useful when you get a hunk of data which _should_ be some sort of intelligible text from the Big Scary Internet (say, a posted web form or email message), and you want to do something useful with it (say, search the content). I don't think that should be part of the standard library. People will mistake what it tells them for certain. +1 I also think that it's better to educate people to add (correct) encoding information to their text data, rather than give them a guess mechanism... http://chardet.feedparser.org/docs/faq.html#faq.yippie chardet is based on the Mozilla algorithm and at least in my experience that algorithm doesn't work too well. The Mozilla algorithm may work for Asian encodings due to the fact that those encodings are usually also bound to a specific language (and you can then use character and word frequency analysis), but for encodings which can encode far more than just a single language (e.g. UTF-8 or Latin-1), the correct detection rate is rather low. The problem becomes completely even more difficult when leaving the normal text domain or when mixing languages in the same text, e.g. when trying to detect source code with comments using a non-ASCII encoding. The trick to just pass the text through a codec and see whether it roundtrips also doesn't necessarily help: Latin-1, for example, will always round-trip, since Latin-1 is a subset of Unicode. IMHO, more research has to be done into this area before a standard module can be added to the Python's stdlib... and who knows, perhaps we're lucky and by the time everyone is using UTF-8 anyway :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
[CCing python-dev again] On 2008-04-22 12:38, Greg Wilson wrote: I don't think that should be part of the standard library. People will mistake what it tells them for certain. [etc] These are all good arguments, but the fact remains that we can't control our inputs (e.g., we're archiving mail messages sent to lists managed by DrProject), and some of those inputs *don't* tell us how they're encoded. Under those circumstances, what would you recommend? I haven't done much research into this, but in general, I think it's better to: * first try to look at other characteristics of a text message, e.g. language, origin, topic, etc., * then narrow down the number of encodings which could apply, * rank them to try to avoid ambiguities and * then try to see what percentage of the text you can decode using each of the encodings in reverse ranking order (ie. more specialized encodings should be tested first, latin-1 last). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding detection in the standard library?
On 2008-04-22 18:33, Bill Janssen wrote: The 2002 paper A language and character set determination method based on N-gram statistics by Izumi Suzuki and Yoshiki Mikami and Ario Ohsato and Yoshihide Chubachi seems to me a pretty good way to go about this. Thanks for the reference. Looks like the existing research on this just hasn't made it into the mainstream yet. Here's their current project: http://www.language-observatory.org/ Looks like they are focusing more on language detection. Another interesting paper using n-grams: Language Identification in Web Pages by Bruno Martins and Mário J. Silva http://xldb.fc.ul.pt/data/Publications_attach/ngram-article.pdf And one using compression: Text Categorization Using Compression Models by Eibe Frank, Chang Chui, Ian H. Witten http://portal.acm.org/citation.cfm?id=789742 They're looking at LSEs, language-script-encoding triples; a script is a way of using a particular character set to write in a particular language. Their system has these requirements: R1. the response must be either correct answer or unable to detect where unable to detect includes other than registered [the registered set of LSEs]; R2. Applicable to multi-LSE texts; R3. never accept a wrong answer, even when the program does not have enough data on an LSE; and R4. applicable to any LSE text. So, no wrong answers. The biggest disadvantage would seem to be that the registration data for a particular LSE is kind of bulky; on the order of 10,000 shift-codons, each of three bytes, about 30K uncompressed. http://portal.acm.org/ft_gateway.cfm?id=772759type=pdf For a server based application that doesn't sound too large. Unless you're using a very broad scope, I don't think that you'd need more than a few hundred LSEs for a typical application - nothing you'd want to put in the Python stdlib, though. Bill IMHO, more research has to be done into this area before a standard module can be added to the Python's stdlib... and who knows, perhaps we're lucky and by the time everyone is using UTF-8 anyway :-) I walked over to our computational linguistics group and asked. This is often combined with language guessing (which uses a similar approach, but using characters instead of bytes), and apparently can usually be done with high confidence. Of course, they're usually looking at clean texts, not random stuff. I'll see if I can get some references and report back -- most of the research on this was done in the 90's. Bill -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 32- and 64-bit living together
On 2008-04-11 19:10, Sérgio Durigan Júnior wrote: Hi all, My question is simple: is there any problem when installing/using both 32- and 64-bit Python's on the same machine? I'm more concerned about header files (those installed under /usr/include/python-2.x), because as far as I could see there's nothing similar to a #ifdef USE_64BIT or something on them. The include files are all static and can be used on both 32-bit and 64-bit platforms or installations. Only the /usr/lib/python2.x files differ between 32-bit and 64-bit (the configuration files are in /usr/lib/python2.x/config). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 11 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 32- and 64-bit living together
On 2008-04-11 20:21, Sérgio Durigan Júnior wrote: Hi Lemburg, On Fri, 2008-04-11 at 19:38 +0200, M.-A. Lemburg wrote: On 2008-04-11 19:10, Sérgio Durigan Júnior wrote: Hi all, My question is simple: is there any problem when installing/using both 32- and 64-bit Python's on the same machine? I'm more concerned about header files (those installed under /usr/include/python-2.x), because as far as I could see there's nothing similar to a #ifdef USE_64BIT or something on them. The include files are all static and can be used on both 32-bit and 64-bit platforms or installations. Thanks :-). Only the /usr/lib/python2.x files differ between 32-bit and 64-bit (the configuration files are in /usr/lib/python2.x/config). Hmm, right. I tried to modify the installation path (using --libdir in ./configure) to /usr/lib64, but some *.pyo objects still are installed under /usr/lib. AFAIK, these objects are bitness-dependent (i.e., if they were generated by a 32-bit Python, they can only be execute by a 32-bit Python - and vice-versa), right? Right. Is there any way to separate these arch-dependent files in /usr/lib and /usr/lib64 depending on their bitness? There's no need for that. Only the config/ dir which is included in the Python lib dir is dependent on the Python configuration. Thanks, P.S.: I think this misbehaviour of --libdir is a bug. IMHO, it should put every arch-dependent file in the path that the user provided. You should probably have a look at how RedHat or openSUSE solve these problems. Some of them have patched Python to fit their needs. You may have to do that as well. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 11 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 32- and 64-bit living together
On 2008-04-11 22:25, Sérgio Durigan Júnior wrote: On Fri, 2008-04-11 at 22:06 +0200, M.-A. Lemburg wrote: Hmm, right. I tried to modify the installation path (using --libdir in ./configure) to /usr/lib64, but some *.pyo objects still are installed under /usr/lib. AFAIK, these objects are bitness-dependent (i.e., if they were generated by a 32-bit Python, they can only be execute by a 32-bit Python - and vice-versa), right? Right. Sorry, I misread you question. PYO and PYC files are *not* dependent on 32/64 bit sizes. Is there any way to separate these arch-dependent files in /usr/lib and /usr/lib64 depending on their bitness? There's no need for that. Only the config/ dir which is included in the Python lib dir is dependent on the Python configuration. I'm afraid I still don't understand your point. I mean, if the *.pyo file *is* dependent on the bitness of the Python interpreter (as you confirmed in my first question), therefore when I decide to have both 32- and 64-bit Python on my system I *must* have two versions of every .pyo file: one for 32- and another for 64-bit Python. What I've missed? Sorry for the confusion. You should probably have a look at how RedHat or openSUSE solve these problems. Some of them have patched Python to fit their needs. You may have to do that as well. I'll sure take a look at them. Thanks! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 11 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] fixing broken build
On 2008-03-27 09:20, Christian Heimes wrote: Neal Norwitz schrieb: Christian, Please fix the build on the various buildbots that are failing or revert your changes for unicode literals. The build failures started to occur at r61953. There were several more (~5) follow up checkins. You can find all the failures here: http://www.python.org/dev/buildbot/all/ There seem to be at least two variations for how setup.py is failing. See below. I've already fixed the problem in r61956. I didn't noticed the issue with a non initialized var until I compiled Python without pydebug. In order to fix the problem on the build bots one has to remove all pyc and pyo files. I'm not sure why that's necessary, but whenever you change something in the compiler, please remember to update the PYC magic. I'd also suggest that you run a non-debug build of Python to test any checkins before committing them. The debug builds change various ways the code is built. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 27 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Decimal(unicode)
On 2008-03-26 07:11, Martin v. Löwis wrote: For binary representations, we already have the struct module to handle the parsing, but for byte sequences with embedded ASCII digits it's reasonably common practice to use strings along with the respective type constructors. Sure, but why can't you write foo = int(bar[start:stop].decode(ascii)) then? Explicit is better than implicit. Agreed. The whole purpose of Unicode is to store text. Data from a file isn't text per-se. You have to tell Python that a particular set of bytes is to be interpreted as text and that only works by explicitly converting the bytes to text. Numbers or digits aren't any different in this context. b1234 is just a sequence of bytes and could well represent the binary encoding of an integer, the start of a base64 encoded image, an SSH key or an audio file. Don't get fooled by the looks of b1234. It's really just a shorter way of writing 0x31 0x32 0x33 0x34. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 26 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: from __future__ import unicode_string_literals
On 2008-03-24 09:22, Lennart Regebro wrote: I think 2to3 is a procedure that will work well for library type projects with a reasonably small set of developers that make regular releases. There you can release both a python 2 and a python 3 version of the module, for example. ... So, in short: Large projects with interconnected modules where the developers and users of module are the same people will have big difficulties with the 2to3 approach and would be the people who are most likely to not be able to in practice go forward to Python 3 unless they have some sort of smooth path forward. I don't think there's a lot to worry about: Companies using Python for applications typically have a completely different life-cycle of releases and applications compared to the Python release schedule, i.e. they often still run Python 2.3 or 2.4 and wait for major releases to settle before deciding to port to them. Every now and then, they make the decision to port to the next release (for the next version of their software) and this change is then managed accordingly - sometimes skipping a complete major release of Python. In such projects, 2to3 will get applied to the sources once and then all development continues on the Python 3.0 version of the code. In reality, I don't think that 2to3 will get used for continuous porting between a 2.x code base and a 3.0 one all that much. The transition from 2.x to 3.0 will happen during a longer period of time (probably a few years) and depend a lot on the release cycle of the applications using Python, whether or not the 3.0 version provides better features, more performance, etc. and whether the 2.x branches of Python and the used 3rd party modules are still supported or not. New applications will likely choose 3.0 right away - provided that the needed 3rd party modules are available and stable enough. In summary: 2to3 is a very useful tool to have. Whether or not it is used for continuous porting between the two worlds is really secondary. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 25 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond
On 2008-03-21 14:47, Phillip J. Eby wrote: So, to accomplish this, we (for some value of we) need to: 1. Hash out consensus around what changes or enhancements are needed to PEP 262, to resolve the previously-listed open issues, those that have come up since (namespace packages, dependency specifications, canonical name/version forms), and anything else that comes up. 2. Update or replace the implementation as appropriate, and modify the distutils to support it in Python 2.6 and beyond. And support it means, ensure that 'install' and *all* bdist commands update the database. The bdist_rpm, bdist_wininst, and bdist_msi commands, even bdist_dumb. (This should probably also include the add/remove programs stuff in the Windows case.) The bdist commands don't need to touch that database in any way, since they don't install anything, nor do they upload things anywhere. They simply package code and put the result into the dist/ subdir. That's all. What you probably mean is that the installers, pre/post-scripts, etc. run when installing one of those packages should update the database of installed packages. Note that there are several package formats which do not execute any code when installing them - the user simply unzips them in some directory. These packages won't be able to register themselves with a database. I guess the only way to support all of these variants is to use a filesystem based approach, e.g. by placing a file with a special extension into some dir on sys.path. The database logic could then scan sys.path for these files, read the data and provide an interface to it. All bdist formats would then have to include these files. distutils already writes .egg-info files when running python setup.py install, so perhaps that's a start (though I'd prefer a three letter extension such as .pkg). .egg-info files currently only include the package meta-data (the PKG-INFO section from PEP 262). We'd have to add a list of files making up the package (FILES section in PEP 262) and also some extra information about any extra files the package creates that can safely be removed in the uninstall process (e.g. .pyo and .pyc files, temporary files, database files, configuration data, registry entries, etc.) - this is currently not covered in PEP 262. I don't think the REQUIRES and PROVIDES sections from the PEP 262 are needed. That info can easily go into the PKG-INFO section. A separate FILES section also doesn't seem to be necessary - we could just add one or more entries or the format: CreatesDir abc/ CreatesFile abc/xyz1.py CreatesDir abc/def/ CreatesFile abc/def/xyz2.py CreatesFile abc/def/xyz3.py CreatesFile abc/def/xyz4.ini (BTW: wininst writes such a file for the uninstall process) So to keep things simple, the rfc822 approach defined in PEP 241 would easily cover everything needed and we could trim down the PEP 262 format to a simple rfc822 header list. In other words: the .egg-info files already provide the basis and only need to be extended with a list of created files, directories (and possibly other resources) as well as a list of resources which may be removed even if not installed explicitly such as byte-code files, etc. 3. Create a document for system packagers referencing the PEP and introducing them to what/why/how of the standard, in case they weren't one of the original participants in creating this. This should probably be a new PEP defining all the bits and pieces making up the installation database. It will probably take some non-trivial work to do all this for Python 2.6, but it's probably possible, if we start now. I don't think it's critical to have an uninstall tool distributed with 2.6, as long as there's a reasonable way to bootstrap its installation later. BTW: There's a simple uninstall command in mxSetup.py that we could contribute to distutils. It works much in the same way as the install command... except that it removes all the files it would have installed. Using pre-built packages, this works without having to rebuild the package just to be able to determine the list of things that need to be removed. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 21 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
Re: [Python-Dev] Proposal: from __future__ import unicode_string_literals
On 2008-03-21 22:32, Martin v. Löwis wrote: It's not implementable because the work has to occur in ast.c (see Py_UnicodeFlag). It can't occur later, because you need to skip the encoding being done in parsestr(). But the __future__ import can only be interpreted after the AST is built, at which time the encoding has already been applied. I think it would be possible to check for future statements on the basis of nodes already. Take a look at how Python 2.3 implemented future statements (why was that rewritten to use the AST, anyway?). As for it not making sense, this is really in the realm of 2to3. I'm beginning to really believe this statement in PEP 3000: There is still the original use case of people who don't want to run 2to3 (for whatever reasons - mostly probably subjective ones), and who would rather run a single code base unmodified. They don't care that documentation tells them this is impossible, when they feel they are so close to making it possible. Could we point them to a special byte-code compiler such as Andrew Dalke's python4ply: http://dalkescientific.com/Python/python4ply.html That approach appears to be a lot easier to implement than trying to tweak the C implementation of the Python parser. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 21 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond
On 2008-03-21 22:21, Phillip J. Eby wrote: At 08:06 PM 3/21/2008 +0100, M.-A. Lemburg wrote: I guess the only way to support all of these variants is to use a filesystem based approach, e.g. by placing a file with a special extension into some dir on sys.path. The database logic could then scan sys.path for these files, read the data and provide an interface to it. All bdist formats would then have to include these files. That's the idea behind the current version of PEP 262, yes, and I think it should be kept. A separate FILES section also doesn't seem to be necessary - we could just add one or more entries or the format: CreatesDir abc/ CreatesFile abc/xyz1.py CreatesDir abc/def/ CreatesFile abc/def/xyz2.py CreatesFile abc/def/xyz3.py CreatesFile abc/def/xyz4.ini I actually think the size and hash information is good, in order to be able to tell if you're looking at an original file. I'm not sure how useful the permissions and uid/gid info is. I'm hoping we'll hear from anybody who has a use case for that. You're heading off in the wrong direction: we should not be trying to rewrite RPM or InnoSetup in Python. Anything more complicated should be left to tools which are specifically written to manage complex software setups. I honestly believe that most people would be happy if we just provide these two things (and no more): * install a package from a local archive, a URL or PyPI * uninstall a package in way that doesn't break other installed packages and whatever the mechanism, avoid making any undercover changes to the Python installation such as adding .pth files, overriding site.py, etc. - these are not needed if the tool keeps to the simple task of installing and uninstalling Python packages. Examples: python pypi.py install mypkg-1.0.tgz python pypi.py install http://www.example.com/mypkg-1.0.tgz python pypi.py install mypkg-1.0 python pypi.py uninstall mypkg If there's a dependency problem, the tool should print the list of other packages it needs. It should not try to install things automagically. If a package needs other modules as well, the package docs can point the user to use e.g. python pypi.py install mydep1-1.3 mydep2-2.3 mydep4-0.3 mypkg-1.0 instead. Anything more complicated should be left to specialized tools such as RPM, apt, MSI or the other such tools out there - after all the tool should be about Python *package* installation, not application installation. We *don't* need the tool to: * support multiple versions of a package (that's just bound to cause problems with pickle, isinstance() etc.) * provide namespace hacking (is a completely separate issue and can be handled by the packages rather than the install tool) * support all kinds of funky version numbers (if a package wants to participate in the system, the author better make sure that the version string fits the standard format) * provide some form of intra-package bus interface (ie. entry points as you call them) * provide support for keeping whole packages in ZIP files (doesn't play well with C extensions, clutters up the sys.path, is read-only, needs special importers, etc. etc. ) * try automatic version matching for required packages * download things from SourceForge or other sites with special download mechanisms * scan websites for links * make coffee, clean the house, send the kids to school :-) And of course, there are still some issues to be resolved regarding requirements, package name/version stuff, etc. But we can hash those out once we reach a quorum on the Distutils-SIG. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 21 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)
On 2008-03-18 18:05, [EMAIL PROTECTED] wrote: I'm reviving a very old thread based on discussions with Martin at pycon. Sent: Monday, 23 July 2007 5:12 PM Subject: Re: [Distutils] distutils.util.get_platform() for Windows Rather than forcing everyone to read the context, allow me to summarize: On 64bit Windows versions, we need a string that identifies the platform, and this string should ideally be used consistently. This original thread related to the files created by distutils (eg, pywin32-210.win???64??-py2.6.exe) but it seems obvious that we should be consistent wherever Python wants to display the platform (eg, in the startup banner, in platform.py, etc). In the old thread, there was a semi-consensus that 'x86_64' be used by distutils (and indeed, Lib/distutils/util.py in get_platform() has been changed, by me, to use this string), but the Python 'banner', for example, reports AMD64. Platform.py doesn't report much at all in this area, at least when pywin32 isn't installed, but it arguably should. Both Martin and I prefer AMD64 as the string, for various reasons. Firstly, it is less ugly than 'x86_64', and doesn't include an '_'/'-' which might tend to confuse parsing by humans or computers. Martin also made the point that AMD invented the architecture and AMD64 is their preferred name, so we should respect that. So, at the risk of painting a bike-shed, I'd like to propose that we adopt 'AMD64' in distutils (needs a change), platform.py (needs a change to use sys.getwindowsversion() in preference to pywin32, if possible, anyway), and the Python banner (which already uses AMD64). Any objections? Any strong feelings that using 'AMD' will confuse people with Intel processors? Strong feelings about the parsability of the name (PJE? wink)? Strong feelings about the color wink? Not really an object, but Microsoft itself uses the term x64 for the 64-bit variants of their OS, e.g. http://www.microsoft.com/windowsxp/64bit/default.mspx Since the platform name is targeting Windows, I think we should avoid confusing Windows users more than Intel users ;-) About the platform.py changes: if someone could provide the return values of sys.getwindowsversion() for 64bit versions of Windows XP and Vista, I could add support for it (don't have a 64bit version of Windows available to check myself). Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)
On 2008-03-20 13:42, Thomas Heller wrote: M.-A. Lemburg schrieb: About the platform.py changes: if someone could provide the return values of sys.getwindowsversion() for 64bit versions of Windows XP and Vista, I could add support for it (don't have a 64bit version of Windows available to check myself). This is the output of a 32-bit Python running on Windows XP Professional x64 Edition, Version 2003, Service Pack 2: C:\Python24ver Microsoft Windows [Version 5.2.3790] C:\Python24python Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. import sys sys.getwindowsversion() (5, 2, 3790, 2, 'Service Pack 2') Thank you ! Anyone with a 64bit Vista ? Or even better: a page documenting what to expect from the system call behind the sys.getwindowsversion() API ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)
On 2008-03-20 13:55, M.-A. Lemburg wrote: On 2008-03-20 13:42, Thomas Heller wrote: M.-A. Lemburg schrieb: About the platform.py changes: if someone could provide the return values of sys.getwindowsversion() for 64bit versions of Windows XP and Vista, I could add support for it (don't have a 64bit version of Windows available to check myself). This is the output of a 32-bit Python running on Windows XP Professional x64 Edition, Version 2003, Service Pack 2: C:\Python24ver Microsoft Windows [Version 5.2.3790] C:\Python24python Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32 Type help, copyright, credits or license for more information. import sys sys.getwindowsversion() (5, 2, 3790, 2, 'Service Pack 2') Thank you ! Anyone with a 64bit Vista ? Or even better: a page documenting what to expect from the system call behind the sys.getwindowsversion() API ? FYI: I added winreg and sys.getwindowsversion() support in r61674. platform.machine() and .processor() will now use the environment variables PROCESSOR_ARCHITECTURE and PROCESSOR_IDENTIFIER where available (should work on Windows XP and later). According to http://support.microsoft.com/kb/888731 platform.machine() will return AMD64, so I guess the x64 string is just a marketing name for 64-bit platforms on Windows and the underlying system uses AMD64 as machine type name. For x86 processors, you'll now get x86 on Windows XP and later. For Itanium processors, you should get IA64 according to this WOW64 page: http://msdn2.microsoft.com/en-us/library/aa384274(VS.85).aspx -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 365 (Adding the pkg_resources module)
On 2008-03-20 21:34, Paul Moore wrote: Also, setuptools-based packages *can* build bdist_wininst installers. (In fact, if memory serves, I added that feature at your request.) I know. python setup.py bdist_wininst. And thank you for adding it. But again you miss my point. People are starting to omit distributing bdist_wininst installers in favour of eggs only. And you cannot (to my knowledge) convert an egg into a bdist_wininst installer, and if you can't compile from source (a C extension with complex dependencies, for example) you're stuck (in the sense that you're forced to use eggs without add/remove programs support). You might want to look at the eGenix pre-built packages as an alternative: they include all the information necessary to let standard distutils continue its works *after* the build step. It's basically a distribution of the package as it looks after the build step has run, but before the package is wrapped up using a packager like bdist_wininst or bdist_msi or installed into the system. You can download the pre-built package and create e.g. an MSI installer or a wininst EXE without needing a compiler - in addition to providing all the options of the standard distutils install command (which makes repackaging them as part of larger applications easy as well). All the logic for this is included in mxSetup.py which ships with the pre-built packages. http://www.egenix.com/products/python/mxBase/#Download http://www.egenix.com/products/python/mxBase/#Installation The current version we have is not yet perfect. The next iteration will also play nice with distutils extensions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C-API status of Python 3?
On 2008-03-02 14:47, Christian Heimes wrote: Alex Martelli wrote: Yep, but please do keep the PyUnicode for str and PyString for bytes (as macros/synonnyms of PyStr and PyBytes if you want!-) to help the task of porting existing extensions... the bytearray functions should no doubt be PyBytearray, though. Yeah, we've already planed to keep PyUnicode as prefix for str type functions. It makes perfectly sense, not only from the historical point of view. But for PyString I planed to rename the prefix to PyBytes. In my opinion we are going to regret it, when we keep too many legacy names from 2.x. In order to make the migration process easier I can add a header file that provides PyString_* functions as aliases for PyBytes_* Comments? +1 Why not also make unicode() the default type constructor and only keep str() as alias to simplify porting (perhaps with a warning) ? The term string is just too overloaded with all kinds of misinterpretations. The term string just refers to a string of bytes - a variable length array so to speak. However, depending on the application space, string is used as synonym for text string just as well as data string. Removing the term string altogether would make it easier for people to understand that Py3k only has unicode (for text data) and bytes (for binary data). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 02 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C-API status of Python 3?
On 2008-03-02 20:39, Bill Janssen wrote: Why not also make unicode() the default type constructor and only keep str() as alias to simplify porting (perhaps with a warning) ? The term string is just too overloaded with all kinds of misinterpretations. The term string just refers to a string of bytes - a variable length array so to speak. However, depending on the application space, string is used as synonym for text string just as well as data string. Removing the term string altogether would make it easier for people to understand that Py3k only has unicode (for text data) and bytes (for binary data). I agree that string is very overloaded, but calling it unicode is sort of like calling integers int32 -- that is, you're talking about the implementation rather than the type. Hmm in that case, we'd have to call it ucs2 or ucs4 depending on how Python was compiled ;-) In most programming languages that aren't at the machine level (like C is), string really is a sequence of text characters, not a string of bytes, and that's probably the term that should be used for Python going forward, despite the legacy issues it involves. I'm not bound to unicode at all, just don't think using string for text data will really make people think twice often enough and then you end up having binary data in a string again - with the only difference that it's now using the Unicode type internally. My personal favorite is text for text data. Personally, I feel that string (for text) and bytes (for binary data represented as a sequence of bytes) are appropriate terms for Python. Keep unicode for a release or two as an alias for string. But isn't all this in a PEP somewhere already? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 03 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C-API status of Python 3?
On 2008-03-02 23:11, Greg Ewing wrote: M.-A. Lemburg wrote: Why not also make unicode() the default type constructor and only keep str() as alias to simplify porting (perhaps with a warning) ? -1 on making us type 7 characters instead of 3 all over the place. Oh well... how about text() ? The term string is just too overloaded with all kinds of misinterpretations. The term string just refers to a string of bytes - a variable length array so to speak. I disagree -- string has come to mean string of characters unless otherwise qualified. Using one to hold non-characters is just an aberration that was necessary in Python 2 because there wasn't much alternative. Buffer objects have been around for years and for exactly this purpose. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 03 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unicode -- UTF-8 in CPython extension modules
On 2008-02-23 00:46, Colin Walters wrote: On Fri, Feb 22, 2008 at 4:23 PM, John Dennis [EMAIL PROTECTED] wrote: Python programs which use Unicode string objects for their i18n and which link to C libraries expecting UTF-8 but which have a CPython binding which only uses 's' or 's#' formats programs seem to often fail with encoding errors. One thing to be aware of is that PyGTK+ actually sets the Python Unicode object encoding to UTF-8. http://bugzilla.gnome.org/show_bug.cgi?id=132040 I mention this because PyGTK is a very popular library related to Python and Linux. So currently if you import gtk, then libraries which are using UTF-8 (as you say, the vast majority) will work with Python unicode objects unmodified. Are you suggesting that John should rely on a bug in some 3rd party extension instead of fixing the Python extension to use es# where needed ? There's a good reason why we don't allow setting the default encoding outside site.py. Trying to play tricks to change the default encoding later on will only cause problems, e.g. the cached default encoded versions of Unicode objects will then use different encodings - the one set in site.py and later the ones with the new encoding. As a result, all kind of weird things can happen. Using the Python Unicode C API really isn't all that hard and it's well documented too, so please use it instead of trying to design software based on workarounds. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 23 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] int/float freelists vs pymalloc
On 2008-02-13 08:02, Andrew MacIntyre wrote: Christian Heimes wrote: Andrew MacIntyre wrote: I tried a LIFO stack implementation (though I won't claim to have done it well), and found it slightly slower than no freelist at all. The advantage of such an approach is that the known size of the stack makes deallocating excess objects easy (and thus no need for sys.compact_free_list() ). I've tried a single linked free list myself. I used the ob_type field to daisy chain the int and float objects. Although the code was fairly short it was slightly slower than an attempt without a free list at all. pymalloc is fast. It's very hard to beat it though. I'm speculating that CPU cache effects can make these differences. The performance of the current trunk float freelist is depressing, given that the same strategy works so well for ints. I seem to recall Tim Peters paying a lot of attention to cache effects when he went over the PyMalloc code before the 2.3 release, which would contribute to its performance. A fixed size LIFO array like PyFloatObject *free_list[PyFloat_MAXFREELIST] increased the speed slightly. IMHO a value of about 80-200 floats and ints is realistic for most apps. More objects in the free lists could keep too many pymalloced areas occupied. I tested the updated patch you added to issue 2039. With the int freelist set to 500 and the float freelist set to 100, its about the same as the no-freelist version for my tests, but PyBench shows the simple float arithmetic to be about 10% better. I'm inclined to set the int LIFO a bit larger than you suggest, simply as ints are so commonly used - hence the value of 500 I used. Floats are much less common by comparison. Even an int LIFO of 500 is only going to tie up ~8kB on a 32bit box (~16kB on 64bit), which is insignificant enough that I can't see a need for a compaction routine. A 200 entry float LIFO would only account for ~4kB on 32bit (~8kB on 64bit). It is difficult to tell what good limits for free lists should be. This depends a lot on the application focus, e.g. a financial application is going to need lots of floats, while a word processor or parser will need more integers. I think the main difference between the current free list implementation and Christian's patches is that the current implementation bypasses pymalloc altogether and allocates the objects directly using the system malloc(). The objects in the free list then cannot keep artificially keep pymalloc pools alive. Furthermore, the current free list implementation works by allocating 1k chunks of memory for more than just one object whenever it finds that the free list is empty. Christian's patches and your free list removal patch, cause all allocations to be done via pymalloc. Christian's free list can also result in nearly empty pymalloc pools to stay alive due to the use of a linked list rather than an array of objects. Finally (and I don't know if you've missed that), the integer implementation uses sharing for small integers. In the current implementation all integers between -5 and 257 are only ever allocated once and then reused whenever an integer in this range is needed. The shared integers are not subject to any of the extra free list handling or pymalloc overhead. This results in a significant boost, since integers in this range are *very* common and also causes the comparison between integers and floats to become biased - floats don't have this optimization. I still think that dropping the free lists can be worthwhile, but pymalloc would need to get further optimizations to give better performance for often requested size classes (the 16 byte class on 32bit architectures, the 24 byte class on 64bit architectures). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] int/float freelists vs pymalloc
On 2008-02-13 12:56, Andrew MacIntyre wrote: I'm not that interested in debating the detail of exactly how big the prospective LIFO freelists are - I just want to see the situation resolved with maximum utilisation of memory for minimum performance penalty. To that end, +1 from me for accepting your revised patch against issue 2039. In addition, unless there are other reasons to retain it, I would be suggesting that the freelist compaction infrastructure you introduced in r60567 be removed for lack of practical utility (assuming acceptance of your patch). If we're down to voting, here's my vote: +1 on removing the freelists from ints and floats, but not the small int sharing optimization +1 on focusing on improving pymalloc to handle int and float object allocations even better -1 on changing the freelist implementations to use pymalloc for allocation of the freelist members instead of malloc, since this would potentially lead to pools (and arenas) being held alive by just a few objects - in the worst case a whole arena (256kB) for just one int object (14 bytes on 32bit platforms). Eventually, all freelists should be removed, unless there's a significant performance loss. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] int/float freelists vs pymalloc
On 2008-02-08 08:21, Martin v. Löwis wrote: One of the hopes of having a custom allocator for Python was to be able to get rid off all free lists. For some reason that never happened. Not sure why. People were probably too busy with adding new features to the language at the time ;-) Probably not. It's more that the free lists still outperformed pymalloc. Something you could try to make PyMalloc perform better for the builtin types is to check the actual size of the allocated PyObjects and then make sure that PyMalloc uses arenas large enough to hold a good quantity of them, e.g. it's possible that the float types fall into the same arena as some other type and thus don't have enough room to use as free list. I don't think any improvements can be gained here. PyMalloc carves out pools of 4096 bytes from an arena when it runs out of blocks for a certain size class, and then keeps a linked list of pools of the same size class. So when many float objects get allocated, you'll have a lot of pools of the float type's size class. IOW, PyMalloc has always enough room. Well, yes, it doesn't run out of memory, but if pymalloc needs to allocate lots of objects of the same size, then performance degrades due to the management overhead involved for checking the free pools as well as creating new arenas as needed. To reduce this overhead, it may be a good idea to preallocate pools for common sizes and make sure they don't drop under a certain threshold. Here's a list of a few object sizes in bytes for Python 2.5 on an AMD64 machine: import mx.Tools mx.Tools.sizeof(int(0)) 24 mx.Tools.sizeof(float(0)) 24 8-bit strings are var objects: mx.Tools.sizeof(str('')) 40 mx.Tools.sizeof(str('a')) 41 Unicode objects use an external buffer: mx.Tools.sizeof(unicode('')) 48 mx.Tools.sizeof(unicode('a')) 48 Lists do as well: mx.Tools.sizeof(list()) 40 mx.Tools.sizeof(list([1,2,3])) 40 Tuples are var objects: mx.Tools.sizeof(tuple()) 24 mx.Tools.sizeof(tuple([1,2,3])) 48 Old style classes: class C: pass ... mx.Tools.sizeof(C) 64 New style classes are a lot heavier: class D(object): pass ... mx.Tools.sizeof(D) 848 mx.Tools.sizeof(type(2)) 848 As you can see, Integers and floats fall into the same pymalloc size class. What's strange in Andrew's result is that both integers and floats use the same free list technique and fall into the same pymalloc size class, yet the results are different. The only difference that's apparent is that small integers are shared, so depending on the data set used for the test, fewer calls to pymalloc or the free list are made. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 08 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] int/float freelists vs pymalloc
On 2008-02-08 19:28, Christian Heimes wrote: In addition to the pure performance aspect, there is the issue of memory utilisation. The current trunk code running the int test case in my original post peaks at 151MB according to top on my FreeBSD box, dropping back to about 62MB after the dict is destroyed (without a compaction). The same script running on the no-freelist build of the interpreter peaks at 119MB, with a minima of around 57MB. I wonder why the free list has such a huge impact in memory usage. Int objects are small (4 byte pointer to type, 4 byte Py_ssize_t and 4 byte value). A thousand int object should consume less than 20kB including overhead and padding. The free lists keep parts of the pymalloc pools alive. Since these are only returned to the OS if the whole pool is unused, a single object could keep 4k of memory associated with the process. I suppose that the remaining few MBs shown by the OS are not really used by the process, but simply kept associated with the process by the OS in case it quickly needs more memory. In order to be sure about the true memory usage, you'd have to force the OS to grab all available memory, e.g. by running a huge process right next to the one you're testing. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 08 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] int/float freelists vs pymalloc
On 2008-02-07 14:09, Andrew MacIntyre wrote: Probably in response to the same stimulus as Christian it occurred to me that the freelist approach had been adopted long before PyMalloc was enabled as standard (in 2.3), and that much of the performance gains between 2.2 and 2.3 were in fact due to PyMalloc. One of the hopes of having a custom allocator for Python was to be able to get rid off all free lists. For some reason that never happened. Not sure why. People were probably too busy with adding new features to the language at the time ;-) Something you could try to make PyMalloc perform better for the builtin types is to check the actual size of the allocated PyObjects and then make sure that PyMalloc uses arenas large enough to hold a good quantity of them, e.g. it's possible that the float types fall into the same arena as some other type and thus don't have enough room to use as free list. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 08 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Limit free list of method and builtin function objects (was: [Python-checkins] r60614 - in python/trunk: Misc/NEWS Objects/classobject.c Objects/methodobject.c)
Hi Christian, could you explain how you came up with the 256 entry limit ? It appears to be rather low and somehow arbitrary. I understand that some limit is required, but since these objects get created a lot (e.g. for bound methods), setting the limit too low will significantly slow down the interpreter. BTW: What does pybench have to say to this patch ? To get an idea of how many objects are typically part of the free list, I'd suggest running an application such as Zope for a while and then check the maximum numfree value. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 06 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 On 2008-02-06 13:44, christian.heimes wrote: Author: christian.heimes Date: Wed Feb 6 13:44:34 2008 New Revision: 60614 Modified: python/trunk/Misc/NEWS python/trunk/Objects/classobject.c python/trunk/Objects/methodobject.c Log: Limit free list of method and builtin function objects to 256 entries each. Modified: python/trunk/Misc/NEWS == --- python/trunk/Misc/NEWS(original) +++ python/trunk/Misc/NEWSWed Feb 6 13:44:34 2008 @@ -12,6 +12,9 @@ Core and builtins - +- Limit free list of method and builtin function objects to 256 entries + each. + - Patch #1953: Added ``sys._compact_freelists()`` and the C API functions ``PyInt_CompactFreeList`` and ``PyFloat_CompactFreeList`` to compact the internal free lists of pre-allocted ints and floats. Modified: python/trunk/Objects/classobject.c == --- python/trunk/Objects/classobject.c(original) +++ python/trunk/Objects/classobject.cWed Feb 6 13:44:34 2008 @@ -4,10 +4,16 @@ #include Python.h #include structmember.h +/* Free list for method objects to safe malloc/free overhead + * The im_self element is used to chain the elements. + */ +static PyMethodObject *free_list; +static int numfree = 0; +#define MAXFREELIST 256 + #define TP_DESCR_GET(t) \ (PyType_HasFeature(t, Py_TPFLAGS_HAVE_CLASS) ? (t)-tp_descr_get : NULL) - /* Forward */ static PyObject *class_lookup(PyClassObject *, PyObject *, PyClassObject **); @@ -2193,8 +2199,6 @@ In case (b), im_self is NULL */ -static PyMethodObject *free_list; - PyObject * PyMethod_New(PyObject *func, PyObject *self, PyObject *klass) { @@ -2207,6 +2211,7 @@ if (im != NULL) { free_list = (PyMethodObject *)(im-im_self); PyObject_INIT(im, PyMethod_Type); + numfree--; } else { im = PyObject_GC_New(PyMethodObject, PyMethod_Type); @@ -2332,8 +2337,14 @@ Py_DECREF(im-im_func); Py_XDECREF(im-im_self); Py_XDECREF(im-im_class); - im-im_self = (PyObject *)free_list; - free_list = im; + if (numfree MAXFREELIST) { + im-im_self = (PyObject *)free_list; + free_list = im; + numfree++; + } + else { + PyObject_GC_Del(im); + } } static int @@ -2620,5 +2631,7 @@ PyMethodObject *im = free_list; free_list = (PyMethodObject *)(im-im_self); PyObject_GC_Del(im); + numfree--; } + assert(numfree == 0); } Modified: python/trunk/Objects/methodobject.c == --- python/trunk/Objects/methodobject.c (original) +++ python/trunk/Objects/methodobject.c Wed Feb 6 13:44:34 2008 @@ -4,7 +4,12 @@ #include Python.h #include structmember.h +/* Free list for method objects to safe malloc/free overhead + * The m_self element is used to chain the objects. + */ static PyCFunctionObject *free_list = NULL; +static int numfree = 0; +#define MAXFREELIST 256 PyObject * PyCFunction_NewEx(PyMethodDef *ml, PyObject *self, PyObject *module) @@ -14,6 +19,7 @@ if (op != NULL) { free_list = (PyCFunctionObject *)(op-m_self); PyObject_INIT(op, PyCFunction_Type); + numfree--; } else { op = PyObject_GC_New(PyCFunctionObject, PyCFunction_Type); @@ -125,8 +131,14 @@ _PyObject_GC_UNTRACK(m); Py_XDECREF(m-m_self);
Re: [Python-Dev] trunc()
On 2008-01-27 08:14, Raymond Hettinger wrote: . You may disagree, but that doesn't make it nuts. Too many thoughts compressed into one adjective ;-) Deprecating int(float)--int may not be nuts, but it is disruptive. Having both trunc() and int() in Py2.6 may not be nuts, but it is duplicative and confusing. The original impetus for facilitating a new Real type being able to trunc() into a new Integral type may not be nuts, but the use case seems far fetched (we're never had a feature request for it -- the notion was born entirely out of numeric tower considerations). The idea that programmers are confused by int(3.7)--3 may not be nuts, but it doesn't match any experience I've had with any programmer, ever. The idea that trunc() is beneficial may not be nuts, but it is certainly questionable. In short, the idea may not be nuts, but I think it is legitimate to suggest that it is unnecessary and that it will do more harm than good. All this reminds me a lot of discussions we've had when we needed a new way to spell out string.join(). In the end, we ended up adding a method to strings (thanks to Tim Peters, IIRC) instead of adding a builtin join(). Since all of the suggested builtins are only meant to work on floats, why not simply add methods for them to the float object ?! E.g. x = 3.141 print x.trunc(), x.floor(), x.ceil() etc. This approach also makes it possible to write types or classes that expose the same API without having to resort to new special methods (we have too many of those already). Please consider that type constructors have a different scope than helper functions. Helper functions should only be made builtins if they are really really useful and often needed. If they don't meet this criteria, they are better off in a separate module. I don't see any of the suggested helper functions meeting this criteria and we already have math.floor() and math.ceil(). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 28 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] trunc()
On 2008-01-25 21:26, Steve Holden wrote: Antoine Pitrou wrote: Raymond Hettinger python at rcn.com writes: Go ask a dozen people if they are surprised that int(3.7) returns 3. No one will be surprised (even folks who just use Excel or VB). It is foolhardy to be a purist and rage against the existing art: Well, for what it's worth, here are MySQL's own two cents: mysql create table t (a int); Query OK, 0 rows affected (0.00 sec) mysql insert t (a) values (1.4), (1.6), (-1.6), (-1.4); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql select * from t; +--+ | a| +--+ |1 | |2 | | -2 | | -1 | +--+ 4 rows in set (0.00 sec) Two points. Firstly, regarding MySQL as authoritative from a standards point of view is bound to lead to trouble, since they have always played fast and loose with the standard for reasons (I suspect) of implementation convenience. Second, that example isn't making use of the INT() function. I was going to show you result of taking the INT() of a float column containing your test values. That was when I found out that MySQL (5.0.41, anyway) doesn't implement the INT() function. What was I saying about standards? FWIW, here's what IBM has to say to this: http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/r814.htm If the argument is a numeric-expression, the result is the same number that would occur if the argument were assigned to a large integer column or variable. If the whole part of the argument is not within the range of integers, an error occurs. The decimal part of the argument is truncated if present. AFAIK, the INTEGER() function is not part of the SQL standard, at least not of SQL92: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt The way to convert a value to an integer is by casting it to one, e.g. CAST (X AS INTEGER). The INT() function is basically a short-cut for this. Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 25 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: per user site-packages directory
I don't really understand what all this has to do with per user site-packages. Note that the motivation for having per user site-packages was to: * address a common request by Python extension package users, * get rid off the hackery done by setuptools in order to provide this. As such the PEP can also be seen as an effort to enable code cleanup *before* adding e.g. pkg_resources to the stdlib. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 On 2008-01-21 16:06, Nick Coghlan wrote: Steve Holden wrote: Christian Heimes wrote: Steve Holden wrote: Maybe once we get easy_install as a part of the core (so there's no need to find and run ez_setup.py to start with) things will start to improve. This is an issue the whole developer community needs to take seriously if we are interested in increasing take-up. setuptools and easy_install won't be included in Python 2.6 and 3.0: http://www.python.org/dev/peps/pep-0365/ Yes, and yet another release (two releases) will go out without easy access to the functionality in Pypi. PEP 365 is a good start, but Pypi loses much of its point until new Python users get access to it out of the box. I also appreciate that resource limitations are standing in the way of setuptools' inclusion (is there something I can do about that?) Just to hammer the point home, however ... Have another look at the rationale given in PEP 365 - it isn't the resourcing to do the work that's a problem, but the relatively slow release cycle of the core. By including pkg_resources in the core (with the addition of access to pure Python modules and packages on PyPI), we would get a simple, stable base for Python packaging to work from, and put users a single standard command away from the more advanced (but also more volatile) features of easy_install and friends. Cheers, Nick. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] #! magic
On 2008-01-20 19:30, Christian Heimes wrote: Yet another python executable could solve the issue, named pythons as python secure. /* gcc -DNDEBUG -g -O2 -Wall -Wstrict-prototypes -IInclude -I. -pthread -Xlinker -lpthread -ldl -lutil -lm -export-dynamic -o pythons2.6 pythons.c libpython2.6.a */ #include Python.h int main(int argc, char **argv) { /* disable some possible harmful features */ Py_IgnoreEnvironmentFlag++; Py_NoUserSiteDirectory++; Py_InteractiveFlag -= INT_MAX; Py_InspectFlag -= INT_MAX; return Py_Main(argc, argv); } $ ./pythons2.6 Python 2.6a0 (:59956M, Jan 14 2008, 22:09:17) [GCC 4.2.1 (Ubuntu 4.2.1-5ubuntu4)] on linux2 Type help, copyright, credits or license for more information. import sys sys.flags sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0, inspect=-2147483647, interactive=-2147483647, optimize=0, dont_write_bytecode=0, no_user_site=1, no_site=0, ingnore_environment=1, Is this a copypaste error or a typo in the code ^ ? tabcheck=0, verbose=0, unicode=0) To make this even more secure, you'd have to package this up together with a copy of the stdlib, but like mxCGIPython does (or did... I have to revive that project at some point :-): http://www.egenix.com/www2002/python/mxCGIPython.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: per user site-packages directory
On 2008-01-14 22:23, Christian Heimes wrote: The PEP is now available at http://www.python.org/dev/peps/pep-0370/. The reference implementation is in svn, too: svn+ssh://[EMAIL PROTECTED]/sandbox/trunk/pep370 Thanks for the effort, Christian. Much appreciated. Regarding the recent ~/bin vs. ~/.local/bin discussion: I usually maintain my ~/bin directories by hand and wouldn't want any application to install things in there automatically (and so far I haven't been using any application that does), so I'd be in favor of the ~/.local/bin dir. Note that users typically don't know which scripts are made available by a Python application and it's not always clear what functionality they provide, whether they can be trusted, include bugs, need to be run with extra care, etc, so IMHO making it a little harder to run them by accident is well warranted. Thanks again, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 14 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Backporting PEP 3101 to 2.6
On 2008-01-10 14:31, Eric Smith wrote: (I'm posting to python-dev, because this isn't strictly 3.0 related. Hopefully most people read it in addition to python-3000). I'm working on backporting the changes I made for PEP 3101 (Advanced String Formatting) to the trunk, in order to meet the pre-PyCon release date for 2.6a1. I have a few questions about how I should handle str/unicode. 3.0 was pretty easy, because everything was unicode. Since this is a new feature, why bother with strings at all (even in 2.6) ? Use Unicode throughout and be done with it. 1: How should the builtin format() work? It takes 2 parameters, an object o and a string s, and returns o.__format__(s). If s is None, it returns o.__format__(empty_string). In 3.0, the empty string is of course unicode. For 2.6, should I use u'' or ''? 2: In 3.0, object.__format__() is essentially this: class object: def __format__(self, format_spec): return format(str(self), format_spec) In 2.6, I assume it should be the equivalent of: class object: def __format__(self, format_spec): if isinstance(format_spec, str): return format(str(self), format_spec) elif isinstance(format_spec, unicode): return format(unicode(self), format_spec) else: error Does that seem right? 3: Every overridden __format__() method is going to have to check for string or unicode, just like object.__format() does, and return either a string or unicode object, appropriately. I don't see any way around this, but I'd like to hear any thoughts. I guess there aren't all that many __format__ methods that will be implemented, so this might not be a big burden. I'll of course implement the built in ones. Thanks in advance for any insights. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 10 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages
On 2008-01-07 14:57, Fred Drake wrote: On Jan 7, 2008, at 7:48 AM, M.-A. Lemburg wrote: Next, we add a per-user site-packages directory to the standard sys.path, and then we could get rid of most of the setuptools import and sys.path hackery, making it a lot cleaner. PYTHONPATH already provides this functionality. I see no need to duplicate that. Agreed, but one of the main arguments for all the .pth file hackery in setuptools is that having to change PYTHONPATH in order to enable user installations of packages is too hard for the typical user. We could easily resolve that issue, if we add a per-user site-packages dir to sys.path in site.py (this is already done for Macs). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 07 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages
On 2008-01-07 17:24, Barry Warsaw wrote: On Jan 7, 2008, at 10:12 AM, Guido van Rossum wrote: On Jan 7, 2008 6:32 AM, Barry Warsaw [EMAIL PROTECTED] wrote: On Jan 7, 2008, at 9:01 AM, M.-A. Lemburg wrote: We could easily resolve that issue, if we add a per-user site-packages dir to sys.path in site.py (this is already done for Macs). +1. I've advocated that for years. I'm not sure what this buys given that you can do this using PYTHONPATH anyway, but because of that I also can't be against it. +0 from me. Patches for 2.6 gratefully accepted. I think it's PEP-worthy too, just so that the semantics get nailed down. Here's a strawman proto-quasi-pre-PEP. Python automatically adds ~/.python/site-packages to sys.path; this is added /before/ the system site-packages file. An open question is whether it needs to go at the front of the list. It should definitely be searched before the system site-packages. Python treats ~/.python/site-packages the same as the system site-packages, w.r.t. .pth files, etc. Open question: should we add yet another environment variable to control this? It's pretty typical for apps to expose such a thing so that the base directory (e.g. ~/.python) can be moved. I'd suggest to make the ~/.python part configurable by an env var, e.g. PYTHONRESOURCES. Perhaps we could use that directory for other Python-related resources as well, e.g. an optional sys.path lookup cache (pickled dictionary of known package/module file locations to reduces Python startup time). I think that's all that's needed. It would make playing with easy_install/setuptools nicer to have this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 07 2008) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Memory benchmarking?
On 2007-11-29 11:52, Titus Brown wrote: Hi all, is there a good, or standard memory benchmarking system for Python? pybench doesn't return significantly different results when Python 2.6 is compiled with pymalloc and without pymalloc. Thinking on it, I'm not too surprised -- pybench probably benchmarks a lot of stuff -- but some guidance on how/whether to benchmark different memory allocation schemes would be welcome. pybench focuses on runtime performance, not memory usage. It's way of creating and deleting objects is also highly non-standard when compared to typical use of Python in real life applications. It's also rather difficult to benchmark memory allocation, since most implementations work with some sort of pre-allocation, buffer pools or free lists. If you want to use a similar approach as pybench does, ie. benchmark small parts of the interpreter instead of generating some grand total, then you'd probably have to do this by spawning a separate process per test. refs: http://code.google.com/p/google-highly-open-participation-psf/issues/detail?id=105colspec=ID%20Status%20Summary http://evanjones.ca/memoryallocator/ http://www.advogato.org/person/wingo/diary/225.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Build Notes for building trunk with Visual Studio 2008 Express Edition
On 2007-11-23 23:12, Paul Moore wrote: On 23/11/2007, Christian Heimes [EMAIL PROTECTED] wrote: bsddb is automatically build by a build step. But you have to convert the project files in build_win32 to VS 2008 first. Simply open the solution file and let VS convert the projects. VS 2008 Express doesn't have a devenv command, so the pre-link step doesn't work. You need to open the bsddb project file, and build db_static by hand. For a debug Python, you need the Debug configuration, for a release Python you need the Release configuration. Beware - the default config is Debug_ASCII which is not checked by the pre-link step. So, from a checkout of Python, plus the various svn externals: - dowload nasm, install it somewhere on your PATH, and copy nasm.exe to nasmw.exe (Why did you use nasmw.exe rather than nasm.exe? Is there a difference in the version you have?) The OpenSSL build process still uses the old nasmw.exe name (the build instructions there are for the old NASM version, but it also works with the latest NASM release). The NASM project has recently changed the name of the executable to nasm.exe. - Open the bsddb solution file, and build debug and release versions of db_static - Open the Python pcbuild solution file, and build the solution. You'll get a total of 2 failures and 18 successes. Of the failures, one (_sqlite3) is not actually fatal (the pre-link step fails, and that only the first time), and the module is actually built correctly. The other is _tkinter, which isn't sorted out yet. You can then run the tests with rt.bat. If you have an openssl.exe on your path, test_socket_ssl may hang. Otherwise, everything should pass, apart from test_tcl. (Actually, there's a failure in test_doctest right now, seems to have come in with r59137, but I don't have time to diagnose right now). This is the case for both trunk and py3k (ignoring genuine test failures). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 24 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Build Notes for building trunk with Visual Studio 2008 Express Edition
On 2007-11-23 16:59, Christian Heimes wrote: Paul Moore wrote: _ssl Christian has been making changes to allow this to build without Perl, so I gave it a try. I used openssl 0.9.8g, which I extracted to the build directory (I noticed afterwards that this is the same version as in Python svn, so I could have used the svn external!) I needed to download nasm (nasm.sf.net) version 2.00rc1, and rename nasm.exe to nasmw.exe and put it on my PATH. Build succeeded, no issues. You still need Perl if you are using an official download of openssl. I've added the pre-build assembly and makefiles in the svn external at svn.python.org Why not include the prebuilt libraries of all external libs in SVN as well ? BTW: Are you including the patented algorithms in the standard OpenSSL build or excluding them ? The patented ones are RC5, IDEA and MDC2: http://svn.python.org/view/external/openssl-0.9.8g/README Here's a previous discussion: http://mail.python.org/pipermail/python-dev/2006-August/068055.html Here's what MediaCrypt has to say about requiring a license for IDEA: http://www.mediacrypt.com/_contents/20_support/204010_faq_bus.asp Note that in the case of IDEA, any commercial use will require getting a license to the patented algorithm first (costs start at EUR 15 for a single use license). I'd opt for not including these algorithms, as it's just too easy for the user to overlook this license requirement. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 23 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] XML codec?
On 2007-11-11 23:22, Martin v. Löwis wrote: First, XML-RPC is not the only mechanism using XML over a network connection. Second, you don't want to do this if you're dealing with several 100 MB of data just because you want to figure out the encoding. That's my original claim/question: what SPECIFIC application do you have in mind that transfers XML over a network and where you would want to have such a stream codec? XML-based web services used for business integration, e.g. based on ebXML. A common use case from our everyday consulting business is e.g. passing market and trading data to portfolio pricing web services. I still don't see the need for this feature from this example. First, in ebXML messaging, the message are typically *not* large (i.e. much smaller than 100 MB). Furthermore, the typical processing of such a message would be to pass it directly to the XML parser, no need for the functionality under discussion. I don't see the point in continuing this discussion. If you think you know better, that's fine. Just please don't generalize this to everyone else working with Python and XML. Right. However, I' will remain opposed to adding this to the standard library until I see why one would absolutely need to have that. Not every piece of code that is useful in some application should be added to the standard library. Agreed, but the application space of web services is large enough to warrant this. If that was the case, wouldn't the existing Python web service libraries already include such a functionality? No. To finalize this: We have a -1 from Martin and a +1 from Walter, Guido and myself. Pretty clear vote if you ask me. I'd say we end the discussion here and move on. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 12 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] XML codec?
On 2007-11-11 14:51, Martin v. Löwis wrote: A non-seekable stream is not all that uncommon in network processing. Right. But what is the relationship to XML encoding autodetection? It pops up whenever you need to detect the encoding of the incoming XML data on the network connection, e.g. in XML RPC or data upload mechanisms. No, it doesn't. For XML-RPC, you pass the XML payload of the HTTP request to the XML parser, and it deals with the encoding. First, XML-RPC is not the only mechanism using XML over a network connection. Second, you don't want to do this if you're dealing with several 100 MB of data just because you want to figure out the encoding. It is also not always feasible to load all data into memory, so some form of buffering must be used. Again, I don't see the use case. For XML-RPC, it's very feasible and standard procedure to have the entire document in memory (in a processed form). You may not see the use case, but that doesn't really mean anything if the use cases exist in real life applications, right ?! This approach is also needed if you want to stack stream codecs (not sure whether this is still possible in Py3, but that's how I designed them for Py2). The design of the Py2 codecs is fairly flawed, unfortunately. Fortunately, this sounds like a fairly flawed argument to me ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] XML codec?
On 2007-11-11 18:56, Martin v. Löwis wrote: First, XML-RPC is not the only mechanism using XML over a network connection. Second, you don't want to do this if you're dealing with several 100 MB of data just because you want to figure out the encoding. That's my original claim/question: what SPECIFIC application do you have in mind that transfers XML over a network and where you would want to have such a stream codec? XML-based web services used for business integration, e.g. based on ebXML. A common use case from our everyday consulting business is e.g. passing market and trading data to portfolio pricing web services. If I have 100MB of XML in a file, using the detection API, I do f = open(filename) s = f.read(100) while True: coding = xml.utils.detect_encoding(s) if coding is not undetermined: break s += f.read(100) f.close() Having the loop here is paranoia: in my application, I might be able to know that 100 bytes are sufficient to determine the encoding always. Doing the detection with files is easy, but that was never questioned. Again, I don't see the use case. For XML-RPC, it's very feasible and standard procedure to have the entire document in memory (in a processed form). You may not see the use case, but that doesn't really mean anything if the use cases exist in real life applications, right ?! Right. However, I' will remain opposed to adding this to the standard library until I see why one would absolutely need to have that. Not every piece of code that is useful in some application should be added to the standard library. Agreed, but the application space of web services is large enough to warrant this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 11 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] XML codec?
On 2007-11-09 14:10, Walter Dörwald wrote: Martin v. Löwis wrote: Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc codecs to do the encoding. There's no need to create a magical mystery codec to pick out which though. So the code is good, if it is inside an XML parser, and it's bad if it is inside a codec? Exactly so. This functionality just *isn't* a codec - there is no encoding. Instead, it is an algorithm for *detecting* an encoding. And what do you do once you've detected the encoding? You decode the input, so why not combine both into an XML decoder? FWIW: I'm +1 on adding such a codec. It makes working with XML data a lot easier: you simply don't have to bother with the encoding of the XML data anymore and can just let the codec figure out the details. The XML parser can then work directly on the Unicode data. Whether it needs to be in C or not is another question (I would have done this in Python since performance is not really an issue), but since the code is already written, why not use it ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 09 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] XML codec?
Martin v. Löwis wrote: It makes working with XML data a lot easier: you simply don't have to bother with the encoding of the XML data anymore and can just let the codec figure out the details. The XML parser can then work directly on the Unicode data. Having the functionality indeed makes things easier. However, I don't find s.decode(xml.detect_encoding(s)) particularly more difficult than s.decode(xml-auto-detection) Not really, but the codec has more control over what happens to the stream, ie. it's easier to implement look-ahead in the codec than to do the detection and then try to push the bytes back onto the stream (which may or may not be possible depending on the nature of the stream). Whether it needs to be in C or not is another question (I would have done this in Python since performance is not really an issue), but since the code is already written, why not use it ? It's a maintenance issue. I'm sure Walter will do a great job in maintaining the code :-) Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 09 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] XML codec?
Martin v. Löwis wrote: Not really, but the codec has more control over what happens to the stream, ie. it's easier to implement look-ahead in the codec than to do the detection and then try to push the bytes back onto the stream (which may or may not be possible depending on the nature of the stream). YAGNI. A non-seekable stream is not all that uncommon in network processing. I usually end up either reading the complete data into memory or doing the needed buffering by hand. Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 10 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Does Python need a file locking module (slightly higher level)?
On 2007-10-26 05:41, Barry Warsaw wrote: On Oct 22, 2007, at 11:30 PM, [EMAIL PROTECTED] wrote: It's not clear that any of these implementations is going to be perfect. Maybe none ever will be. I would agree with this. You write a program and know you need to implement some kind of resource locking, so you start looking for some OTS solution. But then you realize that your application needs somewhat different semantics or needs to work in platforms or environments that the OTS code doesn't handle. Just a few days ago, I was looking at some locking code that needed to work across multiple invocations of a script on multiple machines, and the only thing they shared was a PostgreSQL connection, so we ended up wanting to use its advisory locks. In his reply Jean-Paul made this comment: It might be nice to have something like that in the standard library, but it's very simple once you know what to do. I'm not so sure about the very simple part, especially if you aren't familiar with all the ins and outs of the different platforms. I'd totally agree with this. Locking seems simple, but it's got some really tricky aspects that need to be coded just right or you'll be in a world of hurt. Mailman's LockFile.py (which you're right is *nix only) is stable now, but has had some really subtle bugs in the past. You might want to take a look at the FileLock.py module that's part of the eGenix mx Base distribution (mx.Misc.FileLock). It works reliably on Unix and Windows, doesn't rely on fcntl and has been in use for years. The only downside is that it's application specific, ie. only applications using the module for locking will detect the locks - but then again: this is exactly the problem you typically want to solve. The fact that the first three bits of code I was referred to were implemented by three significant Python tools/platforms and that all are different in some significant ways suggests that there is some both an underlying need for a file locking mechanism but with a lack of consensus about the best way to implement the mother-of-all-file-locking schemes for Python. Maybe the best place for this is in the distribution. PEP? I don't think any one solution will work for everybody. I'm not even sure we can define a common API a la the DBAPI, but if something were to make it into the standard distribution, that's the direction I'd go in. Then we can provide various implementations that support the LockingAPI under various environments, constraints, and platforms. If we wanted to distribute them in the stdlib, we could put them all in a package and let the user decide which features they need. I'm still planning on de-Mailman-ifying LockFile.py sometime soon. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unicode database
Nick Maclaren wrote: Ah, the makefile. I don't think you use it create the Unicode database. It's only good for generating the codecs (Lib/encodings) Yes, but it DOES attempt to download the mappings, and is the ONLY script which attempts to do so. Of course it does. The Tools/unicode/Makefile is meant to simplify recreating the codecs from the (possibly updated) mapping on the Unicode site. If it doesn't work for you, that may well be possible, since I wrote the Makefile and the other related stuff in that directory to help me with updating the codecs from the mappings. It's only checked in for convenience. beelzebub$find Python-2.5.1 -type f | wc 34583460 135981 beelzebub$find Python-2.5.1 -type f | xargs grep ftp.unicode.org Python-2.5.1/Doc/lib/libunicodedata.tex:4.1.0 which is publicly available from \url{ftp://ftp.unicode.org/}. grep: Python-2.5.1/Mac/Icons/Disk: No such file or directory grep: Image.icns: No such file or directory grep: Python-2.5.1/Mac/Icons/Python: No such file or directory grep: Folder.icns: No such file or directory Python-2.5.1/Misc/NEWS: at ftp.unicode.org and contain a few updates (e.g. the Mac OS Python-2.5.1/Tools/unicode/Makefile:# files available at ftp://ftp.unicode.org/ Python-2.5.1/Tools/unicode/Makefile:ncftpget -R ftp.unicode.org . Public/MAPPINGS Python-2.5.1/Tools/unicode/gencodec.py:site (ftp://ftp.unicode.org/Public/MAPPINGS/) and creates Python codec Python-2.5.1/Tools/unicode/python-mappings/TIS-620.TXT:# ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT the Python-2.5.1/Tools/unicode/python-mappings/TIS-620.TXT:# ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT Python-2.5.1/Tools/unicode/python-mappings/KOI8-U.TXT:# ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT Python-2.5.1/Tools/unicode/python-mappings/CP1140.TXT:# ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP037.TXT Python-2.5.1/Modules/unicodedata.c:4.1.0 which is publically available from ftp://ftp.unicode.org/.\n AFAICT, the mappings are still where they always were: at the location given in the Makefile. (e.g. ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-15.TXT ) Then you DEFINITELY are using a non-standard set of files. That above was from the source of Python 2.5.1 that I have just downloaded. No idea where you get that impression from, but then I'm not really sure what you're after anyway ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 09 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!)
On 2007-06-19 14:40, Walter Dörwald wrote: Georg Brandl wrote: A minuscule nit: the rot13 codec has no library equivalent, so it won't be supported anymore :) Given that there are valid use cases for bytes-to-bytes translations, and a common API for them would be nice, does it make sense to have an additional category of codec that is invoked via specific recoding methods on bytes objects? For example: encoded = data.encode_bytes('bz2') decoded = encoded.decode_bytes('bz2') assert data == decoded This is exactly what I proposed a while before under the name bytes.transform(). IMO it would make a common use pattern much more convenient and should be given thought. If a PEP is called for, I'd be happy to at least co-author it. Codecs are a major exception to Guido's law: Never have a parameter whose value switches between completely unrelated algorithms. I don't see much of a problem with that. Parameters are per-se intended to change the behavior of a function or method. Note that you are referring to the .encode() and .decode() methods - these are just easy to use interfaces to the codecs registered in the system. The codec design allows for different input and output types as it doesn't impose restrictions on these. Codecs are more general in that respect: they don't just deal with Unicode encodings, it's a more general approach that also works with other kinds of data types. The access methods, OTOH, can impose restrictions and probably should to restrict the return types to a predicable set. Why don't we put all string transformation functions into a common module (the string module might be a good place): import string string.rot13('abc') I think the string module will have to go away. It doesn't really separate between text and bytes data. Adding more confusion will not really help with making this distinction clear, either, I'm afraid. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jun 19 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2007-07-09: EuroPython 2007, Vilnius, Lithuania19 days to go Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adventures with x64, VS7 and VS8 on Windows
Hi Mark, +1 from me. I think this is simply a bug introduced with the UCS4 patches in Python 2.2. unicodeobject.h already has this code: #ifndef PY_UNICODE_TYPE /* Windows has a usable wchar_t type (unless we're using UCS-4) */ # if defined(MS_WIN32) Py_UNICODE_SIZE == 2 # define HAVE_USABLE_WCHAR_T # define PY_UNICODE_TYPE wchar_t # endif # if defined(Py_UNICODE_WIDE) # define PY_UNICODE_TYPE Py_UCS4 # endif #endif But for some reason, pyconfig.h defines: /* Define as the integral type used for Unicode representation. */ #define PY_UNICODE_TYPE unsigned short /* Define as the size of the unicode type. */ #define Py_UNICODE_SIZE SIZEOF_SHORT /* Define if you have a useable wchar_t type defined in wchar.h; useable means wchar_t must be 16-bit unsigned type. (see Include/unicodeobject.h). */ #if Py_UNICODE_SIZE == 2 #define HAVE_USABLE_WCHAR_T #endif disabling the default settings in the unicodeobject.h. Yes, that does appear strange. The following patch works for me, keeps Python building and appears to solve my problem. Any objections? Looks fine to me. Mark Index: pyconfig.h === --- pyconfig.h (revision 55487) +++ pyconfig.h (working copy) @@ -491,22 +491,13 @@ /* Define if you want to have a Unicode type. */ #define Py_USING_UNICODE -/* Define as the integral type used for Unicode representation. */ -#define PY_UNICODE_TYPE unsigned short - /* Define as the size of the unicode type. */ -#define Py_UNICODE_SIZE SIZEOF_SHORT +/* This is enough for unicodeobject.h to do the right thing on Windows. */ +#define Py_UNICODE_SIZE 2 -/* Define if you have a useable wchar_t type defined in wchar.h; useable - means wchar_t must be 16-bit unsigned type. (see - Include/unicodeobject.h). */ -#if Py_UNICODE_SIZE == 2 -#define HAVE_USABLE_WCHAR_T - /* Define to indicate that the Python Unicode representation can be passed as-is to Win32 Wide API. */ #define Py_WIN_WIDE_FILENAMES -#endif /* Use Python's own small-block memory-allocator. */ #define WITH_PYMALLOC 1 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adventures with x64, VS7 and VS8 on Windows
On 2007-05-21 12:30, Kristján Valur Jónsson wrote: [Py_UNICODE being #defined as unsigned short on Windows] I'd rather make it a platform-specific definition (for platform=Windows API). Correct me if I'm wrong, but isn't wchar_t also available in VS 2003 (and even in VC6?). And doesn't it have the right definition in all these compilers? So +1 for setting Py_UNICODE to wchar_t on Windows. Yes. Btw, in previous visual studio versions, wchar_t was not treated as a builtin type by default, but rather as synonymous with unsighed short. Now the default is that it is, and this causes some semantic differences and incompatibilities of the type seen. +1 from me. If think this is simply a bug introduced with the UCS4 patches in Python 2.2. unicodeobject.h already has this code: #ifndef PY_UNICODE_TYPE /* Windows has a usable wchar_t type (unless we're using UCS-4) */ # if defined(MS_WIN32) Py_UNICODE_SIZE == 2 # define HAVE_USABLE_WCHAR_T # define PY_UNICODE_TYPE wchar_t # endif # if defined(Py_UNICODE_WIDE) # define PY_UNICODE_TYPE Py_UCS4 # endif #endif But for some reason, pyconfig.h defines: /* Define as the integral type used for Unicode representation. */ #define PY_UNICODE_TYPE unsigned short /* Define as the size of the unicode type. */ #define Py_UNICODE_SIZE SIZEOF_SHORT /* Define if you have a useable wchar_t type defined in wchar.h; useable means wchar_t must be 16-bit unsigned type. (see Include/unicodeobject.h). */ #if Py_UNICODE_SIZE == 2 #define HAVE_USABLE_WCHAR_T #endif disabling the default settings in the unicodeobject.h. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 21 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
On 2007-05-21 00:07, Talin wrote: Phillip J. Eby wrote: I wanted to get this in before the Py3K PEP deadline, since this is a Python 2.6 PEP that would presumably impact 3.x as well. Feedback welcome. PEP: 365 Title: Adding the pkg_resources module I'm really surprised that there hasn't been more comment on this. True both ways, I guess: I'm still waiting for a reply to my comments. I'd also like to see more discussion about adding e.g.: * support for user packages (ie. having site.py add a well-defined user home directory based Python path entry to sys.path, e.g. ~/.python/user-packages, much like what MacPython already does now) * support for having the import mechanism play nice with namespace packages (ie. packages that may live in different places on the disk, but appear to be in the same Python package as seen by the import mechanism) I think those two features would go a long way in reducing the number of hacks setuptools currently applies to get this functionality working with code in .pth files, monkey-patching site.py, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 21 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
On 2007-05-21 16:05, Phillip J. Eby wrote: At 01:43 PM 5/21/2007 +0200, M.-A. Lemburg wrote: On 2007-05-21 00:07, Talin wrote: Phillip J. Eby wrote: I wanted to get this in before the Py3K PEP deadline, since this is a Python 2.6 PEP that would presumably impact 3.x as well. Feedback welcome. PEP: 365 Title: Adding the pkg_resources module I'm really surprised that there hasn't been more comment on this. True both ways, I guess: I'm still waiting for a reply to my comments. What comments are you talking about? I must've missed them. I've attached the email. Please see below. I'd also like to see more discussion about adding e.g.: * support for user packages (ie. having site.py add a well-defined user home directory based Python path entry to sys.path, e.g. ~/.python/user-packages, much like what MacPython already does now) * support for having the import mechanism play nice with namespace packages (ie. packages that may live in different places on the disk, but appear to be in the same Python package as seen by the import mechanism) I think those two features would go a long way in reducing the number of hacks setuptools currently applies to get this functionality working with code in .pth files, monkey-patching site.py, etc. These items aren't directly related to the PEP, however. Right. I wasn't referring to this PEP. I think we should have two more PEPs covering the above points, since they offer benefits for all users, not just setuptools users. pkg_resources doesn't monkeypatch anything or touch any .pth files. It only changes sys.path at runtime if you explicitly ask it to locate and activate packages for you. As for namespace packages, pkg_resources provides a more PEP 302-compatible alternative to pkgutil.extend_path(). pkgutil doesn't support anything but existing filesystem directories, but the pkg_resources version supports zipfiles and has hooks to allow namespace package support to be registered for any PEP 302 importer. See: http://peak.telecommunity.com/DevCenter/PkgResources#supporting-custom-importers (specifically, the register_namespace_handler() function.) Looking at the code it appears as if you've already formalized an implementation for this. However, since this is not egg-specific it should probably be moved to pkgutil and get a separate PEP with detailed documentation (the link you provided doesn't really explain the concepts, reading the code helped a bit). What I don't understand about your approach is why importers would have to register with the namespace implementation. This doesn't seem necessary, since the package __path__ attribute already provides all functionality needed for redirecting lookups to different paths. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 21 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ---BeginMessage--- On 2007-05-01 02:29, Phillip J. Eby wrote: I wanted to get this in before the Py3K PEP deadline, since this is a Python 2.6 PEP that would presumably impact 3.x as well. Feedback welcome. Could you add a section that explains the side effects of importing pkg_resources ? The documentation of the module doesn't mention any, but the code suggests that you are installing (some form of) import hooks. Some other comments: * Wouldn't it be better to factor out all the meta-data access code that's not related to eggs into pkgutil ?! * How about then renaming the remaining module to egglib ?! * The module needs some reorganization: imports, globals and constants at the top, maybe a few comments delimiting the various sections, * The get_*_platform() should probably use the platform module which is a lot more flexible than distutils' get_platform() (which should probably use the platform module as well in the long run) Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
On 2007-05-21 20:01, Phillip J. Eby wrote: At 06:28 PM 5/21/2007 +0200, M.-A. Lemburg wrote: However, since this is not egg-specific it should probably be moved to pkgutil and get a separate PEP with detailed documentation (the link you provided doesn't really explain the concepts, reading the code helped a bit). That doesn't really make sense in the context of the current PEP, though, which isn't to provide a general-purpose namespace package API; it's specifically about adding an existing piece of code to the stdlib, with its API intact. You seem to indicate that you're not up to discussing the concepts implemented by the module and *integrating* them with the Python stdlib. Please correct me if I'm wrong, but if the whole point of the PEP is a take it or leave it decision, then I don't see the point of discussing it. I'm -1 on adding the module in its current state; I'd be +1 on integrating the concepts with the Python stdlib. Hope I'm wrong, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 21 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
On 2007-05-21 22:48, Phillip J. Eby wrote: At 08:56 PM 5/21/2007 +0200, M.-A. Lemburg wrote: On 2007-05-21 20:01, Phillip J. Eby wrote: At 06:28 PM 5/21/2007 +0200, M.-A. Lemburg wrote: However, since this is not egg-specific it should probably be moved to pkgutil and get a separate PEP with detailed documentation (the link you provided doesn't really explain the concepts, reading the code helped a bit). That doesn't really make sense in the context of the current PEP, though, which isn't to provide a general-purpose namespace package API; it's specifically about adding an existing piece of code to the stdlib, with its API intact. You seem to indicate that you're not up to discussing the concepts implemented by the module and *integrating* them with the Python stdlib. No, I'm saying something else. I'm saying it: 1. has nothing to do with the PEP, 2. isn't something I'm volunteering to do, and 3. would only make sense to do as part of Python 3 stdlib reorganization, if it were done at all. I don't understand that last part: how can adding a new module or set of modules require waiting for reorganization of the stdlib ? All I'm suggesting is to reorganize the code in pkg_resources.py a bit and move the relevant bits into pkgutil.py and into a new eggutil.py. Now, the code is certainly under an open license, and the concepts are entirely free for anyone to use. If somebody wishes to do what you're describing, they're certainly welcome to take on that thankless task. But I personally don't see the point, since by definition that new API would have *no current users*. And the purpose of the PEP is to serve the (rather large) audience that would like to take advantage of existing software that uses the API. Thus, any proposal to alter that API faces a high entry barrier to show how the proposed changes would provide a signficant practical benefit to users. Why is that ? You can easily provide a pkg_resource.py module with your old API that interfaces to the new reorganized code in the stdlib. That's not even remotely similar to take it or leave it. It might *seem* that way, of course, simply because in any proposal to change the API, there's an implicit question of why nobody proposed the change via the Distutils-SIG, sometime during the last 2+ years of discussions around that API. This doesn't have anything to do with distutils. It's entirely about the egg distribution format. I remain open-minded and curious as to the possibility that someone *could* propose a meaningful change, but am also rationally skeptical that someone actually *will* come up with something that would outweigh the user benefit of keeping the already published, already discussed, already field-tested, already in-use API. For that matter, I remain open-minded and curious as to the possibility of whether someone could propose a reasonable justification for *not* including the module in the stdlib. After all, last year Fredrik Lundh surprised me with a convincing rationale for *not* including setuptools in the stdlib, which is why I backed off on doing so in 2.5, and am now proffering a much-reduced-in-scope proposal for 2.6. So, I'm perfectly willing and able to change my mind, given convincing reasons to do so. So far, though, your change suggestions haven't even explained why *you* want them, let alone why anybody else should agree. We can hardly discuss what you haven't yet said. I'm not sure what you want to hear from me. You asked for comments, I wrote back and gave you comments. I also made it clear why I think that breaking up the addition into different PEPs makes a lot of sense and why separating the code into different modules for the same reason makes a lot of sense as well. I also tried to stir up some discussion to make life easier for setuptools by suggesting a user-package directory on sys.path and adding support for namespace packages as general Python feature instead of hiding it away in pkg_resources.py. You should see this as chance to introduce new concepts to Python. Instead you seem to feel offended every time someone suggests a change in your design. That's also the reason why I stopped discussing things with you on the distutils list. There was simply no way of getting through to you. Perhaps we should just meet up for a beer in London sometime and sort things out ;-) Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 21 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld
Re: [Python-Dev] \u and \U escapes in raw unicode string literals
On 2007-05-12 02:42, Andrew McNabb wrote: On Sat, May 12, 2007 at 01:30:52AM +0200, M.-A. Lemburg wrote: I wonder how we managed to survive all these years with the existing consistent and concise definition of the raw-unicode-escape codec ;-) There are two options: * no one really uses Unicode raw strings nowadays * none of the existing users has ever stumbled across the problem case that triggered all this Both ways, we're discussing a non-issue. Sure, it's a non-issue for Python 2.x. However, when Python 3 comes along, and all strings are Unicode, there will likely be a lot more users stumbling into the problem case. In the first case, changing the codec won't affect much code when ported to Py3k. In the second case, a change to the codec is not necessary. Please also consider the following: * without the Unicode escapes, the only way to put non-ASCII code points into a raw Unicode string is via a source code encoding of say UTF-8 or UTF-16, pretty much defeating the original requirement of writing ASCII code only * non-ASCII code points in text are not uncommon, they occur in most European scripts, all Asian scripts, many scientific texts and in also texts meant for the web (just have a look at the HTML entities, or think of Word exports using quotes) * adding Unicode escapes to the re module will break code already using ...\u... in the regular expressions for other purposes; writing conversion tools that detect this usage is going to be hard * OTOH, writing conversion tools that simply work on string literals in general is easy Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 13 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \u and \U escapes in raw unicode string literals
On 2007-05-13 18:04, Martin v. Löwis wrote: * without the Unicode escapes, the only way to put non-ASCII code points into a raw Unicode string is via a source code encoding of say UTF-8 or UTF-16, pretty much defeating the original requirement of writing ASCII code only That's no problem, though - just don't put the Unicode character into a raw string. Use plain strings if you have a need to include Unicode characters, and are not willing to leave ASCII. For Python 3, the default source encoding is UTF-8, so it is much easier to use non-ASCII characters in the source code. The original requirement may not be as strong anymore as it used to be. You can do that today: Just put the # coding: utf-8 marker at the top of the file. However, in some cases, your editor may not be capable of displaying or letting you enter the Unicode text you have in mind. In other cases, there may be a corporate coding standard in place that prohibits using non-ASCII text in source code, or fixes the encoding to e.g. Latin-1. In all those cases, it's necessary to be able to enter the Unicode code points which do cannot be used in the source code using other means and the easiest way to do this is by using Unicode escapes. * non-ASCII code points in text are not uncommon, they occur in most European scripts, all Asian scripts, many scientific texts and in also texts meant for the web (just have a look at the HTML entities, or think of Word exports using quotes) And you are seriously telling me that people who commonly use non-ASCII code points in their source code are willing to refer to them by Unicode ordinal number (which, of course, they all know by heart, from 1 to 65536)? No, I'm not. I'm saying that non-ASCII code points are in common use and (together with the above bullet) that there are situations where you can't put the relevant code point directly into your source code. Using Unicode escapes for these will always be a cludge, but it's still better than not being able to enter the code points at all. * adding Unicode escapes to the re module will break code already using ...\u... in the regular expressions for other purposes; writing conversion tools that detect this usage is going to be hard It's unlikely to occur in code today - \u just means the same as u (so \u1234 matches u1234); if you want a backslash followed by u in your regular expression, you should write \\u. It would be possible to future-warn about \u in 2.6, catching these cases. Authors then would either have to remove the backslash, or duplicate it, depending on what they want to express. Good idea. The re module would then have to implement the same escaping scheme as the raw-unicode-escape code (only an odd number of backslashes causes the escaping code to trigger). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 13 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \u and \U escapes in raw unicode string literals
On 2007-05-11 07:52, Martin v. Löwis wrote: This is what prompted my question, actually: in Py3k, in the str/unicode unification branch, r\u1234 changes meaning: before the unification, this was an 8-bit string, where the \u was not special, but now it is a unicode string, where \u *is* special. That is true for non-raw strings also: the meaning of \u1234 also changes. However, traditionally, there was *no* escaping mechanism in raw strings in Python, and I feel that this is a good principle, because it is easy to learn (if you leave out the detail that \ can't be the last character in a raw string - which should get fixed also, IMO). So I think in Py3k, \u1234 should continue to be a string with 6 characters. Otherwise, people will complain that os.stat(rc:\windows\system32\user32.dll) fails. Telling them to write os.stat(rc:\windows\system32\u005Cuser32.dll) will just cause puzzled faces. Using double backslashes won't cause that reaction: os.stat(c:\\windows\\system32\\user32.dll) Also note that Windows is smart enough nowadays to parse the good old Unix forward slash: os.stat(c:/windows/system32/user32.dll) Windows path names are one of the two primary applications of raw strings (the other being regexes). IMHO the primary use case are regexps and for those you'd definitely want to be able to put Unicode characters into your expressions. BTW, if you use ur... for your expressions today (which you should if you parse text), then nothing will change when removing the 'u' prefix in Py3k. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 11 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \u and \U escapes in raw unicode string literals
On 2007-05-11 13:05, Thomas Heller wrote: M.-A. Lemburg schrieb: On 2007-05-11 07:52, Martin v. Löwis wrote: This is what prompted my question, actually: in Py3k, in the str/unicode unification branch, r\u1234 changes meaning: before the unification, this was an 8-bit string, where the \u was not special, but now it is a unicode string, where \u *is* special. That is true for non-raw strings also: the meaning of \u1234 also changes. However, traditionally, there was *no* escaping mechanism in raw strings in Python, and I feel that this is a good principle, because it is easy to learn (if you leave out the detail that \ can't be the last character in a raw string - which should get fixed also, IMO). So I think in Py3k, \u1234 should continue to be a string with 6 characters. Otherwise, people will complain that os.stat(rc:\windows\system32\user32.dll) fails. Telling them to write os.stat(rc:\windows\system32\u005Cuser32.dll) will just cause puzzled faces. Using double backslashes won't cause that reaction: os.stat(c:\\windows\\system32\\user32.dll) Sure. But I want to use raw strings for Windows path names; it's much easier to type. But think of the price to pay if we disable use of Unicode escapes in raw strings. And all of this just because of the one special case: having a file name that starts with a U and needs to be referenced literally in a Python application together with a path leading up to it. BTW, there's an easy work-around for this special case: os.stat(os.path.join(rc:\windows\system32, user32.dll)) Also note that Windows is smart enough nowadays to parse the good old Unix forward slash: os.stat(c:/windows/system32/user32.dll) In my opinion this is a windows bug and not a features. Especially because there are Windows api functions (the shell functions, IIRC) that do NOT accept forward slashes. Would you say that *nix is dumb because it doesn't parse \\usr\\include? Sorry, I wasn't trying to imply that Windows is/was a dumb system. I think it's nice that you can use forward slashes on Windows - makes writing code that works in both worlds (Unix and Windows) a lot easier. Windows path names are one of the two primary applications of raw strings (the other being regexes). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 11 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \u and \U escapes in raw unicode string literals
On 2007-05-10 20:53, Paul Moore wrote: On 10/05/07, Guido van Rossum [EMAIL PROTECTED] wrote: I just discovered that, in all versions of Python as far back as I have access to (2.0), \u escapes are interpreted inside raw unicode strings. Thus: [...] Does anyone remember why it is done this way? The reference manual describes this behavior, but doesn't give an explanation: My memory is so dim as to be more speculation than anything else, but I suspect it's simply because there's no other way of including characters outside the ASCII range in a raw string. This is per design (see PEP 100) and was done for the reason given by Paul. The motivation for the chosen approach was to make Python's raw Unicode strings compatible to Java's raw Unicode strings: http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \u and \U escapes in raw unicode string literals
On 2007-05-11 00:11, Guido van Rossum wrote: On 5/10/07, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2007-05-10 20:53, Paul Moore wrote: On 10/05/07, Guido van Rossum [EMAIL PROTECTED] wrote: I just discovered that, in all versions of Python as far back as I have access to (2.0), \u escapes are interpreted inside raw unicode strings. Thus: [...] Does anyone remember why it is done this way? The reference manual describes this behavior, but doesn't give an explanation: My memory is so dim as to be more speculation than anything else, but I suspect it's simply because there's no other way of including characters outside the ASCII range in a raw string. This is per design (see PEP 100) and was done for the reason given by Paul. The motivation for the chosen approach was to make Python's raw Unicode strings compatible to Java's raw Unicode strings: http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html I'm not sure what Java compatibility buys us. It is also far from perfect -- IIUC, in Java if you write \u0022 (that's the character) it counts as an opening or closing quote, and if you write \u005c (a backslash) it can be used to escape the following character. OTOH, in Python, you can write urC:\Program Files\u005c and voila, a raw string terminating in a backslash. (In Java this would escape the instead.) http://mail.python.org/pipermail/python-dev/1999-November/001346.html http://mail.python.org/pipermail/python-dev/1999-November/001392.html and all the other postings in that month related to this. However, I understand the other reason (inclusion of non-ASCII characters in raw strings) and I reluctantly agree with it. Reluctantly, because it means I can't create a raw string containing a \ followed by u or U -- I needed one of those today. print ur\u005cu \u -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 11 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing string constants to byte arrays in Py3k
On 2007-05-04 19:51, Guido van Rossum wrote: [-python-dev] On 5/4/07, Fred L. Drake, Jr. [EMAIL PROTECTED] wrote: On Friday 04 May 2007, M.-A. Lemburg wrote: I also suggest making all bytes literals immutable to avoid running into any issues like the above. +1 from me. Rather than adding immutability to bytes objects (which has big implementation and type checking implications), consider using buffer(b123) as an immutable bytes literal. You can freely concatenate and compare buffer objects with bytes objects. I like Georg's idea of having an immutable bytes subclass. babc could then be a shortcut constructor for this subclass. In general, I don't think it's a good idea to have literals turn into mutable objects, since literals are normally perceived as being constant. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 05 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing string constants to byte arrays in Py3k
On 2007-05-05 18:11, Steven Bethard wrote: On 5/5/07, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2007-05-04 19:51, Guido van Rossum wrote: [-python-dev] On 5/4/07, Fred L. Drake, Jr. [EMAIL PROTECTED] wrote: On Friday 04 May 2007, M.-A. Lemburg wrote: I also suggest making all bytes literals immutable to avoid running into any issues like the above. +1 from me. Rather than adding immutability to bytes objects (which has big implementation and type checking implications), consider using buffer(b123) as an immutable bytes literal. You can freely concatenate and compare buffer objects with bytes objects. I like Georg's idea of having an immutable bytes subclass. babc could then be a shortcut constructor for this subclass. In general, I don't think it's a good idea to have literals turn into mutable objects, since literals are normally perceived as being constant. Does that mean you want list literals to be immutable too? lst = ['a', 'b', 'c'] lst.append('d') # raises an error? Sorry, I was referring to Python literals: http://docs.python.org/ref/literals.html ie. strings and numeric constant values defined in a Python program. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 05 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)
Hi Walter, if the bytes type does turn out to be a mutable type as suggested in PEP 358, then please make sure that no code (C code in particular), relies on the constantness of these byte objects. This is especially important when it comes to codecs, since the error callback logic would allow the callback to manipulate the byte object contents and length without the codec taking note of this change. I expect there to be other places in the interpreter which would break as well. Otherwise, you end up opening the door for segfaults and easy DOS attacks on Python3. Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 On 2007-05-04 15:05, walter.doerwald wrote: Author: walter.doerwald Date: Fri May 4 15:05:09 2007 New Revision: 55119 Modified: python/branches/py3k-struni/Lib/codecs.py python/branches/py3k-struni/Lib/test/test_codecs.py Log: Make the BOM constants in codecs.py bytes. Make the buffered input for decoders a bytes object. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
On 2007-05-01 02:29, Phillip J. Eby wrote: I wanted to get this in before the Py3K PEP deadline, since this is a Python 2.6 PEP that would presumably impact 3.x as well. Feedback welcome. Could you add a section that explains the side effects of importing pkg_resources ? The documentation of the module doesn't mention any, but the code suggests that you are installing (some form of) import hooks. Some other comments: * Wouldn't it be better to factor out all the meta-data access code that's not related to eggs into pkgutil ?! * How about then renaming the remaining module to egglib ?! * The module needs some reorganization: imports, globals and constants at the top, maybe a few comments delimiting the various sections, * The get_*_platform() should probably use the platform module which is a lot more flexible than distutils' get_platform() (which should probably use the platform module as well in the long run) Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 PEP: 365 Title: Adding the pkg_resources module Version: $Revision: 55032 $ Last-Modified: $Date: 2007-04-30 20:24:48 -0400 (Mon, 30 Apr 2007) $ Author: Phillip J. Eby [EMAIL PROTECTED] Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Apr-2007 Post-History: 30-Apr-2007 Abstract This PEP proposes adding an enhanced version of the ``pkg_resources`` module to the standard library. ``pkg_resources`` is a module used to find and manage Python package/version dependencies and access bundled files and resources, including those inside of zipped ``.egg`` files. Currently, ``pkg_resources`` is only available through installing the entire ``setuptools`` distribution, but it does not depend on any other part of setuptools; in effect, it comprises the entire runtime support library for Python Eggs, and is independently useful. In addition, with one feature addition, this module could support easy bootstrap installation of several Python package management tools, including ``setuptools``, ``workingenv``, and ``zc.buildout``. Proposal Rather than proposing to include ``setuptools`` in the standard library, this PEP proposes only that ``pkg_resources`` be added to the standard library for Python 2.6 and 3.0. ``pkg_resources`` is considerably more stable than the rest of setuptools, with virtually no new features being added in the last 12 months. However, this PEP also proposes that a new feature be added to ``pkg_resources``, before being added to the stdlib. Specifically, it should be possible to do something like:: python -m pkg_resources SomePackage==1.2 to request downloading and installation of ``SomePackage`` from PyPI. This feature would *not* be a replacement for ``easy_install``; instead, it would rely on ``SomePackage`` having pure-Python ``.egg`` files listed for download via the PyPI XML-RPC API, and the eggs would be placed in the ``$PYTHONEGGS`` cache, where they would **not** be importable by default. (And no scripts would be installed) However, if the download egg contains installation bootstrap code, it will be given a chance to run. These restrictions would allow the code to be extremely simple, yet still powerful enough to support users downloading package management tools such as ``setuptools``, ``workingenv`` and ``zc.buildout``, simply by supplying the tool's name on the command line. Rationale = Many users have requested that ``setuptools`` be included in the standard library, to save users needing to go through the awkward process of bootstrapping it. However, most of the bootstrapping complexity comes from the fact that setuptools-installed code cannot use the ``pkg_resources`` runtime module unless setuptools is already installed. Thus, installing setuptools requires (in a sense) that setuptools already be installed. Other Python package management tools, such as ``workingenv`` and ``zc.buildout``, have similar bootstrapping issues, since they both make use of setuptools, but also want to provide users with something approaching a one-step install. The complexity of creating bootstrap utilities for these and any other such tools that arise in future, is greatly reduced if ``pkg_resources`` is already present, and is also able to download pre-packaged
Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)
On 2007-05-04 18:53, Georg Brandl wrote: M.-A. Lemburg schrieb: Hi Walter, if the bytes type does turn out to be a mutable type as suggested in PEP 358, then please make sure that no code (C code in particular), relies on the constantness of these byte objects. This is especially important when it comes to codecs, since the error callback logic would allow the callback to manipulate the byte object contents and length without the codec taking note of this change. I expect there to be other places in the interpreter which would break as well. Otherwise, you end up opening the door for segfaults and easy DOS attacks on Python3. If the user does not need to change these bytes objects and this is needed in more places, adding an immutable flag for internal bytes objects only settable from C, or even an immutable byte base class might be an idea. +1 I also suggest making all bytes literals immutable to avoid running into any issues like the above. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hindsight on Py_UNICODE_WIDE?
On 2007-03-23 19:18, Jason Orendorff wrote: Scheme is adding Unicode support in an upcoming standard: (DRAFT) http://www.r6rs.org/document/lib-html/r6rs-lib-Z-H-3.html I have two questions for the python-dev team about Python's Unicode experiences. If it's convenient, please take a moment to reply. Thanks in advance. 1. In hindsight, what do you think about PEP 261, the Py_UNICODE_WIDE build option? On balance, has this been good, bad, or indifferent? What's good/bad about it? Having narrow and wide builds introduces a level of complexity that seems unnecessary. Few people ever use non-BMP code points and the ones who do can easily get away with UTF-16 surrogates. Most Unixes have chosen to go with UCS4 as storage format, so you have little choice if you want to take advantage of mapping directly to wchar on Unix. Windows has chosen UTF-16 as internal storage format and wchar is 16-bit on that platform. You may also want to consider looking at PEP 263: http://www.python.org/dev/peps/pep-0263 Source code encoding is a great thing ! You can now write native Unicode in Python source code. The only downside is the extra complexity added by the fact that the tokenizer in Py2 works on 8-bit characters. For this reason we had to decode the source code to Unicode, then encode it to UTF-8, pass it to the tokenizer and then decode the UTF-8 literal strings for Unicode back into Unicode again. Ideally, the tokenizer in Py3k should be rewritten to work directly on Unicode. 2. The idea of multiple string representations has come up (that is, where all strings are Unicode, but in memory some are 8-bit, some 16-bit, and some 32-bit--each string uses the narrowest possible representation). This has been discussed here for Python 3000. My question is: Is this for real? How far along is it? How likely is it? My suggestion for Scheme is not to go down that route. It adds complexity for little added value and also makes the implementation slower (due to the frequent conversion from one internal format to another). Can't comment on Py3k - I'm out of that loop. If you want to know more about how Unicode was added to Python 2.x and how it can be used, I suggest you read the following: Unicode integration (one of the first PEPs ever written :-): http://www.python.org/dev/peps/pep-0100 Unicode in Python: http://www.egenix.com/files/python/EuroPython2002-Python-and-Unicode.pdf Designing Unicode-aware Applications in Python: http://www.egenix.com/files/python/EPC2006-Developing-Unicode-aware-applications-in-Python.pdf Hope that helps, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 23 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal to revert r54204 (splitext change)
On 2007-03-15 07:45, Martin v. Löwis wrote: Phillip J. Eby schrieb: And yet, that incorrect behavior was clearly intended by the author(s) of the code, test, and docstrings. As it happens, Guido wrote that code (16 years ago) and the docstring (9 years ago), in the case of the posixpath module at least. I don't find it that clear that it was the intention, AFAICT, it could have been an accident also. Guido added the doc strings as a contribution from Charles G. Waldman; he may just have documented the implemented behavior. In r4493, Sjoerd Mullender changed splitext (in an incompatible way) so that it would split off only the last extension, before, foo.tar.gz would be split into 'foo', '.tar.gz'. So it's clear that the intention was always to split off the extension, whether or not the behavior on dotfiles was considered I cannot tell. As for Doc/lib, in r6524 Guido changed it to document the actual behavior, from the last component of \var{root} contains no periods, and \var{ext} is empty or begins with a period. to and \var{ext} is empty or begins with a period and contains at most one period. So it seems the original (Guido's) intention was that it splits of all extensions; Sjoerd then changed it to split off only the last extension. Whatever the intention was or has been: the term extension itself is not well-defined, so there's no obvious right way to implement an API that splits off an extension. E.g. in some cases, .tar.gz is considered an extension, in others, the .gz part is just a transfer encoding and .tar the extension. Then you have .tgz which is a bit of both. It also depends on the platform, e.g. on Windows, only the very last part of a filename is used as extension by the OS to determine the (MIME) type of a file. As always, it's best to just right your own application-specific code to get defined behavior. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 15 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] These csv test cases seem incorrect to me...
Hi Skip, On 2007-03-12 03:01, [EMAIL PROTECTED] wrote: I decided it would be worthwhile to have a csv module written in Python (no C underpinnings) for a number of reasons: * It will probably be easier to add Unicode support to a Python version * More people will be able to read/grok/modify/fix bugs in a Python implementation than in the current mixed Python/C implementation. * With alternative implementations of Python available (PyPy, IronPython, Jython) it makes sense to have a Python version they can use. Lots of good reasons :-) I've written a Python-only Unicode aware CSV module for a client (mostly because CSV data tends to be quirky and I needed a quick way of dealing with corner cases). Perhaps I can get them to donate it to the PSF... I'm far from having anything which will pass the current test suite, but in diagnosing some of my current failures I noticed a couple test cases which seem wrong. In the TestDialectExcel class I see these two questionable tests: def test_quotes_and_more(self): self.readerAssertEqual('ab', [['ab']]) def test_quote_and_quote(self): self.readerAssertEqual('a b', [['a b']]) It seems to me that if a field starts with a quote it *has* to be a quoted field. Any quotes appearing within a quoted field have to be escaped and the field has to end with a quote. Both of these test cases fail on or the other assumption. If they are indeed both correct and I'm just looking at things crosseyed I think they at least deserve comments explaining why they are correct. Both test cases date from the first checkin. I performed the checkin because of the group developing the module I believe I was the only one with checkin privileges at the time, not because I wrote the test cases. Any ideas about why these test cases are in there? I can't imagine Excel generating either one. My recommendation: Let the module do whatever Excel does with such data. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 14 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New syntax for 'dynamic' attribute access
On 2007-02-12 16:19, Georg Brandl wrote: Tim Delaney asked in particular: Have you checked if [the existing uses of getattr, where getattr in that scope is a function argument with default value the built-in getattr] are intended to bring the getattr name into local scope for fast lookup, or to force a binding to the builtin gettattr at compile time (two common (ab)uses of default arguments)? If they are, they would be better served by the new syntax. They're all in Lib/codecs.py, and are of the form: class StreamRecoder: def __getattr__(self, name, getattr=getattr): Inherit all other methods from the underlying stream. return getattr(self.stream, name) Without digging deeper into that code I'm afraid I can't say precisely what is going on. Since that is a special method and ought to have the signature __getattr__(self, name), I think it's safe to assume that that's meant as an optimization. I can confirm that: it's a case of fast-local-lookup optimization. You can add a -1 from me to the list as well: I don't think that dynamic lookups are common enough to warrant new syntax. Even if you do add a new syntax for this, using parenthesis is a poor choice IMHO as the resulting code looks too much like a function call (e.g. callable.(variable)). Other choices would be square brackets [], but these have the same problem as they are in use for indexing. The only brackets that are not yet overloaded in the context of applying them to an object are curly brackets, so callable.{variable} would cause enough raising eyebrows to not think of a typo. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 12 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problem between deallocation of modules and func_globals
On 2007-01-20 00:01, Brett Cannon wrote: On 1/19/07, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2007-01-19 22:33, Brett Cannon wrote: That's a typical error situation you get in __del__ methods at the time the interpreter is shut down. Yeah, but in this case this is at the end of Py_Initialize() for the stuff I am doing to the interpreter. =) Is that in some error branch of Py_Initialize() ? Otherwise I don't see how the modules could get garbage-collected. Nope, it's code I am adding to clean out sys.modules of stuff the user didn't import themselves; it's for security reasons. I'm not sure whether that's really going to increase security: unloading of modules usually isn't safe and you cannot be sure that it's possible to reinitialize a C module once it has been loaded in the process. For Python modules this is often possible, but there still may be side-effects of the import that you cannot easily undo. Perhaps you should just move those modules out to a different dictionary and keep track of it in the import mechanism, so that while you can't access the module directly via sys.modules, the import mechanism still knows that it has been loaded and reinserts it into sys.modules if it gets imported again. I think that you get more security by explicitly limiting which modules and packages you allow to be imported in the first place and restricting what can be done with sys.path and sys.modules. I'm not exactly sure which global state you are referring to. The aliase map, the cache used by the search function ? encodings._cache . Note that the search function registry is a global managed in the thread state (it's not stored in any module). Right, but that is not the issue. If you have deleted the reference to the encodings module from sys.modules it then sets encodings._cache to None. After the deletion, if you try to encode/decode a unicode string you can an AttributeError about how encodings._cache does not have a 'get' method since it is now None instead of a dict. The function is fine and still runs, it's just that the global state it depends on is no longer the way it assume it should be. While I could add some tricks to have the cache dictionary stay alive even after the globals were set to None, I doubt that this will really fix the problem. The encoding package relies on the import mechanism, the codecs module and the _codecs builtin module. Any of these could fail to work depending on the order in which the modules get GCed. There's a reason why things in Py_Finalize() are as carefully ordered :-) Perhaps we need to apply some reordering to the steps in Py_Initialize() ?! Nah, I just need to not delete the modules. =) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 20 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problem between deallocation of modules and func_globals
On 2007-01-18 20:53, Brett Cannon wrote: I have discovered an issue relating to func_globals for functions and the deallocation of the module it is contained within. Let's say you store a reference to the function encodings.search_function from the 'encodings' module (this came up in C code, but I don't see why it couldn't happen in Python code). Then you delete the one reference to the module that is stored in sys.modules, leading to its deallocation. That triggers the setting of None to every value in encodings.__dict__. Oops, now the global namespace for that module has everything valued at None. The dict doesn't get deallocated since a reference is held by encodings.search_function.func_globals and there is still a reference to that (technically held in the interpreter's codec_search_path field). So the function can still execute, but throws exceptions like AttributeError because a module variable that once held a dict now has None and thus doesn't have the 'get' method. That's a typical error situation you get in __del__ methods at the time the interpreter is shut down. The main reason for setting everything to None first is to break circular references and make sure that at least some of the object destructors can run. My question is whether this is at all worth trying to rectify. Since Google didn't turn anything up I am going to guess this is not exactly a common thing. =) That would lead me to believe some (probably most) of you will say, just leave it alone and work around it. If you can come up with a better way, sure :-) The other option I can think of is to store a reference to the module instead of just to its __dict__ in the function. The problem with that is we end up with a circular dependency of the functions in modules having a reference to the module but then the module having a reference to the functions. I tried not having the values in the module's __dict__ set to None if the reference count was above 1 and that solved this issue, but that leads to dangling references on anything in that dict that does not have a reference stored away somewhere else like encodings.search_function. Anybody have any ideas on how to deal with this short of rewriting some codecs stuff so that they don't depend on global state in the module or just telling me to just live with it? I'm not exactly sure which global state you are referring to. The aliase map, the cache used by the search function ? Note that the search function registry is a global managed in the thread state (it's not stored in any module). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 19 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pep-3108.txt
On 2007-01-03 01:42, Brett Cannon wrote: On 1/2/07, M.-A. Lemburg [EMAIL PROTECTED] wrote: +Open Issues +=== + +Consolidate dependent modules together into a single module or package? ... +Consolidate certain modules with similar themes together in a package? +-- ... If you do follow this route, please take the chance to place the whole Python stdlib under a single package. That way we'll avoid name clashes with existing packages and modules now and in the future. That has been suggested before (including by me) and Guido has always shot it down. That's why I left it out of this proposal. Even if it is shot down again, it still deserves to be documented together with the reasons for being shot down. This is a one-in-a-lifetime chance, so it would be sad if it were not taken into account. The extra effort would be minimal - the renaming would have to be done using a script anyway and adding an extra 'from py import ' prefix to the modules wouldn't really make the renaming more complicated ;-) I was about to start writing an open issue on this since the biggest objection from Guido I could find on this topic is http://mail.python.org/pipermail/python-dev/2002-July/026409.html , but then it started to feel like a separate PEP to me. So I think I am going to pass on taking on this topic and let someone else tackle it in a PEP. Sorry, MAL, but I need to worry about my sanity on this one. =) Oh well, it seemed like a perfect fit for the scope of PEP 3108. Guido's reply seems to suggest that he's in favor of introducing a multi-package stdlib structure: I'm rejecting the proposal of a single top-level package named python. You've written that before, but you still haven't given any explanation of why a single package would be worse than a multi-level hierarchy of modules (e.g. grouped by application space). Because a single package doesn't have any other benefits besides getting out of the way from 3rd party developers. At least a proper hierarchy would have the other benefits of grouping. (But better make it a shallow hierarchy! remember Flat is better than nested.) AFAICT, he was only objecting having a single package without any extra restructuring. Then again, the post is from 2002 - so things may have changed. There have been a couple of attempts to reorg the stdlib into packages, but AFAIR, I see, all of them were withdrawn due to the problem of finding a suitable grouping (often enough, a module would be suitable for more than just one functional package, e.g. urllib would fit io as well as net) or lack of support from the developers. Now that we're discussing moving the include files into a subdirectory (for much the same reasons), I think it's time to reboot the discussion of a Python package with or without possible subpackages. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5.1 plans
On 2007-01-04 07:59, Neal Norwitz wrote: The current schedule looks like it's shaping up to be: Wed, Jan 24 for 2.5.1c1 Wed Jan 31 for 2.5.1 It would be great if you could comment on some of the bug reports below. I think several already have patches/suggested fixes. It's not clear to me if this should be fixed, but it's got a high priority:: http://python.org/sf/1467929 %-formatting and dicts +1 The patch is ready to be applied. The only reason it got delayed was the 2.5 release timing. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pep-3108.txt
On 2007-01-03 00:35, Barry Warsaw wrote: On Jan 2, 2007, at 5:41 PM, M.-A. Lemburg wrote: Note that as side-effect of this it becomes a lot harder to manipulate PYTHONPATH to trick Python into loading a standard module from a non-standard location, improving security and robustness of the Python installations. Sometimes though you want to do this, as when you want your application to ensure it gets a particular version of a standard library module, regardless of the version of Python being used. And now we're back to application-specific site-packages ;). Well, I guess that's a rather particular use case and can probably only be safely implemented by the maintainer of the module or package in question ;-) In such (rare) cases, it should be possible to use one of the harder ways to achieve this: * monkey patching the package * using package.__path__ to redirect the in-package search * creating a private copy of the whole package which then has the modified modules and packages in place Regarding application specific package setups: In my experience it's better to have an application specific sys.path setup function that manages this, rather than trying to manipulate PYTHONPATH or trying to tweak Python's stdlib site.py into using some particular way of setting up application specific paths which then makes interop harder for all applications using Python, rather than just the few that require such setups. The application can then call this path setup function early on in the startup phase to make sure that the rest of the startup and the application's main code then imports the right modules and packages. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 03 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pep-3108.txt
On 2007-01-02 01:02, brett.cannon wrote: Author: brett.cannon Date: Tue Jan 2 01:02:41 2007 New Revision: 53204 Added: peps/trunk/pep-3108.txt (contents, props changed) Modified: peps/trunk/pep-.txt Log: Add PEP 3108: Standard Library Reorganization. ... +Open Issues +=== + +Consolidate dependent modules together into a single module or package? ... +Consolidate certain modules with similar themes together in a package? +-- ... If you do follow this route, please take the chance to place the whole Python stdlib under a single package. That way we'll avoid name clashes with existing packages and modules now and in the future. Together with absolute imports this also improves the readability of modules since it becomes immediately clear where the imported code is coming from. Note that as side-effect of this it becomes a lot harder to manipulate PYTHONPATH to trick Python into loading a standard module from a non-standard location, improving security and robustness of the Python installations. +Packages are often used to group together modules that have a similar +theme but do not have any direct relationship or dependency upon each +other. For Python 3.0 obvious groupings could be done since renaming +of various modules is already occurring. + +* collections ++ heapq ++ Queue ++ sets ++ UserDist ++ UserList ++ What to do with UserString? +- Have a package for Python implementations of built-in types + instead of putting the User* modules into 'collections'? +* mac ++ Various Mac-specific modules. ++ Same can be done for other platform-specific code. +* Profiling ++ cProfile ++ profile ++ hotshot ++ pstats +* email ++ mailbox ++ mhlib +* Databases ++ anydbm ++ dbhash ++ dbm ++ bsddb ++ dumbdbm ++ gdbm ++ whichdb +* Audio ++ aifc ++ audioop ++ chunk ++ ossaudiodev ++ sunau ++ wave ++ winsound +* Servers ++ BaseHTTPServer ++ CGIHTTPServer ++ DocXMLRPCServer ++ SimpleHTTPServer ++ SimpleXMLRPCServer ++ SocketServer The package names should probably be converted to lower-case to follow PEP 8. Thanks and Happy New Year, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 02 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pep-3108.txt
On 2007-01-02 23:54, Brett Cannon wrote: On 1/2/07, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2007-01-02 01:02, brett.cannon wrote: Author: brett.cannon Date: Tue Jan 2 01:02:41 2007 New Revision: 53204 Added: peps/trunk/pep-3108.txt (contents, props changed) Modified: peps/trunk/pep-.txt Log: Add PEP 3108: Standard Library Reorganization. ... +Open Issues +=== + +Consolidate dependent modules together into a single module or package? ... +Consolidate certain modules with similar themes together in a package? +-- ... If you do follow this route, please take the chance to place the whole Python stdlib under a single package. That way we'll avoid name clashes with existing packages and modules now and in the future. That has been suggested before (including by me) and Guido has always shot it down. That's why I left it out of this proposal. Even if it is shot down again, it still deserves to be documented together with the reasons for being shot down. This is a one-in-a-lifetime chance, so it would be sad if it were not taken into account. The extra effort would be minimal - the renaming would have to be done using a script anyway and adding an extra 'from py import ' prefix to the modules wouldn't really make the renaming more complicated ;-) Together with absolute imports this also improves the readability of modules since it becomes immediately clear where the imported code is coming from. Note that as side-effect of this it becomes a lot harder to manipulate PYTHONPATH to trick Python into loading a standard module from a non-standard location, improving security and robustness of the Python installations. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 02 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __str__ and unicode
On 2006-12-06 10:26, Fredrik Lundh wrote: over at my work copy of the python language reference, Adrian Holovaty asked about the exact semantics of the __str__ hook: http://effbot.org/pyref/__str__ The return value must be a string object. Does this mean it can be a *Unicode* string object? This distinction is ambiguous to me because unicode objects and string objects are both subclasses of basestring. May a __str__() return a Unicode object? I seem to remember earlier discussions on this topic, but don't recall when and what. From what I can tell, __str__ may return a Unicode object, but only if can be converted to an 8-bit string using the default encoding. Is this on purpose or by accident? Do we have a plan for improving the situation in future 2.X releases ? This was added to make the transition to all Unicode in 3k easier: .__str__() may return a string or Unicode object. .__unicode__() must return a Unicode object. There is no restriction on the content of the Unicode string for .__str__(). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 06 2006) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __str__ and unicode
On 2006-12-06 10:46, M.-A. Lemburg wrote: On 2006-12-06 10:26, Fredrik Lundh wrote: over at my work copy of the python language reference, Adrian Holovaty asked about the exact semantics of the __str__ hook: http://effbot.org/pyref/__str__ The return value must be a string object. Does this mean it can be a *Unicode* string object? This distinction is ambiguous to me because unicode objects and string objects are both subclasses of basestring. May a __str__() return a Unicode object? I seem to remember earlier discussions on this topic, but don't recall when and what. From what I can tell, __str__ may return a Unicode object, but only if can be converted to an 8-bit string using the default encoding. Is this on purpose or by accident? Do we have a plan for improving the situation in future 2.X releases ? This was added to make the transition to all Unicode in 3k easier: .__str__() may return a string or Unicode object. .__unicode__() must return a Unicode object. There is no restriction on the content of the Unicode string for .__str__(). One more thing, since these two hooks are commonly used with str() and unicode(): * unicode(obj) will first try .__unicode() and then revert to .__str__() (possibly converting the string return value to Unicode) * str(obj) will try .__str__() only (possibly converting the Unicode return value to a string using the default encoding) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 06 2006) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] __str__ and unicode
On 2006-12-06 10:56, Fredrik Lundh wrote: M.-A. Lemburg wrote: This was added to make the transition to all Unicode in 3k easier: thanks for the clarification. do you recall when this was added? 2.5? Not really, only that it was definitely before 2.5. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 06 2006) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: Adding data-type objects to Python
Travis E. Oliphant wrote: PEP: unassigned Title: Adding data-type objects to the standard library Attributes kind -- returns the basic kind of the data-type. The basic kinds are: 't' - bit, 'b' - bool, 'i' - signed integer, 'u' - unsigned integer, 'f' - floating point, 'c' - complex floating point, 'S' - string (fixed-length sequence of char), 'U' - fixed length sequence of UCS4, Shouldn't this read fixed length sequence of Unicode ?! The underlying code unit format (UCS2 and UCS4) depends on the Python version. 'O' - pointer to PyObject, 'V' - Void (anything else). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2006) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com