Re: [Python-Dev] [Python-checkins] r86930 - in python/branches/py3k: Doc/library/os.rst Lib/os.py Lib/test/test_os.py Misc/ACKS Misc/NEWS
On Thu, Dec 2, 2010 at 5:05 PM, terry.reedy python-check...@python.org wrote: + If + the target directory with the same mode as we specified already exists, + raises an :exc:`OSError` exception if *exist_ok* is False, otherwise no + exception is raised. If the directory cannot be created in other cases, + raises an :exc:`OSError` exception. I would suggest being explicit here that directory exists, but has a mode other than the one requested always triggers an exception. Perhaps something like the following: Raises an :exc:`OSError` exception if the target directory already exists, unless *exist_ok* is True and the existing directory has the same mode as is specified in the current call. Also raises an :exc:`OSError` exception if the directory cannot be created for any other reason. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] AIX 5.3 - Enabling Shared Library Support Vs Extensions
Hi Anurag, Le 25/11/2010 10:24, Anurag Chourasia a écrit : All, When I configure python to enable shared libraries, none of the extensions are getting built during the make step due to this error. you may want to take a look at the following issue: http://bugs.python.org/issue941346 Python compiled with shared libraries was broken on AIX until recently. There are some patches there to get it to work, or you may want to test the latest 2.7 or 3.x releases. regards -- Sébastien Sablé ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r86924 - python/branches/py3k/Doc/library/random.rst
On Thu, Dec 2, 2010 at 12:41 PM, raymond.hettinger python-check...@python.org wrote: +A more general approach is to arrange the weights in a cumulative probability +distribution with :func:`itertools.accumulate`, and then locate the random value +with :func:`bisect.bisect`:: + + choices, weights = zip(*weighted_choices) + cumdist = list(itertools.accumulate(weights)) + x = random.random() * cumdist[-1] + choices[bisect.bisect(cumdist, x)] +'Blue' “pydoc bisect.bisect” is empty (“Alias for bisect_right()”); in the code, bisect.bisect is noted as compatibility alias. Wouldn’t it be more helpful to use the newer name? Regards ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Stephen J. Turnbull: Here's why: '''print %d % some_integer''' doesn't now, and never will (unless Kristan gets his Python 2.8wink), produce Arabic or Han numerals. Not in any language I know of, not in Microsoft Excel, and definitely not in Python 2. While I don't have Excel to test with, OpenOffice.org Calc will display in Arabic or Han numerals using the NatNum format codes. http://www.scintilla.org/ArabicNumbers.png Ditto Arabic, I would imagine; ISO 8859/6 (aka Latin/Arabic) does not contain the Arabic digits that have been presented here earlier AFAICT. Note that there's plenty of space for them in that code table (eg, 0xB0-0xB9 is empty). Apparently nobody *ever* thought it was useful to have them! DOS code page 864 does use 0xB0-0xB9 for ٠ .. ٩. http://www.ascii.ca/cp864.htm Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 01.12.2010 23:39, schrieb Martin v. Löwis: As of today, What’s New In Python 3.2 [1] does not even mention the unicodedata upgrade to 6.0.0. One reason was that I was instructed not to change What's New a few years ago. Maybe all past, present and future whatsnew maintainers can agree on these rules, which I copied directly from whatsnew/3.2.rst? Rules for maintenance: * Anyone can add text to this document. Do not spend very much time on the wording of your changes, because your text will probably get rewritten to some degree. * The maintainer will go through Misc/NEWS periodically and add changes; it's therefore more important to add your changes to Misc/NEWS than to this file. * This is not a complete list of every single change; completeness is the purpose of Misc/NEWS. Some changes I consider too small or esoteric to include. If such a change is added to the text, I'll just remove it. (This is another reason you shouldn't spend too much time on writing your addition.) * If you want to draw your new text to the attention of the maintainer, add 'XXX' to the beginning of the paragraph or section. * It's OK to just add a fragmentary note about a change. For example: XXX Describe the transmogrify() function added to the socket module. The maintainer will research the change and write the necessary text. * You can comment out your additions if you like, but it's not necessary (especially when a final release is some months away). * Credit the author of a patch or bugfix. Just the name is sufficient; the e-mail address isn't necessary. It's helpful to add the issue number: XXX Describe the transmogrify() function added to the socket module. (Contributed by P.Y. Developer; :issue:`12345`.) This saves the maintainer the effort of going through the SVN log when researching a change. Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Change to the Distutils / Distutils2 workflow
Hey We discussed with Eric about the debugging workflow and we agreed that our life would be easier if every bug fix would land first in Distutils2 when it makes sense, then get backported to Distutils1. For other core-devs that would mean that your patches should be done against hg.python.org/distutils2, which uses unittest2. Then Eric and I would take care of the backporting. I am planning to set up a wiki page with the workflow as soon as I get a chance. Thanks Tarek -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
2010/12/2 Stephen J. Turnbull step...@xemacs.org: Because that works, but print(T1234) doesn't (it prints ASCII). You can't round-trip, but users will want/expect that. You should be able to round-trip, absolutely. I don't think you should expect print() to do that. str(56) possibly. :) That's an argument for it to be in a module, as you then would need to send in a parameter on which decimal characters you want. T1000 = float('一.◯◯◯') That was already discussed here, and it's clear that unicode does not consider these characters to be something you can use in a decimal number, and hence it's not broken. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Wed, 1 Dec 2010 22:28:49 -0500 Alexander Belopolsky alexander.belopol...@gmail.com wrote: Both my personal observations when travelling from Turkey to India and Wikipedia say yes. When representing a number in Arabic, the lowest-valued position is placed on the right, so the order of positions is the same as in left-to-right scripts. https://secure.wikimedia.org/wikipedia/en/wiki/Arabic_language#Numerals This matches my limited research on this topic as well. However, I am not sure that when these codes are embedded in Arabic text, their logical order always matches their display order. That shouldn't matter, since unicode text follows logical order. The display order is up to the graphical representation library. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
On Wed, Dec 1, 2010 at 20:17, Antoine Pitrou solip...@pitrou.net wrote: And I'm not sure what this package called Python is (“a high-level object-oriented programming language”? like Java?), but I'm pretty sure I've heard there's a Python 3 compatible version. Uhm... http://pypi.python.org/pypi/Python Anybody wanna remove that, or update it or something? :-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] AIX 5.3 - Enabling Shared Library Support Vs Extensions
Hi Sebastian, Thanks for your response. I looked at http://bugs.python.org/issue941346 earlier. I was referred to this link by Stefan Krah through another bug that i created at http://bugs.python.org/issue10555 for this issue. I confirm that my problem is solved with the Python 2.7.1 release which contains the changes done by you. Great work done by you and other folks for enabling the Shared Library build on AIX. Hats Off !!! Regards, Anurag 2010/12/2 Sébastien Sablé sa...@users.sourceforge.net Hi Anurag, Le 25/11/2010 10:24, Anurag Chourasia a écrit : All, When I configure python to enable shared libraries, none of the extensions are getting built during the make step due to this error. you may want to take a look at the following issue: http://bugs.python.org/issue941346 Python compiled with shared libraries was broken on AIX until recently. There are some patches there to get it to work, or you may want to test the latest 2.7 or 3.x releases. regards -- Sébastien Sablé ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ICU
On Wed, Dec 1, 2010 at 8:45 PM, Alexander Belopolsky alexander.belopol...@gmail.com wrote: On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote: Oh, about ICU: Actually, I remember you saying that locale should ideally be replaced with a wrapper around the ICU library. By that, I stand - however, I have given up the hope that this will happen anytime soon. Perhaps this could be made a GSOC topic. Incidentally, this may also address another Python's Achilles' heel: the timezone support. http://icu-project.org/download/icutzu.html I work with people who speak highly of ICU, so I want to encourage work in this area. At the same time, I'm skeptical -- IIRC, ICU is a large amount of C++ code. I don't know how easy it will be to integrate this into our build processes for various platforms, nor how Pythonic the resulting APIs will look to the experienced Python user. Still, those are not roadblocks, the benefits are potentially great, so it's definitely worth investigating! -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py
On 2010/11/27 5:31, Brian Curtin wrote: On Fri, Nov 26, 2010 at 14:18, Hirokazu Yamamotoocean-c...@m2.ccsnet.ne.jp wrote: On 2010/11/27 5:02, Brian Curtin wrote: We briefly chatted about this on the os.link feature issue, but I never found a way around it. How about implementing os.path.samefile in Modules/posixmodule.c like this? http://bugs.python.org/file19262/py3k_fix_kill_python_for_short_path.patch # I hope this works. That's almost identical to what the current os.path.sameopenfile is. Lib/ntpath.py opens both files, then compares them via _getfileinformation. That function is implemented to take in a file descriptor, call GetFileInformationByHandle with it, then returns a tuple of dwVolumeSerialNumber, nFileIndexHigh, and nFileIndexLow. Yes. Difference is, file object cannot represent directory, and probably FILE_FLAG_BACKUP_SEMANTICS makes it faster to open file. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ICU
On Dec 1, 2010, at 11:45 PM, Alexander Belopolsky wrote: On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote: Oh, about ICU: Actually, I remember you saying that locale should ideally be replaced with a wrapper around the ICU library. By that, I stand - however, I have given up the hope that this will happen anytime soon. Perhaps this could be made a GSOC topic. Incidentally, this may also address another Python's Achilles' heel: the timezone support. http://icu-project.org/download/icutzu.html Does ICU do anything regarding timezones that datetime + pytz doesn't already do? Wouldn't it make more sense to integrate the already-existing-and-pythonic pytz into Python than to make a new wrapper based on ICU? James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 8:36 AM, Antoine Pitrou solip...@pitrou.net wrote: On Wed, 1 Dec 2010 22:28:49 -0500 Alexander Belopolsky alexander.belopol...@gmail.com wrote: .. This matches my limited research on this topic as well. However, I am not sure that when these codes are embedded in Arabic text, their logical order always matches their display order. That shouldn't matter, since unicode text follows logical order. The display order is up to the graphical representation library. I am not so sure. On my Mac, U+200F (RIGHT-TO-LEFT MARK) affects 0-9 and Arabic-Indic decimals differently: print('\u200F123') 123 print('\u200F\u0661\u0662\u0663') 231 I replaced Arabic-Indic decimals with 0-9 in the output to demonstrate the point. Cut-n-paste does not work well in the presence of RTL directives. and U+202E (RIGHT-TO-LEFT OVERRIDE) reverts the display order for both: print('\u202E123') 321 print('\u202E\u0661\u0662\u0663') 321 (again, the output display is simulated not copied.) I don't know if explicit RTL directives are ever used in Arabic texts, but it is quite possible that texts converted from older formats would use them for efficiency. Note that my point is not to find the correct answer here, but to demonstrate that we as a group don't have the expertise to get parsing of Arabic text right. If we've got it right for Arabic, it is by chance and not by design. This still leaves us with 41 other types of digits for at least 30 different languages. Nobody will ever assume that python builtins are suitable for use with all these variants. This feature is only good for nefarious purposes such as hiding extra digits in innocent-looking files or smuggling binary data through naive interfaces. PS: BTW, shouldn't int('\u0661\u0662\u06DD') be valid? or is it int('\u06DD\u0661\u0662')? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Le jeudi 02 décembre 2010 à 11:41 -0500, Alexander Belopolsky a écrit : Note that my point is not to find the correct answer here, but to demonstrate that we as a group don't have the expertise to get parsing of Arabic text right. I don't understand why you think Arabic or Hebrew text is any different from Western text. Surely right-to-left isn't more conceptually complicated than left-to-right, is it? The fact that mixed rtl + ltr can render bizarrely or is awkward to cut and paste is quite off-topic for our discussion. If we've got it right for Arabic, it is by chance and not by design. This still leaves us with 41 other types of digits for at least 30 different languages. So why do you trust the Unicode standard on other things and not on this one? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ICU
2010/12/2 Guido van Rossum gu...@python.org: On Wed, Dec 1, 2010 at 8:45 PM, Alexander Belopolsky alexander.belopol...@gmail.com wrote: On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote: Oh, about ICU: Actually, I remember you saying that locale should ideally be replaced with a wrapper around the ICU library. By that, I stand - however, I have given up the hope that this will happen anytime soon. Perhaps this could be made a GSOC topic. Incidentally, this may also address another Python's Achilles' heel: the timezone support. http://icu-project.org/download/icutzu.html I work with people who speak highly of ICU, so I want to encourage work in this area. At the same time, I'm skeptical -- IIRC, ICU is a large amount of C++ code. I don't know how easy it will be to integrate this into our build processes for various platforms, nor how Pythonic the resulting APIs will look to the experienced Python user. There's a nice C-API. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r86930 - in python/branches/py3k: Doc/library/os.rst Lib/os.py Lib/test/test_os.py Misc/ACKS Misc/NEWS
On 12/2/2010 4:32 AM, Nick Coghlan wrote: On Thu, Dec 2, 2010 at 5:05 PM, terry.reedypython-check...@python.org wrote: (except I did not write most of the patch) + If + the target directory with the same mode as we specified already exists, + raises an :exc:`OSError` exception if *exist_ok* is False, otherwise no + exception is raised. If the directory cannot be created in other cases, + raises an :exc:`OSError` exception. I would suggest being explicit here that directory exists, but has a mode other than the one requested always triggers an exception. Perhaps something like the following: Raises an :exc:`OSError` exception if the target directory already exists, unless *exist_ok* is True and the existing directory has the same mode as is specified in the current call. Also raises an :exc:`OSError` exception if the directory cannot be created for any other reason. Georg has already patched that paragraph. I will let him decide if any further change is needed. Terry ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ICU
At 07:47 AM 12/2/2010 -0800, Guido van Rossum wrote: On Wed, Dec 1, 2010 at 8:45 PM, Alexander Belopolsky alexander.belopol...@gmail.com wrote: On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote: Oh, about ICU: Actually, I remember you saying that locale should ideally be replaced with a wrapper around the ICU library. By that, I stand - however, I have given up the hope that this will happen anytime soon. Perhaps this could be made a GSOC topic. Incidentally, this may also address another Python's Achilles' heel: the timezone support. http://icu-project.org/download/icutzu.html I work with people who speak highly of ICU, so I want to encourage work in this area. At the same time, I'm skeptical -- IIRC, ICU is a large amount of C++ code. I don't know how easy it will be to integrate this into our build processes for various platforms, nor how Pythonic the resulting APIs will look to the experienced Python user. Still, those are not roadblocks, the benefits are potentially great, so it's definitely worth investigating! FWIW, OSAF did a wrapping for Chandler, though I personally haven't used it: http://pyicu.osafoundation.org/ The README explains the mapping from the ICU APIs to Python ones, including iteration, string conversion, and timezone mapping for use with the datetime type. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
On 12/2/2010 8:36 AM, Lennart Regebro wrote: On Wed, Dec 1, 2010 at 20:17, Antoine Pitrousolip...@pitrou.net wrote: And I'm not sure what this package called Python is (“a high-level object-oriented programming language”? like Java?), but I'm pretty sure I've heard there's a Python 3 compatible version. Uhm... http://pypi.python.org/pypi/Python Anybody wanna remove that, or update it or something? :-) Entry is for Python 2.5. # Package Index Owner: guido, anthony, barry -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 11:56 AM, Antoine Pitrou solip...@pitrou.net wrote: Le jeudi 02 décembre 2010 à 11:41 -0500, Alexander Belopolsky a écrit : Note that my point is not to find the correct answer here, but to demonstrate that we as a group don't have the expertise to get parsing of Arabic text right. I don't understand why you think Arabic or Hebrew text is any different from Western text. Surely right-to-left isn't more conceptually complicated than left-to-right, is it? No, but a mix of LTR and RTL is certainly more difficult that either of the two. I invite you to digest Unicode Standard Annex #9 before we continue this discussion. See http://unicode.org/reports/tr9/. The fact that mixed rtl + ltr can render bizarrely or is awkward to cut and paste is quite off-topic for our discussion. No, it is not. One of the invented use cases in this thread was naive users' desire to enter numbers using their preferred local decimals. Same users may want to be able to cut and paste their decimals as well. More importantly, however, legacy formats may not have support for mixed-direction text and may require that John is 41 be stored as 41 si nhoJ and Unicode converter would turn it into [RTL]John is 14 that will still display as 41 si nhoJ, but int(s[-2:]) will return 14, not 41. If we've got it right for Arabic, it is by chance and not by design. This still leaves us with 41 other types of digits for at least 30 different languages. So why do you trust the Unicode standard on other things and not on this one? What other things? As far as I understand the only str method that was designed to comply with Unicode recomendations was str.isidentifier(). And we have some really bizarre results: '\u2164'.isidentifier() True '\u2164'.isalpha() False and can you describe the difference between str.isdigit() and str.isdecimal()? According to the reference manual, str.isdecimal() Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters include digit characters, and all characters that that can be used to form decimal-radix numbers, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. str.isdigit() Return true if all characters in the string are digits and there is at least one character, false otherwise. http://docs.python.org/dev/library/stdtypes.html#str.isdecimal Since U+0660 is mentioned in the first definition and not in the second, I may conclude that it is not a digit, but '\u0660'.isdigit() True If you know the correct answer, please contribute it here: http://bugs.python.org/issue10587. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Le jeudi 02 décembre 2010 à 13:14 -0500, Alexander Belopolsky a écrit : I don't understand why you think Arabic or Hebrew text is any different from Western text. Surely right-to-left isn't more conceptually complicated than left-to-right, is it? No, but a mix of LTR and RTL is certainly more difficult that either of the two. I invite you to digest Unicode Standard Annex #9 before we continue this discussion. See http://unicode.org/reports/tr9/. “This annex describes specifications for the *positioning* of characters flowing from right to left” (emphasis mine) Looks like something for implementors of rendering engines, which python-dev is not AFAICT. Same users may want to be able to cut and paste their decimals as well. More importantly, however, legacy formats may not have support for mixed-direction text and may require that John is 41 be stored as 41 si nhoJ and Unicode converter would turn it into [RTL]John is 14 that will still display as 41 si nhoJ, but int(s[-2:]) will return 14, not 41. The legacy format argument looks like a red herring to me. When converting from a format to another it is the programmer's job to his/her job right. If we've got it right for Arabic, it is by chance and not by design. This still leaves us with 41 other types of digits for at least 30 different languages. So why do you trust the Unicode standard on other things and not on this one? What other things? Everything which the Unicode database stores and that we already rely on. As far as I understand the only str method that was designed to comply with Unicode recomendations was str.isidentifier(). I don't think so. str.split() and str.splitlines() are also defined in conformance to the SPEC, AFAIK. They certainly try to. And, outside of str itself, the re module tries to follow Unicode categories as well (for example, \d should match non-ASCII digits). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
On Dec 02, 2010, at 12:59 PM, Terry Reedy wrote: On 12/2/2010 8:36 AM, Lennart Regebro wrote: On Wed, Dec 1, 2010 at 20:17, Antoine Pitrousolip...@pitrou.net wrote: And I'm not sure what this package called Python is (“a high-level object-oriented programming language”? like Java?), but I'm pretty sure I've heard there's a Python 3 compatible version. Uhm... http://pypi.python.org/pypi/Python Anybody wanna remove that, or update it or something? :-) Entry is for Python 2.5. # Package Index Owner: guido, anthony, barry Well, I definitely can't remember ever seeing that before. Of course, that doesn't mean I haven't. ;) -Barry Aside: how does one log into the Cheeseshop with your Launchpad OpenID? When I try to do it I end up on a Manual user registration page. I fill out the username with what I think my PyPI user name is, and add my python.org email address, but then it tells me 'barry' is already taken. Do I need some kind of back door linking of my lp openid and my pypi user id? signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
On 2010-12-01, at 11:02 AM, Brian Curtin wrote: http://onpython3yet.com/ might be helpful to you. It orders the projects on PyPI with the most dependencies which are not yet ported to 3.x. Note that there are a number of false positives, e.g., the first result -- NumPy, since people don't seem to keep their classifiers up-to-date. Also note that the dependency information is incomplete. For instance, onpython3yet.com shows just 14 packages depending on Twisted, http://onpython3yet.com/packages/show/Twisted while, in reality, there are 68 of them, http://code.activestate.com/pypm/twisted/#requiredby (see the right sidebar) -srid ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 02.12.2010 03:01, schrieb Ben Finney: Stephen J. Turnbull step...@xemacs.org writes: Furthermore, he provided good *objective* reason (excessive cost, to which I can also testify, in several different input methods for Japanese) why numbers simply would not be input that way. What's left is copy/paste via the mouse. For direct entry by an interactive user, yes. Why are some people in this discussion thinking only of direct entry by an interactive user? Ultimately, somebody will have entered the data. Input from an existing text file, as I said earlier. Which *specific* existing text file? Have you actually *seen* such a text file? Direct entry at the console is a red herring. And we don't need powerhouses because power comes out of the socket. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Maybe all past, present and future whatsnew maintainers can agree on these rules, which I copied directly from whatsnew/3.2.rst? I don't think all past maintainers can (I'm pretty certain that AMK would disagree), but if that's the current policy, I can certainly try following it (I didn't know it exists because I never look at the file). Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
Aside: how does one log into the Cheeseshop with your Launchpad OpenID? When I try to do it I end up on a Manual user registration page. I fill out the username with what I think my PyPI user name is, and add my python.org email address, but then it tells me 'barry' is already taken. Do I need some kind of back door linking of my lp openid and my pypi user id? Since the barry account already exists, you first need to log into that (likely using a password). You can then claim the LP OpenID as being associated with that account, and use LP in the future. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
On Dec 02, 2010, at 08:44 PM, Martin v. Löwis wrote: Since the barry account already exists, you first need to log into that (likely using a password). You can then claim the LP OpenID as being associated with that account, and use LP in the future. Thanks Martin. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 384 accepted
Hi, Since discussion has trailed off without any blocking objections, I'm accepting PEP 384. Martin, you may mark the PEP accepted and proceed with merging the implementation for the beta on Saturday. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Martin v. Löwis wrote: Now, one may wonder what precisely a possibly signed floating point number is, but most likely, this refers to floatnumber ::= pointfloat | exponentfloat pointfloat::= [intpart] fraction | intpart . exponentfloat ::= (intpart | pointfloat) exponent intpart ::= digit+ fraction ::= . digit+ exponent ::= (e | E) [+ | -] digit+ digit ::= 0...9 I don't see why the language spec should limit the wealth of number formats supported by float(). If it doesn't, there should be some other specification of what is correct and what is not. It must not be unspecified. True. It is not uncommon for Asians and other non-Latin script users to use their own native script symbols for numbers. Just because these digits may look strange to someone doesn't mean that they are meaningless or should be discarded. Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing system in which '١٢٣٤.٥٦e4' means 12345600.0. I'm not sure what you're after here. Please also remember that Python3 now allows Unicode names for identifiers for much the same reasons. No no no. Addition of Unicode identifiers has a well-designed, deliberate specification, with a PEP and all. The support for non-ASCII digits in float appears to be ad-hoc, and not founded on actual needs of actual users. Please note that we didn't have PEPs and the PEP process at the time. The Unicode proposal predates and in some respects inspired the PEP process. The decision to add this support was deliberate based on the desire to support as much of the nice features of Unicode in Python as we could. At least that was what was driving me at the time. Regarding actual needs of actual users: I don't buy that as an argument when it comes to supporting a standard that is meant to attract users with non-ASCII origins. Some references you may want to read up on: http://en.wikipedia.org/wiki/Numbers_in_Chinese_culture http://en.wikipedia.org/wiki/Vietnamese_numerals http://en.wikipedia.org/wiki/Korean_numerals http://en.wikipedia.org/wiki/Japanese_numerals Even MS Office supports them: http://languages.siuc.edu/Chinese/Language_Settings.html Note that the support in float() (and the other numeric constructors) to work with Unicode code points was explicitly added when Unicode support was added to Python and has been available since Python 1.6. That doesn't necessarily make it useful. Alexander's complaint is that it makes Python unstable (i.e. changing as the UCD changes). If that were true, then all Unicode database (UCD) changes would make Python unstable. However, most changes to existing code points in the UCS are bug fixes, so they actually have a stabilizing quality more than a destabilizing one. It is not a bug by any definition of bug Most certainly it is: the documentation is either underspecified, or deviates from the implementation (when taking the most plausible interpretation). This is the very definition of bug. The implementation is not a bug and neither was this a bug in the 2.x series of the Python documentation. The Python 3.x docs apparently introduced a reference to the language spec which is clearly not capturing the wealth of possible inputs. So, yes, we're talking about a documentation bug, but not an implementation bug. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2010) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 02.12.2010 20:40, schrieb Martin v. Löwis: Maybe all past, present and future whatsnew maintainers can agree on these rules, which I copied directly from whatsnew/3.2.rst? I don't think all past maintainers can Yes, and the same goes for the future ones, since they may not even know yet that they will be whatsnew maintainers. Or maybe they aren't born yet (let's hope for a long life of Python 3...). (I'm pretty certain that AMK would disagree), but if that's the current policy, I can certainly try following it (I didn't know it exists because I never look at the file). The large chunk of rules appeared in 2.6, where AMK still was maintainer. But even in the whatsnew for 2.4, there is this: .. Don't write extensive text for new sections; I'll do that. .. Feel free to add commented-out reminders of things that need .. to be covered. --amk But in any case, they are certainly valid for the current whatsnew -- even if Raymond likes to grumble about too expansive commits :) Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
Am 02.12.2010 20:06, schrieb Barry Warsaw: On Dec 02, 2010, at 12:59 PM, Terry Reedy wrote: On 12/2/2010 8:36 AM, Lennart Regebro wrote: On Wed, Dec 1, 2010 at 20:17, Antoine Pitrousolip...@pitrou.net wrote: And I'm not sure what this package called Python is (“a high-level object-oriented programming language”? like Java?), but I'm pretty sure I've heard there's a Python 3 compatible version. Uhm... http://pypi.python.org/pypi/Python Anybody wanna remove that, or update it or something? :-) Entry is for Python 2.5. # Package Index Owner: guido, anthony, barry Well, I definitely can't remember ever seeing that before. Of course, that doesn't mean I haven't. ;) No idea what that entry is about. * Development Status :: 3 - Alpha * Development Status :: 6 - Mature Aha. Let's just delete it. Aside: how does one log into the Cheeseshop with your Launchpad OpenID? When I try to do it I end up on a Manual user registration page. I fill out the username with what I think my PyPI user name is, and add my python.org email address, but then it tells me 'barry' is already taken. Do I need some kind of back door linking of my lp openid and my pypi user id? In addition to what Martin said, the Claim OpenID form is on the Your Details page. Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing system in which '١٢٣٤.٥٦e4' means 12345600.0. I'm not sure what you're after here. That the current float() constructor accepts tons of bogus character strings and accepts them as numbers, and that it should stop doing so. The decision to add this support was deliberate based on the desire to support as much of the nice features of Unicode in Python as we could. At least that was what was driving me at the time. At the time, this may have been the right thing to do. With the experience gained, we should now conclude to revert this particular aspect. Some references you may want to read up on: http://en.wikipedia.org/wiki/Numbers_in_Chinese_culture http://en.wikipedia.org/wiki/Vietnamese_numerals http://en.wikipedia.org/wiki/Korean_numerals http://en.wikipedia.org/wiki/Japanese_numerals I don't question that people use non-ASCII characters to denote numbers. I claim that the specific support in Python for that has no connection to reality. I further claim that the use of non-ASCII numbers is a local convention, and that if you provide a library to parse numbers, users (of that library) will somehow have to specify which notational convention(s) is reasonable for the input they have. Even MS Office supports them: http://languages.siuc.edu/Chinese/Language_Settings.html That's printing, though, not parsing. Notice that Python does *not* currently support printing numbers in other scripts - even though this may actually be more useful than parsing. Note that the support in float() (and the other numeric constructors) to work with Unicode code points was explicitly added when Unicode support was added to Python and has been available since Python 1.6. That doesn't necessarily make it useful. Alexander's complaint is that it makes Python unstable (i.e. changing as the UCD changes). If that were true, then all Unicode database (UCD) changes would make Python unstable. That's indeed the case - they do (see the recent bug report on white space processing). However, any change makes Python unstable (in the sense that it can potentially break existing applications), and, in many cases, the risk of breaking something is well worth it. In the case of number parsing, I think Python would be better if float() rejected non-ASCII strings, and any support for such parsing should be redone correctly in a different place (preferably along with printing of numbers). Most certainly it is: the documentation is either underspecified, or deviates from the implementation (when taking the most plausible interpretation). This is the very definition of bug. The implementation is not a bug and neither was this a bug in the 2.x series of the Python documentation. Of course the 2.x documentation is wrong, in that it is severely underspecified, and the most straight-forward interpretation of the specific wording gives an incorrect impression of the implementation. The Python 3.x docs apparently introduced a reference to the language spec which is clearly not capturing the wealth of possible inputs. Right - but only because the 2.x documentation *already* suggested that the supported syntax matches the literal syntax - as that's the most natural thing to assume. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
Since discussion has trailed off without any blocking objections, I'm accepting PEP 384. Martin, you may mark the PEP accepted and proceed with merging the implementation for the beta on Saturday. Thanks! will do (I'll also take into consideration the proposed changes). Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
On Thu, Dec 2, 2010 at 20:24, Sridhar Ratnakumar sridh...@activestate.com wrote: Also note that the dependency information is incomplete. Also, a python3 version of chardet is available (from the website only, looks like). Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On Thu, Dec 2, 2010 at 9:24 PM, Martin v. Löwis mar...@v.loewis.de wrote: Since discussion has trailed off without any blocking objections, I'm accepting PEP 384. Martin, you may mark the PEP accepted and proceed with merging the implementation for the beta on Saturday. Thanks! will do (I'll also take into consideration the proposed changes). I did not get an answer to my last mail about distutils / distutils2 Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
Am 02.12.2010 21:48, schrieb Tarek Ziadé: On Thu, Dec 2, 2010 at 9:24 PM, Martin v. Löwis mar...@v.loewis.de wrote: Since discussion has trailed off without any blocking objections, I'm accepting PEP 384. Martin, you may mark the PEP accepted and proceed with merging the implementation for the beta on Saturday. Thanks! will do (I'll also take into consideration the proposed changes). I did not get an answer to my last mail about distutils / distutils2 What was the question again, and whom did you want an answer from? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: Am 02.12.2010 21:48, schrieb Tarek Ziadé: On Thu, Dec 2, 2010 at 9:24 PM, Martin v. Löwis mar...@v.loewis.de wrote: Since discussion has trailed off without any blocking objections, I'm accepting PEP 384. Martin, you may mark the PEP accepted and proceed with merging the implementation for the beta on Saturday. Thanks! will do (I'll also take into consideration the proposed changes). I did not get an answer to my last mail about distutils / distutils2 What was the question again, and whom did you want an answer from? You can read it in the archives here: http://mail.python.org/pipermail/python-dev/2010-November/106138.html tldr: The question was Why not implementing this in Distutils2 ? Your answer was No, PEP 3149 was accepted, I will do this in Distutils1 My answer was Having an accepted PEP does not imply your code lands in the sdtlib (like PEP 376 and 345) So the question still stands: Why not implementing this in Distutils2 ? Regards Tarek Regards, Martin -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Martin v. Löwis wrote: [...] For direct entry by an interactive user, yes. Why are some people in this discussion thinking only of direct entry by an interactive user? Ultimately, somebody will have entered the data. I don't think you really believe that all data processed by a computer was eventually manually entered by a someone :-) I already gave you a couple of examples of how such data can end up being input for Python number constructors. If you are still curious, please see the Wikipedia pages I linked to, or have a look at these keyboards: http://en.wikipedia.org/wiki/File:KB_Arabic_MAC.svg http://en.wikipedia.org/wiki/File:Keyboard_Layout_Sanskrit.png http://en.wikipedia.org/wiki/File:800px-KB_Thai_Kedmanee.png http://en.wikipedia.org/wiki/File:Tibetan_Keyboard.png http://en.wikipedia.org/wiki/File:KBD-DZ-noshift-2009.png (all referenced on http://en.wikipedia.org/wiki/Keyboard_layout) and then compare these to: http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt Arabic numerals are being used a lot nowadays in Asian countries, but that doesn't mean that the native script versions are not being used anymore. Furthermore, data can well originate from texts that were written hundreds or even thousands of years ago, so there is plenty of material available for processing. Even if not entered directly, there are plenty of ways to convert Arabic numerals (or other numeral systems) to the above forms, e.g. in MS Office for Thai: http://office.microsoft.com/en-us/excel-help/convert-arabic-numbers-to-thai-text-format-HP003074364.aspx Anyway, as mentioned before: all this is really besides the point: If we want to support Unicode in Python, we have to also support conversion of numerals declared in Unicode into a form that can be processed by Python. Regardless of where such data originates. If we were not to follow this approach, we could just as well decide not support support reading Egyptian Hieroglyphs based on the argument that there's no keyboard to enter them... http://www.unicode.org/charts/PDF/U13000.pdf :-) (from http://www.unicode.org/charts/) Input from an existing text file, as I said earlier. Which *specific* existing text file? Have you actually *seen* such a text file? Have you tried Google ? http://www.google.com/search?q=١٢٣ http://www.google.com/search?q=٣+site%3Agov.lb Some examples: http://www.bdl.gov.lb/circ/intpdf/int123.pdf http://www.cdr.gov.lb/study/sdatl/Arabic/Chapter3.PDF http://www.batroun.gov.lb/PDF/Waredat2006.pdf (these all use http://en.wikipedia.org/wiki/Eastern_Arabic_numerals) Direct entry at the console is a red herring. And we don't need powerhouses because power comes out of the socket. Martin, the argument simply doesn't fit well with the discussion about Python and Unicode. We introduced Unicode in Python not because there was a need for each and every code point in Unicode, but because we wanted to adopt a standard which doesn't prefer any one way of writing things over another. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 02 2010) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
So the question still stands: Why not implementing this in Distutils2 ? Because it then wouldn't be available in Python 3.2, which is the target release of the PEP. If that really causes too much pain, I'll refrain from making any changes to distutils; PEP 384 doesn't specify any changes, anyway. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Arabic numerals are being used a lot nowadays in Asian countries, but that doesn't mean that the native script versions are not being used anymore. I never claimed that people are not using their local scripts to enter numbers. However, none of your examples is about Chinese numerals using an ASCII full stop as a decimal point. The only thing I claimed about usage (actually only repeating haiyang kang's earlier claim) is that nobody would enter Chinese numerals with a keyboard and then use full stop as the decimal separator. So all your counter-examples just don't apply - I don't deny them. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Martin v. Löwis wrote: Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing system in which '١٢٣٤.٥٦e4' means 12345600.0. I'm not sure what you're after here. That the current float() constructor accepts tons of bogus character strings and accepts them as numbers, and that it should stop doing so. What bogus characters do the float() and int() constructors accept? As far as I can see, they only accepts numerals. [...] Notice that Python does *not* currently support printing numbers in other scripts - even though this may actually be more useful than parsing. Lack of one function, even if more useful, does not imply that an existing function should be removed. [...] In the case of number parsing, I think Python would be better if float() rejected non-ASCII strings, and any support for such parsing should be redone correctly in a different place (preferably along with printing of numbers). So your problems with the current behaviour are: (1) in some unspecified way, it's not done correctly; (2) it belongs somewhere other than float() and int(). That second is awfully close to bike-shedding. Since you accept that Python *should* have the current behaviour, and Python *already* has the current behaviour, it seems strange that you are kicking up such a fuss merely to *move* the implementation of that behaviour out of the numeric constructors into some unspecified different place. I think it would be constructive to explain: - how the current behaviour is incorrect; - your suggestions for correcting it; and - a concrete suggestion for where you would like to see the behaviour moved to, and why that would be better than where it currently is. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: So the question still stands: Why not implementing this in Distutils2 ? Because it then wouldn't be available in Python 3.2, which is the target release of the PEP. The exact feature I am mentioning is the ability to compile extensions with new options, so I am not sure which PEP is involved since distutils changes refer to PEP 384 in the other PEP. I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. Grouping all new features in the new version and keep Distutils1 in maintenance mode seems to make more sense to me, if we want to make Distutils die and push forward Distutils2 for its new features etc. Or we might get back into backward hell again :) So, I am +1 on a patch on distutils2 and -1 on de-freezing Distutils for any new feature. If that really causes too much pain, I'll refrain from making any changes to distutils; PEP 384 doesn't specify any changes, anyway. That would be awesome, and we can work on a patch for distutils2 to provide that abi option. Regards, Martin -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 1:55 PM, Antoine Pitrou solip...@pitrou.net wrote: .. I don't think so. str.split() and str.splitlines() are also defined in conformance to the SPEC, AFAIK. They certainly try to. You are joking, right? Where exactly does Unicode specify something like this: ''.join('̀́̂'.split('\udf00\ud800')) '́̂' ? OK, splitting on a given separator has very little to do with Unicode or UCD, but str.splitlines() makes absolutely no attempt to conform to Unicode Standard Annex #14 (Unicode line breaking algorithm). Wait, UAX #14 is actually relevant to textwrap module which saw very little change since 2.x days. So, what exactly does str.splitlines() do? And which part of the Unicode standard defines how it is different from str.split(.., '\n')? Reference manual does not help me here either: str.splitlines([keepends]) Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true. http://docs.python.org/dev/library/stdtypes.html#str.splitlines ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Le jeudi 02 décembre 2010 à 16:34 -0500, Alexander Belopolsky a écrit : On Thu, Dec 2, 2010 at 1:55 PM, Antoine Pitrou solip...@pitrou.net wrote: .. I don't think so. str.split() and str.splitlines() are also defined in conformance to the SPEC, AFAIK. They certainly try to. You are joking, right? Perhaps you could look at the implementation. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. I think it was a bad decision to freeze distutils, and we certainly didn't make that (not any we that includes me, that is). This freeze made the situation worse. IIRC, it was really the incompatible changes that made people ask you to stop changing distutils. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 02.12.2010 22:30, schrieb Steven D'Aprano: Martin v. Löwis wrote: Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing system in which '١٢٣٤.٥٦e4' means 12345600.0. I'm not sure what you're after here. That the current float() constructor accepts tons of bogus character strings and accepts them as numbers, and that it should stop doing so. What bogus characters do the float() and int() constructors accept? As far as I can see, they only accepts numerals. Not bogus characters, but bogus character strings. E.g. strings that mix digits from different scripts, and mix them with the Python decimal separator. Notice that Python does *not* currently support printing numbers in other scripts - even though this may actually be more useful than parsing. Lack of one function, even if more useful, does not imply that an existing function should be removed. No. But if the specific function(ality) is not useful and underspecified, it should be removed. So your problems with the current behaviour are: (1) in some unspecified way, it's not done correctly; No. My main concern is that it is not properly specified. If it was specified, I could then tell you what precisely is wrong about it. Right now, I can only give examples for input that it should not accept, and examples of input that it should, but does not accept. (2) it belongs somewhere other than float() and int(). That's only because it also needs a parameter to specify what syntax to follow, somehow. That parameter could be explicit or implicit, and it could be to float or to some other function. But it must be available, and is not. That second is awfully close to bike-shedding. Since you accept that Python *should* have the current behaviour No, I don't. I think it behaves incorrectly, accepting garbage input and guessing some meaning out of it. - how the current behaviour is incorrect; See above: it accepts strings that do not denote real numbers in any writing system, and, despite the claim that the feature is there to support other writing systems, actually does not truly support other writing systems. - your suggestions for correcting it; and Make the current implementation exactly match the current documentation. I think the documentation is correct; the implementation is wrong. - a concrete suggestion for where you would like to see the behaviour moved to, and why that would be better than where it currently is. The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote: .. Have you tried Google ? I tried google at I could not find any plain text or HTML file that would use Arabic-Indic numerals. What was interesting, though that a search for quran unicode (without quotes). Brought me to http://www.sacred-texts.com which says that they've been using unicode since 2002 in their archives. Interestingly enough, their version of Qur'an uses ordinary digits for ayah numbers. See, for example http://www.sacred-texts.com/isl/uq/050.htm. I will change my mind on this issue when you present a machine-readable file with Arabic-Indic numerals and a program capable of reading it and show that this program uses the same number parsing algorithm as Python's int() or float(). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. I think it was a bad decision to freeze distutils, and we certainly didn't make that (not any we that includes me, that is). We is the people at the last language summit. Sorry if I used such a vague word. This freeze made the situation worse. Can you extend on this and explains why it makes it worse ? If we (as you included) don't agree it's the best solution, I would not want to be pushed back to square one at the next summit.. I happily reverted all my changes last year when asked, and started to work on Distutils2. But I'll get out of steam if the direction changes again, with you stating that it makes the situation worse. IIRC, it was really the incompatible changes that made people ask you to stop changing distutils. Who is people ? Are you suggesting that we could have added all the new features in Distutils in the stdlib ? The decision was because we had a mix of: - incompatible changes in private parts -- and some packages where patching distutils internals - changes on public APIs behavior, whith a behavior that was not clearly documented and suggest to interpretation - some mistakes I made as well But that's what you would expect for a project that needs to evolve a lot. Thus the freezing. So how would you make the situation better, if not by doing the work in distutils2 ? Regards, Martin -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On 02/12/2010 21:39, Martin v. Löwis wrote: I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. I think it was a bad decision to freeze distutils, and we certainly didn't make that (not any we that includes me, that is). This freeze made the situation worse. What situation worse? We certainly did ask Tarek to become bdfl of distutils and fix/improve it (at a language summit 2 years ago). We then asked him to revert distutils and do the work in a new package instead of inside distutils (at the language summit this year). I would perhaps argue for a case by case exception on PEPs that *required* distutils support that are being accepted and implemented prior to distutils2 moving into the standard library. It doesn't sound like your changes are *required* by the PEP though. As I recall Tarek thought it was a bad idea to freeze distutils as well, but we insisted. :-) IIRC, it was really the incompatible changes that made people ask you to stop changing distutils. Which included virtually any change to even private APIs. Given the issues freezing the distutils APIs except for essential bugfixes is a reasonable response. I don't know of any situation it has made worse. Things are getting very much better, but happening in distutils2 not distutils. All the best, Michael Foord Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 8:23 PM, Martin v. Löwis mar...@v.loewis.de wrote: In the case of number parsing, I think Python would be better if float() rejected non-ASCII strings, and any support for such parsing should be redone correctly in a different place (preferably along with printing of numbers). +1. The set of strings currently accepted by the float constructor just seems too ad hoc to be at all useful. Apart from the decimal separator issue, and the question of exactly which decimal digits are accepted and which aren't, there are issues like this one: x = '\uff11\uff25\uff0b\uff11\uff10' x '1E+10' float(x) Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'decimal' codec can't encode character '\uff25' in position 1: invalid decimal Unicode string y = '\uff11E+\uff11\uff10' y '1E+10' float(y) 100.0 That is, fullwidth *digits* are allowed, but none of the other characters can be fullwidth variants. Unfortunately, a float string doesn't consist solely of digits, and it seems to me to make little sense to allow variation in the digits without allowing corresponding variations in the other characters that might appear ('.', 'e', 'E', '+', '-'). A couple of slightly trickier decisions: (1) the float constructor currently does accept leading and trailing whitespace; should it allow any Unicode whitespace characters here? I'd say yes. (2) For int() rather than float(), there's a bit more value in allowing the variant digits, since it provides an easy way to interpret those digits. The decimal module currently makes use of this, for example (the decimal spec requires that non-European digits be accepted). I'd be happier if this functionality were moved elsewhere, though. The int constructor is, if anything, currently worse off than float, thanks to its attempts to support non-decimal bases. There's value in having an easy-to-specify, easy-to-maintain API for these basic builtin functions. For one thing, it helps non-CPython implementations. [MAL] The Python 3.x docs apparently introduced a reference to the language spec which is clearly not capturing the wealth of possible inputs. That documentation update was my fault; I was motivated to make the update by issues unrelated to this one (mostly to do with Python 3's more consistent handling of inf and nan, as a result of all the new float-string conversion code). If I'd been thinking harder, I would have remembered that float accepted the non-European digits and added a note to that effect. This (unintentional) omission does underline the point that it's difficult right now to document and understand exactly what the float constructor does or doesn't accept. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 12/2/2010 4:48 PM, Martin v. Löwis wrote: Am 02.12.2010 22:30, schrieb Steven D'Aprano: Martin v. Löwis wrote: Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing system in which '١٢٣٤.٥٦e4' means 12345600.0. I'm not sure what you're after here. That the current float() constructor accepts tons of bogus character strings and accepts them as numbers, and that it should stop doing so. What bogus characters do the float() and int() constructors accept? As far as I can see, they only accepts numerals. Not bogus characters, but bogus character strings. E.g. strings that mix digits from different scripts, and mix them with the Python decimal separator. Notice that Python does *not* currently support printing numbers in other scripts - even though this may actually be more useful than parsing. Lack of one function, even if more useful, does not imply that an existing function should be removed. No. But if the specific function(ality) is not useful and underspecified, it should be removed. So your problems with the current behaviour are: (1) in some unspecified way, it's not done correctly; No. My main concern is that it is not properly specified. If it was specified, I could then tell you what precisely is wrong about it. Right now, I can only give examples for input that it should not accept, and examples of input that it should, but does not accept. (2) it belongs somewhere other than float() and int(). That's only because it also needs a parameter to specify what syntax to follow, somehow. That parameter could be explicit or implicit, and it could be to float or to some other function. But it must be available, and is not. That second is awfully close to bike-shedding. Since you accept that Python *should* have the current behaviour No, I don't. I think it behaves incorrectly, accepting garbage input and guessing some meaning out of it. - how the current behaviour is incorrect; See above: it accepts strings that do not denote real numbers in any writing system, and, despite the claim that the feature is there to support other writing systems, actually does not truly support other writing systems. - your suggestions for correcting it; and Make the current implementation exactly match the current documentation. I think the documentation is correct; the implementation is wrong. - a concrete suggestion for where you would like to see the behaviour moved to, and why that would be better than where it currently is. The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think the basic premise is: you won't find strings in the wild that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.) Eric. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
This freeze made the situation worse. Can you extend on this and explains why it makes it worse ? Before the freeze, distutils was unmaintained (i.e. before you started maintaining it), but people who want to improve it gradually atleast could. Now gradual improvements are also banned, so it's not only unmaintained, but I can't even provide support for the PEP in Python that was just accepted. IIRC, it was really the incompatible changes that made people ask you to stop changing distutils. Who is people ? Are you suggesting that we could have added all the new features in Distutils in the stdlib ? No, only the ones that didn't cause backwards incompatibilities, and broke existing packages. But that's what you would expect for a project that needs to evolve a lot. Thus the freezing. Instead of evolving a lot, and instead of freezing, I would have preferred evolve a little. So how would you make the situation better, if not by doing the work in distutils2 ? Lift the freeze. I'm all for replacing distutils with distutils2, but I'm not sure whether you will declare distutils2 ready tomorrow, next year, or ten years from now. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
Am 02.12.2010 22:54, schrieb Michael Foord: On 02/12/2010 21:39, Martin v. Löwis wrote: I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. I think it was a bad decision to freeze distutils, and we certainly didn't make that (not any we that includes me, that is). This freeze made the situation worse. What situation worse? The distutils is unmaintained situation. It's not only unmaintained now, but proposed improvements are rejected without consideration, on the grounds that they are changes. I would perhaps argue for a case by case exception on PEPs that *required* distutils support that are being accepted and implemented prior to distutils2 moving into the standard library. It doesn't sound like your changes are *required* by the PEP though. Well, the PEP 384 text in PEP 3149 specifies a change. It's not clear whether this change was accepted when PEP 3149 was accepted, or whether it was accepted when PEP 384 was accepted, or whether it was not accepted at all, or whether it was just proposed. In any case, without the change, you won't naturally get extension modules that use the abi3 tag proposed in 3149. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: Am 02.12.2010 22:54, schrieb Michael Foord: On 02/12/2010 21:39, Martin v. Löwis wrote: I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. I think it was a bad decision to freeze distutils, and we certainly didn't make that (not any we that includes me, that is). This freeze made the situation worse. What situation worse? The distutils is unmaintained situation. It's not only unmaintained now, but proposed improvements are rejected without consideration, on the grounds that they are changes. I welcome those changes in Distutils2. That's the whole point. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
The distutils is unmaintained situation. It's not only unmaintained now, but proposed improvements are rejected without consideration, on the grounds that they are changes. I welcome those changes in Distutils2. That's the whole point. That would be useful if there was a clear vision of when distutils2 will be released. Please understand that I'm not blaming you for not releasing it (it *is* too much for a single person), but please understand that it's also not helpful to submit changes to a codebase that is not going to be released in a foreseeable future. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Eric Smith wrote: The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think the basic premise is: you won't find strings in the wild that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.) http://en.wikipedia.org/wiki/Decimal_mark In China, comma and space are used to mark digit groups because dot is used as decimal mark. Note that float() can also parse integers, it just returns them as floats :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 02 2010) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: This freeze made the situation worse. Can you extend on this and explains why it makes it worse ? Before the freeze, distutils was unmaintained (i.e. before you started maintaining it), but people who want to improve it gradually atleast could. Now gradual improvements are also banned, so it's not only unmaintained, but I can't even provide support for the PEP in Python that was just accepted. IIRC, it was really the incompatible changes that made people ask you to stop changing distutils. Who is people ? Are you suggesting that we could have added all the new features in Distutils in the stdlib ? No, only the ones that didn't cause backwards incompatibilities, and broke existing packages. This is impossible. I can point you to some third party project that can break if you touch some distutils internals, like setuptools. Setuptools also uses some privates global variables in some other modules in the stdlib FYI. The right answer was maybe back then: make setuptools and other projects evolve with distutils. But it did not happen. So we left the status quo and moved forward in distutils2. Because we knew distutils needed deeper changes anyways, and we knew setuptools was used everywhere and unfortunately not evolving at the same pace. (note: I am not blaming PJE or anyone when I say this -- the way distutils worked and was poorly maintained was the main reason) But that's what you would expect for a project that needs to evolve a lot. Thus the freezing. Instead of evolving a lot, and instead of freezing, I would have preferred evolve a little. So how would you make the situation better, if not by doing the work in distutils2 ? Lift the freeze. I'm all for replacing distutils with distutils2, but I'm not sure whether you will declare distutils2 ready tomorrow, next year, or ten years from now. Depends on what ready means. If by ready you mean it can be used to replace Distutils1 in a project, I declare Distutils2 ready for usage NOW. It's in alpha stage. I want a solid beta before Pycon. I would even remove Distutils from 3.x altogether at some point since setuptools is not Python 3 compatible, and just put distutils2. 3.3 sounds like a good target. Regards Tarek -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On Thu, 02 Dec 2010 23:21:25 +0100 Martin v. Löwis mar...@v.loewis.de wrote: Am 02.12.2010 22:54, schrieb Michael Foord: On 02/12/2010 21:39, Martin v. Löwis wrote: I was told not to touch to Distutils code to avoid any regression since it's patched to the bones in third party products. So we decided to freeze distutils and add all new features in Distutils2, which is at alpha stage now. So this move seems contradictory to me. I think it was a bad decision to freeze distutils, and we certainly didn't make that (not any we that includes me, that is). This freeze made the situation worse. What situation worse? The distutils is unmaintained situation. It's not only unmaintained now, but proposed improvements are rejected without consideration, on the grounds that they are changes. I think distutils is simply a bugfix branch for distutils2. Similarly as how we don't commit improvements in e.g. 2.7 or 3.1, neither do we commit improvements to distutils. (and I think that's how Guido wanted it anyway) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: The distutils is unmaintained situation. It's not only unmaintained now, but proposed improvements are rejected without consideration, on the grounds that they are changes. I welcome those changes in Distutils2. That's the whole point. That would be useful if there was a clear vision of when distutils2 will be released. Please understand that I'm not blaming you for not releasing it (it *is* too much for a single person), but please understand that it's also not helpful to submit changes to a codebase that is not going to be released in a foreseeable future. I know you're not blaming me. Distutils 2 alpha3 is currently released and available at PyPI. I use it in some of my professional projects FWIW. alpha4 was postponed but should be out this month. It contains major features, people from the GSOC worked on. The initial roadmap was to have a final by the time 3.2 final is out, but that'll be too short. So the target is to have a beta release for Pycon, and to sync the final release with 3.3, with lots of feedback in the meantime hopefully, and people using it from 2.4 onward. Regards, Martin -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On Dec 02, 2010, at 11:21 PM, Martin v. Löwis wrote: Well, the PEP 384 text in PEP 3149 specifies a change. It's not clear whether this change was accepted when PEP 3149 was accepted, or whether it was accepted when PEP 384 was accepted, or whether it was not accepted at all, or whether it was just proposed. From my point of view, the PEP 3149 text is just a proposal. It leaves the final decision to PEP 384, but tries to address some of the issues raised during the PEP 3149 discussion. I think it is within PEP 384's scope to make the final decisions about it. In any case, without the change, you won't naturally get extension modules that use the abi3 tag proposed in 3149. I would favor changing distutils, if it can be done in a way that reasonably preserves backward compatibility. I suppose it's impossible to know all the ways 3rd party code has reached into distutils, but I think you can make fairly good judgements about whether a change is backward compatible or not. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
No, only the ones that didn't cause backwards incompatibilities, and broke existing packages. This is impossible. I can point you to some third party project that can break if you touch some distutils internals, like setuptools. Setuptools also uses some privates global variables in some other modules in the stdlib FYI. So what would break if Extension accepted an abi= keyword parameter? Lift the freeze. I'm all for replacing distutils with distutils2, but I'm not sure whether you will declare distutils2 ready tomorrow, next year, or ten years from now. Depends on what ready means. Included in Python, so that changes become possible again. If by ready you mean it can be used to replace Distutils1 in a project, I declare Distutils2 ready for usage NOW. It's in alpha stage. I want a solid beta before Pycon. I would even remove Distutils from 3.x altogether at some point since setuptools is not Python 3 compatible, and just put distutils2. 3.3 sounds like a good target. So will distuils2 be released before that? If so, when? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Alexander Belopolsky wrote: On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote: .. Have you tried Google ? I tried google at I could not find any plain text or HTML file that would use Arabic-Indic numerals. What was interesting, though that a search for quran unicode (without quotes). Brought me to http://www.sacred-texts.com which says that they've been using unicode since 2002 in their archives. Interestingly enough, their version of Qur'an uses ordinary digits for ayah numbers. See, for example http://www.sacred-texts.com/isl/uq/050.htm. I will change my mind on this issue when you present a machine-readable file with Arabic-Indic numerals and a program capable of reading it and show that this program uses the same number parsing algorithm as Python's int() or float(). Have you had a look at the examples I posted ? They include texts and tables with numbers written using east asian arabic numerals. Here's an example of a a famous Chinese text using Chinese numerals: http://ctext.org/nine-chapters Unfortunately, the Chinese numerals are not listed in the Category Nd, so Python won't be able to parse them. This has various reasons, it seems, one of them being that the numeral code points were not defined as range of code points. I'm sure you can find other books on mathematics in sanscrit or arabic scripts as well. But this whole branch of the discussion is not going to go anywhere. The point is that we support all of Unicode in Python, not just a fragment, and therefore the numeric constructors support all of Unicode. Using them, it's very easy to support numbers in all kinds of variants, whether bound to a locale or not. Adding more locale aware numeric parsers and formatters to the locale module, based on these APIs is certainly a good idea, but orthogonal to the ongoing discussion, IMO. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 02 2010) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Terry Reedy wrote: On 11/29/2010 10:19 AM, M.-A. Lemburg wrote: Nick Coghlan wrote: On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburgm...@egenix.com wrote: If we would go down that road, we would also have to disable other Unicode features based on locale, e.g. whether to apply non-ASCII case mappings, what to consider whitespace, etc. We don't do that for a good reason: Unicode is supposed to be universal and not limited to a single locale. Because parsing numbers is about more than just the characters used for the individual digits. There are additional semantics associated with digit ordering (for any number) and decimal separators and exponential notation (for floating point numbers) and those vary by locale. We deliberately chose to make the builtin numeric parsers unaware of all of those things, and assuming that we can simply parse other digits as if they were their ASCII equivalents and otherwise assume a C locale seems questionable. Sure, and those additional semantics are locale dependent, even between ASCII-only locales. However, that does not apply to the basic building blocks, the decimal digits themselves. If the existing semantics can be adequately defined, documented and defended, then retaining them would be fine. However, the language reference needs to define the behaviour properly so that other implementations know what they need to support and what can be chalked up as being just an implementation accident of CPython. (As a point in the plus column, both decimal.Decimal and fractions.Fraction were able to handle the '١٢٣٤.٥٦' example in a manner consistent with the int and float handling) The support is built into the C API, so there's not really much surprise there. Regarding documentation, we'd just have to add that numbers may be made up of an Unicode code point in the category Nd. See http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf, section 4.6 for details Decimal digits form a large subcategory of numbers consisting of those digits that can be used to form decimal-radix numbers. They include script-specific digits, but exclude char- acters such as Roman numerals and Greek acrophonic numerals. (Note that1, 5 = 15 = fifteen, butI, V = IV = four.) Decimal digits also exclude the compatibility subscript or superscript digits to prevent simplistic parsers from misinterpreting their values in context. int(), float() and long() (in Python2) are such simplistic parsers. Since you are the knowledgable advocate of the current behavior, perhaps you could open an issue and propose a doc patch, even if not .rst formatted. Good suggestion. I tried to collect as much context as possible: http://bugs.python.org/issue10610 I'll leave the rst-magic to someone else, but will certainly help if you have more questions about the details. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 02 2010) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
I think distutils is simply a bugfix branch for distutils2. Similarly as how we don't commit improvements in e.g. 2.7 or 3.1, neither do we commit improvements to distutils. It's different, though, in the sense that Python has a release schedule and multiple committers working on it, and that it normally gets released even if some changes don't get included in a specific release yet. All this seems not to be true for distutils2. So my motivation to contribute changes to it is *much* lower than my desire to contribute to distutils, and it is also provably lower than my motivation to contribute to distribute (say). I'm just getting tired having to talk to five projects just to make a single change to the build infrastructure available to the Python community. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 5:58 PM, M.-A. Lemburg m...@egenix.com wrote: .. I will change my mind on this issue when you present a machine-readable file with Arabic-Indic numerals and a program capable of reading it and show that this program uses the same number parsing algorithm as Python's int() or float(). Have you had a look at the examples I posted ? They include texts and tables with numbers written using east asian arabic numerals. Yes, but this was all about output. I am pretty sure TeX was able to typeset Qur'an in all its glory long before Unicode was invented. Yet, in machine readable form it would be something like {\quran 1} (invented directive). I have asked for a file that is intended for machine processing, not for human enjoyment in print or on a display. I claim that if such file exists, the program that reads it does not use the same rules as Python and converting non-ascii digits would be a tiny portion of what that program does. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Am 02.12.2010 23:43, schrieb M.-A. Lemburg: Eric Smith wrote: The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think the basic premise is: you won't find strings in the wild that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.) http://en.wikipedia.org/wiki/Decimal_mark In China, comma and space are used to mark digit groups because dot is used as decimal mark. I may be misinterpreting that, but I think that refers to the case of writing numbers using Arabic digits. Chinese digits are, e.g., used in the Suzhou numerals http://en.wikipedia.org/wiki/Suzhou_numerals This doesn't have a decimal point at all. Instead, the second line (below or left to the actual digits) describes the power of ten and the unit of measurement (i.e. similar to scientific notation, but with ideographs for the powers of ten). In another writing system, they use 点 (U+70B9) as the decimal separator, see http://en.wikipedia.org/wiki/Chinese_numerals#Fractional_values In the same system, the integral part uses multipliers, i.e. 12345 is [1][1][2][1000][3][100][4][10][5]; the fractional part uses regular digits. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 12/2/2010 5:43 PM, M.-A. Lemburg wrote: Eric Smith wrote: The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think the basic premise is: you won't find strings in the wild that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.) http://en.wikipedia.org/wiki/Decimal_mark In China, comma and space are used to mark digit groups because dot is used as decimal mark. Is that an ASCII dot? That page doesn't say. Note that float() can also parse integers, it just returns them as floats :-) :) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On 02/12/2010 23:01, Martin v. Löwis wrote: [snip...] I'm just getting tired having to talk to five projects just to make a single change to the build infrastructure available to the Python community. The very best hope of resolving that particular problem is distutils2. :-) distutils2 is *already* available to the Python community, and whether or not there is a fixed release date it will have betas and then a 1.0 release in the foreseeable future. The team working on it has made an enormous amount of progress. We're much better off as a development community putting our support and energy into distutils2 rather than pining for evolution of distutils. Michael Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
2010/12/2 Martin v. Löwis mar...@v.loewis.de: No, only the ones that didn't cause backwards incompatibilities, and broke existing packages. This is impossible. I can point you to some third party project that can break if you touch some distutils internals, like setuptools. Setuptools also uses some privates global variables in some other modules in the stdlib FYI. So what would break if Extension accepted an abi= keyword parameter? I suppose you have code behind this, that will be in build_ext and in the compilers. So you will need to try out ALL projects out there that customize build_ext, like numpy or setuptools, etc, But you won't be able to try out all projects because they are not listed somewhere. For starters, the Extension class is replaced by another one in setuptools, that patches the constructor if Pyrex is installed, which is unlikely I guess, so no big deal. But you will also get a replaced version of the Distribution class that uses a private method from distutils, and another version of build_ext with custom compiling flags. Now depending on how you do your thing it could work if you are careful at doing things on the top of setuptools. And then, if numpy.distutils is installed, it relies on distutils build_ext and tries to rely on setuptools one's too, so it gets in the mix of the patched classes, and you get an horrible mix and possible bad interactions. So I am not saying it's impossible to add the feature, but it is impossible to be sure nothing gets broken in third party. So the freeze seems wise indeed Lift the freeze. I'm all for replacing distutils with distutils2, but I'm not sure whether you will declare distutils2 ready tomorrow, next year, or ten years from now. Depends on what ready means. Included in Python, so that changes become possible again. If by ready you mean it can be used to replace Distutils1 in a project, I declare Distutils2 ready for usage NOW. It's in alpha stage. I want a solid beta before Pycon. I would even remove Distutils from 3.x altogether at some point since setuptools is not Python 3 compatible, and just put distutils2. 3.3 sounds like a good target. So will distuils2 be released before that? If so, when? An alpha is already released. A beta will be released for Pycon (I need it for my talk :) ) Then hopefully the final before 3.2 Regards, Martin -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
The point is that we support all of Unicode in Python, not just a fragment, and therefore the numeric constructors support all of Unicode. That conclusion is as false today as it was in Python 1.6, but only now people start caring about that. a) we don't support all of Unicode in numeric constructors. There are lots of things that you can write down that readers would recognize as a real/rational/integral number that float() won't parse. b) if float() would restrict itself to the scientific notation of real numbers (as it should), Python could well continue to claim all of Unicode. Adding more locale aware numeric parsers and formatters to the locale module, based on these APIs is certainly a good idea, but orthogonal to the ongoing discussion, IMO. Not at all. The concept of Unicode numbers is flawed: Unicode does *not* prescribe any specific way to denote numbers. Unicode is about characters, and Python supports the Unicode characters for digits as well as it supports all the other Unicode characters. Instead, support for non-scientific notation of real numbers should be based on user needs, which probably can be approximated by looking at actual scripts. This, in turn, is inherently locale-dependent. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Eric Smith wrote: On 12/2/2010 5:43 PM, M.-A. Lemburg wrote: Eric Smith wrote: The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think the basic premise is: you won't find strings in the wild that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.) http://en.wikipedia.org/wiki/Decimal_mark In China, comma and space are used to mark digit groups because dot is used as decimal mark. Is that an ASCII dot? That page doesn't say. Yes, but to be fair: I think that the page actually refers to the use of the Arabic numeral format in China, rather than with their own script symbols. Note that float() can also parse integers, it just returns them as floats :-) :) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 02 2010) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
From my point of view, the PEP 3149 text is just a proposal. It leaves the final decision to PEP 384, but tries to address some of the issues raised during the PEP 3149 discussion. I think it is within PEP 384's scope to make the final decisions about it. Ok, then it looks like there just won't be any support for module tagging of ABI-conforming modules. It might be possible to support something like this in the import code, but I would consider this pointless without accompanying distutils support. Then, by default, the modules just use the ABI tag that distutils assigns to them by default. It's interesting to note that #9807 got into distutils despite it being frozen (but this is not about ABI tags, right - so does distutils in 3.2 actually assign any ABI tag at all?) I would favor changing distutils, if it can be done in a way that reasonably preserves backward compatibility. It seems this is right out for policy reasons. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
An alpha is already released. A beta will be released for Pycon (I need it for my talk :) ) Then hopefully the final before 3.2 Ok, that's promising. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On Fri, Dec 3, 2010 at 12:01 AM, Martin v. Löwis mar...@v.loewis.de wrote: I think distutils is simply a bugfix branch for distutils2. Similarly as how we don't commit improvements in e.g. 2.7 or 3.1, neither do we commit improvements to distutils. It's different, though, in the sense that Python has a release schedule and multiple committers working on it, and that it normally gets released even if some changes don't get included in a specific release yet. All this seems not to be true for distutils2. We have 3 or 4 regular contributors. That's not a lot for sure. So my motivation to contribute changes to it is *much* lower than my desire to contribute to distutils, and it is also provably lower than my motivation to contribute to distribute (say). I'm just getting tired having to talk to five projects just to make a single change to the build infrastructure available to the Python community. I am not trying to motivate you to contribute to Distutils2. I am trying to make sure we are all on the same page for what's good for Python. So if we work in Distutils2 and you work in Distutils saying publicly that you don't want to contribute to Distutils2, that's a total nonsense. We took some decisions, and you want to go against them. So I want to have a consensus here for the packaging eco-system and make sure we are still on track. I am sorry if you get tired of it, but I don't want to be told at the next summit: sorry Tarek, now we need to do changes little by little in distutils1 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
Hi, 2010/12/3 Michael Foord fuzzy...@voidspace.org.uk On 02/12/2010 23:01, Martin v. Löwis wrote: [snip...] I'm just getting tired having to talk to five projects just to make a single change to the build infrastructure available to the Python community. The very best hope of resolving that particular problem is distutils2. :-) distutils2 is *already* available to the Python community, and whether or not there is a fixed release date it will have betas and then a 1.0 release in the foreseeable future. The team working on it has made an enormous amount of progress. We're much better off as a development community putting our support and energy into distutils2 rather than pining for evolution of distutils. Sure. But today (before 3.2b1) we want to merge PEP3149 and PEP384; they change the paths and filenames used by python. Either we modify distutils to comply with the new names, or defer these PEPs until distutils2 is ready. -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote: .. Some examples: http://www.bdl.gov.lb/circ/intpdf/int123.pdf I looked at this one more closely. While I cannot understand what it says, It appears that Arabic numerals are used in dates. It looks like Python want be able to deal with those: datetime.strptime('١٩٩٩/١٠/٢٩', '%Y/%m/%d') .. ValueError: time data '١٩٩٩/١٠/٢٩' does not match format '%Y/%m/%d' Interestingly, datetime.strptime('١٩٩٩', '%Y') datetime.datetime(1999, 1, 1, 0, 0) which further suggests that support of such numerals is accidental. As I think more about it, though I am becoming less avert to accepting these numerals for base 10 integers. Integers can be easily extracted from text using simple regex and '\d' accepts all category Nd characters. I would require though that all digits be from the same block, which is not hard because Unicode now promises to only have them in contiguous blocks of 10. This rule seems to address some of security issues because it is unlikely that a system that can display some of the local digits would not be able to display all of them properly. I still don't think it makes any sense to accept them in float(). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Stephen J. Turnbull wrote: Steven D'Aprano writes: With full respect to haiyang kang, hear-say from one person can hardly be described as strong evidence That's *disrespectful* nonsense. What Haiyang reported was not hearsay, it's direct observation of what he sees around him and personal experience, plus extrapolation. Look up hearsay, please. Fair enough. I choose my words poorly and apologise. A better description would be anecdotal evidence. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On 02/12/2010 23:51, Amaury Forgeot d'Arc wrote: Hi, 2010/12/3 Michael Foord fuzzy...@voidspace.org.uk mailto:fuzzy...@voidspace.org.uk On 02/12/2010 23:01, Martin v. Löwis wrote: [snip...] I'm just getting tired having to talk to five projects just to make a single change to the build infrastructure available to the Python community. The very best hope of resolving that particular problem is distutils2. :-) distutils2 is *already* available to the Python community, and whether or not there is a fixed release date it will have betas and then a 1.0 release in the foreseeable future. The team working on it has made an enormous amount of progress. We're much better off as a development community putting our support and energy into distutils2 rather than pining for evolution of distutils. Sure. But today (before 3.2b1) we want to merge PEP3149 and PEP384; they change the paths and filenames used by python. Either we modify distutils to comply with the new names, or defer these PEPs until distutils2 is ready. Or put support for them into distutils2 now? Michael -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (BOGUS AGREEMENTS) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On Dec 03, 2010, at 12:51 AM, Amaury Forgeot d'Arc wrote: Sure. But today (before 3.2b1) we want to merge PEP3149 and PEP384; they change the paths and filenames used by python. Either we modify distutils to comply with the new names, or defer these PEPs until distutils2 is ready. I do not think it would be a good idea to revert PEP 3149. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
On 03.12.2010 00:25, Tarek Ziadé wrote: 2010/12/2 Martin v. Löwismar...@v.loewis.de: No, only the ones that didn't cause backwards incompatibilities, and broke existing packages. This is impossible. I can point you to some third party project that can break if you touch some distutils internals, like setuptools. Setuptools also uses some privates global variables in some other modules in the stdlib FYI. So what would break if Extension accepted an abi= keyword parameter? I suppose you have code behind this, that will be in build_ext and in the compilers. So you will need to try out ALL projects out there that customize build_ext, like numpy or setuptools, etc, But you won't be able to try out all projects because they are not listed somewhere. is this necessary? are all these projects known to work with 3.2, without having changes compared to 3.1 *without* this pep? hardly ... how many extensions will use this restricted api at all? Is it a legitimate solution to back up building an extension in the default mode? even without having any changes in distutils it would make sense to know if an extension can be built with the restricted ABI, so maybe it is better to defer any changes to the extension soname, and provide a check for an extension if it conforms to the restricted ABI, even if the extension still uses the python version specific soname. I did not mean to block this pep by choosing any installation names. Matthias ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On 12/2/2010 6:54 PM, Alexander Belopolsky wrote: On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburgm...@egenix.com wrote: .. Some examples: http://www.bdl.gov.lb/circ/intpdf/int123.pdf I looked at this one more closely. While I cannot understand what it says, It appears that Arabic numerals are used in dates. It looks like Python want be able to deal with those: When I travelled in S. Asia around 25 years ago, arabic and indic numerals were in obvious use in stores, road signs, and banks (as with money exchange receipts). I learned the digits partly for self-protestions ;-). I have no real idea of what is done *now* in computerized business, but I assume the native digits are used. It may well be that there is no Python software yet that operates with native digits. The lack of direct output capability would hinder that. Of course, someone could run both input and output through language-specific str.translate digit translators. datetime.strptime('١٩٩٩/١٠/٢٩', '%Y/%m/%d') Googling ١٩٩٩ gets about 83,000 hits. .. ValueError: time data '١٩٩٩/١٠/٢٩' does not match format '%Y/%m/%d' Interestingly, datetime.strptime('١٩٩٩', '%Y') datetime.datetime(1999, 1, 1, 0, 0) which further suggests that support of such numerals is accidental. As I think more about it, though I am becoming less avert to accepting these numerals for base 10 integers. Both input and output are needed for educational programming, though translation tables might be enough. Integers can be easily extracted from text using simple regex and '\d' accepts all category Nd characters. I would require though that all digits be from the same block, which is not hard because Unicode now promises to only have them in contiguous blocks of 10. That seems sensible. This rule seems to address some of security issues because it is unlikely that a system that can display some of the local digits would not be able to display all of them properly. I still don't think it makes any sense to accept them in float(). For the present, I would pretty well agree with that, at least until we know more. You have raised an important issue. It is a bit of a chicken and egg problem though. We will not really know what is needed until Python is used more in non-english/non-euro contexts, while such usage may await better support. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
even without having any changes in distutils it would make sense to know if an extension can be built with the restricted ABI, so maybe it is better to defer any changes to the extension soname, and provide a check for an extension if it conforms to the restricted ABI, even if the extension still uses the python version specific soname. Python’s setup.py has an example in Martin’s branch: ext = Extension('xxlimited', ['xxlimited.c'], define_macros=[('Py_LIMITED_API', 1)]) http://codereview.appspot.com/3262043/patch/1/68 This is possible with today’s distutils. I don’t know if it’s enough to build stable-ABI-conformant extension modules. Regards ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
Martin v. Löwis writes: Aside: how does one log into the Cheeseshop with your Launchpad OpenID? When I try to do it I end up on a Manual user registration page. I fill out the username with what I think my PyPI user name is, and add my python.org email address, but then it tells me 'barry' is already taken. Do I need some kind of back door linking of my lp openid and my pypi user id? Since the barry account already exists, you first need to log into that (likely using a password). You can then claim the LP OpenID as being associated with that account, and use LP in the future. It would be nice if the UI told users that, and offered an opportunity to log in. Better yet would be a option for an OpenID to claim a user name by giving the password for it (ie, automatically on a successful login from that page). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Lennart Regebro writes: 2010/12/2 Stephen J. Turnbull step...@xemacs.org: T1000 = float('一.◯◯◯') That was already discussed here, and it's clear that unicode does not consider these characters to be something you can use in a decimal number, and hence it's not broken. Huh? IOW, use Unicode features just because they're there, what the users want and use doesn't matter? The only evidence I've seen so far that this feature is anything but a a toy for a small faction of developers is Neil Hodgson's information that OOo will generate these kinds of digits (note that it *will* do Han! so the evidence is as good for users demanding Han numerals as for any other kind, Unicode.org definitions notwithstanding), and that DOS CP 864 contains the Indo/Arabic versions. Of course, it's quite possible that those were toys for the developers of those software packages too. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
Hi Prashant, Python 3 support in distutils2 is not entirely finished, it’s an interesting and challenging task. Another idea: convert the python.org internal scripts to use Python 3, for example starting with patches for http://code.python.org/hg/peps/ . This would not have any impact on the community, but it’s easy work that’d help the Python developers to eat their own dogfood. Regards ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Furthermore, data can well originate from texts that were written hundreds or even thousands of years ago, so there is plenty of material available for processing. humm..., for this, i think we need a special tuned language processing system to handle this, and one subsystem for one language :)... (sometimes a single word is not enough, we also need context) Take pi for example, in modern math, it is wrote as: 3.1415...; in old China, it is sometimes wrote as: 三一四一五 or 三点一四一五 or 叁点壹肆壹伍; And if these texts are extracted through scanner (OCR or other image processing tech), in my POV, it is the job of this image processing subsystem (or some other subsystem between the image processing and database) to do the mapping between number and raw text data, example table in DB: text | raw data|raw image data ---|-|--- 3.1415 | 三一四一五| image... br, khy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Neil Hodgson writes: While I don't have Excel to test with, OpenOffice.org Calc will display in Arabic or Han numerals using the NatNum format codes. Display is different from input, but at least this is concrete evidence. Will it accept Arabic on input? (Han might be too much to ask for since Unicode considers Han digits to be impure.) Ditto Arabic, I would imagine; ISO 8859/6 (aka Latin/Arabic) does not contain the Arabic digits that have been presented here earlier AFAICT. DOS code page 864 does use 0xB0-0xB9 OK, Microsoft thought it would be useful. I'd still like to know whether people actually use them for input (or output, for that matter -- anybody have a corpus of Arabic Form 10-Ks to grep through?), but that's more concrete evidence than we've seen before. Thank you! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Antoine Pitrou writes: The legacy format argument looks like a red herring to me. When converting from a format to another it is the programmer's job to his/her job right. Uhmm, the argument *for* this feature proposed by several people is that Python's numeric constructors do it (right) so that the programmer doesn't have to. If Python *doesn't* do it right, why should Python do it at all? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
On Thu, Dec 2, 2010 at 4:57 PM, Mark Dickinson dicki...@gmail.com wrote: .. (the decimal spec requires that non-European digits be accepted). Mark, I think *requires* is too strong of a word to describe what the spec says. The decimal module documentation refers to two authorities: 1. IBM’s General Decimal Arithmetic Specification 2. IEEE standard 854-1987 The IEEE standards predates Unicode and unsurprisingly does not have anything related to the issue. the IBM's spec says the following in the Conversions section: It is recommended that implementations also provide additional number formatting routines (including some which are locale-dependent), and if available should accept non-European decimal digits in strings. http://speleotrove.com/decimal/daconvs.html This cannot possibly be interpreted as normative text. The emphasis is clearly on formatting routines with non-European decimal digits added as an afterthought. This recommendation can reasonably be interpreted as a requirement that conversion routines should accept what formatting routines can produce. In Python there are no formatting routines to produce non-European numerals, so there is no requirement to accept them in conversions. I don't think decimal module should support non-European decimal digits. The only place where it can make some sense is in int() because here we have a fighting chance of producing a reasonable definition. The motivating use case is conversion of numerical data extracted from text using simple '\d+' regex matches. Here is how I would do it: 1. String x of non-European decimal digits is only accepted in int(x), but not by int(x, 0) or int(x, 10). 2. If x contains one or more non-European digits, then (a) all digits must be from the same block: def basepoint(c): return ord(c) - unicodedata.digit(c) all(basepoint(c) == basepoint(x[0]) for c in x) - True (b) and '+' or '-' sign is not alowed. 3. A character c is a digit if it matches '\d' regex. I think this means unicodedata.category(c) - 'Nd'. Condition 2(b) is important because there is no clear way to define what is acceptable as '+' or '-' using Unicode character properties and not all number systems even support local form of negation. (It is also YAGNI.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
Le 02/12/2010 23:17, Martin v. Löwis a écrit : Before the freeze, distutils was unmaintained (i.e. before you started maintaining it), but people who want to improve it gradually atleast could. Now gradual improvements are also banned, so it's not only unmaintained, but I can't even provide support for the PEP in Python that was just accepted. I wonder what your definition of “unmaintained” is. Tarek has been fixing bugs for two years, and recently I have been made a committer to assist him. It’s true that I’ve not been as active as I would have liked*, but I did fix some bugs, as I think you know, given that you’ve helped me in some reports. Sure, distutils is not as well-maintained as other modules, but a dozen bugs have been fixed by five or six of us since the revert. I do feel responsible for all 116 remaining bugs, and intend to address all of them. * This is partly normal, since I had warned before I was accepted as a committer that my time would be scarce for a year, partly due to the fact that I also do bug triage, doc work and patch reviews, and partly due to some personal problems with focusing. On the matter of freeze exceptions, there have been two: - reading the makefile with surogateescape error handler so that python can build with an ASCII locale in a non-ASCII path (haypo, #6011) - handle soabiflags (barry, #9807). I took part in the discussion before those changes and did not object to them: they are very small changes that enable a new feature of Python 3.2. Maybe I should have requested Tarek’s approval for those changes; he knows better than me how third parties may break because of changes that don’t seem to break anything. Regards ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Change to the Distutils / Distutils2 workflow
Hi everyone, I have sketched a workflow guide on http://wiki.python.org/moin/Distutils/FixingBugs Cheers ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
Python’s setup.py has an example in Martin’s branch: ext = Extension('xxlimited', ['xxlimited.c'], define_macros=[('Py_LIMITED_API', 1)]) http://codereview.appspot.com/3262043/patch/1/68 This is possible with today’s distutils. I don’t know if it’s enough to build stable-ABI-conformant extension modules. It is. However, there is also the proposal that they use an ABI tag in the SO name; having that generated automatically would require a distutils change. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 accepted
I wonder what your definition of “unmaintained” is. In this specific case: doesn't get feature requests acted upon. I'm well aware that you are fixing bugs, and that is appreciated. Sure, distutils is not as well-maintained as other modules, but a dozen bugs have been fixed by five or six of us since the revert. I do feel responsible for all 116 remaining bugs, and intend to address all of them. But if the resolution of the bug would require a new feature, your answer will be this is going to be fixed in distutils2 (if at all), it's out of scope for distutils. Before, if the submitter contributed a patch, the patch was just unreviewed for a long time, unless one of the committers picked it up. Now, the patch will be rejected, which I consider worse - because the patch is not being rejected on its own merits, but just because of a policy decision to not improve distutils anymore. For example, I keep running into the issue that distutils doesn't currently support parallel builds. I have been pondering supporting -j for building extensions, using both unbounded -j and the GNU make style -jN build server. However, I know that the patch will be rejected, so I don't even start working on it. On the matter of freeze exceptions, there have been two: - reading the makefile with surogateescape error handler so that python can build with an ASCII locale in a non-ASCII path (haypo, #6011) - handle soabiflags (barry, #9807). I took part in the discussion before those changes and did not object to them: they are very small changes that enable a new feature of Python 3.2. Maybe I should have requested Tarek’s approval for those changes; he knows better than me how third parties may break because of changes that don’t seem to break anything. I see. Now, I'd claim that the reasoning as to why an abi= parameter on Extension may break things also applies to the soabiflags: to support soabiflags, the INSTALL_SCHEMES syntax was modified. If the install command is subclassed, that could lead to funny interactions, e.g. where the subclass fails to put abiflags into config_vars. IIUC, subst_vars will then eventually raise a ValueError. I'm not saying that this is a likely scenario - only that the reasoning if a change can possibly affect existing code, it should not be made applies to essentially any change. So if you want to avoid breaking things with certainty, not even bug fixes would be acceptable. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Porting Ideas
It would be nice if the UI told users that, and offered an opportunity to log in. Better yet would be a option for an OpenID to claim a user name by giving the password for it (ie, automatically on a successful login from that page). So many projects, so little time. Contributions are welcome. IOW, it's easier for me to educate users. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com