Re: [Python-Dev] [Python-checkins] r86930 - in python/branches/py3k: Doc/library/os.rst Lib/os.py Lib/test/test_os.py Misc/ACKS Misc/NEWS

2010-12-02 Thread Nick Coghlan
On Thu, Dec 2, 2010 at 5:05 PM, terry.reedy python-check...@python.org wrote:
 +   If
 +   the target directory with the same mode as we specified already exists,
 +   raises an :exc:`OSError` exception if *exist_ok* is False, otherwise no
 +   exception is raised.  If the directory cannot be created in other cases,
 +   raises an :exc:`OSError` exception.

I would suggest being explicit here that directory exists, but has a
mode other than the one requested always triggers an exception.
Perhaps something like the following:

Raises an :exc:`OSError` exception if the target directory already
exists, unless *exist_ok* is True and the existing directory has the
same mode as is specified in the current call.  Also raises an
:exc:`OSError` exception if the directory cannot be created for any
other reason.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] AIX 5.3 - Enabling Shared Library Support Vs Extensions

2010-12-02 Thread Sébastien Sablé


Hi Anurag,

Le 25/11/2010 10:24, Anurag Chourasia a écrit :
 All,

 When I configure python to enable shared libraries, none of the 
extensions are getting built during the make step due to this error.



you may want to take a look at the following issue:

http://bugs.python.org/issue941346

Python compiled with shared libraries was broken on AIX until recently. 
There are some patches there to get it to work, or you may want to test 
the latest 2.7 or 3.x releases.


regards

--
Sébastien Sablé
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86924 - python/branches/py3k/Doc/library/random.rst

2010-12-02 Thread Éric Araujo
 On Thu, Dec 2, 2010 at 12:41 PM, raymond.hettinger
 python-check...@python.org wrote:
 +A more general approach is to arrange the weights in a cumulative 
 probability
 +distribution with :func:`itertools.accumulate`, and then locate the random 
 value
 +with :func:`bisect.bisect`::
 +
 + choices, weights = zip(*weighted_choices)
 + cumdist = list(itertools.accumulate(weights))
 + x = random.random() * cumdist[-1]
 + choices[bisect.bisect(cumdist, x)]
 +'Blue'

“pydoc bisect.bisect” is empty (“Alias for bisect_right()”); in the
code, bisect.bisect is noted as compatibility alias.  Wouldn’t it be
more helpful to use the newer name?

Regards

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Neil Hodgson
Stephen J. Turnbull:

 Here's why: '''print %d %
 some_integer''' doesn't now, and never will (unless Kristan gets his
 Python 2.8wink), produce Arabic or Han numerals.  Not in any
 language I know of, not in Microsoft Excel, and definitely not in
 Python 2.

   While I don't have Excel to test with, OpenOffice.org Calc will
display in Arabic or Han numerals using the NatNum format codes.
http://www.scintilla.org/ArabicNumbers.png

 Ditto Arabic, I
 would imagine; ISO 8859/6 (aka Latin/Arabic) does not contain the
 Arabic digits that have been presented here earlier AFAICT.  Note that
 there's plenty of space for them in that code table (eg, 0xB0-0xB9 is
 empty).  Apparently nobody *ever* thought it was useful to have them!

   DOS code page 864 does use 0xB0-0xB9 for ٠ .. ٩.
http://www.ascii.ca/cp864.htm

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Georg Brandl
Am 01.12.2010 23:39, schrieb Martin v. Löwis:
 As of today, What’s New In Python 3.2 [1] does not even mention the
 unicodedata upgrade to 6.0.0.
 
 One reason was that I was instructed not to change What's New a few
 years ago.

Maybe all past, present and future whatsnew maintainers can agree on these
rules, which I copied directly from whatsnew/3.2.rst?

   Rules for maintenance:

   * Anyone can add text to this document.  Do not spend very much time
   on the wording of your changes, because your text will probably
   get rewritten to some degree.

   * The maintainer will go through Misc/NEWS periodically and add
   changes; it's therefore more important to add your changes to
   Misc/NEWS than to this file.

   * This is not a complete list of every single change; completeness
   is the purpose of Misc/NEWS.  Some changes I consider too small
   or esoteric to include.  If such a change is added to the text,
   I'll just remove it.  (This is another reason you shouldn't spend
   too much time on writing your addition.)

   * If you want to draw your new text to the attention of the
   maintainer, add 'XXX' to the beginning of the paragraph or
   section.

   * It's OK to just add a fragmentary note about a change.  For
   example: XXX Describe the transmogrify() function added to the
   socket module.  The maintainer will research the change and
   write the necessary text.

   * You can comment out your additions if you like, but it's not
   necessary (especially when a final release is some months away).

   * Credit the author of a patch or bugfix.   Just the name is
   sufficient; the e-mail address isn't necessary.  It's helpful to
   add the issue number:

 XXX Describe the transmogrify() function added to the socket
 module.

 (Contributed by P.Y. Developer; :issue:`12345`.)

   This saves the maintainer the effort of going through the SVN log
   when researching a change.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Change to the Distutils / Distutils2 workflow

2010-12-02 Thread Tarek Ziadé
Hey

We discussed with Eric about the debugging workflow and we agreed that
our life would be easier if every bug fix would land first in
Distutils2 when it makes sense, then get backported to Distutils1.

For other core-devs that would mean that your patches should be done
against hg.python.org/distutils2, which uses unittest2. Then Eric and
I would take care of the backporting.

I am planning to set up a wiki page with the workflow as soon as I get a chance.

Thanks
Tarek

-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Lennart Regebro
2010/12/2 Stephen J. Turnbull step...@xemacs.org:
 Because that works, but

 print(T1234)

 doesn't (it prints ASCII).  You can't round-trip, but users will
 want/expect that.

You should be able to round-trip, absolutely. I don't think you should
expect print() to do that. str(56) possibly. :)
That's an argument for it to be in a module, as you then would need to
send in a parameter on which decimal characters you want.

 T1000 = float('一.◯◯◯')

That was already discussed here, and it's clear that unicode does not
consider these characters to be something you can use in a decimal
number, and hence it's not broken.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Antoine Pitrou
On Wed, 1 Dec 2010 22:28:49 -0500
Alexander Belopolsky alexander.belopol...@gmail.com wrote:
 
  Both my personal observations when travelling from Turkey to India and
  Wikipedia say yes. When representing a number in Arabic, the lowest-valued
  position is placed on the right, so the order of positions is the same as in
  left-to-right scripts.
  https://secure.wikimedia.org/wikipedia/en/wiki/Arabic_language#Numerals
 
 This matches my limited research on this topic as well.  However, I am
 not sure that when these codes are embedded in Arabic text, their
 logical order always matches their display order.

That shouldn't matter, since unicode text follows logical order. The
display order is up to the graphical representation library.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Lennart Regebro
On Wed, Dec 1, 2010 at 20:17, Antoine Pitrou solip...@pitrou.net wrote:
 And I'm not sure what this package called Python is (“a high-level
 object-oriented programming language”? like Java?), but I'm pretty sure
 I've heard there's a Python 3 compatible version.

Uhm... http://pypi.python.org/pypi/Python

Anybody wanna remove that, or update it or something? :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] AIX 5.3 - Enabling Shared Library Support Vs Extensions

2010-12-02 Thread Anurag Chourasia
Hi Sebastian,

Thanks for your response.

I  looked at http://bugs.python.org/issue941346 earlier. I was referred to
this link by Stefan Krah through another bug that i created at
http://bugs.python.org/issue10555 for this issue.

I confirm that my problem is solved with the Python 2.7.1 release which
contains the changes done by you.

Great work done by you and other folks for enabling the Shared Library build
on AIX. Hats Off !!!

Regards,
Anurag

2010/12/2 Sébastien Sablé sa...@users.sourceforge.net


 Hi Anurag,

 Le 25/11/2010 10:24, Anurag Chourasia a écrit :

  All,
 
  When I configure python to enable shared libraries, none of the
 extensions are getting built during the make step due to this error.
 

 you may want to take a look at the following issue:

 http://bugs.python.org/issue941346

 Python compiled with shared libraries was broken on AIX until recently.
 There are some patches there to get it to work, or you may want to test the
 latest 2.7 or 3.x releases.

 regards

 --
 Sébastien Sablé

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ICU

2010-12-02 Thread Guido van Rossum
On Wed, Dec 1, 2010 at 8:45 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote:

 Oh, about ICU:

  Actually, I remember you saying that locale should ideally be replaced
  with a wrapper around the ICU library.

 By that, I stand - however, I have given up the hope that this will
 happen anytime soon.

 Perhaps this could be made a GSOC topic.


 Incidentally, this may also address another Python's Achilles' heel:
 the timezone support.

 http://icu-project.org/download/icutzu.html

I work with people who speak highly of ICU, so I want to encourage
work in this area.

At the same time, I'm skeptical -- IIRC, ICU is a large amount of C++
code. I don't know how easy it will be to integrate this into our
build processes for various platforms, nor how Pythonic the
resulting APIs will look to the experienced Python user.

Still, those are not roadblocks, the benefits are potentially great,
so it's definitely worth investigating!

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py

2010-12-02 Thread Hirokazu Yamamoto

On 2010/11/27 5:31, Brian Curtin wrote:

On Fri, Nov 26, 2010 at 14:18, Hirokazu Yamamotoocean-c...@m2.ccsnet.ne.jp

wrote:



On 2010/11/27 5:02, Brian Curtin wrote:


We briefly chatted about this on the os.link
feature issue, but I never found a way around it.



How about implementing os.path.samefile in
Modules/posixmodule.c like this?

http://bugs.python.org/file19262/py3k_fix_kill_python_for_short_path.patch

# I hope this works.



That's almost identical to what the current os.path.sameopenfile is.

Lib/ntpath.py opens both files, then compares them via _getfileinformation.
That function is implemented to take in a file descriptor, call
GetFileInformationByHandle with it, then returns a tuple
of dwVolumeSerialNumber, nFileIndexHigh, and nFileIndexLow.



Yes. Difference is, file object cannot represent directory,
and probably FILE_FLAG_BACKUP_SEMANTICS makes it faster to open file.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ICU

2010-12-02 Thread James Y Knight

On Dec 1, 2010, at 11:45 PM, Alexander Belopolsky wrote:

 On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote:
 
 Oh, about ICU:
 
 Actually, I remember you saying that locale should ideally be replaced
 with a wrapper around the ICU library.
 
 By that, I stand - however, I have given up the hope that this will
 happen anytime soon.
 
 Perhaps this could be made a GSOC topic.
 
 
 Incidentally, this may also address another Python's Achilles' heel:
 the timezone support.
 
 http://icu-project.org/download/icutzu.html

Does ICU do anything regarding timezones that datetime + pytz doesn't already 
do? Wouldn't it make more sense to integrate the already-existing-and-pythonic 
pytz into Python than to make a new wrapper based on ICU?

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 8:36 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Wed, 1 Dec 2010 22:28:49 -0500
 Alexander Belopolsky alexander.belopol...@gmail.com wrote:
..
 This matches my limited research on this topic as well.  However, I am
 not sure that when these codes are embedded in Arabic text, their
 logical order always matches their display order.

 That shouldn't matter, since unicode text follows logical order. The
 display order is up to the graphical representation library.


I am not so sure.  On my Mac, U+200F (RIGHT-TO-LEFT MARK) affects 0-9
and Arabic-Indic decimals differently:

 print('\u200F123')
‏123
 print('\u200F\u0661\u0662\u0663')
231

I replaced Arabic-Indic decimals with 0-9 in the output to demonstrate
the point.  Cut-n-paste does not work well in the presence of RTL
directives.

and U+202E (RIGHT-TO-LEFT OVERRIDE) reverts the display order for both:

 print('\u202E123')
321
 print('\u202E\u0661\u0662\u0663')
321

(again, the output display is simulated not copied.)  I don't know if
explicit RTL directives are ever used in Arabic texts, but it is quite
possible that texts converted from older formats would use them for
efficiency.

Note that my point is not to find the correct answer here, but to
demonstrate that we as a group don't have the expertise to get parsing
of Arabic text right.  If we've got it right for Arabic, it is by
chance and not by design.  This still leaves us with 41 other types of
digits for at least 30 different languages.  Nobody will ever assume
that python builtins are suitable for use with all these variants.
This feature is only good for nefarious purposes such as hiding
extra digits in innocent-looking files or smuggling binary data
through naive interfaces.

PS: BTW, shouldn't int('\u0661\u0662\u06DD') be valid? or is it
int('\u06DD\u0661\u0662')?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Antoine Pitrou
Le jeudi 02 décembre 2010 à 11:41 -0500, Alexander Belopolsky a écrit :
 
 Note that my point is not to find the correct answer here, but to
 demonstrate that we as a group don't have the expertise to get parsing
 of Arabic text right.

I don't understand why you think Arabic or Hebrew text is any different
from Western text. Surely right-to-left isn't more conceptually
complicated than left-to-right, is it?

The fact that mixed rtl + ltr can render bizarrely or is awkward to cut
and paste is quite off-topic for our discussion.

 If we've got it right for Arabic, it is by
 chance and not by design.  This still leaves us with 41 other types of
 digits for at least 30 different languages.

So why do you trust the Unicode standard on other things and not on this
one?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ICU

2010-12-02 Thread Benjamin Peterson
2010/12/2 Guido van Rossum gu...@python.org:
 On Wed, Dec 1, 2010 at 8:45 PM, Alexander Belopolsky
 alexander.belopol...@gmail.com wrote:
 On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou solip...@pitrou.net wrote:

 Oh, about ICU:

  Actually, I remember you saying that locale should ideally be replaced
  with a wrapper around the ICU library.

 By that, I stand - however, I have given up the hope that this will
 happen anytime soon.

 Perhaps this could be made a GSOC topic.


 Incidentally, this may also address another Python's Achilles' heel:
 the timezone support.

 http://icu-project.org/download/icutzu.html

 I work with people who speak highly of ICU, so I want to encourage
 work in this area.

 At the same time, I'm skeptical -- IIRC, ICU is a large amount of C++
 code. I don't know how easy it will be to integrate this into our
 build processes for various platforms, nor how Pythonic the
 resulting APIs will look to the experienced Python user.

There's a nice C-API.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86930 - in python/branches/py3k: Doc/library/os.rst Lib/os.py Lib/test/test_os.py Misc/ACKS Misc/NEWS

2010-12-02 Thread Terry Reedy



On 12/2/2010 4:32 AM, Nick Coghlan wrote:

On Thu, Dec 2, 2010 at 5:05 PM, terry.reedypython-check...@python.org  wrote:

(except I did not write most of the patch)


+   If
+   the target directory with the same mode as we specified already exists,
+   raises an :exc:`OSError` exception if *exist_ok* is False, otherwise no
+   exception is raised.  If the directory cannot be created in other cases,
+   raises an :exc:`OSError` exception.


I would suggest being explicit here that directory exists, but has a
mode other than the one requested always triggers an exception.
Perhaps something like the following:

Raises an :exc:`OSError` exception if the target directory already
exists, unless *exist_ok* is True and the existing directory has the
same mode as is specified in the current call.  Also raises an
:exc:`OSError` exception if the directory cannot be created for any
other reason.


Georg has already patched that paragraph. I will let him decide if any 
further change is needed.


Terry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ICU

2010-12-02 Thread P.J. Eby

At 07:47 AM 12/2/2010 -0800, Guido van Rossum wrote:

On Wed, Dec 1, 2010 at 8:45 PM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Tue, Nov 30, 2010 at 3:13 PM, Antoine Pitrou 
solip...@pitrou.net wrote:


 Oh, about ICU:

  Actually, I remember you saying that locale should ideally be replaced
  with a wrapper around the ICU library.

 By that, I stand - however, I have given up the hope that this will
 happen anytime soon.

 Perhaps this could be made a GSOC topic.


 Incidentally, this may also address another Python's Achilles' heel:
 the timezone support.

 http://icu-project.org/download/icutzu.html

I work with people who speak highly of ICU, so I want to encourage
work in this area.

At the same time, I'm skeptical -- IIRC, ICU is a large amount of C++
code. I don't know how easy it will be to integrate this into our
build processes for various platforms, nor how Pythonic the
resulting APIs will look to the experienced Python user.

Still, those are not roadblocks, the benefits are potentially great,
so it's definitely worth investigating!


FWIW, OSAF did a wrapping for Chandler, though I personally haven't used it:

   http://pyicu.osafoundation.org/

The README explains the mapping from the ICU APIs to Python ones, 
including iteration, string conversion, and timezone mapping for use 
with the datetime type.




--
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Terry Reedy

On 12/2/2010 8:36 AM, Lennart Regebro wrote:

On Wed, Dec 1, 2010 at 20:17, Antoine Pitrousolip...@pitrou.net  wrote:

And I'm not sure what this package called Python is (“a high-level
object-oriented programming language”? like Java?), but I'm pretty sure
I've heard there's a Python 3 compatible version.


Uhm... http://pypi.python.org/pypi/Python

Anybody wanna remove that, or update it or something? :-)


Entry is for Python 2.5.

# Package Index Owner: guido, anthony, barry

--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 11:56 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Le jeudi 02 décembre 2010 à 11:41 -0500, Alexander Belopolsky a écrit :

 Note that my point is not to find the correct answer here, but to
 demonstrate that we as a group don't have the expertise to get parsing
 of Arabic text right.

 I don't understand why you think Arabic or Hebrew text is any different
 from Western text. Surely right-to-left isn't more conceptually
 complicated than left-to-right, is it?


No, but a mix of LTR and RTL is certainly more difficult that either
of the two.  I invite you to digest Unicode Standard Annex #9 before
we continue this discussion.

See http://unicode.org/reports/tr9/.


 The fact that mixed rtl + ltr can render bizarrely or is awkward to cut
 and paste is quite off-topic for our discussion.


No, it is not.  One of the invented use cases in this thread was naive
users' desire to enter numbers using their preferred local decimals.
Same users may want to be able to cut and paste their decimals as
well.  More importantly, however, legacy formats may not have support
for mixed-direction text and may require that John is 41 be stored
as 41 si nhoJ and Unicode converter would turn it into [RTL]John is
14  that will still display as  41 si nhoJ, but int(s[-2:]) will
return 14, not 41.

 If we've got it right for Arabic, it is by
 chance and not by design.  This still leaves us with 41 other types of
 digits for at least 30 different languages.

 So why do you trust the Unicode standard on other things and not on this
 one?

What other things? As far as I understand the only str method that was
designed to comply with Unicode recomendations was str.isidentifier().
 And we have some really bizarre results:


 '\u2164'.isidentifier()
True
 '\u2164'.isalpha()
False

and can you describe the difference between str.isdigit() and
str.isdecimal()?  According to the reference manual,


str.isdecimal()
Return true if all characters in the string are decimal characters and
there is at least one character, false otherwise. Decimal characters
include digit characters, and all characters that that can be used to
form decimal-radix numbers, e.g. U+0660, ARABIC-INDIC DIGIT ZERO.

str.isdigit()
Return true if all characters in the string are digits and there is at
least one character, false otherwise.
 http://docs.python.org/dev/library/stdtypes.html#str.isdecimal

Since U+0660 is mentioned in the first definition and not in the
second, I may conclude that it is not a digit, but

 '\u0660'.isdigit()
True

If you know the correct answer, please contribute it here:
http://bugs.python.org/issue10587.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Antoine Pitrou
Le jeudi 02 décembre 2010 à 13:14 -0500, Alexander Belopolsky a écrit :
  I don't understand why you think Arabic or Hebrew text is any different
  from Western text. Surely right-to-left isn't more conceptually
  complicated than left-to-right, is it?
 
 
 No, but a mix of LTR and RTL is certainly more difficult that either
 of the two.  I invite you to digest Unicode Standard Annex #9 before
 we continue this discussion.
 
 See http://unicode.org/reports/tr9/.

“This annex describes specifications for the *positioning* of characters
flowing from right to left” (emphasis mine)

Looks like something for implementors of rendering engines, which
python-dev is not AFAICT.

 Same users may want to be able to cut and paste their decimals as
 well.  More importantly, however, legacy formats may not have support
 for mixed-direction text and may require that John is 41 be stored
 as 41 si nhoJ and Unicode converter would turn it into [RTL]John is
 14  that will still display as  41 si nhoJ, but int(s[-2:]) will
 return 14, not 41.

The legacy format argument looks like a red herring to me. When
converting from a format to another it is the programmer's job to
his/her job right.

  If we've got it right for Arabic, it is by
  chance and not by design.  This still leaves us with 41 other types of
  digits for at least 30 different languages.
 
  So why do you trust the Unicode standard on other things and not on this
  one?
 
 What other things?

Everything which the Unicode database stores and that we already rely
on.

 As far as I understand the only str method that was
 designed to comply with Unicode recomendations was str.isidentifier().

I don't think so.  str.split() and str.splitlines() are also defined in
conformance to the SPEC, AFAIK.  They certainly try to.
And, outside of str itself, the re module tries to follow Unicode
categories as well (for example, \d should match non-ASCII digits).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Barry Warsaw
On Dec 02, 2010, at 12:59 PM, Terry Reedy wrote:

On 12/2/2010 8:36 AM, Lennart Regebro wrote:
 On Wed, Dec 1, 2010 at 20:17, Antoine Pitrousolip...@pitrou.net  wrote:
 And I'm not sure what this package called Python is (“a high-level
 object-oriented programming language”? like Java?), but I'm pretty sure
 I've heard there's a Python 3 compatible version.

 Uhm... http://pypi.python.org/pypi/Python

 Anybody wanna remove that, or update it or something? :-)

Entry is for Python 2.5.

# Package Index Owner: guido, anthony, barry

Well, I definitely can't remember ever seeing that before.  Of course, that
doesn't mean I haven't. ;)

-Barry

Aside: how does one log into the Cheeseshop with your Launchpad OpenID?  When
I try to do it I end up on a Manual user registration page.  I fill out the
username with what I think my PyPI user name is, and add my python.org email
address, but then it tells me 'barry' is already taken.  Do I need some kind
of back door linking of my lp openid and my pypi user id?


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Sridhar Ratnakumar

On 2010-12-01, at 11:02 AM, Brian Curtin wrote:

 http://onpython3yet.com/ might be helpful to you. It orders the projects on 
 PyPI with the most dependencies which are not yet ported to 3.x.
 
 Note that there are a number of false positives, e.g., the first result -- 
 NumPy, since people don't seem to keep their classifiers up-to-date.

Also note that the dependency information is incomplete. For instance, 
onpython3yet.com shows just 14 packages depending on Twisted,

  http://onpython3yet.com/packages/show/Twisted

while, in reality, there are 68 of them,

  http://code.activestate.com/pypm/twisted/#requiredby
  (see the right sidebar)

-srid

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
Am 02.12.2010 03:01, schrieb Ben Finney:
 Stephen J. Turnbull step...@xemacs.org writes:
 
 Furthermore, he provided good *objective* reason (excessive cost, to
 which I can also testify, in several different input methods for
 Japanese) why numbers simply would not be input that way.

 What's left is copy/paste via the mouse.
 
 For direct entry by an interactive user, yes. Why are some people in
 this discussion thinking only of direct entry by an interactive user?

Ultimately, somebody will have entered the data.

 Input from an existing text file, as I said earlier.

Which *specific* existing text file? Have you actually *seen* such a
text file?

 Direct entry at the console is a red herring.

And we don't need powerhouses because power comes out of the socket.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
 Maybe all past, present and future whatsnew maintainers can agree on these
 rules, which I copied directly from whatsnew/3.2.rst?

I don't think all past maintainers can (I'm pretty certain that AMK
would disagree), but if that's the current policy, I can certainly try
following it (I didn't know it exists because I never look at the file).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Martin v. Löwis
 Aside: how does one log into the Cheeseshop with your Launchpad OpenID?  When
 I try to do it I end up on a Manual user registration page.  I fill out the
 username with what I think my PyPI user name is, and add my python.org email
 address, but then it tells me 'barry' is already taken.  Do I need some kind
 of back door linking of my lp openid and my pypi user id?

Since the barry account already exists, you first need to log into
that (likely using a password). You can then claim the LP OpenID as
being associated with that account, and use LP in the future.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Barry Warsaw
On Dec 02, 2010, at 08:44 PM, Martin v. Löwis wrote:

Since the barry account already exists, you first need to log into
that (likely using a password). You can then claim the LP OpenID as
being associated with that account, and use LP in the future.

Thanks Martin.
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 384 accepted

2010-12-02 Thread Benjamin Peterson
Hi,
Since discussion has trailed off without any blocking objections, I'm
accepting PEP 384. Martin, you may mark the PEP accepted and proceed
with merging the implementation for the beta on Saturday.

-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Martin v. Löwis wrote:
 Now, one may wonder what precisely a possibly signed floating point
 number is, but most likely, this refers to

 floatnumber   ::=  pointfloat | exponentfloat
 pointfloat::=  [intpart] fraction | intpart .
 exponentfloat ::=  (intpart | pointfloat) exponent
 intpart   ::=  digit+
 fraction  ::=  . digit+
 exponent  ::=  (e | E) [+ | -] digit+
 digit  ::=  0...9

 I don't see why the language spec should limit the wealth of number
 formats supported by float().
 
 If it doesn't, there should be some other specification of what
 is correct and what is not. It must not be unspecified.

True.

 It is not uncommon for Asians and other non-Latin script users to
 use their own native script symbols for numbers. Just because these
 digits may look strange to someone doesn't mean that they are
 meaningless or should be discarded.
 
 Then these users should speak up and indicate their need, or somebody
 should speak up and confirm that there are users who actually want
 '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
 system in which '١٢٣٤.٥٦e4' means 12345600.0.

I'm not sure what you're after here.

 Please also remember that Python3 now allows Unicode names for
 identifiers for much the same reasons.
 
 No no no. Addition of Unicode identifiers has a well-designed,
 deliberate specification, with a PEP and all. The support for
 non-ASCII digits in float appears to be ad-hoc, and not founded
 on actual needs of actual users.

Please note that we didn't have PEPs and the PEP process at the
time. The Unicode proposal predates and in some respects inspired
the PEP process.

The decision to add this support was deliberate based on the desire
to support as much of the nice features of Unicode in Python as
we could. At least that was what was driving me at the time.

Regarding actual needs of actual users: I don't buy that as an
argument when it comes to supporting a standard that is meant
to attract users with non-ASCII origins.

Some references you may want to read up on:

http://en.wikipedia.org/wiki/Numbers_in_Chinese_culture
http://en.wikipedia.org/wiki/Vietnamese_numerals
http://en.wikipedia.org/wiki/Korean_numerals
http://en.wikipedia.org/wiki/Japanese_numerals

Even MS Office supports them:

http://languages.siuc.edu/Chinese/Language_Settings.html

 Note that the support in float() (and the other numeric constructors)
 to work with Unicode code points was explicitly added when Unicode
 support was added to Python and has been available since Python 1.6.
 
 That doesn't necessarily make it useful. Alexander's complaint is that
 it makes Python unstable (i.e. changing as the UCD changes).

If that were true, then all Unicode database (UCD) changes would make
Python unstable. However, most changes to existing code points in the UCS
are bug fixes, so they actually have a stabilizing quality more than
a destabilizing one.

 It is not a bug by any definition of bug
 
 Most certainly it is: the documentation is either underspecified,
 or deviates from the implementation (when taking the most plausible
 interpretation). This is the very definition of bug.

The implementation is not a bug and neither was this a bug in the
2.x series of the Python documentation. The Python 3.x docs apparently
introduced a reference to the language spec which is clearly not
capturing the wealth of possible inputs.

So, yes, we're talking about a documentation bug, but not an
implementation bug.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 29 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Georg Brandl
Am 02.12.2010 20:40, schrieb Martin v. Löwis:
 Maybe all past, present and future whatsnew maintainers can agree on these
 rules, which I copied directly from whatsnew/3.2.rst?
 
 I don't think all past maintainers can

Yes, and the same goes for the future ones, since they may not even know yet
that they will be whatsnew maintainers.  Or maybe they aren't born yet (let's
hope for a long life of Python 3...).

 (I'm pretty certain that AMK
 would disagree), but if that's the current policy, I can certainly try
 following it (I didn't know it exists because I never look at the file).

The large chunk of rules appeared in 2.6, where AMK still was maintainer.
But even in the whatsnew for 2.4, there is this:

.. Don't write extensive text for new sections; I'll do that.
.. Feel free to add commented-out reminders of things that need
.. to be covered.  --amk

But in any case, they are certainly valid for the current whatsnew -- even
if Raymond likes to grumble about too expansive commits :)

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Georg Brandl
Am 02.12.2010 20:06, schrieb Barry Warsaw:
 On Dec 02, 2010, at 12:59 PM, Terry Reedy wrote:
 
On 12/2/2010 8:36 AM, Lennart Regebro wrote:
 On Wed, Dec 1, 2010 at 20:17, Antoine Pitrousolip...@pitrou.net  wrote:
 And I'm not sure what this package called Python is (“a high-level
 object-oriented programming language”? like Java?), but I'm pretty sure
 I've heard there's a Python 3 compatible version.

 Uhm... http://pypi.python.org/pypi/Python

 Anybody wanna remove that, or update it or something? :-)

Entry is for Python 2.5.

# Package Index Owner: guido, anthony, barry
 
 Well, I definitely can't remember ever seeing that before.  Of course, that
 doesn't mean I haven't. ;)

No idea what that entry is about.

* Development Status :: 3 - Alpha
* Development Status :: 6 - Mature

Aha.  Let's just delete it.

 Aside: how does one log into the Cheeseshop with your Launchpad OpenID?  When
 I try to do it I end up on a Manual user registration page.  I fill out the
 username with what I think my PyPI user name is, and add my python.org email
 address, but then it tells me 'barry' is already taken.  Do I need some kind
 of back door linking of my lp openid and my pypi user id?

In addition to what Martin said, the Claim OpenID form is on the Your
Details page.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
 Then these users should speak up and indicate their need, or somebody
 should speak up and confirm that there are users who actually want
 '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
 system in which '١٢٣٤.٥٦e4' means 12345600.0.
 
 I'm not sure what you're after here.

That the current float() constructor accepts tons of bogus character
strings and accepts them as numbers, and that it should stop doing so.

 The decision to add this support was deliberate based on the desire
 to support as much of the nice features of Unicode in Python as
 we could. At least that was what was driving me at the time.

At the time, this may have been the right thing to do. With the
experience gained, we should now conclude to revert this particular aspect.

 Some references you may want to read up on:
 
 http://en.wikipedia.org/wiki/Numbers_in_Chinese_culture
 http://en.wikipedia.org/wiki/Vietnamese_numerals
 http://en.wikipedia.org/wiki/Korean_numerals
 http://en.wikipedia.org/wiki/Japanese_numerals

I don't question that people use non-ASCII characters to
denote numbers. I claim that the specific support in Python for that
has no connection to reality. I further claim that the use of non-ASCII
numbers is a local convention, and that if you provide a library to
parse numbers, users (of that library) will somehow have to specify
which notational convention(s) is reasonable for the input they have.

 Even MS Office supports them:
 
 http://languages.siuc.edu/Chinese/Language_Settings.html

That's printing, though, not parsing.

Notice that Python does *not* currently support printing numbers in
other scripts - even though this may actually be more useful than
parsing.

 Note that the support in float() (and the other numeric constructors)
 to work with Unicode code points was explicitly added when Unicode
 support was added to Python and has been available since Python 1.6.

 That doesn't necessarily make it useful. Alexander's complaint is that
 it makes Python unstable (i.e. changing as the UCD changes).
 
 If that were true, then all Unicode database (UCD) changes would make
 Python unstable.

That's indeed the case - they do (see the recent bug report on white
space processing). However, any change makes Python unstable (in the
sense that it can potentially break existing applications), and, in
many cases, the risk of breaking something is well worth it.

In the case of number parsing, I think Python would be better if
float() rejected non-ASCII strings, and any support for such parsing
should be redone correctly in a different place (preferably along with
printing of numbers).

 Most certainly it is: the documentation is either underspecified,
 or deviates from the implementation (when taking the most plausible
 interpretation). This is the very definition of bug.
 
 The implementation is not a bug and neither was this a bug in the
 2.x series of the Python documentation.

Of course the 2.x documentation is wrong, in that it is severely
underspecified, and the most straight-forward interpretation of the
specific wording gives an incorrect impression of the implementation.

 The Python 3.x docs apparently
 introduced a reference to the language spec which is clearly not
 capturing the wealth of possible inputs.

Right - but only because the 2.x documentation *already* suggested that
the supported syntax matches the literal syntax - as that's the most
natural thing to assume.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 Since discussion has trailed off without any blocking objections, I'm
 accepting PEP 384. Martin, you may mark the PEP accepted and proceed
 with merging the implementation for the beta on Saturday.

Thanks! will do (I'll also take into consideration the proposed changes).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Dirkjan Ochtman
On Thu, Dec 2, 2010 at 20:24, Sridhar Ratnakumar
sridh...@activestate.com wrote:
 Also note that the dependency information is incomplete.

Also, a python3 version of chardet is available (from the website
only, looks like).

Cheers,

Dirkjan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
On Thu, Dec 2, 2010 at 9:24 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Since discussion has trailed off without any blocking objections, I'm
 accepting PEP 384. Martin, you may mark the PEP accepted and proceed
 with merging the implementation for the beta on Saturday.

 Thanks! will do (I'll also take into consideration the proposed changes).

I did not get an answer to my last mail about distutils / distutils2


 Regards,
 Martin
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com




-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
Am 02.12.2010 21:48, schrieb Tarek Ziadé:
 On Thu, Dec 2, 2010 at 9:24 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Since discussion has trailed off without any blocking objections, I'm
 accepting PEP 384. Martin, you may mark the PEP accepted and proceed
 with merging the implementation for the beta on Saturday.

 Thanks! will do (I'll also take into consideration the proposed changes).
 
 I did not get an answer to my last mail about distutils / distutils2

What was the question again, and whom did you want an answer from?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:
 Am 02.12.2010 21:48, schrieb Tarek Ziadé:
 On Thu, Dec 2, 2010 at 9:24 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Since discussion has trailed off without any blocking objections, I'm
 accepting PEP 384. Martin, you may mark the PEP accepted and proceed
 with merging the implementation for the beta on Saturday.

 Thanks! will do (I'll also take into consideration the proposed changes).

 I did not get an answer to my last mail about distutils / distutils2

 What was the question again, and whom did you want an answer from?

You can read it in the archives here:
http://mail.python.org/pipermail/python-dev/2010-November/106138.html

tldr:

The question was Why not implementing this in Distutils2 ?
Your answer was No, PEP 3149 was accepted, I will do this in Distutils1
My answer was Having an accepted PEP does not imply your code lands
in the sdtlib (like PEP 376 and 345)

So the question still stands: Why not implementing this in Distutils2 ?

Regards
Tarek


 Regards,
 Martin




-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Martin v. Löwis wrote:
 [...]
 For direct entry by an interactive user, yes. Why are some people in
 this discussion thinking only of direct entry by an interactive user?
 
 Ultimately, somebody will have entered the data.

I don't think you really believe that all data processed by a
computer was eventually manually entered by a someone :-)

I already gave you a couple of examples of how such data can
end up being input for Python number constructors. If you are
still curious, please see the Wikipedia pages I linked to,
or have a look at these keyboards:

http://en.wikipedia.org/wiki/File:KB_Arabic_MAC.svg
http://en.wikipedia.org/wiki/File:Keyboard_Layout_Sanskrit.png
http://en.wikipedia.org/wiki/File:800px-KB_Thai_Kedmanee.png
http://en.wikipedia.org/wiki/File:Tibetan_Keyboard.png
http://en.wikipedia.org/wiki/File:KBD-DZ-noshift-2009.png

(all referenced on http://en.wikipedia.org/wiki/Keyboard_layout)

and then compare these to:

http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt

Arabic numerals are being used a lot nowadays in Asian countries,
but that doesn't mean that the native script versions are not
being used anymore.

Furthermore, data can well originate from texts that were written
hundreds or even thousands of years ago, so there is plenty of
material available for processing.

Even if not entered directly, there are plenty of ways to convert
Arabic numerals (or other numeral systems) to the above forms,
e.g. in MS Office for Thai:

http://office.microsoft.com/en-us/excel-help/convert-arabic-numbers-to-thai-text-format-HP003074364.aspx

Anyway, as mentioned before: all this is really besides the point:

If we want to support Unicode in Python, we have to also support
conversion of numerals declared in Unicode into a form that can
be processed by Python. Regardless of where such data originates.

If we were not to follow this approach, we could just as well
decide not support support reading Egyptian Hieroglyphs based
on the argument that there's no keyboard to enter them...

http://www.unicode.org/charts/PDF/U13000.pdf  :-)

(from http://www.unicode.org/charts/)

 Input from an existing text file, as I said earlier.
 
 Which *specific* existing text file? Have you actually *seen* such a
 text file?

Have you tried Google ?

http://www.google.com/search?q=١٢٣
http://www.google.com/search?q=٣+site%3Agov.lb

Some examples:

http://www.bdl.gov.lb/circ/intpdf/int123.pdf
http://www.cdr.gov.lb/study/sdatl/Arabic/Chapter3.PDF
http://www.batroun.gov.lb/PDF/Waredat2006.pdf

(these all use http://en.wikipedia.org/wiki/Eastern_Arabic_numerals)

 Direct entry at the console is a red herring.
 
 And we don't need powerhouses because power comes out of the socket.

Martin, the argument simply doesn't fit well with the discussion
about Python and Unicode.

We introduced Unicode in Python not because there was a need
for each and every code point in Unicode, but because we wanted
to adopt a standard which doesn't prefer any one way of writing
things over another.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 02 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis

 So the question still stands: Why not implementing this in Distutils2 ?

Because it then wouldn't be available in Python 3.2, which is the target
release of the PEP.

If that really causes too much pain, I'll refrain from making any
changes to distutils; PEP 384 doesn't specify any changes, anyway.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
 Arabic numerals are being used a lot nowadays in Asian countries,
 but that doesn't mean that the native script versions are not
 being used anymore.

I never claimed that people are not using their local scripts to enter
numbers. However, none of your examples is about Chinese numerals using
an ASCII full stop as a decimal point. The only thing I claimed about
usage (actually only repeating haiyang kang's earlier claim) is that
nobody would enter Chinese numerals with a keyboard and then use full
stop as the decimal separator.

So all your counter-examples just don't apply - I don't deny them.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Steven D'Aprano

Martin v. Löwis wrote:

Then these users should speak up and indicate their need, or somebody
should speak up and confirm that there are users who actually want
'١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
system in which '١٢٣٤.٥٦e4' means 12345600.0.

I'm not sure what you're after here.


That the current float() constructor accepts tons of bogus character
strings and accepts them as numbers, and that it should stop doing so.


What bogus characters do the float() and int() constructors accept? As 
far as I can see, they only accepts numerals.



[...]

Notice that Python does *not* currently support printing numbers in
other scripts - even though this may actually be more useful than
parsing.


Lack of one function, even if more useful, does not imply that an 
existing function should be removed.


[...]

In the case of number parsing, I think Python would be better if
float() rejected non-ASCII strings, and any support for such parsing
should be redone correctly in a different place (preferably along with
printing of numbers).


So your problems with the current behaviour are:

(1) in some unspecified way, it's not done correctly;

(2) it belongs somewhere other than float() and int().

That second is awfully close to bike-shedding. Since you accept that 
Python *should* have the current behaviour, and Python *already* has the 
current behaviour, it seems strange that you are kicking up such a fuss 
merely to *move* the implementation of that behaviour out of the numeric 
constructors into some unspecified different place.


I think it would be constructive to explain:

- how the current behaviour is incorrect;
- your suggestions for correcting it; and
- a concrete suggestion for where you would like to see the behaviour 
moved to, and why that would be better than where it currently is.




--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:

 So the question still stands: Why not implementing this in Distutils2 ?

 Because it then wouldn't be available in Python 3.2, which is the target
 release of the PEP.

The exact feature I am mentioning is the ability to compile extensions
with new options, so I am not sure which PEP is involved since
distutils changes refer to PEP 384 in the other PEP.

I was told not to touch to Distutils code to avoid any regression
since it's patched to the bones in third party products. So we decided
to freeze distutils and add all new features in Distutils2, which is
at alpha stage now.  So this move seems contradictory to me.

Grouping all new features in the new version and keep Distutils1 in
maintenance mode seems to make more sense to me, if we want to make
Distutils die and push forward Distutils2 for its new features etc. Or
we might get back into backward hell again :)

So, I am +1 on a patch on distutils2 and -1 on de-freezing Distutils
for any new feature.

 If that really causes too much pain, I'll refrain from making any
 changes to distutils; PEP 384 doesn't specify any changes, anyway.

That would be awesome, and we can work on a patch for distutils2 to
provide that abi option.



 Regards,
 Martin




-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 1:55 PM, Antoine Pitrou solip...@pitrou.net wrote:
..
 I don't think so.  str.split() and str.splitlines() are also defined in
 conformance to the SPEC, AFAIK.  They certainly try to.

You are joking, right?  Where exactly does Unicode specify something like this:

 ''.join('̀́̂'.split('\udf00\ud800'))
'́̂'
?

OK, splitting on a given separator has very little to do with Unicode
or UCD, but str.splitlines()  makes absolutely no attempt to conform
to Unicode Standard Annex #14 (Unicode line breaking algorithm).
Wait, UAX #14 is actually relevant to textwrap module which saw very
little change since 2.x days.  So, what exactly does str.splitlines()
do?   And which part of the Unicode standard defines how it is
different from str.split(.., '\n')?  Reference manual does not help me
here either:


str.splitlines([keepends])

Return a list of the lines in the string, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends is
given and true.
 http://docs.python.org/dev/library/stdtypes.html#str.splitlines
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Antoine Pitrou
Le jeudi 02 décembre 2010 à 16:34 -0500, Alexander Belopolsky a écrit :
 On Thu, Dec 2, 2010 at 1:55 PM, Antoine Pitrou solip...@pitrou.net wrote:
 ..
  I don't think so.  str.split() and str.splitlines() are also defined in
  conformance to the SPEC, AFAIK.  They certainly try to.
 
 You are joking, right?

Perhaps you could look at the implementation.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 I was told not to touch to Distutils code to avoid any regression
 since it's patched to the bones in third party products. So we decided
 to freeze distutils and add all new features in Distutils2, which is
 at alpha stage now.  So this move seems contradictory to me.

I think it was a bad decision to freeze distutils, and we certainly
didn't make that (not any we that includes me, that is). This freeze
made the situation worse.

IIRC, it was really the incompatible changes that made people ask you to
stop changing distutils.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
Am 02.12.2010 22:30, schrieb Steven D'Aprano:
 Martin v. Löwis wrote:
 Then these users should speak up and indicate their need, or somebody
 should speak up and confirm that there are users who actually want
 '١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
 system in which '١٢٣٤.٥٦e4' means 12345600.0.
 I'm not sure what you're after here.

 That the current float() constructor accepts tons of bogus character
 strings and accepts them as numbers, and that it should stop doing so.
 
 What bogus characters do the float() and int() constructors accept? As
 far as I can see, they only accepts numerals.

Not bogus characters, but bogus character strings. E.g. strings that mix
digits from different scripts, and mix them with the Python decimal
separator.

 Notice that Python does *not* currently support printing numbers in
 other scripts - even though this may actually be more useful than
 parsing.
 
 Lack of one function, even if more useful, does not imply that an
 existing function should be removed.

No. But if the specific function(ality) is not useful and
underspecified, it should be removed.

 So your problems with the current behaviour are:
 
 (1) in some unspecified way, it's not done correctly;

No. My main concern is that it is not properly specified. If it was
specified, I could then tell you what precisely is wrong about it.
Right now, I can only give examples for input that it should not accept,
and examples of input that it should, but does not accept.

 (2) it belongs somewhere other than float() and int().

That's only because it also needs a parameter to specify what syntax to
follow, somehow. That parameter could be explicit or implicit, and it
could be to float or to some other function. But it must be available,
and is not.

 That second is awfully close to bike-shedding. Since you accept that
 Python *should* have the current behaviour

No, I don't. I think it behaves incorrectly, accepting garbage input and
guessing some meaning out of it.

 - how the current behaviour is incorrect;

See above: it accepts strings that do not denote real numbers in any
writing system, and, despite the claim that the feature is there to
support other writing systems, actually does not truly support other
writing systems.

 - your suggestions for correcting it; and

Make the current implementation exactly match the current documentation.
I think the documentation is correct; the implementation is wrong.

 - a concrete suggestion for where you would like to see the behaviour
 moved to, and why that would be better than where it currently is.

The current behavior should go nowhere; it is not useful. Something very
similar to the current behavior (but done correctly) should go into the
locale module.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 Have you tried Google ?


I tried google at I could not find any plain text or HTML file that
would use Arabic-Indic numerals.  What was interesting, though that a
search for quran unicode (without quotes).  Brought me to
http://www.sacred-texts.com which says that they've been using unicode
since 2002 in their archives.  Interestingly enough, their version of
Qur'an uses ordinary digits for ayah numbers.  See, for example
http://www.sacred-texts.com/isl/uq/050.htm.

I will change my mind on this issue when you present a
machine-readable file with Arabic-Indic numerals and a program capable
of reading it and show that this program uses the same number parsing
algorithm as Python's int() or float().
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:
 I was told not to touch to Distutils code to avoid any regression
 since it's patched to the bones in third party products. So we decided
 to freeze distutils and add all new features in Distutils2, which is
 at alpha stage now.  So this move seems contradictory to me.

 I think it was a bad decision to freeze distutils, and we certainly
 didn't make that (not any we that includes me, that is).

We is the people at the last language summit. Sorry if I used such a
vague word.

 This freeze made the situation worse.

Can you extend on this and explains why it makes it worse ?

If we (as you included) don't agree it's the best solution, I would
not want to be pushed back to square one at the next summit..

I happily reverted all my changes last year when asked, and started to
work on Distutils2. But I'll get out of steam if the direction changes
again, with you stating that it makes the situation worse.


 IIRC, it was really the incompatible changes that made people ask you to
 stop changing distutils.

Who is people ? Are you suggesting that we could have added all the
new features in Distutils in the stdlib ?

The decision was because we had a mix of:

- incompatible changes in private parts  -- and some packages where
patching distutils internals
- changes on public APIs behavior, whith a behavior that was not
clearly documented and suggest to interpretation
- some mistakes I made as well

But that's what you would expect for a project that needs to evolve a
lot. Thus the freezing.

So how would you make the situation better, if not by doing the work
in distutils2 ?

 Regards,
 Martin




-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Michael Foord

On 02/12/2010 21:39, Martin v. Löwis wrote:

I was told not to touch to Distutils code to avoid any regression
since it's patched to the bones in third party products. So we decided
to freeze distutils and add all new features in Distutils2, which is
at alpha stage now.  So this move seems contradictory to me.

I think it was a bad decision to freeze distutils, and we certainly
didn't make that (not any we that includes me, that is). This freeze
made the situation worse.


What situation worse?

We certainly did ask Tarek to become bdfl of distutils and fix/improve 
it (at a language summit 2 years ago). We then asked him to revert 
distutils and do the work in a new package instead of inside distutils 
(at the language summit this year).


I would perhaps argue for a case by case exception on PEPs that 
*required* distutils support that are being accepted and implemented 
prior to distutils2 moving into the standard library. It doesn't sound 
like your changes are *required* by the PEP though.


As I recall Tarek thought it was a bad idea to freeze distutils as well, 
but we insisted. :-)

IIRC, it was really the incompatible changes that made people ask you to
stop changing distutils.

Which included virtually any change to even private APIs. Given the 
issues freezing the distutils APIs except for essential bugfixes is a 
reasonable response. I don't know of any situation it has made worse. 
Things are getting very much better, but happening in distutils2 not 
distutils.


All the best,

Michael Foord


Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk



--

http://www.voidspace.org.uk/

READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (”BOGUS AGREEMENTS”) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Mark Dickinson
On Thu, Dec 2, 2010 at 8:23 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 In the case of number parsing, I think Python would be better if
 float() rejected non-ASCII strings, and any support for such parsing
 should be redone correctly in a different place (preferably along with
 printing of numbers).

+1.  The set of strings currently accepted by the float constructor
just seems too ad hoc to be at all useful.  Apart from the decimal
separator issue, and the question of exactly which decimal digits are
accepted and which aren't, there are issues like this one:

 x = '\uff11\uff25\uff0b\uff11\uff10'
 x
'1E+10'
 float(x)
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeEncodeError: 'decimal' codec can't encode character '\uff25' in
position 1: invalid decimal Unicode string
 y = '\uff11E+\uff11\uff10'
 y
'1E+10'
 float(y)
100.0

That is, fullwidth *digits* are allowed, but none of the other
characters can be fullwidth variants.  Unfortunately, a float string
doesn't consist solely of digits, and it seems to me to make little
sense to allow variation in the digits without allowing corresponding
variations in the other characters that might appear ('.', 'e', 'E',
'+', '-').

A couple of slightly trickier decisions: (1) the float constructor
currently does accept leading and trailing whitespace;  should it
allow any Unicode whitespace characters here? I'd say yes. (2) For
int() rather than float(), there's a bit more value in allowing the
variant digits, since it provides an easy way to interpret those
digits.  The decimal module currently makes use of this, for example
(the decimal spec requires that non-European digits be accepted).  I'd
be happier if this functionality were moved elsewhere, though.  The
int constructor is, if anything, currently worse off than float,
thanks to its attempts to support non-decimal bases.

There's value in having an easy-to-specify, easy-to-maintain API for
these basic builtin functions.  For one thing, it helps non-CPython
implementations.

[MAL]
 The Python 3.x docs apparently
 introduced a reference to the language spec which is clearly not
 capturing the wealth of possible inputs.

That documentation update was my fault;  I was motivated to make the
update by issues unrelated to this one (mostly to do with Python 3's
more consistent handling of inf and nan, as a result of all the new
float-string conversion code).  If I'd been thinking harder, I would
have remembered that float accepted the non-European digits and added
a note to that effect.  This (unintentional) omission does underline
the point that it's difficult right now to document and understand
exactly what the float constructor does or doesn't accept.

Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Eric Smith

On 12/2/2010 4:48 PM, Martin v. Löwis wrote:

Am 02.12.2010 22:30, schrieb Steven D'Aprano:

Martin v. Löwis wrote:

Then these users should speak up and indicate their need, or somebody
should speak up and confirm that there are users who actually want
'١٢٣٤.٥٦' to denote 1234.56. To my knowledge, there is no writing
system in which '١٢٣٤.٥٦e4' means 12345600.0.

I'm not sure what you're after here.


That the current float() constructor accepts tons of bogus character
strings and accepts them as numbers, and that it should stop doing so.


What bogus characters do the float() and int() constructors accept? As
far as I can see, they only accepts numerals.


Not bogus characters, but bogus character strings. E.g. strings that mix
digits from different scripts, and mix them with the Python decimal
separator.


Notice that Python does *not* currently support printing numbers in
other scripts - even though this may actually be more useful than
parsing.


Lack of one function, even if more useful, does not imply that an
existing function should be removed.


No. But if the specific function(ality) is not useful and
underspecified, it should be removed.


So your problems with the current behaviour are:

(1) in some unspecified way, it's not done correctly;


No. My main concern is that it is not properly specified. If it was
specified, I could then tell you what precisely is wrong about it.
Right now, I can only give examples for input that it should not accept,
and examples of input that it should, but does not accept.


(2) it belongs somewhere other than float() and int().


That's only because it also needs a parameter to specify what syntax to
follow, somehow. That parameter could be explicit or implicit, and it
could be to float or to some other function. But it must be available,
and is not.


That second is awfully close to bike-shedding. Since you accept that
Python *should* have the current behaviour


No, I don't. I think it behaves incorrectly, accepting garbage input and
guessing some meaning out of it.


- how the current behaviour is incorrect;


See above: it accepts strings that do not denote real numbers in any
writing system, and, despite the claim that the feature is there to
support other writing systems, actually does not truly support other
writing systems.


- your suggestions for correcting it; and


Make the current implementation exactly match the current documentation.
I think the documentation is correct; the implementation is wrong.


- a concrete suggestion for where you would like to see the behaviour
moved to, and why that would be better than where it currently is.


The current behavior should go nowhere; it is not useful. Something very
similar to the current behavior (but done correctly) should go into the
locale module.


I agree with everything Martin says here. I think the basic premise is: 
you won't find strings in the wild that use non-ASCII digits but do 
use the ASCII dot as a decimal point. And that's what float() is looking 
for. (And that doesn't even begin to address what it expects for an 
exponent 'e'.)


Eric.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 This freeze made the situation worse.
 
 Can you extend on this and explains why it makes it worse ?

Before the freeze, distutils was unmaintained (i.e. before you started
maintaining it), but people who want to improve it gradually atleast
could. Now gradual improvements are also banned, so it's not only
unmaintained, but I can't even provide support for the PEP in Python
that was just accepted.

 IIRC, it was really the incompatible changes that made people ask you to
 stop changing distutils.
 
 Who is people ? Are you suggesting that we could have added all the
 new features in Distutils in the stdlib ?

No, only the ones that didn't cause backwards incompatibilities,
and broke existing packages.

 But that's what you would expect for a project that needs to evolve a
 lot. Thus the freezing.

Instead of evolving a lot, and instead of freezing, I would have
preferred evolve a little.

 So how would you make the situation better, if not by doing the work
 in distutils2 ?

Lift the freeze. I'm all for replacing distutils with distutils2, but
I'm not sure whether you will declare distutils2 ready tomorrow, next
year, or ten years from now.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
Am 02.12.2010 22:54, schrieb Michael Foord:
 On 02/12/2010 21:39, Martin v. Löwis wrote:
 I was told not to touch to Distutils code to avoid any regression
 since it's patched to the bones in third party products. So we decided
 to freeze distutils and add all new features in Distutils2, which is
 at alpha stage now.  So this move seems contradictory to me.
 I think it was a bad decision to freeze distutils, and we certainly
 didn't make that (not any we that includes me, that is). This freeze
 made the situation worse.
 
 What situation worse?

The distutils is unmaintained situation. It's not only unmaintained
now, but proposed improvements are rejected without consideration, on
the grounds that they are changes.

 I would perhaps argue for a case by case exception on PEPs that
 *required* distutils support that are being accepted and implemented
 prior to distutils2 moving into the standard library. It doesn't sound
 like your changes are *required* by the PEP though.

Well, the PEP 384 text in PEP 3149 specifies a change. It's not clear
whether this change was accepted when PEP 3149 was accepted, or whether
it was accepted when PEP 384 was accepted, or whether it was not
accepted at all, or whether it was just proposed.

In any case, without the change, you won't naturally get extension
modules that use the abi3 tag proposed in 3149.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:
 Am 02.12.2010 22:54, schrieb Michael Foord:
 On 02/12/2010 21:39, Martin v. Löwis wrote:
 I was told not to touch to Distutils code to avoid any regression
 since it's patched to the bones in third party products. So we decided
 to freeze distutils and add all new features in Distutils2, which is
 at alpha stage now.  So this move seems contradictory to me.
 I think it was a bad decision to freeze distutils, and we certainly
 didn't make that (not any we that includes me, that is). This freeze
 made the situation worse.

 What situation worse?

 The distutils is unmaintained situation. It's not only unmaintained
 now, but proposed improvements are rejected without consideration, on
 the grounds that they are changes.

I welcome those changes in Distutils2. That's the whole point.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 The distutils is unmaintained situation. It's not only unmaintained
 now, but proposed improvements are rejected without consideration, on
 the grounds that they are changes.
 
 I welcome those changes in Distutils2. That's the whole point.

That would be useful if there was a clear vision of when distutils2
will be released. Please understand that I'm not blaming you for not
releasing it (it *is* too much for a single person), but please
understand that it's also not helpful to submit changes to a codebase
that is not going to be released in a foreseeable future.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Eric Smith wrote:
 The current behavior should go nowhere; it is not useful. Something very
 similar to the current behavior (but done correctly) should go into the
 locale module.
 
 I agree with everything Martin says here. I think the basic premise is:
 you won't find strings in the wild that use non-ASCII digits but do
 use the ASCII dot as a decimal point. And that's what float() is looking
 for. (And that doesn't even begin to address what it expects for an
 exponent 'e'.)

http://en.wikipedia.org/wiki/Decimal_mark

In China, comma and space are used to mark digit groups because dot is used as 
decimal mark.

Note that float() can also parse integers, it just returns them as
floats :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 02 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:
 This freeze made the situation worse.

 Can you extend on this and explains why it makes it worse ?

 Before the freeze, distutils was unmaintained (i.e. before you started
 maintaining it), but people who want to improve it gradually atleast
 could. Now gradual improvements are also banned, so it's not only
 unmaintained, but I can't even provide support for the PEP in Python
 that was just accepted.

 IIRC, it was really the incompatible changes that made people ask you to
 stop changing distutils.

 Who is people ? Are you suggesting that we could have added all the
 new features in Distutils in the stdlib ?

 No, only the ones that didn't cause backwards incompatibilities,
 and broke existing packages.

This is impossible. I can point you to some third party project that
can break if you touch some distutils internals, like setuptools.
Setuptools also uses some privates global variables in some other
modules in the stdlib FYI.

The right answer was maybe back then: make setuptools and other
projects evolve with distutils.

But it did not happen. So we left the status quo and moved forward in
distutils2. Because we knew distutils needed deeper changes anyways,
and we knew setuptools was used everywhere and unfortunately not
evolving at the same pace. (note: I am not blaming PJE or anyone when
I say this -- the way distutils worked and was poorly maintained was
the main reason)


 But that's what you would expect for a project that needs to evolve a
 lot. Thus the freezing.

 Instead of evolving a lot, and instead of freezing, I would have
 preferred evolve a little.

 So how would you make the situation better, if not by doing the work
 in distutils2 ?

 Lift the freeze. I'm all for replacing distutils with distutils2, but
 I'm not sure whether you will declare distutils2 ready tomorrow, next
 year, or ten years from now.

Depends on what ready means.

If by ready you mean it can be used to replace Distutils1 in a
project, I declare Distutils2 ready for usage NOW.  It's in alpha
stage. I want a solid beta before Pycon.

I would even remove Distutils from 3.x altogether at some point since
setuptools is not Python 3 compatible, and just put distutils2.

3.3 sounds like a good target.

Regards
Tarek

-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Antoine Pitrou
On Thu, 02 Dec 2010 23:21:25 +0100
Martin v. Löwis mar...@v.loewis.de wrote:
 Am 02.12.2010 22:54, schrieb Michael Foord:
  On 02/12/2010 21:39, Martin v. Löwis wrote:
  I was told not to touch to Distutils code to avoid any regression
  since it's patched to the bones in third party products. So we decided
  to freeze distutils and add all new features in Distutils2, which is
  at alpha stage now.  So this move seems contradictory to me.
  I think it was a bad decision to freeze distutils, and we certainly
  didn't make that (not any we that includes me, that is). This freeze
  made the situation worse.
  
  What situation worse?
 
 The distutils is unmaintained situation. It's not only unmaintained
 now, but proposed improvements are rejected without consideration, on
 the grounds that they are changes.

I think distutils is simply a bugfix branch for distutils2. Similarly
as how we don't commit improvements in e.g. 2.7 or 3.1, neither do we
commit improvements to distutils.

(and I think that's how Guido wanted it anyway)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:
 The distutils is unmaintained situation. It's not only unmaintained
 now, but proposed improvements are rejected without consideration, on
 the grounds that they are changes.

 I welcome those changes in Distutils2. That's the whole point.

 That would be useful if there was a clear vision of when distutils2
 will be released.   Please understand that I'm not blaming you for not
 releasing it (it *is* too much for a single person), but please
 understand that it's also not helpful to submit changes to a codebase
 that is not going to be released in a foreseeable future.

I know you're not blaming me.

Distutils 2 alpha3 is currently released and available at PyPI. I use
it in some of my professional projects FWIW.

alpha4 was postponed but should be out this month. It contains major
features, people from the GSOC worked on.

The initial roadmap was to have a final by the time 3.2 final is out,
but that'll be too short. So the target is to have a beta release for
Pycon, and to sync the final release with 3.3, with lots of feedback
in the meantime hopefully, and people using it from 2.4 onward.



 Regards,
 Martin




-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Barry Warsaw
On Dec 02, 2010, at 11:21 PM, Martin v. Löwis wrote:

Well, the PEP 384 text in PEP 3149 specifies a change. It's not clear
whether this change was accepted when PEP 3149 was accepted, or whether
it was accepted when PEP 384 was accepted, or whether it was not
accepted at all, or whether it was just proposed.

From my point of view, the PEP 3149 text is just a proposal.  It leaves the
final decision to PEP 384, but tries to address some of the issues raised
during the PEP 3149 discussion.  I think it is within PEP 384's scope to make
the final decisions about it.

In any case, without the change, you won't naturally get extension
modules that use the abi3 tag proposed in 3149.

I would favor changing distutils, if it can be done in a way that reasonably
preserves backward compatibility.  I suppose it's impossible to know all the
ways 3rd party code has reached into distutils, but I think you can make
fairly good judgements about whether a change is backward compatible or not.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 No, only the ones that didn't cause backwards incompatibilities,
 and broke existing packages.
 
 This is impossible. I can point you to some third party project that
 can break if you touch some distutils internals, like setuptools.
 Setuptools also uses some privates global variables in some other
 modules in the stdlib FYI.

So what would break if Extension accepted an abi= keyword parameter?

 Lift the freeze. I'm all for replacing distutils with distutils2, but
 I'm not sure whether you will declare distutils2 ready tomorrow, next
 year, or ten years from now.
 
 Depends on what ready means.

Included in Python, so that changes become possible again.

 If by ready you mean it can be used to replace Distutils1 in a
 project, I declare Distutils2 ready for usage NOW.  It's in alpha
 stage. I want a solid beta before Pycon.
 
 I would even remove Distutils from 3.x altogether at some point since
 setuptools is not Python 3 compatible, and just put distutils2.
 
 3.3 sounds like a good target.

So will distuils2 be released before that? If so, when?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote:
 ..
 Have you tried Google ?

 
 I tried google at I could not find any plain text or HTML file that
 would use Arabic-Indic numerals.  What was interesting, though that a
 search for quran unicode (without quotes).  Brought me to
 http://www.sacred-texts.com which says that they've been using unicode
 since 2002 in their archives.  Interestingly enough, their version of
 Qur'an uses ordinary digits for ayah numbers.  See, for example
 http://www.sacred-texts.com/isl/uq/050.htm.
 
 I will change my mind on this issue when you present a
 machine-readable file with Arabic-Indic numerals and a program capable
 of reading it and show that this program uses the same number parsing
 algorithm as Python's int() or float().

Have you had a look at the examples I posted ? They include texts
and tables with numbers written using east asian arabic numerals.

Here's an example of a a famous Chinese text using Chinese numerals:

http://ctext.org/nine-chapters

Unfortunately, the Chinese numerals are not listed in the Category Nd,
so Python won't be able to parse them. This has various reasons, it
seems, one of them being that the numeral code points were not defined
as range of code points.

I'm sure you can find other books on mathematics in sanscrit or
arabic scripts as well.

But this whole branch of the discussion is not going to go anywhere.

The point is that we support all of Unicode in Python, not just a fragment,
and therefore the numeric constructors support all of Unicode.

Using them, it's very easy to support numbers in all kinds of variants,
whether bound to a locale or not.

Adding more locale aware numeric parsers and formatters to the
locale module, based on these APIs is certainly a good idea,
but orthogonal to the ongoing discussion, IMO.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 02 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Terry Reedy wrote:
 On 11/29/2010 10:19 AM, M.-A. Lemburg wrote:
 Nick Coghlan wrote:
 On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburgm...@egenix.com  wrote:
 If we would go down that road, we would also have to disable other
 Unicode features based on locale, e.g. whether to apply non-ASCII
 case mappings, what to consider whitespace, etc.

 We don't do that for a good reason: Unicode is supposed to be
 universal and not limited to a single locale.

 Because parsing numbers is about more than just the characters used
 for the individual digits. There are additional semantics associated
 with digit ordering (for any number) and decimal separators and
 exponential notation (for floating point numbers) and those vary by
 locale. We deliberately chose to make the builtin numeric parsers
 unaware of all of those things, and assuming that we can simply parse
 other digits as if they were their ASCII equivalents and otherwise
 assume a C locale seems questionable.

 Sure, and those additional semantics are locale dependent, even
 between ASCII-only locales. However, that does not apply to the
 basic building blocks, the decimal digits themselves.

 If the existing semantics can be adequately defined, documented and
 defended, then retaining them would be fine. However, the language
 reference needs to define the behaviour properly so that other
 implementations know what they need to support and what can be chalked
 up as being just an implementation accident of CPython. (As a point in
 the plus column, both decimal.Decimal and fractions.Fraction were able
 to handle the '١٢٣٤.٥٦' example in a manner consistent with the int
 and float handling)

 The support is built into the C API, so there's not really much
 surprise there.

 Regarding documentation, we'd just have to add that numbers may
 be made up of an Unicode code point in the category Nd.

 See http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf, section
 4.6 for details

 
 Decimal digits form a large subcategory of numbers consisting of those
 digits that can be
 used to form decimal-radix numbers. They include script-specific
 digits, but exclude char-
 acters such as Roman numerals and Greek acrophonic numerals. (Note
 that1, 5  = 15 =
 fifteen, butI, V  = IV = four.) Decimal digits also exclude the
 compatibility subscript or
 superscript digits to prevent simplistic parsers from misinterpreting
 their values in context.
 

 int(), float() and long() (in Python2) are such simplistic
 parsers.
 
 Since you are the knowledgable advocate of the current behavior, perhaps
 you could open an issue and propose a doc patch, even if not .rst
 formatted.

Good suggestion. I tried to collect as much context as possible:

http://bugs.python.org/issue10610

I'll leave the rst-magic to someone else, but will certainly help
if you have more questions about the details.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 02 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 I think distutils is simply a bugfix branch for distutils2. Similarly
 as how we don't commit improvements in e.g. 2.7 or 3.1, neither do we
 commit improvements to distutils.

It's different, though, in the sense that Python has a release schedule
and multiple committers working on it, and that it normally gets
released even if some changes don't get included in a specific release
yet.

All this seems not to be true for distutils2. So my motivation to
contribute changes to it is *much* lower than my desire to contribute
to distutils, and it is also provably lower than my motivation to
contribute to distribute (say). I'm just getting tired having to talk to
five projects just to make a single change to the build infrastructure
available to the Python community.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 5:58 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 I will change my mind on this issue when you present a
 machine-readable file with Arabic-Indic numerals and a program capable
 of reading it and show that this program uses the same number parsing
 algorithm as Python's int() or float().

 Have you had a look at the examples I posted ? They include texts
 and tables with numbers written using east asian arabic numerals.

Yes, but this was all about output.  I am pretty sure TeX was able to
typeset Qur'an in all its glory long before Unicode was invented.
Yet, in machine readable form it would be something like {\quran 1}
(invented directive).   I have asked for a file that is intended for
machine processing, not for human enjoyment in print or on a display.
 I claim that if such file exists, the program that reads it does not
use the same rules as Python and converting non-ascii digits would be
a tiny portion of what that program does.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
Am 02.12.2010 23:43, schrieb M.-A. Lemburg:
 Eric Smith wrote:
 The current behavior should go nowhere; it is not useful. Something very
 similar to the current behavior (but done correctly) should go into the
 locale module.

 I agree with everything Martin says here. I think the basic premise is:
 you won't find strings in the wild that use non-ASCII digits but do
 use the ASCII dot as a decimal point. And that's what float() is looking
 for. (And that doesn't even begin to address what it expects for an
 exponent 'e'.)
 
 http://en.wikipedia.org/wiki/Decimal_mark
 
 In China, comma and space are used to mark digit groups because dot is used 
 as decimal mark.

I may be misinterpreting that, but I think that refers to the case of
writing numbers using Arabic digits.

Chinese digits are, e.g., used in the Suzhou numerals

http://en.wikipedia.org/wiki/Suzhou_numerals

This doesn't have a decimal point at all. Instead, the second line
(below or left to the actual digits) describes the power of ten and
the unit of measurement (i.e. similar to scientific notation,
but with ideographs for the powers of ten).

In another writing system, they use 点 (U+70B9) as the decimal
separator, see

http://en.wikipedia.org/wiki/Chinese_numerals#Fractional_values

In the same system, the integral part uses multipliers, i.e.
12345 is [1][1][2][1000][3][100][4][10][5]; the fractional
part uses regular digits.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Eric Smith

On 12/2/2010 5:43 PM, M.-A. Lemburg wrote:

Eric Smith wrote:

The current behavior should go nowhere; it is not useful. Something very
similar to the current behavior (but done correctly) should go into the
locale module.


I agree with everything Martin says here. I think the basic premise is:
you won't find strings in the wild that use non-ASCII digits but do
use the ASCII dot as a decimal point. And that's what float() is looking
for. (And that doesn't even begin to address what it expects for an
exponent 'e'.)


http://en.wikipedia.org/wiki/Decimal_mark

In China, comma and space are used to mark digit groups because dot is used as 
decimal mark.


Is that an ASCII dot? That page doesn't say.


Note that float() can also parse integers, it just returns them as
floats :-)


:)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Michael Foord

On 02/12/2010 23:01, Martin v. Löwis wrote:

[snip...]

I'm just getting tired having to talk to
five projects just to make a single change to the build infrastructure
available to the Python community.



The very best hope of resolving that particular problem is distutils2. :-)

distutils2 is *already* available to the Python community, and whether 
or not there is a fixed release date it will have betas and then a 1.0 
release in the foreseeable future. The team working on it has made an 
enormous amount of progress. We're much better off as a development 
community putting our support and energy into distutils2 rather than 
pining for evolution of distutils.


Michael


Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk



--

http://www.voidspace.org.uk/

READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (”BOGUS AGREEMENTS”) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
2010/12/2 Martin v. Löwis mar...@v.loewis.de:
 No, only the ones that didn't cause backwards incompatibilities,
 and broke existing packages.

 This is impossible. I can point you to some third party project that
 can break if you touch some distutils internals, like setuptools.
 Setuptools also uses some privates global variables in some other
 modules in the stdlib FYI.

 So what would break if Extension accepted an abi= keyword parameter?

I suppose you have code behind this, that will be in build_ext and in
the compilers. So you will need to try out ALL projects out there that
customize build_ext, like numpy or setuptools, etc, But you won't be
able to try out all projects because they are not listed somewhere.

For starters, the Extension class is replaced by another one in
setuptools, that patches the constructor if Pyrex is installed,
which is unlikely I guess, so no big deal. But you will also get a
replaced version of the Distribution class that uses a private method
from distutils, and another version of build_ext with custom compiling
flags.

Now depending on how you do your thing it could work if you are
careful at doing things on the top of setuptools.

And then, if numpy.distutils is installed, it relies on distutils
build_ext and tries to rely on setuptools one's too, so it gets in the
mix of the patched classes, and you get an horrible mix and possible
bad interactions.

So I am not saying it's impossible to add the feature, but it is
impossible to be sure nothing gets broken in third party.

So the freeze seems wise indeed

 Lift the freeze. I'm all for replacing distutils with distutils2, but
 I'm not sure whether you will declare distutils2 ready tomorrow, next
 year, or ten years from now.

 Depends on what ready means.

 Included in Python, so that changes become possible again.

 If by ready you mean it can be used to replace Distutils1 in a
 project, I declare Distutils2 ready for usage NOW.  It's in alpha
 stage. I want a solid beta before Pycon.

 I would even remove Distutils from 3.x altogether at some point since
 setuptools is not Python 3 compatible, and just put distutils2.

 3.3 sounds like a good target.

 So will distuils2 be released before that? If so, when?

An alpha is already released. A beta will be released for Pycon (I
need it for my talk :) )   Then hopefully the final before 3.2


 Regards,
 Martin




-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Martin v. Löwis
 The point is that we support all of Unicode in Python, not just a fragment,
 and therefore the numeric constructors support all of Unicode.

That conclusion is as false today as it was in Python 1.6, but only now
people start caring about that.

a) we don't support all of Unicode in numeric constructors. There are
   lots of things that you can write down that readers would recognize
   as a real/rational/integral number that float() won't parse.
b) if float() would restrict itself to the scientific notation of
   real numbers (as it should), Python could well continue to claim all
   of Unicode.

 Adding more locale aware numeric parsers and formatters to the
 locale module, based on these APIs is certainly a good idea,
 but orthogonal to the ongoing discussion, IMO.

Not at all. The concept of Unicode numbers is flawed: Unicode does
*not* prescribe any specific way to denote numbers. Unicode is about
characters, and Python supports the Unicode characters for digits as
well as it supports all the other Unicode characters.

Instead, support for non-scientific notation of real numbers should
be based on user needs, which probably can be approximated by looking
at actual scripts. This, in turn, is inherently locale-dependent.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Eric Smith wrote:
 On 12/2/2010 5:43 PM, M.-A. Lemburg wrote:
 Eric Smith wrote:
 The current behavior should go nowhere; it is not useful. Something
 very
 similar to the current behavior (but done correctly) should go into the
 locale module.

 I agree with everything Martin says here. I think the basic premise is:
 you won't find strings in the wild that use non-ASCII digits but do
 use the ASCII dot as a decimal point. And that's what float() is looking
 for. (And that doesn't even begin to address what it expects for an
 exponent 'e'.)

 http://en.wikipedia.org/wiki/Decimal_mark

 In China, comma and space are used to mark digit groups because dot
 is used as decimal mark.
 
 Is that an ASCII dot? That page doesn't say.

Yes, but to be fair: I think that the page actually refers to the
use of the Arabic numeral format in China, rather than with their
own script symbols.

 Note that float() can also parse integers, it just returns them as
 floats :-)
 
 :)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 02 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 From my point of view, the PEP 3149 text is just a proposal.  It leaves the
 final decision to PEP 384, but tries to address some of the issues raised
 during the PEP 3149 discussion.  I think it is within PEP 384's scope to make
 the final decisions about it.

Ok, then it looks like there just won't be any support for module
tagging of ABI-conforming modules. It might be possible to support
something like this in the import code, but I would consider this
pointless without accompanying distutils support.

Then, by default, the modules just use the ABI tag that distutils
assigns to them by default. It's interesting to note that #9807
got into distutils despite it being frozen (but this is not about
ABI tags, right - so does distutils in 3.2 actually assign any
ABI tag at all?)

 I would favor changing distutils, if it can be done in a way that reasonably
 preserves backward compatibility.

It seems this is right out for policy reasons.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 An alpha is already released. A beta will be released for Pycon (I
 need it for my talk :) )   Then hopefully the final before 3.2

Ok, that's promising.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Tarek Ziadé
On Fri, Dec 3, 2010 at 12:01 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 I think distutils is simply a bugfix branch for distutils2. Similarly
 as how we don't commit improvements in e.g. 2.7 or 3.1, neither do we
 commit improvements to distutils.

 It's different, though, in the sense that Python has a release schedule
 and multiple committers working on it, and that it normally gets
 released even if some changes don't get included in a specific release
 yet.

 All this seems not to be true for distutils2.

We have 3 or 4 regular contributors. That's not a lot for sure.

 So my motivation to
 contribute changes to it is *much* lower than my desire to contribute
 to distutils, and it is also provably lower than my motivation to
 contribute to distribute (say). I'm just getting tired having to talk to
 five projects just to make a single change to the build infrastructure
 available to the Python community.

I am not trying to motivate you to contribute to Distutils2. I am
trying to make sure we are all on the same page for what's good for
Python.

So if we work in Distutils2 and you work in Distutils saying publicly
that you don't want to contribute to Distutils2, that's a total
nonsense.

We took some decisions, and you want to go against them.  So I want to
have a consensus here for the packaging eco-system and make sure we
are still on track.

I am sorry if you get tired of it, but I don't want to be told at the
next summit: sorry Tarek, now we need to do changes little by little
in distutils1
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Amaury Forgeot d'Arc
Hi,

2010/12/3 Michael Foord fuzzy...@voidspace.org.uk

 On 02/12/2010 23:01, Martin v. Löwis wrote:

 [snip...]

 I'm just getting tired having to talk to
 five projects just to make a single change to the build infrastructure
 available to the Python community.


 The very best hope of resolving that particular problem is distutils2. :-)

 distutils2 is *already* available to the Python community, and whether or
 not there is a fixed release date it will have betas and then a 1.0 release
 in the foreseeable future. The team working on it has made an enormous
 amount of progress. We're much better off as a development community putting
 our support and energy into distutils2 rather than pining for evolution of
 distutils.


Sure. But today (before 3.2b1) we want to merge PEP3149 and PEP384;
they change the paths and filenames used by python.
Either we modify distutils to comply with the new names,
or defer these PEPs until distutils2 is ready.

-- 
Amaury Forgeot d'Arc
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 Some examples:

 http://www.bdl.gov.lb/circ/intpdf/int123.pdf

I looked at this one more closely.  While I cannot understand what it
says, It appears that Arabic numerals are used in dates.   It looks
like Python want be able to deal with those:

 datetime.strptime('١٩٩٩/١٠/٢٩', '%Y/%m/%d')
..
ValueError: time data '١٩٩٩/١٠/٢٩' does not match format '%Y/%m/%d'

Interestingly,

 datetime.strptime('١٩٩٩', '%Y')
datetime.datetime(1999, 1, 1, 0, 0)

which further suggests that support of such numerals is accidental.

As I think more about it, though I am becoming less avert to accepting
these numerals for base 10 integers.  Integers can be easily extracted
from text using simple regex and '\d' accepts all category Nd
characters.  I would require though that all digits be from the same
block, which is not hard because Unicode now promises to only have
them in contiguous blocks of 10.   This rule seems to address some of
security issues because it is unlikely that a system that can display
some of the local digits would not be able to display all of them
properly.

I still don't think it makes any sense to accept them in float().
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Steven D'Aprano

Stephen J. Turnbull wrote:

Steven D'Aprano writes:

  With full respect to haiyang kang, hear-say from one person can hardly 
  be described as strong evidence


That's *disrespectful* nonsense.  What Haiyang reported was not
hearsay, it's direct observation of what he sees around him and
personal experience, plus extrapolation.  Look up hearsay, please.


Fair enough. I choose my words poorly and apologise. A better 
description would be anecdotal evidence.



--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Michael Foord

On 02/12/2010 23:51, Amaury Forgeot d'Arc wrote:

Hi,

2010/12/3 Michael Foord fuzzy...@voidspace.org.uk 
mailto:fuzzy...@voidspace.org.uk


On 02/12/2010 23:01, Martin v. Löwis wrote:

[snip...]

I'm just getting tired having to talk to
five projects just to make a single change to the build
infrastructure
available to the Python community.


The very best hope of resolving that particular problem is
distutils2. :-)

distutils2 is *already* available to the Python community, and
whether or not there is a fixed release date it will have betas
and then a 1.0 release in the foreseeable future. The team working
on it has made an enormous amount of progress. We're much better
off as a development community putting our support and energy into
distutils2 rather than pining for evolution of distutils.


Sure. But today (before 3.2b1) we want to merge PEP3149 and PEP384;
they change the paths and filenames used by python.
Either we modify distutils to comply with the new names,
or defer these PEPs until distutils2 is ready.


Or put support for them into distutils2 now?

Michael



--
Amaury Forgeot d'Arc


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk



--

http://www.voidspace.org.uk/

READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (BOGUS AGREEMENTS) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Barry Warsaw
On Dec 03, 2010, at 12:51 AM, Amaury Forgeot d'Arc wrote:

Sure. But today (before 3.2b1) we want to merge PEP3149 and PEP384;
they change the paths and filenames used by python.
Either we modify distutils to comply with the new names,
or defer these PEPs until distutils2 is ready.

I do not think it would be a good idea to revert PEP 3149.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Matthias Klose

On 03.12.2010 00:25, Tarek Ziadé wrote:

2010/12/2 Martin v. Löwismar...@v.loewis.de:

No, only the ones that didn't cause backwards incompatibilities,
and broke existing packages.


This is impossible. I can point you to some third party project that
can break if you touch some distutils internals, like setuptools.
Setuptools also uses some privates global variables in some other
modules in the stdlib FYI.


So what would break if Extension accepted an abi= keyword parameter?


I suppose you have code behind this, that will be in build_ext and in
the compilers. So you will need to try out ALL projects out there that
customize build_ext, like numpy or setuptools, etc, But you won't be
able to try out all projects because they are not listed somewhere.


is this necessary?  are all these projects known to work with 3.2, without 
having changes compared to 3.1 *without* this pep?  hardly ...


how many extensions will use this restricted api at all?  Is it a legitimate 
solution to back up building an extension in the default mode?


even without having any changes in distutils it would make sense to know if an 
extension can be built with the restricted ABI, so maybe it is better to defer 
any changes to the extension soname, and provide a check for an extension if it 
conforms to the restricted ABI, even if the extension still uses the python 
version specific soname.


I did not mean to block this pep by choosing any installation names.

  Matthias
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Terry Reedy

On 12/2/2010 6:54 PM, Alexander Belopolsky wrote:

On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburgm...@egenix.com  wrote:
..

Some examples:

http://www.bdl.gov.lb/circ/intpdf/int123.pdf


I looked at this one more closely.  While I cannot understand what it
says, It appears that Arabic numerals are used in dates.   It looks
like Python want be able to deal with those:


When I travelled in S. Asia around 25 years ago, arabic and indic 
numerals were in obvious use in stores, road signs, and banks (as with 
money exchange receipts). I learned the digits partly for 
self-protestions ;-). I have no real idea of what is done *now* in 
computerized business, but I assume the native digits are used.


It may well be that there is no Python software yet that operates with 
native digits. The lack of direct output capability would hinder that. 
Of course, someone could run both input and output through 
language-specific str.translate digit translators.



datetime.strptime('١٩٩٩/١٠/٢٩', '%Y/%m/%d')


Googling ١٩٩٩ gets about 83,000 hits.

..
ValueError: time data '١٩٩٩/١٠/٢٩' does not match format '%Y/%m/%d'

Interestingly,


datetime.strptime('١٩٩٩', '%Y')

datetime.datetime(1999, 1, 1, 0, 0)

which further suggests that support of such numerals is accidental.

As I think more about it, though I am becoming less avert to accepting
these numerals for base 10 integers.


Both input and output are needed for educational programming, though 
translation tables might be enough.


  Integers can be easily extracted

from text using simple regex and '\d' accepts all category Nd
characters.  I would require though that all digits be from the same
block, which is not hard because Unicode now promises to only have
them in contiguous blocks of 10.


That seems sensible.

 This rule seems to address some of

security issues because it is unlikely that a system that can display
some of the local digits would not be able to display all of them
properly.

I still don't think it makes any sense to accept them in float().


For the present, I would pretty well agree with that, at least until we 
know more.


You have raised an important issue. It is a bit of a chicken and egg 
problem though. We will not really know what is needed until Python is 
used more in non-english/non-euro contexts, while such usage may await 
better support.


--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Éric Araujo
 even without having any changes in distutils it would make sense to know if 
 an 
 extension can be built with the restricted ABI, so maybe it is better to 
 defer 
 any changes to the extension soname, and provide a check for an extension if 
 it 
 conforms to the restricted ABI, even if the extension still uses the python 
 version specific soname.

Python’s setup.py has an example in Martin’s branch:

  ext = Extension('xxlimited', ['xxlimited.c'],
  define_macros=[('Py_LIMITED_API', 1)])

http://codereview.appspot.com/3262043/patch/1/68

This is possible with today’s distutils.  I don’t know if it’s enough to
build stable-ABI-conformant extension modules.

Regards

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Stephen J. Turnbull
Martin v. Löwis writes:
   Aside: how does one log into the Cheeseshop with your Launchpad OpenID?  
   When
   I try to do it I end up on a Manual user registration page.  I fill out 
   the
   username with what I think my PyPI user name is, and add my python.org 
   email
   address, but then it tells me 'barry' is already taken.  Do I need some 
   kind
   of back door linking of my lp openid and my pypi user id?
  
  Since the barry account already exists, you first need to log into
  that (likely using a password). You can then claim the LP OpenID as
  being associated with that account, and use LP in the future.

It would be nice if the UI told users that, and offered an opportunity
to log in.

Better yet would be a option for an OpenID to claim a user name by
giving the password for it (ie, automatically on a successful login
from that page).


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Stephen J. Turnbull
Lennart Regebro writes:
  2010/12/2 Stephen J. Turnbull step...@xemacs.org:

   T1000 = float('一.◯◯◯')
  
  That was already discussed here, and it's clear that unicode does not
  consider these characters to be something you can use in a decimal
  number, and hence it's not broken.

Huh?  IOW, use Unicode features just because they're there, what the
users want and use doesn't matter?

The only evidence I've seen so far that this feature is anything but a
a toy for a small faction of developers is Neil Hodgson's information
that OOo will generate these kinds of digits (note that it *will* do
Han! so the evidence is as good for users demanding Han numerals as
for any other kind, Unicode.org definitions notwithstanding), and that
DOS CP 864 contains the Indo/Arabic versions.

Of course, it's quite possible that those were toys for the developers
of those software packages too.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Éric Araujo
Hi Prashant,

Python 3 support in distutils2 is not entirely finished, it’s an
interesting and challenging task.

Another idea: convert the python.org internal scripts to use Python 3,
for example starting with patches for http://code.python.org/hg/peps/ .
This would not have any impact on the community, but it’s easy work
that’d help the Python developers to eat their own dogfood.

Regards

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread haiyang kang
 Furthermore, data can well originate from texts that were written
 hundreds or even thousands of years ago, so there is plenty of
 material available for processing.

humm...,  for this, i think we need a special tuned language
processing system to handle this, and one subsystem for one language :)...
(sometimes a single word is not enough, we also need context)

Take pi for example, in modern math, it is wrote as: 3.1415...;
 in old China, it is sometimes wrote as: 三一四一五 or
 三点一四一五 or 叁点壹肆壹伍;

And if these texts are extracted through scanner
 (OCR or other image processing tech),  in my POV,
it is the job of this image processing subsystem
 (or some other subsystem between the image processing and database)
to do the mapping between number and raw text data, example table in DB:
text  | raw data|raw image data
---|-|---
3.1415 | 三一四一五| image...

br,
khy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Stephen J. Turnbull
Neil Hodgson writes:

 While I don't have Excel to test with, OpenOffice.org Calc will
  display in Arabic or Han numerals using the NatNum format codes.

Display is different from input, but at least this is concrete
evidence.

Will it accept Arabic on input?  (Han might be too much to ask for
since Unicode considers Han digits to be impure.)

   Ditto Arabic, I would imagine; ISO 8859/6 (aka Latin/Arabic) does
   not contain the Arabic digits that have been presented here
   earlier AFAICT.
  
 DOS code page 864 does use 0xB0-0xB9

OK, Microsoft thought it would be useful.

I'd still like to know whether people actually use them for input (or
output, for that matter -- anybody have a corpus of Arabic Form 10-Ks
to grep through?), but that's more concrete evidence than we've seen
before.  Thank you!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Stephen J. Turnbull
Antoine Pitrou writes:

  The legacy format argument looks like a red herring to me. When
  converting from a format to another it is the programmer's job to
  his/her job right.

Uhmm, the argument *for* this feature proposed by several people
is that Python's numeric constructors do it (right) so that the
programmer doesn't have to.

If Python *doesn't* do it right, why should Python do it at all?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread Alexander Belopolsky
On Thu, Dec 2, 2010 at 4:57 PM, Mark Dickinson dicki...@gmail.com wrote:
..
 (the decimal spec requires that non-European digits be accepted).

Mark,

I think *requires* is too strong of a word to describe what the spec
says.   The decimal module documentation refers to two authorities:

1. IBM’s General Decimal Arithmetic Specification
2. IEEE standard 854-1987

The IEEE standards predates Unicode and unsurprisingly does not have
anything related to the issue.  the IBM's spec says the following in
the Conversions section:


It is recommended that implementations also provide additional number
formatting routines (including some which are locale-dependent), and
if available should accept non-European decimal digits in strings.
 http://speleotrove.com/decimal/daconvs.html

This cannot possibly be interpreted as normative text.  The emphasis
is clearly on formatting routines with non-European decimal digits
added as an afterthought.  This recommendation can reasonably be
interpreted as a requirement that conversion routines should accept
what formatting routines can produce.  In Python there are no
formatting routines to produce non-European numerals, so there is no
requirement to accept them in conversions.

I don't think decimal module should support non-European decimal
digits.  The only place where it can make some sense is in int()
because here we have a fighting chance of producing a reasonable
definition.   The motivating use case is conversion of numerical data
extracted from text using simple '\d+'  regex matches.

Here is how I would do it:

1.  String x of non-European decimal digits is only accepted in
int(x), but not by int(x, 0) or int(x, 10).
2.  If x contains one or more non-European digits, then

(a)  all digits must be from the same block:

  def basepoint(c):
return ord(c) - unicodedata.digit(c)
  all(basepoint(c) == basepoint(x[0]) for c in x) - True

 (b) and '+' or '-' sign is not alowed.

3. A character c is a digit if it matches '\d' regex.  I think this
means unicodedata.category(c) - 'Nd'.

Condition 2(b) is important because there is no clear way to define
what is acceptable as '+' or '-' using Unicode character properties
and not all number systems even support local form of negation.  (It
is also YAGNI.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Éric Araujo
Le 02/12/2010 23:17, Martin v. Löwis a écrit :
 Before the freeze, distutils was unmaintained (i.e. before you started
 maintaining it), but people who want to improve it gradually atleast
 could. Now gradual improvements are also banned, so it's not only
 unmaintained, but I can't even provide support for the PEP in Python
 that was just accepted.

I wonder what your definition of “unmaintained” is.  Tarek has been
fixing bugs for two years, and recently I have been made a committer to
assist him.  It’s true that I’ve not been as active as I would have
liked*, but I did fix some bugs, as I think you know, given that you’ve
helped me in some reports.

Sure, distutils is not as well-maintained as other modules, but a dozen
bugs have been fixed by five or six of us since the revert.  I do feel
responsible for all 116 remaining bugs, and intend to address all of them.

 * This is partly normal, since I had warned before I was accepted as a
   committer that my time would be scarce for a year, partly due to
   the fact that I also do bug triage, doc work and patch reviews, and
   partly due to some personal problems with focusing.


On the matter of freeze exceptions, there have been two:
- reading the makefile with surogateescape error handler so that python
can build with an ASCII locale in a non-ASCII path (haypo, #6011)
- handle soabiflags (barry, #9807).
I took part in the discussion before those changes and did not object to
them: they are very small changes that enable a new feature of Python
3.2.  Maybe I should have requested Tarek’s approval for those changes;
he knows better than me how third parties may break because of changes
that don’t seem to break anything.


Regards

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Change to the Distutils / Distutils2 workflow

2010-12-02 Thread Éric Araujo
Hi everyone,

I have sketched a workflow guide on
http://wiki.python.org/moin/Distutils/FixingBugs

Cheers

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 Python’s setup.py has an example in Martin’s branch:
 
   ext = Extension('xxlimited', ['xxlimited.c'],
   define_macros=[('Py_LIMITED_API', 1)])
 
 http://codereview.appspot.com/3262043/patch/1/68
 
 This is possible with today’s distutils.  I don’t know if it’s enough to
 build stable-ABI-conformant extension modules.

It is. However, there is also the proposal that they use an ABI tag in
the SO name; having that generated automatically would require a
distutils change.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-02 Thread Martin v. Löwis
 I wonder what your definition of “unmaintained” is.

In this specific case: doesn't get feature requests acted upon.
I'm well aware that you are fixing bugs, and that is appreciated.

 Sure, distutils is not as well-maintained as other modules, but a dozen
 bugs have been fixed by five or six of us since the revert.  I do feel
 responsible for all 116 remaining bugs, and intend to address all of them.

But if the resolution of the bug would require a new feature, your
answer will be this is going to be fixed in distutils2 (if at all),
it's out of scope for distutils. Before, if the submitter contributed
a patch, the patch was just unreviewed for a long time, unless one
of the committers picked it up. Now, the patch will be rejected, which
I consider worse - because the patch is not being rejected on its own
merits, but just because of a policy decision to not improve distutils
anymore.

For example, I keep running into the issue that distutils doesn't
currently support parallel builds. I have been pondering supporting
-j for building extensions, using both unbounded -j and the GNU make
style -jN build server. However, I know that the patch will be rejected,
so I don't even start working on it.

 On the matter of freeze exceptions, there have been two:
 - reading the makefile with surogateescape error handler so that python
 can build with an ASCII locale in a non-ASCII path (haypo, #6011)
 - handle soabiflags (barry, #9807).
 I took part in the discussion before those changes and did not object to
 them: they are very small changes that enable a new feature of Python
 3.2.  Maybe I should have requested Tarek’s approval for those changes;
 he knows better than me how third parties may break because of changes
 that don’t seem to break anything.

I see. Now, I'd claim that the reasoning as to why an abi= parameter
on Extension may break things also applies to the soabiflags:
to support soabiflags, the INSTALL_SCHEMES syntax was modified.
If the install command is subclassed, that could lead to funny
interactions, e.g. where the subclass fails to put abiflags into
config_vars. IIUC, subst_vars will then eventually raise a ValueError.

I'm not saying that this is a likely scenario - only that the
reasoning if a change can possibly affect existing code, it
should not be made applies to essentially any change. So if you
want to avoid breaking things with certainty, not even bug
fixes would be acceptable.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-02 Thread Martin v. Löwis
 It would be nice if the UI told users that, and offered an opportunity
 to log in.
 
 Better yet would be a option for an OpenID to claim a user name by
 giving the password for it (ie, automatically on a successful login
 from that page).

So many projects, so little time. Contributions are welcome.
IOW, it's easier for me to educate users.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com