[Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread Victor Stinner
On Thursday 02 December 2010 19:06:51 georg.brandl wrote:
 Author: georg.brandl
 Date: Thu Dec  2 19:06:51 2010
 New Revision: 86934
 
 Log:
 #7475: add (un)transform method to bytes/bytearray and str, add back codecs
 that can be used with them from Python 2.

Oh no, someone did it. Was it really needed to reintroduce rot13 and friends?

I'm not strongly opposed to .transform()/.untranform() if it can be complelty 
separated to text encodings (ascii, latin9, utf-8  cie). But str.encode() and 
bytes.decode() do accept transform codec names and raise strange error 
messages. Quote of Martin von Löwis (#7475):

If the codecs are restored, one half of them becomes available to
.encode/.decode methods, since the codec registry cannot tell which
ones implement real character encodings, and which ones are other
conversion methods. So adding them would be really confusing.

 'abc'.transform('hex')
TypeError: 'str' does not support the buffer interface
 b'abc'.transform('rot13')
TypeError: expected an object with the buffer interface

 b'abcd'.decode('hex')
TypeError: decoder did not return a str object (type=bytes)
 'abc'.encode('rot13')
TypeError: encoder did not return a bytes object (type=str)

I don't like transform() and untransform() because I think that we should not 
add too much operations to the base types (bytes and str), and they do 
implicit module import. I prefer explicit module import (eg. import binascii; 
binascii.hexlify(b'to hex')). It remembers me PHP and it's ugly namespace with 
+5000 functions. I prefer Python because it uses smaller and more namespaces 
which are more specific and well defined. If we add email and compression 
functions to bytes, why not adding a web browser to the str?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-03 Thread Neil Hodgson
Stephen J. Turnbull:

 Will it accept Arabic on input?  (Han might be too much to ask for
 since Unicode considers Han digits to be impure.)

   I couldn't find a direct way to input Arabic digits into OO Calc,
the normal use of Alt+number didn't work in Calc although it did in
WordPad where Alt+1632 is ٠ and so on.

   OO Calc does have settings in the Complex Text Layout section for
choosing different numerals but I don't understand the interaction of
choices here.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-03 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Thu, Dec 2, 2010 at 5:58 PM, M.-A. Lemburg m...@egenix.com wrote:
 ..
 I will change my mind on this issue when you present a
 machine-readable file with Arabic-Indic numerals and a program capable
 of reading it and show that this program uses the same number parsing
 algorithm as Python's int() or float().

 Have you had a look at the examples I posted ? They include texts
 and tables with numbers written using east asian arabic numerals.
 
 Yes, but this was all about output.  I am pretty sure TeX was able to
 typeset Qur'an in all its glory long before Unicode was invented.
 Yet, in machine readable form it would be something like {\quran 1}
 (invented directive).   I have asked for a file that is intended for
 machine processing, not for human enjoyment in print or on a display.
  I claim that if such file exists, the program that reads it does not
 use the same rules as Python and converting non-ascii digits would be
 a tiny portion of what that program does.

Well, programs that take input from the keyboards I posted in this
thread will have to deal with the digits. Since Python's input()
accepts keyboard input, you have your use case :-)

Seriously, I find the distinction between input and output forms
of numerals somewhat misguided. Any output can also serve as input.
For books and other printed material, images, etc. you have scanners
and OCR. For screen output you have screen readers. For spreadsheets
and data, you have CSV, TSV, XML, etc. etc. etc.

Just for the fun of it, I created a CSV file with Thai and Dzongkha
numerals (in addition to Arabic ones) using OpenOffice. Here's the
cut and paste version:


Numbers in various scripts  

Arabic  ThaiDzongkha
1   ๑   ༡
2   ๒   ༢
3   ๓   ༣
4   ๔   ༤
5   ๕   ༥
6   ๖   ༦
7   ๗   ༧
8   ๘   ༨
9   ๙   ༩
10  ๑๐  ༡༠
11  ๑๑  ༡༡
12  ๑๒  ༡༢
13  ๑๓  ༡༣
14  ๑๔  ༡༤
15  ๑๕  ༡༥
16  ๑๖  ༡༦
17  ๑๗  ༡༧
18  ๑๘  ༡༨
19  ๑๙  ༡༩
20  ๒๐  ༢༠


And here's the script that goes with it:

import csv
c = csv.reader(open('Numbers-in-various-scripts.csv'))
headers = [c.next() for i in range(3)]
while c:
print [int(unicode(x, 'utf-8')) for x in c.next()]

and the output using Python 2.7:

[1, 1, 1]
[2, 2, 2]
[3, 3, 3]
[4, 4, 4]
[5, 5, 5]
[6, 6, 6]
[7, 7, 7]
[8, 8, 8]
[9, 9, 9]
[10, 10, 10]
[11, 11, 11]
[12, 12, 12]
[13, 13, 13]
[14, 14, 14]
[15, 15, 15]
[16, 16, 16]
[17, 17, 17]
[18, 18, 18]
[19, 19, 19]
[20, 20, 20]

If you need more such files, I can generate as many as you like ;-)
I can send the OOo file as well, if you like to play around with it.

I'd say: case closed :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 03 2010)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
Numbers in various scripts,,
,,
Arabic,Thai,Dzongkha
1,๑,༡
2,๒,༢
3,๓,༣
4,๔,༤
5,๕,༥
6,๖,༦
7,๗,༧
8,๘,༨
9,๙,༩
10,๑๐,༡༠
11,๑๑,༡༡
12,๑๒,༡༢
13,๑๓,༡༣
14,๑๔,༡༤
15,๑๕,༡༥
16,๑๖,༡༦
17,๑๗,༡༧
18,๑๘,༡༨
19,๑๙,༡༩
20,๒๐,༢༠
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86965 - python/branches/py3k/Lib/test/__main__.py

2010-12-03 Thread Nick Coghlan
On Fri, Dec 3, 2010 at 8:42 PM, michael.foord
python-check...@python.org wrote:
 +# When tests are run from the Python build directory, it is best practice
 +# to keep the test files in a subfolder.  It eases the cleanup of leftover
 +# files using command make distclean.
 +if sysconfig.is_python_build():
 +    TEMPDIR = os.path.join(sysconfig.get_config_var('srcdir'), 'build')
 +    TEMPDIR = os.path.abspath(TEMPDIR)
 +    if not os.path.exists(TEMPDIR):
 +        os.mkdir(TEMPDIR)
 +    regrtest.TEMPDIR = TEMPDIR

If syconfig.is_python_build() returns False...

 +# Define a writable temp dir that will be used as cwd while running
 +# the tests. The name of the dir includes the pid to allow parallel
 +# testing (see the -j option).
 +TESTCWD = 'test_python_{}'.format(os.getpid())
 +
 +TESTCWD = os.path.join(TEMPDIR, TESTCWD)

... this line is going to blow up with a NameError.

I would suggest putting this common setup code into a _make_test_dir()
helper function in regrtest, then have both regrtest and test.__main__
call it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86965 - python/branches/py3k/Lib/test/__main__.py

2010-12-03 Thread Michael Foord

On 03/12/2010 10:53, Nick Coghlan wrote:

On Fri, Dec 3, 2010 at 8:42 PM, michael.foord
python-check...@python.org  wrote:

+# When tests are run from the Python build directory, it is best practice
+# to keep the test files in a subfolder.  It eases the cleanup of leftover
+# files using command make distclean.
+if sysconfig.is_python_build():
+TEMPDIR = os.path.join(sysconfig.get_config_var('srcdir'), 'build')
+TEMPDIR = os.path.abspath(TEMPDIR)
+if not os.path.exists(TEMPDIR):
+os.mkdir(TEMPDIR)
+regrtest.TEMPDIR = TEMPDIR

If syconfig.is_python_build() returns False...


+# Define a writable temp dir that will be used as cwd while running
+# the tests. The name of the dir includes the pid to allow parallel
+# testing (see the -j option).
+TESTCWD = 'test_python_{}'.format(os.getpid())
+
+TESTCWD = os.path.join(TEMPDIR, TESTCWD)

... this line is going to blow up with a NameError.

I would suggest putting this common setup code into a _make_test_dir()
helper function in regrtest, then have both regrtest and test.__main__
call it.



Ok, good suggestion. Thanks

Michael


Cheers,
Nick.




--

http://www.voidspace.org.uk/

READ CAREFULLY. By accepting and reading this email you agree,
on behalf of your employer, to release me from all obligations
and waivers arising from any and all NON-NEGOTIATED agreements,
licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap,
confidentiality, non-disclosure, non-compete and acceptable use
policies (”BOGUS AGREEMENTS”) that I have entered into with your
employer, its partners, licensors, agents and assigns, in
perpetuity, without prejudice to my ongoing rights and privileges.
You further represent that you have the authority to release me
from any BOGUS AGREEMENTS on behalf of your employer.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Éric Araujo
Le 03/12/2010 08:31, Martin v. Löwis a écrit :
 I wonder what your definition of “unmaintained” is. 
 In this specific case: doesn't get feature requests acted upon.
Thanks for clarifying.  I think that’s a stretch, but I see your meaning
now.

 Sure, distutils is not as well-maintained as other modules, but a dozen
 bugs have been fixed by five or six of us since the revert.  I do feel
 responsible for all 116 remaining bugs, and intend to address all of them.
 
 But if the resolution of the bug would require a new feature, your
 answer will be this is going to be fixed in distutils2 (if at all),
 it's out of scope for distutils. Before, if the submitter contributed
 a patch, the patch was just unreviewed for a long time, unless one
 of the committers picked it up. Now, the patch will be rejected, which
 I consider worse - because the patch is not being rejected on its own
 merits, but just because of a policy decision to not improve distutils
 anymore.
The patch would not be rejected, but assigned to a different version.
It‘s not different than replying to an old bug with a patch for Python
2.5 and requesting that it be updated for py3k.  It’s also not uncommon
to have another contributor or a core dev updating the patch if the
original poster does not reply.

 For example, I keep running into the issue that distutils doesn't
 currently support parallel builds. I have been pondering supporting
 -j for building extensions, using both unbounded -j and the GNU make
 style -jN build server. However, I know that the patch will be rejected,
 so I don't even start working on it.
This would be a very useful feature for distutils2.

 On the matter of freeze exceptions, there have been two: [snip]
 I see. Now, I'd claim that the reasoning as to why an abi= parameter
 on Extension may break things also applies to the soabiflags:
 to support soabiflags, the INSTALL_SCHEMES syntax was modified.
 If the install command is subclassed, that could lead to funny
 interactions, e.g. where the subclass fails to put abiflags into
 config_vars. IIUC, subst_vars will then eventually raise a ValueError.
This is a concern.

 I'm not saying that this is a likely scenario - only that the
 reasoning if a change can possibly affect existing code, it
 should not be made applies to essentially any change. So if you
 want to avoid breaking things with certainty, not even bug
 fixes would be acceptable.
If we wanted 100% certainty, yes.

Regards

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Unicode Character Database

2010-12-03 Thread Antoine Pitrou
Le vendredi 03 décembre 2010 à 13:58 +0900, Stephen J. Turnbull a
écrit :
 Antoine Pitrou writes:
 
   The legacy format argument looks like a red herring to me. When
   converting from a format to another it is the programmer's job to
   his/her job right.
 
 Uhmm, the argument *for* this feature proposed by several people
 is that Python's numeric constructors do it (right) so that the
 programmer doesn't have to.

As far as I understand, Alexander was talking about a legacy pre-unicode
text format. We don't have to support this.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r86962 - in python/branches/py3k: Doc/library/pydoc.rst Doc/whatsnew/3.2.rst Lib/pydoc.py Lib/test/test_pydoc.py Misc/ACKS Misc/NEWS

2010-12-03 Thread Antoine Pitrou
On Fri,  3 Dec 2010 10:29:12 +0100 (CET)
nick.coghlan python-check...@python.org wrote:
 Author: nick.coghlan
 Date: Fri Dec  3 10:29:11 2010
 New Revision: 86962
 
 Log:
 Improve Pydoc interactive browsing (#2001).  Patch by Ron Adam.

Tests seem to fail under Windows:

==
FAIL: test_url_requests (test.test_pydoc.PyDocUrlHandlerTest)
--
Traceback (most recent call last):
  File 
D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\test_pydoc.py, 
line 456, in test_url_requests
self.assertEqual(result, title)
AssertionError: 'Python: Read Error' != 'Python: getfile 
/D:\\cygwin\\home\\db3l\\buildarea\\3.x.bolen-windows\\build\\l [truncated]...
- Python: Read Error
+ Python: getfile 
/D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\string.py


(http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%203.x/builds/3708/steps/test/logs/stdio)



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread R. David Murray
On Fri, 03 Dec 2010 10:16:04 +0100, Victor Stinner 
victor.stin...@haypocalc.com wrote:
 On Thursday 02 December 2010 19:06:51 georg.brandl wrote:
  Author: georg.brandl
  Date: Thu Dec  2 19:06:51 2010
  New Revision: 86934
 
  Log:
  #7475: add (un)transform method to bytes/bytearray and str, add back codecs
  that can be used with them from Python 2.
 
 Oh no, someone did it. Was it really needed to reintroduce rot13 and friends?
 
 I'm not strongly opposed to .transform()/.untranform() if it can be complelty
 separated to text encodings (ascii, latin9, utf-8  cie). But str.encode() and
 bytes.decode() do accept transform codec names and raise strange error
 messages. Quote of Martin von Löwis (#7475):
 
 If the codecs are restored, one half of them becomes available to
 .encode/.decode methods, since the codec registry cannot tell which
 ones implement real character encodings, and which ones are other
 conversion methods. So adding them would be really confusing.
 
  'abc'.transform('hex')
 TypeError: 'str' does not support the buffer interface
  b'abc'.transform('rot13')
 TypeError: expected an object with the buffer interface

I find these 'buffer interface' error messages to be the most confusing
error message I get out of Python3 no matter what context they show up
in.  I have no idea what they are telling me.  That issue is more
general than transform/untransform, but perhaps it could be fixed
for transform/untransform in particular.

  b'abcd'.decode('hex')
 TypeError: decoder did not return a str object (type=bytes)
  'abc'.encode('rot13')
 TypeError: encoder did not return a bytes object (type=str)

These error messages make perfect sense to me.  I think it
is called duck typing :)

 I don't like transform() and untransform() because I think that we should not
 add too much operations to the base types (bytes and str), and they do
 implicit module import. I prefer explicit module import (eg. import binascii;
 binascii.hexlify(b'to hex')). It remembers me PHP and it's ugly namespace with
 +5000 functions. I prefer Python because it uses smaller and more namespaces
 which are more specific and well defined. If we add email and compression
 functions to bytes, why not adding a web browser to the str?

As MAL says, the codec machinery is a general purpose tool.  I think
it, and the transform methods, are a useful level of abstraction over
a general class of problems.

Please also recall that transform/untransform was discussed before
the release of Python 3.0 and was approved at the time, but it just
did not get implemented before the 3.0 release.

--
R. David Murray  www.bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread Alexander Belopolsky
On Fri, Dec 3, 2010 at 10:11 AM, R. David Murray rdmur...@bitdance.com wrote:
..
 Please also recall that transform/untransform was discussed before
 the release of Python 3.0 and was approved at the time, but it just
 did not get implemented before the 3.0 release.


Can you provide a link?  My search for transform on python-dev came out with

http://mail.python.org/pipermail/python-dev/2010-June/100564.html

where you seem to oppose these methods.  Also, new methods to builtins
fall under the language moratorium (but can be approved on a
case-by-case basis):

http://www.python.org/dev/peps/pep-3003/#case-by-case-exemptions

Is there an effort to document these exceptions?  I expected such
approvals to be added to PEP 3003, but apparently this was not the
case.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] buffer interface messages

2010-12-03 Thread Antoine Pitrou
On Fri, 03 Dec 2010 10:11:29 -0500
R. David Murray rdmur...@bitdance.com wrote:
  
   'abc'.transform('hex')
  TypeError: 'str' does not support the buffer interface
   b'abc'.transform('rot13')
  TypeError: expected an object with the buffer interface
 
 I find these 'buffer interface' error messages to be the most confusing
 error message I get out of Python3 no matter what context they show up
 in.  I have no idea what they are telling me.  That issue is more
 general than transform/untransform, but perhaps it could be fixed
 for transform/untransform in particular.

I agree. buffer interface is a technicality that the Python user
doesn't do about (unless (s)he also writes C extensions). How about
expected a bytes-compatible object?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r86962 - in python/branches/py3k: Doc/library/pydoc.rst Doc/whatsnew/3.2.rst Lib/pydoc.py Lib/test/test_pydoc.py Misc/ACKS Misc/NEWS

2010-12-03 Thread Nick Coghlan
On Sat, Dec 4, 2010 at 12:00 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Fri,  3 Dec 2010 10:29:12 +0100 (CET)
 nick.coghlan python-check...@python.org wrote:
 Author: nick.coghlan
 Date: Fri Dec  3 10:29:11 2010
 New Revision: 86962

 Log:
 Improve Pydoc interactive browsing (#2001).  Patch by Ron Adam.

 Tests seem to fail under Windows:

 ==
 FAIL: test_url_requests (test.test_pydoc.PyDocUrlHandlerTest)
 --
 Traceback (most recent call last):
  File 
 D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\test_pydoc.py,
  line 456, in test_url_requests
    self.assertEqual(result, title)
 AssertionError: 'Python: Read Error' != 'Python: getfile 
 /D:\\cygwin\\home\\db3l\\buildarea\\3.x.bolen-windows\\build\\l [truncated]...
 - Python: Read Error
 + Python: getfile 
 /D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\string.py


 (http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%203.x/builds/3708/steps/test/logs/stdio)

Yeah, Georg pointed that one out. My latest checkin *should* have
fixed it (but we'll see what the buildbots have to say).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread Victor Stinner
On Friday 03 December 2010 16:11:29 R. David Murray wrote:
   'abc'.transform('hex')
  TypeError: 'str' does not support the buffer interface
  
   b'abc'.transform('rot13')
  TypeError: expected an object with the buffer interface
 
 I find these 'buffer interface' error messages to be the most confusing
 error message I get out of Python3 no matter what context they show up
 in.  I have no idea what they are telling me.  That issue is more
 general than transform/untransform, but perhaps it could be fixed
 for transform/untransform in particular.

If it's more general, let's open an issue for that:
http://bugs.python.org/issue10616

   b'abcd'.decode('hex')
  TypeError: decoder did not return a str object (type=bytes)
  
   'abc'.encode('rot13')
  TypeError: encoder did not return a bytes object (type=str)
 
 These error messages make perfect sense to me.  I think it
 is called duck typing :)

 (...) 

 Please also recall that transform/untransform was discussed before
 the release of Python 3.0 and was approved at the time, but it just
 did not get implemented before the 3.0 release.

Ok.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buffer interface messages

2010-12-03 Thread Nick Coghlan
On Sat, Dec 4, 2010 at 2:28 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Fri, 03 Dec 2010 10:11:29 -0500
 R. David Murray rdmur...@bitdance.com wrote:
 
   'abc'.transform('hex')
  TypeError: 'str' does not support the buffer interface
   b'abc'.transform('rot13')
  TypeError: expected an object with the buffer interface

 I find these 'buffer interface' error messages to be the most confusing
 error message I get out of Python3 no matter what context they show up
 in.  I have no idea what they are telling me.  That issue is more
 general than transform/untransform, but perhaps it could be fixed
 for transform/untransform in particular.

 I agree. buffer interface is a technicality that the Python user
 doesn't do about (unless (s)he also writes C extensions). How about
 expected a bytes-compatible object?

Why not binary data interface? That's what they're actually looking for.

It seems odd for 'rot13' to be throwing that error though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buffer interface messages

2010-12-03 Thread Antoine Pitrou
On Sat, 4 Dec 2010 02:45:42 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, Dec 4, 2010 at 2:28 AM, Antoine Pitrou solip...@pitrou.net wrote:
  On Fri, 03 Dec 2010 10:11:29 -0500
  R. David Murray rdmur...@bitdance.com wrote:
  
'abc'.transform('hex')
   TypeError: 'str' does not support the buffer interface
b'abc'.transform('rot13')
   TypeError: expected an object with the buffer interface
 
  I find these 'buffer interface' error messages to be the most confusing
  error message I get out of Python3 no matter what context they show up
  in.  I have no idea what they are telling me.  That issue is more
  general than transform/untransform, but perhaps it could be fixed
  for transform/untransform in particular.
 
  I agree. buffer interface is a technicality that the Python user
  doesn't do about (unless (s)he also writes C extensions). How about
  expected a bytes-compatible object?
 
 Why not binary data interface? That's what they're actually looking for.

I don't think it's more understandable, since it's not a well-known
Python concept.
(and in this specific case, the transform() method is on str and bytes,
not on arbitrary bytes-like objects)

 It seems odd for 'rot13' to be throwing that error though.

Agreed.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buffer interface messages

2010-12-03 Thread Nick Coghlan
On Sat, Dec 4, 2010 at 2:57 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Sat, 4 Dec 2010 02:45:42 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, Dec 4, 2010 at 2:28 AM, Antoine Pitrou solip...@pitrou.net wrote:
  On Fri, 03 Dec 2010 10:11:29 -0500
  R. David Murray rdmur...@bitdance.com wrote:
  
'abc'.transform('hex')
   TypeError: 'str' does not support the buffer interface
b'abc'.transform('rot13')
   TypeError: expected an object with the buffer interface
 
  I find these 'buffer interface' error messages to be the most confusing
  error message I get out of Python3 no matter what context they show up
  in.  I have no idea what they are telling me.  That issue is more
  general than transform/untransform, but perhaps it could be fixed
  for transform/untransform in particular.
 
  I agree. buffer interface is a technicality that the Python user
  doesn't do about (unless (s)he also writes C extensions). How about
  expected a bytes-compatible object?

 Why not binary data interface? That's what they're actually looking for.

 I don't think it's more understandable, since it's not a well-known
 Python concept.
 (and in this specific case, the transform() method is on str and bytes,
 not on arbitrary bytes-like objects)

Indeed, bytes-compatible is likely to be the most meaningful phrase
(and then we can have the bytes docs explain further as to what
bytes-compatible means, likely by reference to memoryview).

Although, as Victor's mod to the complex constructor error message
shows, the default error message is still going to be overly specific
in many cases that also accept non-binary data.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2010-12-03 Thread Python tracker

ACTIVITY SUMMARY (2010-11-26 - 2010-12-03)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open2537 ( +4)
  closed 19861 (+69)
  total  22398 (+73)

Open issues with patches: 1080 


Issues opened (47)
==

#2690: Precompute range length and enhance range subscript support
http://bugs.python.org/issue2690  reopened by ncoghlan

#9333: Expose a way to enable os.symlink on Windows
http://bugs.python.org/issue9333  reopened by brian.curtin

#10544: yield expression inside generator expression does nothing
http://bugs.python.org/issue10544  opened by Inyeol.Lee

#10545: remove or rewrite Using Backslash to Continue Statements ant
http://bugs.python.org/issue10545  opened by rurpy2

#10546: UTF-16-LE and UTF-16-BE support non-BMP characters
http://bugs.python.org/issue10546  opened by haypo

#10548: Error in setUp not reported as expectedFailure (unittest)
http://bugs.python.org/issue10548  opened by michael.foord

#10550: Windows: leak in test_concurrent_futures
http://bugs.python.org/issue10550  opened by skrah

#10551: mimetypes read from the registry should not overwrite standard
http://bugs.python.org/issue10551  opened by kovid

#10552: Tools/unicode/gencodec.py error
http://bugs.python.org/issue10552  opened by belopolsky

#10553: Add optimize argument to builtin compile() and byte-compilatio
http://bugs.python.org/issue10553  opened by georg.brandl

#10556: test_zipimport_support mucks up with modules
http://bugs.python.org/issue10556  opened by pitrou

#10557: Malformed error message from float()
http://bugs.python.org/issue10557  opened by belopolsky

#10558: non-standard processing of several configure options ignores 
http://bugs.python.org/issue10558  opened by ned.deily

#10559: NameError in tutorial/interpreter
http://bugs.python.org/issue10559  opened by eric.araujo

#10560: Fixes for Windows sources
http://bugs.python.org/issue10560  opened by Carlo_Bramini

#10563: Spurious newline in time.ctime
http://bugs.python.org/issue10563  opened by Gerrit.Holl

#10566: gdb debugging support additions (Tools/gdb/libpython.py)
http://bugs.python.org/issue10566  opened by eggy

#10570: curses.tigetstr() returns bytes, but curses.tparm() expects a 
http://bugs.python.org/issue10570  opened by jwilk

#10571: setup.py upload --sign broken: TypeError: 'str' does not sup
http://bugs.python.org/issue10571  opened by jwilk

#10572: Move unittest test package to Lib/test
http://bugs.python.org/issue10572  opened by michael.foord

#10573: Consistency in unittest assert methods: order of actual, expec
http://bugs.python.org/issue10573  opened by michael.foord

#10574: email.header.decode_header fails if the string contains multip
http://bugs.python.org/issue10574  opened by starsareblueandfaraway

#10576: Add a progress callback to gcmodule
http://bugs.python.org/issue10576  opened by krisvale

#10577: (Fancy) URL opener stuck when trying to open redirected url
http://bugs.python.org/issue10577  opened by xhresko

#10580: Installer sentence in bold
http://bugs.python.org/issue10580  opened by Retro

#10581: Review and document string format accepted in numeric data typ
http://bugs.python.org/issue10581  opened by belopolsky

#10582: PyErr_PrintEx exits silently when passed SystemExit exception
http://bugs.python.org/issue10582  opened by Marc.Horowitz

#10583: Encoding issue with chm help in 2.7.1
http://bugs.python.org/issue10583  opened by flashk

#10587: Document the meaning of str methods
http://bugs.python.org/issue10587  opened by belopolsky

#10588: imp.find_module raises unexpected SyntaxError
http://bugs.python.org/issue10588  opened by emile.anclin

#10589: I/O ABC docs should specify which methods have implementations
http://bugs.python.org/issue10589  opened by stutzbach

#10590: Parameter type error for xml.sax.parseString(string, ...)
http://bugs.python.org/issue10590  opened by Thomas.Ryan

#10592: pprint module doesn't work well with OrderedDicts
http://bugs.python.org/issue10592  opened by mikez302

#10595: Adding a syslog.conf reader in syslog
http://bugs.python.org/issue10595  opened by tarek

#10596: modulo operator bug
http://bugs.python.org/issue10596  opened by Sergio.Ĥlutĉin

#10598: curses fails to import on Solaris
http://bugs.python.org/issue10598  opened by rtarnell

#10599: sgmllib.parse_endtag() is not respecting quoted text
http://bugs.python.org/issue10599  opened by Michael.Brooks

#10601: sys.displayhook: use backslashreplace error handler if repr(va
http://bugs.python.org/issue10601  opened by haypo

#10605: ElementTree documentation
http://bugs.python.org/issue10605  opened by adrian_nye

#10608: Add a section to Windows FAQ explaining os.symlink
http://bugs.python.org/issue10608  opened by brian.curtin

#10609: dbm documentation example doesn't work (iteritems())
http://bugs.python.org/issue10609  opened by sandro.tosi

#10610: 

Re: [Python-Dev] buffer interface messages

2010-12-03 Thread Georg Brandl
Am 03.12.2010 17:57, schrieb Antoine Pitrou:
 On Sat, 4 Dec 2010 02:45:42 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 On Sat, Dec 4, 2010 at 2:28 AM, Antoine Pitrou solip...@pitrou.net wrote:
  On Fri, 03 Dec 2010 10:11:29 -0500
  R. David Murray rdmur...@bitdance.com wrote:
  
'abc'.transform('hex')
   TypeError: 'str' does not support the buffer interface
b'abc'.transform('rot13')
   TypeError: expected an object with the buffer interface
 
  I find these 'buffer interface' error messages to be the most confusing
  error message I get out of Python3 no matter what context they show up
  in.  I have no idea what they are telling me.  That issue is more
  general than transform/untransform, but perhaps it could be fixed
  for transform/untransform in particular.
 
  I agree. buffer interface is a technicality that the Python user
  doesn't do about (unless (s)he also writes C extensions). How about
  expected a bytes-compatible object?
 
 Why not binary data interface? That's what they're actually looking for.
 
 I don't think it's more understandable, since it's not a well-known
 Python concept.
 (and in this specific case, the transform() method is on str and bytes,
 not on arbitrary bytes-like objects)

object that can be handled as bytes?
object that provides bytes?

 It seems odd for 'rot13' to be throwing that error though.
 
 Agreed.

rot13 is a str-str codec.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buffer interface messages

2010-12-03 Thread Antoine Pitrou
On Fri, 03 Dec 2010 18:09:39 +0100
Georg Brandl g.bra...@gmx.net wrote:
 Am 03.12.2010 17:57, schrieb Antoine Pitrou:
  On Sat, 4 Dec 2010 02:45:42 +1000
  Nick Coghlan ncogh...@gmail.com wrote:
  On Sat, Dec 4, 2010 at 2:28 AM, Antoine Pitrou solip...@pitrou.net wrote:
   On Fri, 03 Dec 2010 10:11:29 -0500
   R. David Murray rdmur...@bitdance.com wrote:
   
 'abc'.transform('hex')
TypeError: 'str' does not support the buffer interface
 b'abc'.transform('rot13')
TypeError: expected an object with the buffer interface
  
   I find these 'buffer interface' error messages to be the most confusing
   error message I get out of Python3 no matter what context they show up
   in.  I have no idea what they are telling me.  That issue is more
   general than transform/untransform, but perhaps it could be fixed
   for transform/untransform in particular.
  
   I agree. buffer interface is a technicality that the Python user
   doesn't do about (unless (s)he also writes C extensions). How about
   expected a bytes-compatible object?
  
  Why not binary data interface? That's what they're actually looking for.
  
  I don't think it's more understandable, since it's not a well-known
  Python concept.
  (and in this specific case, the transform() method is on str and bytes,
  not on arbitrary bytes-like objects)
 
 object that can be handled as bytes?
 object that provides bytes?
 
  It seems odd for 'rot13' to be throwing that error though.
  
  Agreed.
 
 rot13 is a str-str codec.

Then why does it claim to expect an object with the buffer interface?
bytes has the buffer interface, while str doesn't ;)

I'll also mention that getting a TypeError when you call a method with
the correct type of argument is a bit surprising.

cheers

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buffer interface messages

2010-12-03 Thread Georg Brandl
Am 03.12.2010 18:29, schrieb Antoine Pitrou:
 On Fri, 03 Dec 2010 18:09:39 +0100
 Georg Brandl g.bra...@gmx.net wrote:
 Am 03.12.2010 17:57, schrieb Antoine Pitrou:
  On Sat, 4 Dec 2010 02:45:42 +1000
  Nick Coghlan ncogh...@gmail.com wrote:
  On Sat, Dec 4, 2010 at 2:28 AM, Antoine Pitrou solip...@pitrou.net 
  wrote:
   On Fri, 03 Dec 2010 10:11:29 -0500
   R. David Murray rdmur...@bitdance.com wrote:
   
 'abc'.transform('hex')
TypeError: 'str' does not support the buffer interface
 b'abc'.transform('rot13')
TypeError: expected an object with the buffer interface
  
   I find these 'buffer interface' error messages to be the most confusing
   error message I get out of Python3 no matter what context they show up
   in.  I have no idea what they are telling me.  That issue is more
   general than transform/untransform, but perhaps it could be fixed
   for transform/untransform in particular.
  
   I agree. buffer interface is a technicality that the Python user
   doesn't do about (unless (s)he also writes C extensions). How about
   expected a bytes-compatible object?
  
  Why not binary data interface? That's what they're actually looking for.
  
  I don't think it's more understandable, since it's not a well-known
  Python concept.
  (and in this specific case, the transform() method is on str and bytes,
  not on arbitrary bytes-like objects)
 
 object that can be handled as bytes?
 object that provides bytes?
 
  It seems odd for 'rot13' to be throwing that error though.
  
  Agreed.
 
 rot13 is a str-str codec.
 
 Then why does it claim to expect an object with the buffer interface?
 bytes has the buffer interface, while str doesn't ;)

I'll leave it to you to figure it out.  (Hint: look at the traceback.)

 I'll also mention that getting a TypeError when you call a method with
 the correct type of argument is a bit surprising.

Why? The codec has the wrong type :)  But seriously, MAL already proposed
a bit more information on codecs, so that they can let it be known which
types they convert between.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread R. David Murray
On Fri, 03 Dec 2010 11:14:56 -0500, Alexander Belopolsky 
alexander.belopol...@gmail.com wrote:
 On Fri, Dec 3, 2010 at 10:11 AM, R. David Murray rdmur...@bitdance.com 
 wrote:
 ..
  Please also recall that transform/untransform was discussed before
  the release of Python 3.0 and was approved at the time, but it just
  did not get implemented before the 3.0 release.
 
 
 Can you provide a link?  My search for transform on python-dev came out with

It was linked from the issue, if I recall correctly.  I do remember
reading the thread from the python-3000 list, linked by someone
somewhere :)

 http://mail.python.org/pipermail/python-dev/2010-June/100564.html
 
 where you seem to oppose these methods.  Also, new methods to builtins

It looks to me like I was agreeing that transform/untrasnform should
do only bytes-bytes or str-str regardless of what codec name you
passed them.

 fall under the language moratorium (but can be approved on a
 case-by-case basis):
 
 http://www.python.org/dev/peps/pep-3003/#case-by-case-exemptions
 
 Is there an effort to document these exceptions?  I expected such
 approvals to be added to PEP 3003, but apparently this was not the
 case.

I believe MAL's thought was that the addition of these methods had
been approved pre-moratorium, but I don't know if that is a
sufficient argument or not.

--
R. David Murray  www.bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread Guido van Rossum
On Fri, Dec 3, 2010 at 9:58 AM, R. David Murray rdmur...@bitdance.com wrote:
 On Fri, 03 Dec 2010 11:14:56 -0500, Alexander Belopolsky 
 alexander.belopol...@gmail.com wrote:
 On Fri, Dec 3, 2010 at 10:11 AM, R. David Murray rdmur...@bitdance.com 
 wrote:
 ..
  Please also recall that transform/untransform was discussed before
  the release of Python 3.0 and was approved at the time, but it just
  did not get implemented before the 3.0 release.
 

 Can you provide a link?  My search for transform on python-dev came out with

 It was linked from the issue, if I recall correctly.  I do remember
 reading the thread from the python-3000 list, linked by someone
 somewhere :)

 http://mail.python.org/pipermail/python-dev/2010-June/100564.html

 where you seem to oppose these methods.  Also, new methods to builtins

 It looks to me like I was agreeing that transform/untrasnform should
 do only bytes-bytes or str-str regardless of what codec name you
 passed them.

 fall under the language moratorium (but can be approved on a
 case-by-case basis):

 http://www.python.org/dev/peps/pep-3003/#case-by-case-exemptions

 Is there an effort to document these exceptions?  I expected such
 approvals to be added to PEP 3003, but apparently this was not the
 case.

 I believe MAL's thought was that the addition of these methods had
 been approved pre-moratorium, but I don't know if that is a
 sufficient argument or not.

It is not.

The moratorium is intended to freeze the state of the language as
implemented, not whatever was discussed and approved but didn't get
implemented (that'd be a hole big enough to drive a truck through, as
the saying goes :-).

Regardless of what I or others may have said before, I am not
currently a fan of adding transform() to either str or bytes.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Porting Ideas

2010-12-03 Thread Prashant Kumar
On 12/3/10, Éric Araujo mer...@netwok.org wrote:
 Hi Prashant,

 Python 3 support in distutils2 is not entirely finished, it’s an
 interesting and challenging task.

 Another idea: convert the python.org internal scripts to use Python 3,
 for example starting with patches for http://code.python.org/hg/peps/ .
 This would not have any impact on the community, but it’s easy work
 that’d help the Python developers to eat their own dogfood.

 Regards


 Thanks for the suggestion.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-03 Thread Alexander Belopolsky
On Fri, Dec 3, 2010 at 4:16 AM, Victor Stinner
victor.stin...@haypocalc.com wrote:
..
 I don't like transform() and untransform() because I think that we should not
 add too much operations to the base types (bytes and str), and they do
 implicit module import. I prefer explicit module import (eg. import binascii;
 binascii.hexlify(b'to hex')).

+1

Implicit imports are currently subtly broken with no solution in sight.  See

http://bugs.python.org/issue8098
http://bugs.python.org/issue7980

In fact, once the language moratorium is over, I will argue that
str.encode() and byte.decode() should deprecate encoding argument and
just do UTF-8 encoding/decoding.  Hopefully by that time most people
will forget that other encodings exist.  (I can dream, right?)  Python
can still provide the codec machinery but not tie it to str/bytes
builtins.  After all, neither encode/decode nor transform/unstransform
fully utilize the power of codec design.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86945 - python/branches/py3k/Lib/test/__main__.py

2010-12-03 Thread Brett Cannon
I just want to say thanks for doing this, Michael. __main__.py is IMO
woefully underused and it's great to see Python dogfooding the feature
along with making it easier to explain how to run our unit tests.

On Thu, Dec 2, 2010 at 17:34, michael.foord python-check...@python.org wrote:
 Author: michael.foord
 Date: Fri Dec  3 02:34:01 2010
 New Revision: 86945

 Log:
 Initial implementation of Lib/test/__main__.py so we can run tests with 
 'python -m test'

 Added:
   python/branches/py3k/Lib/test/__main__.py   (contents, props changed)

 Added: python/branches/py3k/Lib/test/__main__.py
 ==
 --- (empty file)
 +++ python/branches/py3k/Lib/test/__main__.py   Fri Dec  3 02:34:01 2010
 @@ -0,0 +1,38 @@
 +import os
 +import sys
 +import sysconfig
 +
 +from test import support
 +from test.regrtest import main
 +
 +# findtestdir() gets the dirname out of __file__, so we have to make it
 +# absolute before changing the working directory.
 +# For example __file__ may be relative when running trace or profile.
 +# See issue #9323.
 +__file__ = os.path.abspath(__file__)
 +
 +# sanity check
 +assert __file__ == os.path.abspath(sys.argv[0])
 +
 +# When tests are run from the Python build directory, it is best practice
 +# to keep the test files in a subfolder.  It eases the cleanup of leftover
 +# files using command make distclean.
 +if sysconfig.is_python_build():
 +    TEMPDIR = os.path.join(sysconfig.get_config_var('srcdir'), 'build')
 +    TEMPDIR = os.path.abspath(TEMPDIR)
 +    if not os.path.exists(TEMPDIR):
 +        os.mkdir(TEMPDIR)
 +
 +# Define a writable temp dir that will be used as cwd while running
 +# the tests. The name of the dir includes the pid to allow parallel
 +# testing (see the -j option).
 +TESTCWD = 'test_python_{}'.format(os.getpid())
 +
 +TESTCWD = os.path.join(TEMPDIR, TESTCWD)
 +
 +# Run the tests in a context manager that temporary changes the CWD to a
 +# temporary and writable directory. If it's not possible to create or
 +# change the CWD, the original CWD will be used. The original CWD is
 +# available from support.SAVEDCWD.
 +with support.temp_cwd(TESTCWD, quiet=True):
 +    main()
 ___
 Python-checkins mailing list
 python-check...@python.org
 http://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Martin v. Löwis
 For example, I keep running into the issue that distutils doesn't
 currently support parallel builds. I have been pondering supporting
 -j for building extensions, using both unbounded -j and the GNU make
 style -jN build server. However, I know that the patch will be rejected,
 so I don't even start working on it.
 This would be a very useful feature for distutils2.

But I'm not interested at all in having it in distutils2. I want the
Python build itself to use it, and alas, I can't because of the freeze.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Éric Araujo
 But I'm not interested at all in having it in distutils2. I want the
 Python build itself to use it, and alas, I can't because of the freeze.
You can’t in 3.2, true.  Neither can you in 3.1, or any previous
version.  If you implement it in distutils2, you have very good chances
to get it for 3.3.  Isn’t that a win?

(BTW: congratulations on having PEP 384 accepted and merged.)

Regards

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Martin v. Löwis
Am 03.12.2010 23:48, schrieb Éric Araujo:
 But I'm not interested at all in having it in distutils2. I want the
 Python build itself to use it, and alas, I can't because of the freeze.
 You can’t in 3.2, true.  Neither can you in 3.1, or any previous
 version.  If you implement it in distutils2, you have very good chances
 to get it for 3.3.  Isn’t that a win?

It is, unfortunately, a very weak promise. Until distutils2 is
integrated in Python, I probably won't spend any time on it.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Dima Tisnek
How hard or reasonable would it be to free memory pages on OS level?

[pcmiiw] Gabage collection within a generation involves moving live
objects to compact the generation storage. This separates the memory
region into 2 parts live and cleared, the pointer to the beginning
of the cleared part is where next allocation is going to happen.

When this is done, does Python gc move the objects preserving order or
does it try to populate garbaged slot with some live object
disregarding order? Earlier case is more applicable, but latter case
is a target for below too.

If we were to look at memory regions from the OS point of view, they
are allocated as pages (or possibly as hugetlb pages). So if we are to
compact something like this [LL__][_L__][][L_LL][LFFF]  where []
is a page, L is live object and _ is garbage and F is free memory,
would it not make more sense to tell OS that [] is not needed
anymore, and not move some of the consequtive [L_LL][LFFF] at all, or
at least not move those objects as far down the memory region?

This would of course have a certain overhead of tracking which pages
are given back to OS and mapping them back when needed, but at the
same time, be beneficial because fewer objets are moved and also
possibly improve cpu cache performance because objects won't be moved
so far out.

p.s. if someone has an athoritative link to modern python gc design,
please let me know.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Terry Reedy

On 12/3/2010 5:55 PM, Dima Tisnek wrote:

How hard or reasonable would it be to free memory pages on OS level?

[pcmiiw] Gabage collection within a generation involves moving live
objects to compact the generation storage. This separates the memory
region into 2 parts live and cleared, the pointer to the beginning
of the cleared part is where next allocation is going to happen.

When this is done, does Python gc move the objects preserving order or
does it try to populate garbaged slot with some live object
disregarding order? Earlier case is more applicable, but latter case
is a target for below too.

If we were to look at memory regions from the OS point of view, they
are allocated as pages (or possibly as hugetlb pages). So if we are to
compact something like this [LL__][_L__][][L_LL][LFFF]  where []
is a page, L is live object and _ is garbage and F is free memory,
would it not make more sense to tell OS that [] is not needed
anymore, and not move some of the consequtive [L_LL][LFFF] at all, or
at least not move those objects as far down the memory region?

This would of course have a certain overhead of tracking which pages
are given back to OS and mapping them back when needed, but at the
same time, be beneficial because fewer objets are moved and also
possibly improve cpu cache performance because objects won't be moved
so far out.

p.s. if someone has an athoritative link to modern python gc design,
please let me know.


gc is implementation specific. CPython uses ref counting + cycle gc. A 
constraint on all implementations is that objects have a fixed, unique 
id during their lifetime. CPython uses the address as the id, so it 
cannot move objects. Other implementations do differently. Compacting gc 
requires an id to current address table or something.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] r86976 - in python/branches/py3k: Doc/library/configparser.rst Doc/library/fileformats.rst Lib/configparser.py Lib/test/test_cfgparser.py Misc/NEWS

2010-12-03 Thread Łukasz Langa
Wiadomość napisana przez Éric Araujo w dniu 2010-12-03, o godz. 19:35:

 Hello,
 
 Author: lukasz.langa
 New Revision: 86976
 Log: Issue 10499: Modular interpolation in configparser
 
 Modified: python/branches/py3k/Doc/library/configparser.rst
 Is the module still backward compatible with the 3.1 version, modulo
 fixed bugs?  I haven’t been able to follow closely all the great
 improvements you’ve been making, and there has been a lot of movement in
 the code, so I’m not sure.

There have been minor incompatibilities, all have been thoroughly discussed 
with Fred and other developers. No changes should cause silent breakage, though.

 Thanks for taking up configparser.  Maybe it will become so useful and
 extensible that you’ll have to release it on PyPI for older Pythons :)

Sure. Why not? :)

 Modified: python/branches/py3k/Doc/library/fileformats.rst
 This looks like unrelated changes that slipped in the commit. 

As we discussed this on IRC, this is an unfortunate slip caused by me trying to 
wrap up documentation updates in one go. Will remember to do that separately 
now. Thanks!

 Modified: python/branches/py3k/Lib/configparser.py
 
 Raise DuplicateSectionError if a section by the specified name
 already exists. Raise ValueError if name is DEFAULT.
 
 -if section == self._default_section:
 +if section == self.default_section:
 raise ValueError('Invalid section name: %s' % section)
 I think it’s the only error message using %s instead of %r.  The quotes
 added by %r typically help spot names with spaces (embedded or trailing)
 and the like.

Corrected in rev 86999, thanks.

 +options = list(d.keys())
 +if raw:
 +return [(option, d[option])
 +for option in options]
 +else:
 +return [(option, self._interpolation.before_get(self, section,
 +option, 
 d[option],
 +d))
 +for option in options]
 The list call seems unneeded.  Minor style point: I avoid such dangling
 arguments that don’t read great, I prefer to go to a newline after the
 opening paren:
   return [(option, self._interpolation.before_get(
   self, section, option, d[option], d)) for
   option in options]

You're right that my formatting looks ugly but yours is not that much better 
either ;) How about:

value_getter = lambda option: self._interpolation.before_get(self,
section, option, d[option], d)
if raw:
value_getter = lambda option: d[option]
return [(option, value_getter(option))
for option in d.keys()]

-- 
Best regards,
Łukasz Langa
tel. +48 791 080 144
WWW http://lukasz.langa.pl/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread James Y Knight
On Dec 3, 2010, at 5:52 PM, Martin v. Löwis wrote:

 Am 03.12.2010 23:48, schrieb Éric Araujo:
 But I'm not interested at all in having it in distutils2. I want the
 Python build itself to use it, and alas, I can't because of the freeze.
 You can’t in 3.2, true.  Neither can you in 3.1, or any previous
 version.  If you implement it in distutils2, you have very good chances
 to get it for 3.3.  Isn’t that a win?
 
 It is, unfortunately, a very weak promise. Until distutils2 is
 integrated in Python, I probably won't spend any time on it.

It seems like it'd be a good idea to start integrating distutils2 into python 
trunk immediately after the 3.2 branch is cut, then.

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] gc ideas -- dynamic profiling

2010-12-03 Thread Dima Tisnek
Python organizes objects into 3 generations, ephemeral, short- and long-lived.

When object is created it is place in ephemeral, if it lives long
enough, it is move to short-lived and so on.

q1 are generations placed in separate memory regions, or are all
generations in one memory regions and there is a pointer that
signifies the boundary between generations?

I propose to track hot spots in python, that is contexts where most of
allocations occur, and instrument these with counters that essentially
tell how often an object generated here ends up killed in ephemeral,
short-, long-lived garbage collector run or is in fac tstill alive. If
a particular allocation context creates objects that are likely to be
long-lived, allocator could skip frst 2 generations altogether
(generations are separate regions) or preload the object with high
survival count (if q1 is single region).

On the other hand, if we know where most allocations occur, we can
presume that most of these allocations are ephemeral, otherwise we run
out of memory anyway, if this is indeed so, it makes my point moot.

Implications are extra code to define context, extra pointer back to
context from every allocation (or alterntively a weakref from
allocation point to every object it generated) and real-time
accounting as to what happens to these objects.

It should be possible to approach this problem statistically, that is
instrument only every 100s object or so.

Context could be simple, e.g. bytecode operation 3 on line 45 in
module junk, or more complex, e.g. call stack
str-genera...@238382-function foo-function boo-...; or even
patterns like ''' str-any depth-function big '''; clearly figuring
out what hotspots are is already non-trivial, the more coplex the
definition of context the more it borders downright impossible.

p.s. can anyone share modern cpython profiling results to shed some
light on how important gc optimization really is?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread James Y Knight
On Dec 3, 2010, at 6:04 PM, Terry Reedy wrote:
 gc is implementation specific. CPython uses ref counting + cycle gc. A 
 constraint on all implementations is that objects have a fixed, unique id 
 during their lifetime. CPython uses the address as the id, so it cannot move 
 objects. Other implementations do differently. Compacting gc requires an id 
 to current address table or something.

It's somewhat unfortuante that python has this constraint, instead of the 
looser: objects have a fixed id during their lifetime, which is much easier 
to implement, and practically as useful.

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Martin v. Löwis
Am 03.12.2010 23:55, schrieb Dima Tisnek:
 How hard or reasonable would it be to free memory pages on OS level?

Very easy. Python already does that.

 [pcmiiw] Gabage collection within a generation involves moving live
 objects to compact the generation storage. This separates the memory
 region into 2 parts live and cleared, the pointer to the beginning
 of the cleared part is where next allocation is going to happen.

I think you are talking about copying collectors here. This is not how
Python's garbage collection works.

 When this is done, does Python gc move the objects preserving order or
 does it try to populate garbaged slot with some live object
 disregarding order? Earlier case is more applicable, but latter case
 is a target for below too.

(C)Python's garbage collector is not moving objects at all.

 If we were to look at memory regions from the OS point of view, they
 are allocated as pages (or possibly as hugetlb pages). So if we are to
 compact something like this [LL__][_L__][][L_LL][LFFF]  where []
 is a page, L is live object and _ is garbage and F is free memory,
 would it not make more sense to tell OS that [] is not needed
 anymore, and not move some of the consequtive [L_LL][LFFF] at all, or
 at least not move those objects as far down the memory region?

See above. Python does no moving of objects whatsoever.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Dima Tisnek
Oh my bad, I must've confused python with some research paper.
Unique id is not so hard to make without an address.

While on this topic, what is the real need for unique ids?
Also I reckon not all objects need a unique id like this, e.g.
interned strings, simple data types and hashable and comparable
objects could perhaps survive without unique id?

On 3 December 2010 16:04, Terry Reedy tjre...@udel.edu wrote:
 On 12/3/2010 5:55 PM, Dima Tisnek wrote:

 How hard or reasonable would it be to free memory pages on OS level?

 [pcmiiw] Gabage collection within a generation involves moving live
 objects to compact the generation storage. This separates the memory
 region into 2 parts live and cleared, the pointer to the beginning
 of the cleared part is where next allocation is going to happen.

 When this is done, does Python gc move the objects preserving order or
 does it try to populate garbaged slot with some live object
 disregarding order? Earlier case is more applicable, but latter case
 is a target for below too.

 If we were to look at memory regions from the OS point of view, they
 are allocated as pages (or possibly as hugetlb pages). So if we are to
 compact something like this [LL__][_L__][][L_LL][LFFF]  where []
 is a page, L is live object and _ is garbage and F is free memory,
 would it not make more sense to tell OS that [] is not needed
 anymore, and not move some of the consequtive [L_LL][LFFF] at all, or
 at least not move those objects as far down the memory region?

 This would of course have a certain overhead of tracking which pages
 are given back to OS and mapping them back when needed, but at the
 same time, be beneficial because fewer objets are moved and also
 possibly improve cpu cache performance because objects won't be moved
 so far out.

 p.s. if someone has an athoritative link to modern python gc design,
 please let me know.

 gc is implementation specific. CPython uses ref counting + cycle gc. A
 constraint on all implementations is that objects have a fixed, unique id
 during their lifetime. CPython uses the address as the id, so it cannot move
 objects. Other implementations do differently. Compacting gc requires an id
 to current address table or something.

 --
 Terry Jan Reedy

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/dimaqq%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- dynamic profiling

2010-12-03 Thread Martin v. Löwis
 q1 are generations placed in separate memory regions, or are all
 generations in one memory regions and there is a pointer that
 signifies the boundary between generations?

You should really start reading the source code. See Modules/gcmodule.c.

To answer your question: neither, nor. All objects are in one region,
and there is no pointer separating the generations. Instead, all objects
belonging to one generation form a double-linked list.

 I propose to track hot spots in python, that is contexts where most of
 allocations occur, and instrument these with counters that essentially
 tell how often an object generated here ends up killed in ephemeral,
 short-, long-lived garbage collector run or is in fac tstill alive. If
 a particular allocation context creates objects that are likely to be
 long-lived, allocator could skip frst 2 generations altogether
 (generations are separate regions) or preload the object with high
 survival count (if q1 is single region).

We would consider such a proposal only if you had *actual evidence*
that it improves things, rather than just having a reasoning that it
might.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Terry Reedy

On 12/3/2010 5:52 PM, Martin v. Löwis wrote:

Am 03.12.2010 23:48, schrieb Éric Araujo:

But I'm not interested at all in having it in distutils2. I want the
Python build itself to use it, and alas, I can't because of the freeze.

You can’t in 3.2, true.  Neither can you in 3.1, or any previous
version.  If you implement it in distutils2, you have very good chances
to get it for 3.3.  Isn’t that a win?


It is, unfortunately, a very weak promise. Until distutils2 is
integrated in Python, I probably won't spend any time on it.


Éric, I have the impression from Tarek and you together that D2 is still 
in alpha only because it is not feature frozen and that it is as capable 
and stable as D1. I do not know what Martin means by 'integrate' (other 
than that he be able to use it to build Python), but if my first 
sentence is true, I cannot help but wonder whether a snapshot of D2 
could be included with 3.2, perhaps as '_distribute2' (note leading 
underscore) at least for Python's use. The doc, if any, could just say 
'Development snapshot of D2.a4 (or whatever) for building Python. Other 
uses should get the latest public release from PyPI.


--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Martin v. Löwis
 Oh my bad, I must've confused python with some research paper.
 Unique id is not so hard to make without an address.
 
 While on this topic, what is the real need for unique ids?

They are absolutely needed for mutable objects. For immutable ones,
it would be ok to claim that they are identical if they are equal
(assuming they support equality - which is tricky for things like NaN).

Of course, the C API has lots of assumptions that identity and address
are really the same thing.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Martin v. Löwis
Am 04.12.2010 00:35, schrieb Terry Reedy:
 On 12/3/2010 5:52 PM, Martin v. Löwis wrote:
 Am 03.12.2010 23:48, schrieb Éric Araujo:
 But I'm not interested at all in having it in distutils2. I want the
 Python build itself to use it, and alas, I can't because of the freeze.
 You can’t in 3.2, true.  Neither can you in 3.1, or any previous
 version.  If you implement it in distutils2, you have very good chances
 to get it for 3.3.  Isn’t that a win?

 It is, unfortunately, a very weak promise. Until distutils2 is
 integrated in Python, I probably won't spend any time on it.
 
 Éric, I have the impression from Tarek and you together that D2 is still
 in alpha only because it is not feature frozen and that it is as capable
 and stable as D1. I do not know what Martin means by 'integrate' (other
 than that he be able to use it to build Python)

That the master copy of the source code is in the Python source repository.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Terry Reedy

On 12/3/2010 6:46 PM, Martin v. Löwis wrote:


and stable as D1. I do not know what Martin means by 'integrate' (other
than that he be able to use it to build Python)


That the master copy of the source code is in the Python source repository.


Is a separate branch acceptible, as long as you can commit changes?
Does the meaning of 'repository' change any with the hg switch?

--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Terry Reedy

On 12/3/2010 6:15 PM, James Y Knight wrote:

On Dec 3, 2010, at 6:04 PM, Terry Reedy wrote:

gc is implementation specific. CPython uses ref counting + cycle
gc. A constraint on all implementations is that objects have a
fixed, unique id during their lifetime. CPython uses the address as
the id, so it cannot move objects. Other implementations do
differently. Compacting gc requires an id to current address table
or something.


I left out that the id must be an int.


It's somewhat unfortuante that python has this constraint, instead of
the looser: objects have a fixed id during their lifetime, which is
much easier to implement, and practically as useful.


Given that the only different between 'fixed and unique' and 'fixed' is 
the uniqueness part, I do not understand 'practically as useful'. 
Duplicate ids (in the extreme, that same for all objects) hardly seem 
useful at all.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 384 accepted

2010-12-03 Thread Martin v. Löwis
Am 04.12.2010 01:00, schrieb Terry Reedy:
 On 12/3/2010 6:46 PM, Martin v. Löwis wrote:
 
 and stable as D1. I do not know what Martin means by 'integrate' (other
 than that he be able to use it to build Python)

 That the master copy of the source code is in the Python source
 repository.
 
 Is a separate branch acceptible, as long as you can commit changes?

No. I want the buildbots be able to use the code, in the Python trunk.

 Does the meaning of 'repository' change any with the hg switch?

No. The code would need to be pushed to the master repository, so that
the buildbots can fetch it from there.

In short, I'm not very interested in contributing to a tool that has no
users (and is the fifths of its kind).

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread James Y Knight
On Dec 3, 2010, at 7:05 PM, Terry Reedy wrote:
 I left out that the id must be an int.
 
 It's somewhat unfortuante that python has this constraint, instead of
 the looser: objects have a fixed id during their lifetime, which is
 much easier to implement, and practically as useful.
 
 Given that the only different between 'fixed and unique' and 'fixed' is the 
 uniqueness part, I do not understand 'practically as useful'. Duplicate ids 
 (in the extreme, that same for all objects) hardly seem useful at all.

Sure they are. This is what Java provides you, for example. If you have fixed, 
but potentially non-unique ids (in Java you get this using 
identityHashCode()), you can still make an identity hashtable. You simply 
need to *also* check using is that the two objects really are the same one 
after finding the hash bin using id.

It'd be a quality of implementation issue whether an implementation gives you 
the same value for every object. It should not, of course, if it wants programs 
to have reasonable performance. Same sort of thing as how __hash__ should not 
return 0 for everything.

Besides identity-hashtables, the main other thing id gets used for is to have 
some identifier string in a printer (e.g. class Foo at 0x1234567890). 
There, it's generally good enough to use an id which is not guaranteed to be, 
but often is, unique. It works for Java...

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Terry Reedy

On 12/3/2010 7:46 PM, James Y Knight wrote:


Sure they are. This is what Java provides you, for example. If you
have fixed, but potentially non-unique ids (in Java you get this
using identityHashCode()), you can still make an identity


I do not see the point of calling a (non-unique) hash value the identity


hashtable. You simply need to *also* check using is that the two


In Python, that unique isness is the identify.

(a is b) == (id(a) == id(b)) by definition.


objects really are the same one after finding the hash bin using id.


by using the hash value, which is how Python dict operate.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Cameron Simpson
On 03Dec2010 18:15, James Y Knight f...@fuhm.net wrote:
| On Dec 3, 2010, at 6:04 PM, Terry Reedy wrote:
|  gc is implementation specific. CPython uses ref counting + cycle
|  gc. A constraint on all implementations is that objects have a fixed,
|  unique id during their lifetime. CPython uses the address as the id, so
|  it cannot move objects. Other implementations do differently. Compacting
|  gc requires an id to current address table or something.
| 
| It's somewhat unfortuante that python has this constraint, instead of
| the looser: objects have a fixed id during their lifetime, which is
| much easier to implement, and practically as useful.

Python doesn't have the constraint you imagine; it _does_ have objects
have a fixed id during their lifetime.

_CPython_ has this constraint because it uses the address as the id,
which is free and needs no papping or extra storage. Of course, it
removes certain freedoms from the GC etc as a side effect.
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

The number of cylinders for this disk is set to 364737.
There is nothing wrong with that, but this is larger than 1024, [...]
- fdisk on our new RAID 07oct2007 :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread James Y Knight
On Dec 3, 2010, at 10:50 PM, Terry Reedy wrote:
 On 12/3/2010 7:46 PM, James Y Knight wrote:
 
 Sure they are. This is what Java provides you, for example. If you
 have fixed, but potentially non-unique ids (in Java you get this
 using identityHashCode()), you can still make an identity
 
 I do not see the point of calling a (non-unique) hash value the identity

My point was simply that a) it's an unfortunate constraint on potential GC 
implementations that objects need to have a fixed and unique id in Python, and 
b) that it's not actually necessary to have such a constraint (in the abstract 
sense of required; obviously it's a requirement upon Python *today*, due to 
existing code which depends upon that promise). 

Would you be happier if I had said it's unfortunate that Python has an id 
function instead of an identityHashValue function? I suppose that's what I 
really meant. Python the language would not have been harmed had it had from 
the start an identityHashValue() function instead of an id() function. In the 
CPython implementation, it may even have had the exact same behavior, but 
would've allowed other implementations more flexibility.

James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Terry Reedy

On 12/3/2010 11:06 PM, Cameron Simpson wrote:

On 03Dec2010 18:15, James Y Knightf...@fuhm.net  wrote:
| On Dec 3, 2010, at 6:04 PM, Terry Reedy wrote:
|  gc is implementation specific. CPython uses ref counting + cycle
|  gc. A constraint on all implementations is that objects have a fixed,
|  unique id during their lifetime. CPython uses the address as the id, so
|  it cannot move objects. Other implementations do differently. Compacting
|  gc requires an id to current address table or something.
|
| It's somewhat unfortuante that python has this constraint, instead of
| the looser: objects have a fixed id during their lifetime, which is
| much easier to implement, and practically as useful.

Python doesn't have the constraint you imagine; it _does_ have objects
have a fixed id during their lifetime.


id(object)
Return the “identity” of an object. This is an integer which is 
guaranteed to be unique and constant for this object during its lifetime.


Of course, other implementations are free to change builtins, but code 
that depends on CPython's definitions will not run.


--
Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Dima Tisnek
On 3 December 2010 16:45, Martin v. Löwis mar...@v.loewis.de wrote:
 Oh my bad, I must've confused python with some research paper.
 Unique id is not so hard to make without an address.

 While on this topic, what is the real need for unique ids?

 They are absolutely needed for mutable objects. For immutable ones,
 it would be ok to claim that they are identical if they are equal
 (assuming they support equality - which is tricky for things like NaN).

Indeed, but do ids really need to be unique and fixed at the same time?

a is b # (if atomic) needs unique ids, but doesn't really need fixed ids
a[b]   # needs fixed hash, but not strictly a globally unique id

I can imagine an implementaion of pickle for example that uses unique
and fixed as a given to detect cycles, etc; but that would be
implementation detail.

It seems to me unique and fixed id implies that it is stored somewhere
(with incref beforehands and decref afterwards), however a proper
reference to an object could be used just as well.

Am I still missing something?


 Of course, the C API has lots of assumptions that identity and address
 are really the same thing.

 Regards,
 Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Maciej Fijalkowski
On Sat, Dec 4, 2010 at 6:34 AM, James Y Knight f...@fuhm.net wrote:
 On Dec 3, 2010, at 10:50 PM, Terry Reedy wrote:
 On 12/3/2010 7:46 PM, James Y Knight wrote:

 Sure they are. This is what Java provides you, for example. If you
 have fixed, but potentially non-unique ids (in Java you get this
 using identityHashCode()), you can still make an identity

 I do not see the point of calling a (non-unique) hash value the identity

 My point was simply that a) it's an unfortunate constraint on potential GC 
 implementations that objects need to have a fixed and unique id in Python, 
 and b) that it's not actually necessary to have such a constraint (in the 
 abstract sense of required; obviously it's a requirement upon Python *today*, 
 due to existing code which depends upon that promise).

 Would you be happier if I had said it's unfortunate that Python has an id 
 function instead of an identityHashValue function? I suppose that's what I 
 really meant. Python the language would not have been harmed had it had from 
 the start an identityHashValue() function instead of an id() function. In the 
 CPython implementation, it may even have had the exact same behavior, but 
 would've allowed other implementations more flexibility.

 James


I don't see how this related to moving vs non-moving GC. PyPy (and I
believe IronPython and Java) all have fixed unique ids that are not
necesarilly their addresses. The only problem is that id() computed
that way is more costly performance-wise, but works.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] gc ideas -- sparse memory

2010-12-03 Thread Steven D'Aprano

James Y Knight wrote:

On Dec 3, 2010, at 10:50 PM, Terry Reedy wrote:

On 12/3/2010 7:46 PM, James Y Knight wrote:


Sure they are. This is what Java provides you, for example. If you
have fixed, but potentially non-unique ids (in Java you get this
using identityHashCode()), you can still make an identity

I do not see the point of calling a (non-unique) hash value the identity


My point was simply that a) it's an unfortunate constraint on potential GC implementations that objects need to have a fixed and unique id in Python, and b) that it's not actually necessary to have such a constraint (in the abstract sense of required; obviously it's a requirement upon Python *today*, due to existing code which depends upon that promise). 


I'm afraid I don't follow you. Unless you're suggesting some sort of 
esoteric object system whereby objects *don't* have identity (e.g. where 
objects are emergent properties of some sort of distributed, 
non-localised information), any object naturally has an identity -- 
itself.


It seems to me that an identify function must obey one constraint:

* Two objects which exist simultaneously have the same identity if
  and only if they are the same object i.e. id(x) == id(y) if and
  only if x is y.

Other than that, an implementation is free to make id() behave any way 
they like. CPython uses the memory location of the object, but Jython 
and IronPython use an incrementing counter which is never re-used for 
the lifetime of the Python process. CPython's implementation implies 
that objects may not be moved in memory, but that's not a language 
constraint, that's an implementation issue.


It seems counter-productive to me to bother with an identity function 
which doesn't meet that constraint. If id(x) == id(y) implies nothing 
about x and y (they may, or may not, be the same object) then what's the 
point? Why would you bother using that function when you could just use 
x == y instead?



Would you be happier if I had said it's unfortunate that Python has an id function 
instead of an identityHashValue function? I suppose that's what I really meant. Python the 
language would not have been harmed had it had from the start an identityHashValue() function instead of 
an id() function. In the CPython implementation, it may even have had the exact same behavior, but 
would've allowed other implementations more flexibility.


No, because I don't see what the point of identityHashValue or why you 
would ever bother to use it.



--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com