Re: [Python-Dev] A new dictionary implementation
On 15/02/12 21:09, Yury Selivanov wrote: Hello Mark, First, I've back-ported your patch on python 3.2.2 (which was relatively easy). Almost all tests pass, and those that don't are always failing on my machine if I remember. The patch can be found here: http://goo.gl/nSzzY Then, I compared memory footprint of one of our applications (300,000 LOC) and saw it about 6% less than on vanilla python 3.2.2 (660 MB of reserved process memory compared to 702 MB; Linux Gentoo 64bit) The application is written in heavy OOP style (for instance, ~1000 classes are generated by our ORM on the fly, and there are approximately the same amount of hand-written ones) so I hoped for a much bigger saving. As for the patch itself I found one use-case, where python with the patch behaves differently:: class Foo: def __init__(self, msg): self.msg = msg f = Foo('123') class _str(str): pass print(f.msg) print(getattr(f, _str('msg'))) The above snippet works perfectly on vanilla py3.2, but fails on the patched one (even on 3.3 compiled from your 'cpython_new_dict' branch) I'm not sure that it's a valid code, though. If not, then we need to fix some python internals to add exact type check in 'getattr', in the 'operator.getattr', etc. And if it is - your patch needs to be fixed. In any case, I propose to add the above code to the python test-suite, with either expecting a result or an exception. Your code is valid, the bug is in my code. I've fixed and updated the repository. More tests to be added later. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hello Mark, First, I've back-ported your patch on python 3.2.2 (which was relatively easy). Almost all tests pass, and those that don't are always failing on my machine if I remember. The patch can be found here: http://goo.gl/nSzzY Then, I compared memory footprint of one of our applications (300,000 LOC) and saw it about 6% less than on vanilla python 3.2.2 (660 MB of reserved process memory compared to 702 MB; Linux Gentoo 64bit) The application is written in heavy OOP style (for instance, ~1000 classes are generated by our ORM on the fly, and there are approximately the same amount of hand-written ones) so I hoped for a much bigger saving. As for the patch itself I found one use-case, where python with the patch behaves differently:: class Foo: def __init__(self, msg): self.msg = msg f = Foo('123') class _str(str): pass print(f.msg) print(getattr(f, _str('msg'))) The above snippet works perfectly on vanilla py3.2, but fails on the patched one (even on 3.3 compiled from your 'cpython_new_dict' branch) I'm not sure that it's a valid code, though. If not, then we need to fix some python internals to add exact type check in 'getattr', in the 'operator.getattr', etc. And if it is - your patch needs to be fixed. In any case, I propose to add the above code to the python test-suite, with either expecting a result or an exception. Cheers, Yury On 2012-02-15, at 12:58 PM, Mark Shannon wrote: > Any opinions on my new dictionary implementation? > > I'm happy to take silence on the PEP as tacit approval, > but the code definitely needs reviewing. > > Issue: > http://bugs.python.org/issue13903 > > PEP: > https://bitbucket.org/markshannon/cpython_new_dict/src/6c4d5d9dfc6d/pep-new-dict.txt > > Repository > https://bitbucket.org/markshannon/cpython_new_dict > > Cheers, > Mark. > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On Mon, 13 Feb 2012 12:31:38 + Mark Shannon wrote: > Note that the json benchmark is unstable and should be ignored. Can you elaborate? If it's unstable it should be fixed, not ignored :) Also, there are two different mako results in your message, which one is the right one? Thanks Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Any opinions on my new dictionary implementation? I'm happy to take silence on the PEP as tacit approval, but the code definitely needs reviewing. Issue: http://bugs.python.org/issue13903 PEP: https://bitbucket.org/markshannon/cpython_new_dict/src/6c4d5d9dfc6d/pep-new-dict.txt Repository https://bitbucket.org/markshannon/cpython_new_dict Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
francis wrote: Hi Mark, Bah... typo in assert statement. My fault for not testing the debug build (release build worked fine). Both builds working now. Yeah, now is working and passes all tests also on my machine. I've tried to run the test suite but I'm getting a SyntaxError: (may be you know it's just the first time that I try the tool): = ci@random:~/prog/cpython/benchmarks$ python perf.py -r -b apps python ../cpython_new_dict/python Running 2to3... INFO:root:Running ../cpython_new_dict/python lib/2to3/2to3 -f all lib/2to3_data Traceback (most recent call last): File "perf.py", line 2236, in main(sys.argv[1:]) File "perf.py", line 2192, in main options))) File "perf.py", line 1279, in BM_2to3 return SimpleBenchmark(Measure2to3, *args, **kwargs) File "perf.py", line 706, in SimpleBenchmark *args, **kwargs) File "perf.py", line 1275, in Measure2to3 return MeasureCommand(command, trials, env, options.track_memory) File "perf.py", line 1223, in MeasureCommand CallAndCaptureOutput(command, env=env) File "perf.py", line 1053, in CallAndCaptureOutput raise RuntimeError(u"Benchmark died: " + unicode(stderr, 'ascii')) RuntimeError: Benchmark died: Traceback (most recent call last): File "lib/2to3/2to3", line 3, in from lib2to3.main import main File "/home/ci/prog/cpython/benchmarks/lib/2to3/lib2to3/main.py", line 47 except os.error, err: ^ SyntaxError: invalid syntax = And the baseline is: Python 2.7.2+ (but it also gives me an SyntaxError running on python3 default (e50db1b7ad7b) What I'm doing wrong ? (from it's doc: “This project is intended to be an authoritative source of benchmarks for all Python implementations.”) You need to convert the benchamrks to Python3 using 2to3. Instructions are in the make_perf3.sh file. You may need to manually fix up the output as well :( Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hi Mark, Bah... typo in assert statement. My fault for not testing the debug build (release build worked fine). Both builds working now. Yeah, now is working and passes all tests also on my machine. I've tried to run the test suite but I'm getting a SyntaxError: (may be you know it's just the first time that I try the tool): = ci@random:~/prog/cpython/benchmarks$ python perf.py -r -b apps python ../cpython_new_dict/python Running 2to3... INFO:root:Running ../cpython_new_dict/python lib/2to3/2to3 -f all lib/2to3_data Traceback (most recent call last): File "perf.py", line 2236, in main(sys.argv[1:]) File "perf.py", line 2192, in main options))) File "perf.py", line 1279, in BM_2to3 return SimpleBenchmark(Measure2to3, *args, **kwargs) File "perf.py", line 706, in SimpleBenchmark *args, **kwargs) File "perf.py", line 1275, in Measure2to3 return MeasureCommand(command, trials, env, options.track_memory) File "perf.py", line 1223, in MeasureCommand CallAndCaptureOutput(command, env=env) File "perf.py", line 1053, in CallAndCaptureOutput raise RuntimeError(u"Benchmark died: " + unicode(stderr, 'ascii')) RuntimeError: Benchmark died: Traceback (most recent call last): File "lib/2to3/2to3", line 3, in from lib2to3.main import main File "/home/ci/prog/cpython/benchmarks/lib/2to3/lib2to3/main.py", line 47 except os.error, err: ^ SyntaxError: invalid syntax = And the baseline is: Python 2.7.2+ (but it also gives me an SyntaxError running on python3 default (e50db1b7ad7b) What I'm doing wrong ? (from it's doc: “This project is intended to be an authoritative source of benchmarks for all Python implementations.”) Thanks in advance ! francis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On 08/02/2012 15:16, Mark Shannon wrote: Hi, Version 2 is now available. Version 2 makes as few changes to tunable constants as possible, and generally does not change iteration order (so repr() is unchanged). All tests pass (the only changes to tests are for sys.getsizeof() ). Repository: https://bitbucket.org/markshannon/cpython_new_dict Issue http://bugs.python.org/issue13903 Performance changes are basically zero for non-OO code. Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show a small reduction in memory use. (see notes below) GCbench uses 47% less memory and is 12% faster. 2to3, which seems to be the only "realistic" benchmark that runs on Py3, shows no change in speed and uses 10% less memory. In your first version 2to3 used 28% less memory. Do you know why it's worse in this version? Michael All benchmarks and tests performed on old, slow 32bit machine with linux. Do please try it on your machine(s). If accepted, the new dict implementation will allow a useful optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode: By testing to see if the (immutable) keys-tables is the expected table, the value can accessed directly by index, rather than by name. Cheers, Mark. Notes: All benchmarks from http://hg.python.org/benchmarks/ using the -m flag to get memory usage data. I've ignored the json benchmarks which shows unstable behaviour on my machine. Tiny changes to the dict being serialized or to the random seed can change the relative speed of my implementation vs CPython from -25% to +10%. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
francis wrote: Hi Mark, I've just cloned : Repository: https://bitbucket.org/markshannon/cpython_new_dict Do please try it on your machine(s). that's a: Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 GNU/Linux and I'm getting: gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o Objects/memoryobject.c Objects/dictobject.c: In function ‘dict_popitem’: Objects/dictobject.c:2208:5: error: ‘PyDictKeyEntry’ has no member named ‘me_value’ make: *** [Objects/dictobject.o] Error 1 make: *** Waiting for unfinished jobs Bah... typo in assert statement. My fault for not testing the debug build (release build worked fine). Both builds working now. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hi Mark, I've just cloned : Repository: https://bitbucket.org/markshannon/cpython_new_dict Do please try it on your machine(s). that's a: Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 GNU/Linux and I'm getting: gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o Objects/memoryobject.c Objects/dictobject.c: In function ‘dict_popitem’: Objects/dictobject.c:2208:5: error: ‘PyDictKeyEntry’ has no member named ‘me_value’ make: *** [Objects/dictobject.o] Error 1 make: *** Waiting for unfinished jobs Cheers francis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Just more info: changeset is: 74843:20702d1acf17 Cheers, francis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hi, Version 2 is now available. Version 2 makes as few changes to tunable constants as possible, and generally does not change iteration order (so repr() is unchanged). All tests pass (the only changes to tests are for sys.getsizeof() ). Repository: https://bitbucket.org/markshannon/cpython_new_dict Issue http://bugs.python.org/issue13903 Performance changes are basically zero for non-OO code. Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show a small reduction in memory use. (see notes below) GCbench uses 47% less memory and is 12% faster. 2to3, which seems to be the only "realistic" benchmark that runs on Py3, shows no change in speed and uses 10% less memory. All benchmarks and tests performed on old, slow 32bit machine with linux. Do please try it on your machine(s). If accepted, the new dict implementation will allow a useful optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode: By testing to see if the (immutable) keys-tables is the expected table, the value can accessed directly by index, rather than by name. Cheers, Mark. Notes: All benchmarks from http://hg.python.org/benchmarks/ using the -m flag to get memory usage data. I've ignored the json benchmarks which shows unstable behaviour on my machine. Tiny changes to the dict being serialized or to the random seed can change the relative speed of my implementation vs CPython from -25% to +10%. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Just a quick update. I've been analysing and profile the behaviour of my new dict and messing about with various implementation options. I've settled on a new implementation. Its the same basic idea, but with better locality of reference for unshared keys. Guido asked: > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? For those instances which keep the default, yes. Otherwise the answer is, as Martin pointed out, it could yes provided that adding a new key does not force a resize. Although it is a bit arbitrary when a resize occurs. The new version will incorporate this behaviour. Expect version 2 soon. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Am 02.02.2012 12:30, schrieb Chris Withers: > On 01/02/2012 17:50, Guido van Rossum wrote: >> Another question: a common pattern is to use (immutable) class >> variables as default values for instance variables, and only set the >> instance variables once they need to be different. Does such a class >> benefit from your improvement? > > A less common pattern, but which still needs to work, is where a mutable > class variable is deliberately store state across all instances of a > class... This is really *just* a dictionary implementation. It doesn't affect any of the lookup procedures. If you trust that the dictionary semantics on its own isn't changed (which I believe is the case, except for key order), none of the dict applications will change. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On Wed, 1 Feb 2012 09:50:55 -0800 Guido van Rossum wrote: > On Wed, Feb 1, 2012 at 9:13 AM, Hans Mulder wrote: > > On 30/01/12 00:30:14, Steven D'Aprano wrote: > >> > >> Mark Shannon wrote: > >>> > >>> Antoine Pitrou wrote: > > > > [..] > > > >>> Antoine is right. It is a reorganisation of the dict, plus a couple of > >>> changes to typeobject.c and object.c to ensure that instance > >>> dictionaries do indeed share keys arrays. > >> > >> > >> > >> I don't quite follow how that could work. > >> > >> If I have this: > >> > >> class C: > >> pass > >> > >> a = C() > >> b = C() > >> > >> a.spam = 1 > >> b.ham = 2 > >> > >> > >> how can a.__dict__ and b.__dict__ share key arrays? I've tried reading > >> the source, but I'm afraid I don't understand it well enough to make > >> sense of it. > > > > > > They can't. > > > > But then, your class is atypical. Usually, classes initialize all the > > attributes of their instances in the __init__ method, perhaps like so: > > > > class D: > > def __init__(self, ham=None, spam=None): > > self.ham = ham > > self.spam = spam > > > > As long as you follow the common practice of not adding any attributes > > after the object has been initialized, your instances can share their > > keys array. Mark's patch will do that. > > > > You'll still be allowed to have different attributes per instance, but > > if you do that, then the patch doesn't buy you much. > > Hey, I like this! It's a subtle encouragement for developers to > initialize all their instance variables in their __init__ or __new__ > method, with a (modest) performance improvement for a carrot. (Though > I have to admit I have no idea how you do it. Wouldn't the set of dict > keys be different while __init__ is in the middle of setting the > instance variables?) > > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? I'm not sure who "you" is in your e-mail, but AFAICT Mark's patch doesn't special-case __init__ or __new__. Any attribute setting on an instance uses the shared keys array on the instance's type. "Missing" attributes on an instance are simply NULL pointers in the instance's values array. (I've suggested that the keys array be bounded in size, to avoid pathological cases where someone (ab)uses instances as fancy dicts and puts lots of random data in them) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On 02/02/2012 11:30, Chris Withers wrote: On 01/02/2012 17:50, Guido van Rossum wrote: Another question: a common pattern is to use (immutable) class variables as default values for instance variables, and only set the instance variables once they need to be different. Does such a class benefit from your improvement? A less common pattern, but which still needs to work, is where a mutable class variable is deliberately store state across all instances of a class... Given that Mark's patch passes the Python test suite I'm sure basic patterns like this *work*, the question is which of them take advantage of the improved memory efficiency. In the case you mention I don't think it's an issue at all, because the class level attribute doesn't (generally) appear in instance dicts. What's also common is where the class holds a *default* value for instances, which may be overridden by an instance attribute on *some* instances. All the best, Michael Foord Chris -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On 01/02/2012 17:50, Guido van Rossum wrote: Another question: a common pattern is to use (immutable) class variables as default values for instance variables, and only set the instance variables once they need to be different. Does such a class benefit from your improvement? A less common pattern, but which still needs to work, is where a mutable class variable is deliberately store state across all instances of a class... Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hey, I like this! It's a subtle encouragement for developers to initialize all their instance variables in their __init__ or __new__ method, with a (modest) performance improvement for a carrot. (Though I have to admit I have no idea how you do it. Wouldn't the set of dict keys be different while __init__ is in the middle of setting the instance variables?) The "type's attribute set" will be a superset of the instance's, for a shared key set. Initializing the first instance grows the key set, which is put into the type. Subsequent instances start out with the key set as a candidate, and have all values set to NULL in the dict values set. As long as you are only setting attributes that are in the shared key set, the values just get set. When it encounters a key not in the shared key set, the dict dissociates itself from the shared key set. Another question: a common pattern is to use (immutable) class variables as default values for instance variables, and only set the instance variables once they need to be different. Does such a class benefit from your improvement? It depends. IIUC, if the first instance happens to get this attribute set, it ends up in the shared key set, and subsequent instances may have a NULL value for the key. I'm unsure how *exactly* the key set gets frozen. You cannot allow resizing the key set once it is shared, as you would have to find all instances with the same key set and resize their values. It would be possible (IIUC) to add more keys to the shared key set if that doesn't cause a resize, but I'm not sure whether the patch does that. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Guido van Rossum python.org> writes: > Hey, I like this! It's a subtle encouragement for developers to > initialize all their instance variables in their __init__ or __new__ > method, with a (modest) performance improvement for a carrot. (Though > I have to admit I have no idea how you do it. Wouldn't the set of dict > keys be different while __init__ is in the middle of setting the > instance variables?) > > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? > > > -- HansM > While I absolutely cannot speak to this implementation. Traditionally this type of approach is refered to as maps, and was pioneered in SELF, originally presented at OOPSLA '89: http://dl.acm.org/citation.cfm?id=74884 . PyPy also uses these maps to back it's object, although from what I've read the implementation looks nothing like the proposed one for CPython, you can read about that here: http://bit.ly/zwlOkV , and if you're really excited about this you can read our implementation here: https://bitbucket.org/pypy/pypy/src/default/pypy/objspace/std/mapdict.py . Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On Wed, Feb 1, 2012 at 9:13 AM, Hans Mulder wrote: > On 30/01/12 00:30:14, Steven D'Aprano wrote: >> >> Mark Shannon wrote: >>> >>> Antoine Pitrou wrote: > > [..] > >>> Antoine is right. It is a reorganisation of the dict, plus a couple of >>> changes to typeobject.c and object.c to ensure that instance >>> dictionaries do indeed share keys arrays. >> >> >> >> I don't quite follow how that could work. >> >> If I have this: >> >> class C: >> pass >> >> a = C() >> b = C() >> >> a.spam = 1 >> b.ham = 2 >> >> >> how can a.__dict__ and b.__dict__ share key arrays? I've tried reading >> the source, but I'm afraid I don't understand it well enough to make >> sense of it. > > > They can't. > > But then, your class is atypical. Usually, classes initialize all the > attributes of their instances in the __init__ method, perhaps like so: > > class D: > def __init__(self, ham=None, spam=None): > self.ham = ham > self.spam = spam > > As long as you follow the common practice of not adding any attributes > after the object has been initialized, your instances can share their > keys array. Mark's patch will do that. > > You'll still be allowed to have different attributes per instance, but > if you do that, then the patch doesn't buy you much. Hey, I like this! It's a subtle encouragement for developers to initialize all their instance variables in their __init__ or __new__ method, with a (modest) performance improvement for a carrot. (Though I have to admit I have no idea how you do it. Wouldn't the set of dict keys be different while __init__ is in the middle of setting the instance variables?) Another question: a common pattern is to use (immutable) class variables as default values for instance variables, and only set the instance variables once they need to be different. Does such a class benefit from your improvement? > -- HansM -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On 30/01/12 00:30:14, Steven D'Aprano wrote: Mark Shannon wrote: Antoine Pitrou wrote: [..] Antoine is right. It is a reorganisation of the dict, plus a couple of changes to typeobject.c and object.c to ensure that instance dictionaries do indeed share keys arrays. I don't quite follow how that could work. If I have this: class C: pass a = C() b = C() a.spam = 1 b.ham = 2 how can a.__dict__ and b.__dict__ share key arrays? I've tried reading the source, but I'm afraid I don't understand it well enough to make sense of it. They can't. But then, your class is atypical. Usually, classes initialize all the attributes of their instances in the __init__ method, perhaps like so: class D: def __init__(self, ham=None, spam=None): self.ham = ham self.spam = spam As long as you follow the common practice of not adding any attributes after the object has been initialized, your instances can share their keys array. Mark's patch will do that. You'll still be allowed to have different attributes per instance, but if you do that, then the patch doesn't buy you much. -- HansM ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
I still have gdb 6.somthing, would you mail me the full output please, so I can see what the problem is. It's done, let me know if you need more output. Cheers, francis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Mark Shannon wrote: Antoine Pitrou wrote: On Sun, 29 Jan 2012 09:56:11 -0500 Benjamin Peterson wrote: 2012/1/29 Mark Shannon : Hi, Now that issue 13703 has been largely settled, I want to propose my new dictionary implementation again. It is a little more polished than before. If you're serious about changing the dictionary implementation, I think you should write a PEP. It should explain the new dicts advantages (and disadvantages?) and give comprehensive benchmark numbers. Something along the lines of http://www.python.org/dev/peps/pep-3128/ I should think. "New dictionary implementation" is a misnomer here. Mark's patch merely allows to share the keys array between several dictionaries. The lookup algorithm remains exactly the same as far as I've read. It's actually much less invasive than e.g. Martin's AVL trees-for-hash-collisions proposal. Antoine is right. It is a reorganisation of the dict, plus a couple of changes to typeobject.c and object.c to ensure that instance dictionaries do indeed share keys arrays. I don't quite follow how that could work. If I have this: class C: pass a = C() b = C() a.spam = 1 b.ham = 2 how can a.__dict__ and b.__dict__ share key arrays? I've tried reading the source, but I'm afraid I don't understand it well enough to make sense of it. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Matt Joiner wrote: Mark, Good luck with getting this in, I'm also hopeful about coroutines, maybe after pushing your dict optimization your coroutine implementation will get more consideration. Shush, don't say the C word or you'll put people off ;) I'm actually not that fussed about the coroutine implementation. With "yield from" generators have all the power of asymmetric coroutines. I think my coroutine implementation is a neater way to do things, but it is not worth the fuss. Anyway, I'm working on my next crazy experiment :) Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
>> Please clarify the status of that code: are you actually proposing >> 6a21f3b35e20 for inclusion into Python as-is? If so, please post it >> as a patch to the tracker, as it will need to be reviewed (possibly >> with requests for further changes). > > I thought it already was a patch. What do I need to do to make it a patch? I missed your announcement of issue13903; all is fine here. > Where do I find it? http://www.python.org/psf/contrib/contrib-form-python/ Thanks, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Martin v. Löwis wrote: Now that issue 13703 has been largely settled, I want to propose my new dictionary implementation again. It is a little more polished than before. Please clarify the status of that code: are you actually proposing 6a21f3b35e20 for inclusion into Python as-is? If so, please post it as a patch to the tracker, as it will need to be reviewed (possibly with requests for further changes). I thought it already was a patch. What do I need to do to make it a patch? If not, it would be good if you could give a list of things that need to be done before you consider submission to Python. A few tests that rely on dict ordering should probably be fixed first. I'll submit bug reports for those. Also, please submit a contrib form if you haven't done so. Where do I find it? Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
francis wrote: On 01/29/2012 11:31 AM, Mark Shannon wrote: It passes all the tests. (I had to change a couple that relied on dict repr() ordering) Hi Mark, I've cloned the repo, build it the I've tried with ./python -m test. I got some errors: First in general: 340 tests OK. 2 tests failed: test_dis test_gdb [snip] then test_dis: [snip] == FAIL: test_code_info (test.test_dis.CodeInfoTests) -- [snip] == FAIL: test_show_code (test.test_dis.CodeInfoTests) -- [snip] These are known failures, the tests are at fault as they rely on dict ordering. However, they should be commented out. Probably crept back in again when I pulled the latest version of cpython -- I'll fix them now. [snip] * For test gdb: Lots of output . Ran 42 tests in 11.361s FAILED (failures=28) test test_gdb failed 1 test failed: test_gdb [109989 refs] I still have gdb 6.somthing, would you mail me the full output please, so I can see what the problem is. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
> Now that issue 13703 has been largely settled, > I want to propose my new dictionary implementation again. > It is a little more polished than before. Please clarify the status of that code: are you actually proposing 6a21f3b35e20 for inclusion into Python as-is? If so, please post it as a patch to the tracker, as it will need to be reviewed (possibly with requests for further changes). If not, it would be good if you could give a list of things that need to be done before you consider submission to Python. Also, please submit a contrib form if you haven't done so. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On 01/29/2012 11:31 AM, Mark Shannon wrote: It passes all the tests. (I had to change a couple that relied on dict repr() ordering) Hi Mark, I've cloned the repo, build it the I've tried with ./python -m test. I got some errors: First in general: 340 tests OK. 2 tests failed: test_dis test_gdb 4 tests altered the execution environment: test_multiprocessing test_packaging test_site test_strlit 18 tests skipped: test_curses test_devpoll test_kqueue test_lzma test_msilib test_ossaudiodev test_smtpnet test_socketserver test_startfile test_timeout test_tk test_ttk_guionly test_urllib2net test_urllibnet test_winreg test_winsound test_xmlrpc_net test_zipfile64 1 skip unexpected on linux: test_lzma [1348560 refs] then test_dis: == CPython 3.3.0a0 (default:f15cf35c9922, Jan 29 2012, 18:12:19) [GCC 4.6.2] == Linux-3.1.0-1-amd64-x86_64-with-debian-wheezy-sid little-endian == /home/ci/prog/cpython/hotpy_new_dict/build/test_python_14470 Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0) [1/1] test_dis test_big_linenos (test.test_dis.DisTests) ... ok test_boundaries (test.test_dis.DisTests) ... ok test_bug_1333982 (test.test_dis.DisTests) ... ok test_bug_708901 (test.test_dis.DisTests) ... ok test_dis (test.test_dis.DisTests) ... ok test_dis_none (test.test_dis.DisTests) ... ok test_dis_object (test.test_dis.DisTests) ... ok test_dis_traceback (test.test_dis.DisTests) ... ok test_disassemble_bytes (test.test_dis.DisTests) ... ok test_disassemble_method (test.test_dis.DisTests) ... ok test_disassemble_method_bytes (test.test_dis.DisTests) ... ok test_disassemble_str (test.test_dis.DisTests) ... ok test_opmap (test.test_dis.DisTests) ... ok test_opname (test.test_dis.DisTests) ... ok test_code_info (test.test_dis.CodeInfoTests) ... FAIL test_code_info_object (test.test_dis.CodeInfoTests) ... ok test_pretty_flags_no_flags (test.test_dis.CodeInfoTests) ... ok test_show_code (test.test_dis.CodeInfoTests) ... FAIL == FAIL: test_code_info (test.test_dis.CodeInfoTests) -- Traceback (most recent call last): File "/home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py", line 439, in test_code_info self.assertRegex(dis.code_info(x), expected) AssertionError: Regex didn't match: 'Name: f\nFilename: (.*)\nArgument count:1\nKw-only arguments: 0\nNumber of locals: 1\nStack size:8\nFlags: OPTIMIZED, NEWLOCALS, NESTED\nConstants:\n 0: None\nNames:\n 0: print\nVariable names:\n 0: c\nFree variables:\n 0: e\n 1: d\n 2: f\n 3: y\n 4: x\n 5: z' not found in 'Name: f\nFilename: /home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py\nArgument count:1\nKw-only arguments: 0\nNumber of locals: 1\nStack size:8\nFlags: OPTIMIZED, NEWLOCALS, NESTED\nConstants:\n 0: None\nNames:\n 0: print\nVariable names:\n 0: c\nFree variables:\n 0: y\n 1: e\n 2: d\n 3: f\n 4: x\n 5: z' == FAIL: test_show_code (test.test_dis.CodeInfoTests) -- Traceback (most recent call last): File "/home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py", line 446, in test_show_code self.assertRegex(output.getvalue(), expected+"\n") AssertionError: Regex didn't match: 'Name: f\nFilename: (.*)\nArgument count:1\nKw-only arguments: 0\nNumber of locals: 1\nStack size:8\nFlags: OPTIMIZED, NEWLOCALS, NESTED\nConstants:\n 0: None\nNames:\n 0: print\nVariable names:\n 0: c\nFree variables:\n 0: e\n 1: d\n 2: f\n 3: y\n 4: x\n 5: z\n' not found in 'Name: f\nFilename: /home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py\nArgument count:1\nKw-only arguments: 0\nNumber of locals: 1\nStack size:8\nFlags: OPTIMIZED, NEWLOCALS, NESTED\nConstants:\n 0: None\nNames:\n 0: print\nVariable names:\n 0: c\nFree variables:\n 0: y\n 1: e\n 2: d\n 3: f\n 4: x\n 5: z\n' -- Ran 18 tests in 0.070s FAILED (failures=2) test test_dis failed 1 test failed: test_dis [111919 refs] * For test gdb: Lots of output . Ran 42 tests in 11.361s FAILED (failures=28) test test_gdb failed 1 test failed: test_gdb [109989 refs] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscri
Re: [Python-Dev] A new dictionary implementation
Antoine Pitrou wrote: On Sun, 29 Jan 2012 09:56:11 -0500 Benjamin Peterson wrote: 2012/1/29 Mark Shannon : Hi, Now that issue 13703 has been largely settled, I want to propose my new dictionary implementation again. It is a little more polished than before. If you're serious about changing the dictionary implementation, I think you should write a PEP. It should explain the new dicts advantages (and disadvantages?) and give comprehensive benchmark numbers. Something along the lines of http://www.python.org/dev/peps/pep-3128/ I should think. "New dictionary implementation" is a misnomer here. Mark's patch merely allows to share the keys array between several dictionaries. The lookup algorithm remains exactly the same as far as I've read. It's actually much less invasive than e.g. Martin's AVL trees-for-hash-collisions proposal. Antoine is right. It is a reorganisation of the dict, plus a couple of changes to typeobject.c and object.c to ensure that instance dictionaries do indeed share keys arrays. The lookup algorithm remains the same (it works well). Cheers, Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Antoine Pitrou wrote: Hi, On Sun, 29 Jan 2012 10:31:48 + Mark Shannon wrote: Now that issue 13703 has been largely settled, I want to propose my new dictionary implementation again. It is a little more polished than before. https://bitbucket.org/markshannon/hotpy_new_dict I briefly took a look at your code yesterday and it looked generally reasonable to me. It would be nice to open an issue on http://bugs.python.org so that we can review it there (just fill the "repository" field and use the "create patch" button). Done: http://bugs.python.org/issue13903 Cheers, Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
2012/1/29 Antoine Pitrou : > On Sun, 29 Jan 2012 09:56:11 -0500 > Benjamin Peterson wrote: > >> 2012/1/29 Mark Shannon : >> > Hi, >> > >> > Now that issue 13703 has been largely settled, >> > I want to propose my new dictionary implementation again. >> > It is a little more polished than before. >> >> If you're serious about changing the dictionary implementation, I >> think you should write a PEP. It should explain the new dicts >> advantages (and disadvantages?) and give comprehensive benchmark >> numbers. Something along the lines of >> http://www.python.org/dev/peps/pep-3128/ I should think. > > "New dictionary implementation" is a misnomer here. Mark's patch merely > allows to share the keys array between several dictionaries. The lookup > algorithm remains exactly the same as far as I've read. It's actually > much less invasive than e.g. Martin's AVL trees-for-hash-collisions > proposal. Ah, okay. So, the subject makes sound scarier than it is. :) -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
On Sun, 29 Jan 2012 09:56:11 -0500 Benjamin Peterson wrote: > 2012/1/29 Mark Shannon : > > Hi, > > > > Now that issue 13703 has been largely settled, > > I want to propose my new dictionary implementation again. > > It is a little more polished than before. > > If you're serious about changing the dictionary implementation, I > think you should write a PEP. It should explain the new dicts > advantages (and disadvantages?) and give comprehensive benchmark > numbers. Something along the lines of > http://www.python.org/dev/peps/pep-3128/ I should think. "New dictionary implementation" is a misnomer here. Mark's patch merely allows to share the keys array between several dictionaries. The lookup algorithm remains exactly the same as far as I've read. It's actually much less invasive than e.g. Martin's AVL trees-for-hash-collisions proposal. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
2012/1/29 Mark Shannon : > Hi, > > Now that issue 13703 has been largely settled, > I want to propose my new dictionary implementation again. > It is a little more polished than before. If you're serious about changing the dictionary implementation, I think you should write a PEP. It should explain the new dicts advantages (and disadvantages?) and give comprehensive benchmark numbers. Something along the lines of http://www.python.org/dev/peps/pep-3128/ I should think. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new dictionary implementation
Hi, On Sun, 29 Jan 2012 10:31:48 + Mark Shannon wrote: > > Now that issue 13703 has been largely settled, > I want to propose my new dictionary implementation again. > It is a little more polished than before. > > https://bitbucket.org/markshannon/hotpy_new_dict I briefly took a look at your code yesterday and it looked generally reasonable to me. It would be nice to open an issue on http://bugs.python.org so that we can review it there (just fill the "repository" field and use the "create patch" button). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com