Re: [Python-Dev] A new dictionary implementation

2012-02-17 Thread Mark Shannon

On 15/02/12 21:09, Yury Selivanov wrote:

Hello Mark,

First, I've back-ported your patch on python 3.2.2 (which was relatively
easy).  Almost all tests pass, and those that don't are always failing on
my machine if I remember.  The patch can be found here: http://goo.gl/nSzzY

Then, I compared memory footprint of one of our applications (300,000 LOC)
and saw it about 6% less than on vanilla python 3.2.2 (660 MB of reserved
process memory compared to 702 MB; Linux Gentoo 64bit) The application is
written in heavy OOP style (for instance, ~1000 classes are generated by our
ORM on the fly, and there are approximately the same amount of hand-written
ones) so I hoped for a much bigger saving.

As for the patch itself I found one use-case, where python with the patch
behaves differently::

   class Foo:
   def __init__(self, msg):
   self.msg = msg

   f = Foo('123')

   class _str(str):
   pass

   print(f.msg)
   print(getattr(f, _str('msg')))

The above snippet works perfectly on vanilla py3.2, but fails on the patched
one  (even on 3.3 compiled from your 'cpython_new_dict' branch)  I'm not sure
that it's a valid code, though.  If not, then we need to fix some python
internals to add exact type  check in 'getattr', in the 'operator.getattr', etc.
And if it is - your  patch needs to be fixed.  In any case, I propose to add
the above code to the  python test-suite, with either expecting a result or an
exception.


Your code is valid, the bug is in my code.
I've fixed and updated the repository.
More tests to be added later.

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-15 Thread Mark Shannon

Any opinions on my new dictionary implementation?

I'm happy to take silence on the PEP as tacit approval,
but the code definitely needs reviewing.

Issue:
http://bugs.python.org/issue13903

PEP:
https://bitbucket.org/markshannon/cpython_new_dict/src/6c4d5d9dfc6d/pep-new-dict.txt

Repository
https://bitbucket.org/markshannon/cpython_new_dict

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-15 Thread Antoine Pitrou
On Mon, 13 Feb 2012 12:31:38 +
Mark Shannon m...@hotpy.org wrote:
 Note that the json benchmark is unstable and should be ignored.

Can you elaborate? If it's unstable it should be fixed, not ignored :)

Also, there are two different mako results in your message, which one
is the right one?

Thanks

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-15 Thread Yury Selivanov
Hello Mark,

First, I've back-ported your patch on python 3.2.2 (which was relatively
easy).  Almost all tests pass, and those that don't are always failing on
my machine if I remember.  The patch can be found here: http://goo.gl/nSzzY

Then, I compared memory footprint of one of our applications (300,000 LOC) 
and saw it about 6% less than on vanilla python 3.2.2 (660 MB of reserved
process memory compared to 702 MB; Linux Gentoo 64bit) The application is 
written in heavy OOP style (for instance, ~1000 classes are generated by our 
ORM on the fly, and there are approximately the same amount of hand-written 
ones) so I hoped for a much bigger saving.

As for the patch itself I found one use-case, where python with the patch
behaves differently::

  class Foo:
  def __init__(self, msg):
  self.msg = msg

  f = Foo('123')

  class _str(str):
  pass

  print(f.msg)
  print(getattr(f, _str('msg')))

The above snippet works perfectly on vanilla py3.2, but fails on the patched 
one  (even on 3.3 compiled from your 'cpython_new_dict' branch)  I'm not sure
that it's a valid code, though.  If not, then we need to fix some python 
internals to add exact type  check in 'getattr', in the 'operator.getattr', 
etc.  
And if it is - your  patch needs to be fixed.  In any case, I propose to add 
the above code to the  python test-suite, with either expecting a result or an 
exception.

Cheers,
Yury

On 2012-02-15, at 12:58 PM, Mark Shannon wrote:

 Any opinions on my new dictionary implementation?
 
 I'm happy to take silence on the PEP as tacit approval,
 but the code definitely needs reviewing.
 
 Issue:
 http://bugs.python.org/issue13903
 
 PEP:
 https://bitbucket.org/markshannon/cpython_new_dict/src/6c4d5d9dfc6d/pep-new-dict.txt
 
 Repository
 https://bitbucket.org/markshannon/cpython_new_dict
 
 Cheers,
 Mark.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-09 Thread Mark Shannon

francis wrote:

Hi Mark,
I've just cloned :


Repository: https://bitbucket.org/markshannon/cpython_new_dict



Do please try it on your machine(s).

that's a:
Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 
GNU/Linux



and I'm getting:

gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. 
-I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c
gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. 
-I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o 
Objects/memoryobject.c

Objects/dictobject.c: In function ‘dict_popitem’:
Objects/dictobject.c:2208:5: error: ‘PyDictKeyEntry’ has no member named 
‘me_value’

make: *** [Objects/dictobject.o] Error 1
make: *** Waiting for unfinished jobs


Bah... typo in assert statement.
My fault for not testing the debug build (release build worked fine).
Both builds working now.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-09 Thread Michael Foord

On 08/02/2012 15:16, Mark Shannon wrote:

Hi,

Version 2 is now available.

Version 2 makes as few changes to tunable constants as possible, and 
generally does not change iteration order (so repr() is unchanged).

All tests pass (the only changes to tests are for sys.getsizeof() ).

Repository: https://bitbucket.org/markshannon/cpython_new_dict
Issue http://bugs.python.org/issue13903

Performance changes are basically zero for non-OO code.
Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show
a small reduction in memory use. (see notes below)

GCbench uses 47% less memory and is 12% faster.
2to3, which seems to be the only realistic benchmark that runs on Py3,
shows no change in speed and uses 10% less memory.


In your first version 2to3 used 28% less memory. Do you know why it's 
worse in this version?


Michael



All benchmarks and tests performed on old, slow 32bit machine
with linux.
Do please try it on your machine(s).

If accepted, the new dict implementation will allow a useful 
optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode:

By testing to see if the (immutable) keys-tables is the expected table,
the value can accessed directly by index, rather than by name.

Cheers,
Mark.


Notes:
All benchmarks from http://hg.python.org/benchmarks/
using the -m flag to get memory usage data.

I've ignored the json benchmarks which shows unstable behaviour
on my machine.
Tiny changes to the dict being serialized or to the random seed can 
change the relative speed of my implementation vs CPython from -25% to 
+10%.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk





--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-09 Thread francis

Hi Mark,

Bah... typo in assert statement.
My fault for not testing the debug build (release build worked fine).
Both builds working now.

Yeah, now is working and passes all tests also on my machine.

I've tried to run the test suite but I'm getting a SyntaxError:
(may be you know it's just the first time that I try the tool):


=
ci@random:~/prog/cpython/benchmarks$ python perf.py -r -b apps python 
../cpython_new_dict/python

Running 2to3...
INFO:root:Running ../cpython_new_dict/python lib/2to3/2to3 -f all 
lib/2to3_data

Traceback (most recent call last):
  File perf.py, line 2236, in module
main(sys.argv[1:])
  File perf.py, line 2192, in main
options)))
  File perf.py, line 1279, in BM_2to3
return SimpleBenchmark(Measure2to3, *args, **kwargs)
  File perf.py, line 706, in SimpleBenchmark
*args, **kwargs)
  File perf.py, line 1275, in Measure2to3
return MeasureCommand(command, trials, env, options.track_memory)
  File perf.py, line 1223, in MeasureCommand
CallAndCaptureOutput(command, env=env)
  File perf.py, line 1053, in CallAndCaptureOutput
raise RuntimeError(uBenchmark died:  + unicode(stderr, 'ascii'))
RuntimeError: Benchmark died: Traceback (most recent call last):
  File lib/2to3/2to3, line 3, in module
from lib2to3.main import main
  File /home/ci/prog/cpython/benchmarks/lib/2to3/lib2to3/main.py, line 47
except os.error, err:
   ^
SyntaxError: invalid syntax
=

And the baseline is: Python 2.7.2+ (but it also gives me an SyntaxError 
running on

python3 default  (e50db1b7ad7b)

What I'm doing wrong ? (from it's doc: “This project is intended to be an
authoritative source of benchmarks for all Python implementations.”)

Thanks in advance !

francis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-09 Thread Mark Shannon

francis wrote:

Hi Mark,

Bah... typo in assert statement.
My fault for not testing the debug build (release build worked fine).
Both builds working now.

Yeah, now is working and passes all tests also on my machine.

I've tried to run the test suite but I'm getting a SyntaxError:
(may be you know it's just the first time that I try the tool):


=
ci@random:~/prog/cpython/benchmarks$ python perf.py -r -b apps python 
../cpython_new_dict/python

Running 2to3...
INFO:root:Running ../cpython_new_dict/python lib/2to3/2to3 -f all 
lib/2to3_data

Traceback (most recent call last):
  File perf.py, line 2236, in module
main(sys.argv[1:])
  File perf.py, line 2192, in main
options)))
  File perf.py, line 1279, in BM_2to3
return SimpleBenchmark(Measure2to3, *args, **kwargs)
  File perf.py, line 706, in SimpleBenchmark
*args, **kwargs)
  File perf.py, line 1275, in Measure2to3
return MeasureCommand(command, trials, env, options.track_memory)
  File perf.py, line 1223, in MeasureCommand
CallAndCaptureOutput(command, env=env)
  File perf.py, line 1053, in CallAndCaptureOutput
raise RuntimeError(uBenchmark died:  + unicode(stderr, 'ascii'))
RuntimeError: Benchmark died: Traceback (most recent call last):
  File lib/2to3/2to3, line 3, in module
from lib2to3.main import main
  File /home/ci/prog/cpython/benchmarks/lib/2to3/lib2to3/main.py, line 47
except os.error, err:
   ^
SyntaxError: invalid syntax
=

And the baseline is: Python 2.7.2+ (but it also gives me an SyntaxError 
running on

python3 default  (e50db1b7ad7b)

What I'm doing wrong ? (from it's doc: “This project is intended to be an
authoritative source of benchmarks for all Python implementations.”)


You need to convert the benchamrks to Python3 using 2to3. Instructions 
are in the make_perf3.sh file. You may need to manually fix up the 
output as well :(


Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-08 Thread Mark Shannon

Hi,

Version 2 is now available.

Version 2 makes as few changes to tunable constants as possible, and 
generally does not change iteration order (so repr() is unchanged).

All tests pass (the only changes to tests are for sys.getsizeof() ).

Repository: https://bitbucket.org/markshannon/cpython_new_dict
Issue http://bugs.python.org/issue13903

Performance changes are basically zero for non-OO code.
Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show
a small reduction in memory use. (see notes below)

GCbench uses 47% less memory and is 12% faster.
2to3, which seems to be the only realistic benchmark that runs on Py3,
shows no change in speed and uses 10% less memory.

All benchmarks and tests performed on old, slow 32bit machine
with linux.
Do please try it on your machine(s).

If accepted, the new dict implementation will allow a useful 
optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode:

By testing to see if the (immutable) keys-tables is the expected table,
the value can accessed directly by index, rather than by name.

Cheers,
Mark.


Notes:
All benchmarks from http://hg.python.org/benchmarks/
using the -m flag to get memory usage data.

I've ignored the json benchmarks which shows unstable behaviour
on my machine.
Tiny changes to the dict being serialized or to the random seed can 
change the relative speed of my implementation vs CPython from -25% to +10%.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-08 Thread francis

Just more info: changeset is: 74843:20702d1acf17

Cheers,

francis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-08 Thread francis

Hi Mark,
I've just cloned :


Repository: https://bitbucket.org/markshannon/cpython_new_dict



Do please try it on your machine(s).

that's a:
Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 
GNU/Linux



and I'm getting:

gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. 
-I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c
gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. 
-I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o Objects/memoryobject.c

Objects/dictobject.c: In function ‘dict_popitem’:
Objects/dictobject.c:2208:5: error: ‘PyDictKeyEntry’ has no member named 
‘me_value’

make: *** [Objects/dictobject.o] Error 1
make: *** Waiting for unfinished jobs

Cheers

francis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-02 Thread Chris Withers

On 01/02/2012 17:50, Guido van Rossum wrote:

Another question: a common pattern is to use (immutable) class
variables as default values for instance variables, and only set the
instance variables once they need to be different. Does such a class
benefit from your improvement?


A less common pattern, but which still needs to work, is where a mutable 
class variable is deliberately store state across all instances of a 
class...


Chris

--
Simplistix - Content Management, Batch Processing  Python Consulting
- http://www.simplistix.co.uk
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-02 Thread Michael Foord

On 02/02/2012 11:30, Chris Withers wrote:

On 01/02/2012 17:50, Guido van Rossum wrote:

Another question: a common pattern is to use (immutable) class
variables as default values for instance variables, and only set the
instance variables once they need to be different. Does such a class
benefit from your improvement?


A less common pattern, but which still needs to work, is where a 
mutable class variable is deliberately store state across all 
instances of a class...


Given that Mark's patch passes the Python test suite I'm sure basic 
patterns like this *work*, the question is which of them take advantage 
of the improved memory efficiency. In the case you mention I don't think 
it's an issue at all, because the class level attribute doesn't 
(generally) appear in instance dicts.


What's also common is where the class holds a *default* value for 
instances, which may be overridden by an instance attribute on *some* 
instances.


All the best,

Michael Foord


Chris




--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-02 Thread Antoine Pitrou
On Wed, 1 Feb 2012 09:50:55 -0800
Guido van Rossum gu...@python.org wrote:
 On Wed, Feb 1, 2012 at 9:13 AM, Hans Mulder han...@xs4all.nl wrote:
  On 30/01/12 00:30:14, Steven D'Aprano wrote:
 
  Mark Shannon wrote:
 
  Antoine Pitrou wrote:
 
  [..]
 
  Antoine is right. It is a reorganisation of the dict, plus a couple of
  changes to typeobject.c and object.c to ensure that instance
  dictionaries do indeed share keys arrays.
 
 
 
  I don't quite follow how that could work.
 
  If I have this:
 
  class C:
  pass
 
  a = C()
  b = C()
 
  a.spam = 1
  b.ham = 2
 
 
  how can a.__dict__ and b.__dict__ share key arrays? I've tried reading
  the source, but I'm afraid I don't understand it well enough to make
  sense of it.
 
 
  They can't.
 
  But then, your class is atypical.  Usually, classes initialize all the
  attributes of their instances in the __init__ method, perhaps like so:
 
  class D:
     def __init__(self, ham=None, spam=None):
         self.ham = ham
         self.spam = spam
 
  As long as you follow the common practice of not adding any attributes
  after the object has been initialized, your instances can share their
  keys array.  Mark's patch will do that.
 
  You'll still be allowed to have different attributes per instance, but
  if you do that, then the patch doesn't buy you much.
 
 Hey, I like this! It's a subtle encouragement for developers to
 initialize all their instance variables in their __init__ or __new__
 method, with a (modest) performance improvement for a carrot. (Though
 I have to admit I have no idea how you do it. Wouldn't the set of dict
 keys be different while __init__ is in the middle of setting the
 instance variables?)
 
 Another question: a common pattern is to use (immutable) class
 variables as default values for instance variables, and only set the
 instance variables once they need to be different. Does such a class
 benefit from your improvement?

I'm not sure who you is in your e-mail, but AFAICT Mark's patch
doesn't special-case __init__ or __new__. Any attribute setting on an
instance uses the shared keys array on the instance's type. Missing
attributes on an instance are simply NULL pointers in the instance's
values array.

(I've suggested that the keys array be bounded in size, to avoid
pathological cases where someone (ab)uses instances as fancy dicts and
puts lots of random data in them)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-02 Thread Martin v. Löwis
Am 02.02.2012 12:30, schrieb Chris Withers:
 On 01/02/2012 17:50, Guido van Rossum wrote:
 Another question: a common pattern is to use (immutable) class
 variables as default values for instance variables, and only set the
 instance variables once they need to be different. Does such a class
 benefit from your improvement?
 
 A less common pattern, but which still needs to work, is where a mutable
 class variable is deliberately store state across all instances of a
 class...

This is really *just* a dictionary implementation. It doesn't affect any
of the lookup procedures. If you trust that the dictionary semantics on
its own isn't changed (which I believe is the case, except for key
order), none of the dict applications will change.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-02 Thread Mark Shannon

Just a quick update.

I've been analysing and profile the behaviour of my new dict and messing 
about with various implementation options.


I've settled on a new implementation.
Its the same basic idea, but with better locality of reference for 
unshared keys.


Guido asked:

 Another question: a common pattern is to use (immutable) class
 variables as default values for instance variables, and only set the
 instance variables once they need to be different. Does such a class
 benefit from your improvement?

For those instances which keep the default, yes.
Otherwise the answer is, as Martin pointed out,
it could yes provided that adding a new key does not force a resize.
Although it is a bit arbitrary when a resize occurs.
The new version will incorporate this behaviour.

Expect version 2 soon.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-01 Thread Hans Mulder

On 30/01/12 00:30:14, Steven D'Aprano wrote:

Mark Shannon wrote:

Antoine Pitrou wrote:

[..]

Antoine is right. It is a reorganisation of the dict, plus a couple of
changes to typeobject.c and object.c to ensure that instance
dictionaries do indeed share keys arrays.



I don't quite follow how that could work.

If I have this:

class C:
pass

a = C()
b = C()

a.spam = 1
b.ham = 2


how can a.__dict__ and b.__dict__ share key arrays? I've tried reading
the source, but I'm afraid I don't understand it well enough to make
sense of it.


They can't.

But then, your class is atypical.  Usually, classes initialize all the
attributes of their instances in the __init__ method, perhaps like so:

class D:
def __init__(self, ham=None, spam=None):
self.ham = ham
self.spam = spam

As long as you follow the common practice of not adding any attributes
after the object has been initialized, your instances can share their
keys array.  Mark's patch will do that.

You'll still be allowed to have different attributes per instance, but
if you do that, then the patch doesn't buy you much.

-- HansM




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-01 Thread Guido van Rossum
On Wed, Feb 1, 2012 at 9:13 AM, Hans Mulder han...@xs4all.nl wrote:
 On 30/01/12 00:30:14, Steven D'Aprano wrote:

 Mark Shannon wrote:

 Antoine Pitrou wrote:

 [..]

 Antoine is right. It is a reorganisation of the dict, plus a couple of
 changes to typeobject.c and object.c to ensure that instance
 dictionaries do indeed share keys arrays.



 I don't quite follow how that could work.

 If I have this:

 class C:
 pass

 a = C()
 b = C()

 a.spam = 1
 b.ham = 2


 how can a.__dict__ and b.__dict__ share key arrays? I've tried reading
 the source, but I'm afraid I don't understand it well enough to make
 sense of it.


 They can't.

 But then, your class is atypical.  Usually, classes initialize all the
 attributes of their instances in the __init__ method, perhaps like so:

 class D:
    def __init__(self, ham=None, spam=None):
        self.ham = ham
        self.spam = spam

 As long as you follow the common practice of not adding any attributes
 after the object has been initialized, your instances can share their
 keys array.  Mark's patch will do that.

 You'll still be allowed to have different attributes per instance, but
 if you do that, then the patch doesn't buy you much.

Hey, I like this! It's a subtle encouragement for developers to
initialize all their instance variables in their __init__ or __new__
method, with a (modest) performance improvement for a carrot. (Though
I have to admit I have no idea how you do it. Wouldn't the set of dict
keys be different while __init__ is in the middle of setting the
instance variables?)

Another question: a common pattern is to use (immutable) class
variables as default values for instance variables, and only set the
instance variables once they need to be different. Does such a class
benefit from your improvement?

 -- HansM

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-01 Thread Alex
Guido van Rossum guido at python.org writes:

 Hey, I like this! It's a subtle encouragement for developers to
 initialize all their instance variables in their __init__ or __new__
 method, with a (modest) performance improvement for a carrot. (Though
 I have to admit I have no idea how you do it. Wouldn't the set of dict
 keys be different while __init__ is in the middle of setting the
 instance variables?)
 
 Another question: a common pattern is to use (immutable) class
 variables as default values for instance variables, and only set the
 instance variables once they need to be different. Does such a class
 benefit from your improvement?
 
  -- HansM
 


While I absolutely cannot speak to this implementation. Traditionally this type
of approach is refered to as maps, and was pioneered in SELF, originally
presented at OOPSLA '89: http://dl.acm.org/citation.cfm?id=74884 .  PyPy also
uses these maps to back it's object, although from what I've read the
implementation looks nothing like the proposed one for CPython, you can read
about that here: http://bit.ly/zwlOkV , and if you're really excited about this
you can read our implementation here:
https://bitbucket.org/pypy/pypy/src/default/pypy/objspace/std/mapdict.py .

Alex


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-02-01 Thread martin

Hey, I like this! It's a subtle encouragement for developers to
initialize all their instance variables in their __init__ or __new__
method, with a (modest) performance improvement for a carrot. (Though
I have to admit I have no idea how you do it. Wouldn't the set of dict
keys be different while __init__ is in the middle of setting the
instance variables?)


The type's attribute set will be a superset of the instance's, for
a shared key set. Initializing the first instance grows the key set,
which is put into the type. Subsequent instances start out with the
key set as a candidate, and have all values set to NULL in the dict
values set. As long as you are only setting attributes that are in the
shared key set, the values just get set. When it encounters a key not
in the shared key set, the dict dissociates itself from the shared key
set.


Another question: a common pattern is to use (immutable) class
variables as default values for instance variables, and only set the
instance variables once they need to be different. Does such a class
benefit from your improvement?


It depends. IIUC, if the first instance happens to get this attribute
set, it ends up in the shared key set, and subsequent instances may have
a NULL value for the key.

I'm unsure how *exactly* the key set gets frozen. You cannot allow resizing
the key set once it is shared, as you would have to find all instances with
the same key set and resize their values. It would be possible (IIUC) to
add more keys to the shared key set if that doesn't cause a resize, but I'm
not sure whether the patch does that.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] A new dictionary implementation

2012-01-29 Thread Mark Shannon

Hi,

Now that issue 13703 has been largely settled,
I want to propose my new dictionary implementation again.
It is a little more polished than before.

https://bitbucket.org/markshannon/hotpy_new_dict

Object-oriented benchmarks use considerably less memory and are
sometimes faster (by a small amount).
(I've only benchmarked on my old 32bit machine)

E.g   2to3  No speed change  -28% memory
GCbench   +10% speed -47% memory

Other benchmarks show little or no change in behaviour,
mainly minor memory savings.

If an application is OO and uses lots of memory
the new dict will save a lot of memory and maybe boost performance.
Other applications will be largely unaffected.

It passes all the tests.
(I had to change a couple that relied on dict repr() ordering)

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Antoine Pitrou

Hi,

On Sun, 29 Jan 2012 10:31:48 +
Mark Shannon m...@hotpy.org wrote:
 
 Now that issue 13703 has been largely settled,
 I want to propose my new dictionary implementation again.
 It is a little more polished than before.
 
 https://bitbucket.org/markshannon/hotpy_new_dict

I briefly took a look at your code yesterday and it looked generally
reasonable to me. It would be nice to open an issue on
http://bugs.python.org so that we can review it there (just fill the
repository field and use the create patch button).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Benjamin Peterson
2012/1/29 Mark Shannon m...@hotpy.org:
 Hi,

 Now that issue 13703 has been largely settled,
 I want to propose my new dictionary implementation again.
 It is a little more polished than before.

If you're serious about changing the dictionary implementation, I
think you should write a PEP. It should explain the new dicts
advantages (and disadvantages?) and give comprehensive benchmark
numbers. Something along the lines of
http://www.python.org/dev/peps/pep-3128/ I should think.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Antoine Pitrou
On Sun, 29 Jan 2012 09:56:11 -0500
Benjamin Peterson benja...@python.org wrote:

 2012/1/29 Mark Shannon m...@hotpy.org:
  Hi,
 
  Now that issue 13703 has been largely settled,
  I want to propose my new dictionary implementation again.
  It is a little more polished than before.
 
 If you're serious about changing the dictionary implementation, I
 think you should write a PEP. It should explain the new dicts
 advantages (and disadvantages?) and give comprehensive benchmark
 numbers. Something along the lines of
 http://www.python.org/dev/peps/pep-3128/ I should think.

New dictionary implementation is a misnomer here. Mark's patch merely
allows to share the keys array between several dictionaries. The lookup
algorithm remains exactly the same as far as I've read. It's actually
much less invasive than e.g. Martin's AVL trees-for-hash-collisions
proposal.

Regards

Antoine.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Benjamin Peterson
2012/1/29 Antoine Pitrou solip...@pitrou.net:
 On Sun, 29 Jan 2012 09:56:11 -0500
 Benjamin Peterson benja...@python.org wrote:

 2012/1/29 Mark Shannon m...@hotpy.org:
  Hi,
 
  Now that issue 13703 has been largely settled,
  I want to propose my new dictionary implementation again.
  It is a little more polished than before.

 If you're serious about changing the dictionary implementation, I
 think you should write a PEP. It should explain the new dicts
 advantages (and disadvantages?) and give comprehensive benchmark
 numbers. Something along the lines of
 http://www.python.org/dev/peps/pep-3128/ I should think.

 New dictionary implementation is a misnomer here. Mark's patch merely
 allows to share the keys array between several dictionaries. The lookup
 algorithm remains exactly the same as far as I've read. It's actually
 much less invasive than e.g. Martin's AVL trees-for-hash-collisions
 proposal.

Ah, okay. So, the subject makes sound scarier than it is. :)



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Mark Shannon

Antoine Pitrou wrote:

Hi,

On Sun, 29 Jan 2012 10:31:48 +
Mark Shannon m...@hotpy.org wrote:

Now that issue 13703 has been largely settled,
I want to propose my new dictionary implementation again.
It is a little more polished than before.

https://bitbucket.org/markshannon/hotpy_new_dict


I briefly took a look at your code yesterday and it looked generally
reasonable to me. It would be nice to open an issue on
http://bugs.python.org so that we can review it there (just fill the
repository field and use the create patch button).


Done:  http://bugs.python.org/issue13903

Cheers,
Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Mark Shannon

Antoine Pitrou wrote:

On Sun, 29 Jan 2012 09:56:11 -0500
Benjamin Peterson benja...@python.org wrote:


2012/1/29 Mark Shannon m...@hotpy.org:

Hi,

Now that issue 13703 has been largely settled,
I want to propose my new dictionary implementation again.
It is a little more polished than before.

If you're serious about changing the dictionary implementation, I
think you should write a PEP. It should explain the new dicts
advantages (and disadvantages?) and give comprehensive benchmark
numbers. Something along the lines of
http://www.python.org/dev/peps/pep-3128/ I should think.


New dictionary implementation is a misnomer here. Mark's patch merely
allows to share the keys array between several dictionaries. The lookup
algorithm remains exactly the same as far as I've read. It's actually
much less invasive than e.g. Martin's AVL trees-for-hash-collisions
proposal.



Antoine is right. It is a reorganisation of the dict, plus a couple of 
changes to typeobject.c and object.c to ensure that instance 
dictionaries do indeed share keys arrays.

The lookup algorithm remains the same (it works well).

Cheers,
Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread francis

On 01/29/2012 11:31 AM, Mark Shannon wrote:

It passes all the tests.
(I had to change a couple that relied on dict repr() ordering)


Hi Mark,
I've cloned the repo, build it the I've tried with ./python -m test. I 
got some errors:


First in general:
340 tests OK.
2 tests failed:
test_dis test_gdb
4 tests altered the execution environment:
test_multiprocessing test_packaging test_site test_strlit
18 tests skipped:
test_curses test_devpoll test_kqueue test_lzma test_msilib
test_ossaudiodev test_smtpnet test_socketserver test_startfile
test_timeout test_tk test_ttk_guionly test_urllib2net
test_urllibnet test_winreg test_winsound test_xmlrpc_net
test_zipfile64
1 skip unexpected on linux:
test_lzma
[1348560 refs]


then test_dis:

== CPython 3.3.0a0 (default:f15cf35c9922, Jan 29 2012, 18:12:19) [GCC 4.6.2]
==   Linux-3.1.0-1-amd64-x86_64-with-debian-wheezy-sid little-endian
==   /home/ci/prog/cpython/hotpy_new_dict/build/test_python_14470
Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, 
optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, 
ignore_environment=0, verbose=0, bytes_warning=0, quiet=0)

[1/1] test_dis
test_big_linenos (test.test_dis.DisTests) ... ok
test_boundaries (test.test_dis.DisTests) ... ok
test_bug_1333982 (test.test_dis.DisTests) ... ok
test_bug_708901 (test.test_dis.DisTests) ... ok
test_dis (test.test_dis.DisTests) ... ok
test_dis_none (test.test_dis.DisTests) ... ok
test_dis_object (test.test_dis.DisTests) ... ok
test_dis_traceback (test.test_dis.DisTests) ... ok
test_disassemble_bytes (test.test_dis.DisTests) ... ok
test_disassemble_method (test.test_dis.DisTests) ... ok
test_disassemble_method_bytes (test.test_dis.DisTests) ... ok
test_disassemble_str (test.test_dis.DisTests) ... ok
test_opmap (test.test_dis.DisTests) ... ok
test_opname (test.test_dis.DisTests) ... ok
test_code_info (test.test_dis.CodeInfoTests) ... FAIL
test_code_info_object (test.test_dis.CodeInfoTests) ... ok
test_pretty_flags_no_flags (test.test_dis.CodeInfoTests) ... ok
test_show_code (test.test_dis.CodeInfoTests) ... FAIL

==
FAIL: test_code_info (test.test_dis.CodeInfoTests)
--
Traceback (most recent call last):
  File /home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py, 
line 439, in test_code_info

self.assertRegex(dis.code_info(x), expected)
AssertionError: Regex didn't match: 'Name:  
f\nFilename:  (.*)\nArgument count:1\nKw-only arguments: 
0\nNumber of locals:  1\nStack size:8\nFlags: 
OPTIMIZED, NEWLOCALS, NESTED\nConstants:\n   0: None\nNames:\n   0: 
print\nVariable names:\n   0: c\nFree variables:\n   0: e\n   1: d\n   
2: f\n   3: y\n   4: x\n   5: z' not found in 'Name:  
f\nFilename:  
/home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py\nArgument 
count:1\nKw-only arguments: 0\nNumber of locals:  1\nStack 
size:8\nFlags: OPTIMIZED, NEWLOCALS, 
NESTED\nConstants:\n   0: None\nNames:\n   0: print\nVariable names:\n   
0: c\nFree variables:\n   0: y\n   1: e\n   2: d\n   3: f\n   4: x\n   5: z'


==
FAIL: test_show_code (test.test_dis.CodeInfoTests)
--
Traceback (most recent call last):
  File /home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py, 
line 446, in test_show_code

self.assertRegex(output.getvalue(), expected+\n)
AssertionError: Regex didn't match: 'Name:  
f\nFilename:  (.*)\nArgument count:1\nKw-only arguments: 
0\nNumber of locals:  1\nStack size:8\nFlags: 
OPTIMIZED, NEWLOCALS, NESTED\nConstants:\n   0: None\nNames:\n   0: 
print\nVariable names:\n   0: c\nFree variables:\n   0: e\n   1: d\n   
2: f\n   3: y\n   4: x\n   5: z\n' not found in 'Name:  
f\nFilename:  
/home/ci/prog/cpython/hotpy_new_dict/Lib/test/test_dis.py\nArgument 
count:1\nKw-only arguments: 0\nNumber of locals:  1\nStack 
size:8\nFlags: OPTIMIZED, NEWLOCALS, 
NESTED\nConstants:\n   0: None\nNames:\n   0: print\nVariable names:\n   
0: c\nFree variables:\n   0: y\n   1: e\n   2: d\n   3: f\n   4: x\n   
5: z\n'


--
Ran 18 tests in 0.070s

FAILED (failures=2)
test test_dis failed
1 test failed:
test_dis
[111919 refs]

*
For test gdb:

Lots of output .

Ran 42 tests in 11.361s

FAILED (failures=28)
test test_gdb failed
1 test failed:
test_gdb
[109989 refs]


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Martin v. Löwis
 Now that issue 13703 has been largely settled,
 I want to propose my new dictionary implementation again.
 It is a little more polished than before.

Please clarify the status of that code: are you actually proposing
6a21f3b35e20 for inclusion into Python as-is? If so, please post it
as a patch to the tracker, as it will need to be reviewed (possibly
with requests for further changes).

If not, it would be good if you could give a list of things that need to
be done before you consider submission to Python.

Also, please submit a contrib form if you haven't done so.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Mark Shannon

francis wrote:

On 01/29/2012 11:31 AM, Mark Shannon wrote:

It passes all the tests.
(I had to change a couple that relied on dict repr() ordering)


Hi Mark,
I've cloned the repo, build it the I've tried with ./python -m test. I 
got some errors:


First in general:
340 tests OK.
2 tests failed:
test_dis test_gdb


[snip]



then test_dis:


[snip]

==
FAIL: test_code_info (test.test_dis.CodeInfoTests)
--

[snip]


==
FAIL: test_show_code (test.test_dis.CodeInfoTests)
--

[snip]

These are known failures, the tests are at fault as they rely on dict 
ordering. However, they should be commented out. Probably crept back in 
again when I pulled the latest version of cpython -- I'll fix them now.


[snip]


*
For test gdb:

Lots of output .

Ran 42 tests in 11.361s

FAILED (failures=28)
test test_gdb failed
1 test failed:
test_gdb
[109989 refs]


I still have gdb 6.somthing,
would you mail me the full output please,
so I can see what the problem is.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Mark Shannon

Martin v. Löwis wrote:

Now that issue 13703 has been largely settled,
I want to propose my new dictionary implementation again.
It is a little more polished than before.


Please clarify the status of that code: are you actually proposing
6a21f3b35e20 for inclusion into Python as-is? If so, please post it
as a patch to the tracker, as it will need to be reviewed (possibly
with requests for further changes).


I thought it already was a patch. What do I need to do to make it a patch?



If not, it would be good if you could give a list of things that need to
be done before you consider submission to Python.


A few tests that rely on dict ordering should probably be fixed first.
I'll submit bug reports for those.



Also, please submit a contrib form if you haven't done so.


Where do I find it?

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Martin v. Löwis
 Please clarify the status of that code: are you actually proposing
 6a21f3b35e20 for inclusion into Python as-is? If so, please post it
 as a patch to the tracker, as it will need to be reviewed (possibly
 with requests for further changes).
 
 I thought it already was a patch. What do I need to do to make it a patch?

I missed your announcement of issue13903; all is fine here.

 Where do I find it?

http://www.python.org/psf/contrib/contrib-form-python/

Thanks,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Mark Shannon

Matt Joiner wrote:
Mark, Good luck with getting this in, I'm also hopeful about coroutines, 
maybe after pushing your dict optimization your coroutine implementation 
will get more consideration.


Shush, don't say the C word or you'll put people off ;)

I'm actually not that fussed about the coroutine implementation.
With yield from generators have all the power of asymmetric coroutines.
I think my coroutine implementation is a neater way to do things,
but it is not worth the fuss.

Anyway, I'm working on my next crazy experiment :)

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread Steven D'Aprano

Mark Shannon wrote:

Antoine Pitrou wrote:

On Sun, 29 Jan 2012 09:56:11 -0500
Benjamin Peterson benja...@python.org wrote:


2012/1/29 Mark Shannon m...@hotpy.org:

Hi,

Now that issue 13703 has been largely settled,
I want to propose my new dictionary implementation again.
It is a little more polished than before.

If you're serious about changing the dictionary implementation, I
think you should write a PEP. It should explain the new dicts
advantages (and disadvantages?) and give comprehensive benchmark
numbers. Something along the lines of
http://www.python.org/dev/peps/pep-3128/ I should think.


New dictionary implementation is a misnomer here. Mark's patch merely
allows to share the keys array between several dictionaries. The lookup
algorithm remains exactly the same as far as I've read. It's actually
much less invasive than e.g. Martin's AVL trees-for-hash-collisions
proposal.



Antoine is right. It is a reorganisation of the dict, plus a couple of 
changes to typeobject.c and object.c to ensure that instance 
dictionaries do indeed share keys arrays.



I don't quite follow how that could work.

If I have this:

class C:
pass

a = C()
b = C()

a.spam = 1
b.ham = 2


how can a.__dict__ and b.__dict__ share key arrays? I've tried reading the 
source, but I'm afraid I don't understand it well enough to make sense of it.





--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A new dictionary implementation

2012-01-29 Thread francis



I still have gdb 6.somthing,
would you mail me the full output please,
so I can see what the problem is.

It's done, let me know if you need more output.

Cheers,
francis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com