Re: List size versus list allocation

2010-05-02 Thread Steven D'Aprano
On Sun, 02 May 2010 18:07:31 +0200, Christian Heimes wrote:

>> I'm interested in gathering some statistics on this, and to do so I
>> need a way of measuring the list's logical size versus its actual size.
>> The first is easy: it's just len(list). Is there some way of getting
>> the allocated size of the list?
> 
> With Python 2.6 and newer you can use sys.getsizeof() to get the actual
> size of the list object:


Thanks Christian and Peter for your answers. That's very helpful.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List size versus list allocation

2010-05-02 Thread Peter Otten
Christian Heimes wrote:

 def slots(lst):
> ... return (sys.getsizeof(lst) - sys.getsizeof([])) /

D'oh!

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: List size versus list allocation

2010-05-02 Thread Christian Heimes

I'm interested in gathering some statistics on this, and to do so I need
a way of measuring the list's logical size versus its actual size. The
first is easy: it's just len(list). Is there some way of getting the
allocated size of the list?


With Python 2.6 and newer you can use sys.getsizeof() to get the actual 
size of the list object:



import sys, struct
sys.getsizeof([])

72

a = [1,2,3]
b = []
b.append(1)
b.append(2)
b.append(3)
sys.getsizeof(a)

96

sys.getsizeof(b)

104

def slots(lst):
... return (sys.getsizeof(lst) - sys.getsizeof([])) / 
struct.calcsize("P")

...

slots(b)

4

slots(a)

3

Christian

--
http://mail.python.org/mailman/listinfo/python-list


Re: List size versus list allocation

2010-05-02 Thread Peter Otten
Steven D'Aprano wrote:

> Python lists are over-allocated: whenever they need to be resized, they
> are made a little bit larger than necessary so that appends will be fast.
> 
> See:
> 
> http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c
> 
> I'm interested in gathering some statistics on this, and to do so I need
> a way of measuring the list's logical size versus its actual size. The
> first is easy: it's just len(list). Is there some way of getting the
> allocated size of the list?

With some brute force guesswork...

>>> a = [None] * 42
>>> for i in range(10):
... print i, ctypes.c_ulong.from_address(id(a)+i*8)
...
0 c_ulong(1L)
1 c_ulong(8488896L)
2 c_ulong(42L)
3 c_ulong(37432768L)
4 c_ulong(42L)
5 c_ulong(1L)
6 c_ulong(8510496L)
7 c_ulong(32L)
8 c_ulong(18446744073709551615L)
9 c_ulong(8935143189711421440L)
>>> a = []
>>> for i in range(42): a.append(None)
...
>>> for i in range(10):
... print i, ctypes.c_ulong.from_address(id(a)+i*8)
...
0 c_ulong(1L)
1 c_ulong(8488896L)
2 c_ulong(42L)
3 c_ulong(40136160L)
4 c_ulong(46L)
5 c_ulong(0L)
6 c_ulong(0L)
7 c_ulong(0L)
8 c_ulong(0L)
9 c_ulong(0L)

... you can see that the size sits at offset 4*sizeof(long) on my 64-bit 
Python. With that information we can take a look at a list's overallocation 
strategy:

>>> a = []
>>> for i in range(20):
... print i, ctypes.c_ulong.from_address(id(a)+4*8)
... a.append(None)
...
0 c_ulong(0L)
1 c_ulong(4L)
2 c_ulong(4L)
3 c_ulong(4L)
4 c_ulong(4L)
5 c_ulong(8L)
6 c_ulong(8L)
7 c_ulong(8L)
8 c_ulong(8L)
9 c_ulong(16L)
10 c_ulong(16L)
11 c_ulong(16L)
12 c_ulong(16L)
13 c_ulong(16L)
14 c_ulong(16L)
15 c_ulong(16L)
16 c_ulong(16L)
17 c_ulong(25L)
18 c_ulong(25L)
19 c_ulong(25L)

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list