On 08/05/13 03:10, Bjorn Madsen wrote:
import sys
L=[x for x in range(10000)]
sys.getsizeof(L)
43816
L={x for x in range(10000)}
sys.getsizeof(L)
262260
?


Both sets and lists may be over-allocated. I expect that lists created with a 
list comprehension may not be over-allocated, but if you append a single item 
to it, Python *may* resize it and quadruple or double the size. If you don't 
over-allocate, then *every* append becomes slow:

[a, b, c, d] append e:

copy the list, plus one extra slot:

[a, b, c, d]
[a, b, c, d, UNUSED]

insert e:

[a, b, c, d]
[a, b, c, d, e]

delete the original:

[a, b, c, d, e]


Instead of adding just one extra slot, Python quadruples (for small lists) or 
doubles the size, so that *nearly always* your list has plenty of extra slots 
to append into instantly. This makes appending an almost-constant time 
operation, on average, regardless of how big your list is, and big list will 
never use more than twice the amount of memory needed. That's a good trade off 
of space versus time (memory versus speed).


Sets, and dicts, are also over-allocated, because of the nature of how hash tables work. 
They also trade off space for time. The hash table is a big array of slots, some of which 
are flagged as "empty", some of which have an item in them:

[EMPTY, EMPTY, b, EMPTY, d, c, EMPTY, EMPTY, EMPTY, a, EMPTY, e]

When you do a lookup on (say) c, Python calculates a hash of c to get the index 
to look. Say it gets index 5, it looks at index 5 and sees c is there. This 
means it only needs to look in one slot instead of up to 12.

The more empty slots, the better the hash table will perform. If you would like 
more details on how hash tables work, you can google it, or just ask here, but 
the short answer is that in order to get extremely fast almost constant-time 
insertion, deletion and lookup of items in dicts and sets, they trade off some 
memory for speed.


--
Steven
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to