Heh. I wasn't intending to be nasty, but this program makes our arena
recycling look _much_ worse than memcrunch.py does. It cycles through
phases. In each phase, it first creates a large randomish number of
objects, then deletes half of all objects in existence. Except that
every 10th phase, it deletes 90% instead. It's written to go through
100 phases, but I killed it after 10 because it was obviously going to
keep on growing without bound.
Note 1: to do anything deterministic with obmalloc stats these days
appears to require setting the envar PYTHONHASHSEED to 0 before
running (else stats vary even by the time you get to an interactive
prompt).
Note 2: there are 3 heavily used size classes here, for ints,
2-tuples, and class instances, of byte sizes 32, 64, and 96 on 64-bit
boxes, under my PR and under released 3.7.3.
First with my branch, after phase 10 finishes building objects:
phase 10 adding 9953410
phase 10 has 16743920 objects
# arenas allocated total = 3,114
# arenas reclaimed = 0
# arenas highwater mark = 3,114
# arenas allocated current = 3,114
3114 arenas * 1048576 bytes/arena = 3,265,265,664
# bytes in allocated blocks = 3,216,332,784
No arenas have ever been reclaimed, but space utilization is excellent
(about 98.5% of arenas are being used by objects).
Then after phase 10 deletes 90% of everything still alive:
phase 10 deleting factor 90% 15069528
phase 10 done deleting
# arenas allocated total = 3,114
# arenas reclaimed = 0
# arenas highwater mark = 3,114
# arenas allocated current = 3,114
3114 arenas * 1048576 bytes/arena = 3,265,265,664
# bytes in allocated blocks = 323,111,488
Still no arenas have been released, and space utilization is horrid.
A bit less than 10% of allocated space is being use for objects.
Now under 3.7.3. First when phase 10 is done building:
phase 10 adding 9953410
phase 10 has 16743920 objects
# arenas allocated total = 14,485
# arenas reclaimed = 2,020
# arenas highwater mark = 12,465
# arenas allocated current = 12,465
12465 arenas * 262144 bytes/arena = 3,267,624,960
# bytes in allocated blocks = 3,216,219,656
Space utilization is again excellent. A significant number of arenas
were reclaimed - but usefully? Let's see how things turn out after
phase 10 ends deleting 90% of the objects:
phase 10 deleting factor 90% 15069528
phase 10 done deleting
# arenas allocated total = 14,485
# arenas reclaimed = 2,020
# arenas highwater mark = 12,465
# arenas allocated current = 12,465
12465 arenas * 262144 bytes/arena = 3,267,624,960
# bytes in allocated blocks = 322,998,360
Didn't manage to reclaim anything! Space utililization is again
horrid, and it's actually consuming a bit more arena bytes than when
running under the PR.
Which is just more of what I've been seeing over & over: 3.7.3 and
the PR both do a fine job of recycling arenas, or a horrid job,
depending on the program.
For excellent recycling, change this program to use a dict instead of a set. So
data = {}
at the start, fill it with
data[serial] = Stuff()
and change
data.pop()
to use .popitem().
The difference is that set elements still appear in pseudo-random
order, but dicts are in insertion-time order. So data.popitem() loses
the most recently added dict entry, and the program is then just
modeling stack allocation/deallocation.
def doit():
import random
from random import randrange
import sys
class Stuff:
# add cruft so it takes 96 bytes under 3.7 and 3.8
__slots__ = tuple("abcdefg")
def __hash__(self):
return hash(id(self))
LO = 5_000_000
HI = LO * 2
data = set()
serial = 0
random.seed(42)
for phase in range(1, 101):
toadd = randrange(LO, HI)
print("phase", phase, "adding", toadd)
for _ in range(toadd):
data.add((serial, Stuff()))
serial += 1
print("phase", phase, "has", len(data), "objects")
sys._debugmallocstats()
factor = 0.5 if phase % 10 else 0.9
todelete = int(len(data) * factor)
print(f"phase {phase} deleting factor {factor:.0%} {todelete}")
for _ in range(todelete):
data.pop()
print("phase", phase, "done deleting")
sys._debugmallocstats()
doit()
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/ZTLJGXEM7NCASL5NVGMRMDN3O4GGUEIX/