[issue38373] List overallocation strategy

2020-03-17 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2020-03-17 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset 2fe815edd6778fb9deef8f8044848647659c2eb8 by Serhiy Storchaka in 
branch 'master':
bpo-38373: Change list overallocating strategy. (GH-18952)
https://github.com/python/cpython/commit/2fe815edd6778fb9deef8f8044848647659c2eb8


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2020-03-11 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
keywords: +patch
pull_requests: +18303
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/18952

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-16 Thread Inada Naoki


Change by Inada Naoki :


--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-04 Thread Tim Peters


Tim Peters  added the comment:

Don't know.  Define "the problem" ;-)  As soon as the allocation is over 512 
bytes (64 pointers), it's punted to the system malloc family.  Before then, do 
a relative handful of relatively small memcpy's really matter?

pymalloc is faster than system mallocs, and probably uses space more 
efficiently.  There are no pure wins here :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-04 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> WRT pymalloc, it will always copy on growing resize in this context

That sounds like an excellent reason NOT to use pymalloc ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-04 Thread Tim Peters


Tim Peters  added the comment:

WRT pymalloc, it will always copy on growing resize in this context.  A 
pymalloc pool is dedicated to blocks of the same size class, so if the size 
class increases (they're 16 bytes apart now), the data must be copied to a 
different pool (dedicated to blocks of the larger size class).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-04 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Here is some data. step is the number of items added at a time, the first row 
is the size, the second row is the currently allocated size, the third row is 
the proposed allocated size.

for step in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10):
sizes0 = range(step, step*31, step)
sizes1 = []
sizes2 = []
old_size = allocated1 = allocated2 = 0
for size in sizes0:
if size > allocated1:
allocated1 = size + (size >> 3) + (3 if size < 9 else 6)
sizes1.append(allocated1)
if size > allocated2:
allocated2 = (size + (size >> 3) + 6) & ~3
if size - old_size > allocated2 - size:
allocated2 = (size + 3) & ~3
sizes2.append(allocated2)
old_size = size
print('\nstep =', step)
print(' '.join(f'{x:3}' for x in sizes0))
print(' '.join(f'{x:3}' for x in sizes1))
print(' '.join(f'{x:3}' for x in sizes2))

step = 1
  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20 
 21  22  23  24  25  26  27  28  29  30
  4   4   4   4   8   8   8   8  16  16  16  16  16  16  16  16  25  25  25  25 
 25  25  25  25  25  35  35  35  35  35
  4   4   4   4   8   8   8   8  16  16  16  16  16  16  16  16  24  24  24  24 
 24  24  24  24  32  32  32  32  32  32

step = 2
  2   4   6   8  10  12  14  16  18  20  22  24  26  28  30  32  34  36  38  40 
 42  44  46  48  50  52  54  56  58  60
  5   5   9   9  17  17  17  17  26  26  26  26  26  37  37  37  37  37  48  48 
 48  48  48  48  62  62  62  62  62  62
  8   8   8   8  16  16  16  16  24  24  24  24  32  32  32  32  44  44  44  44 
 44  44  56  56  56  56  56  56  68  68

step = 3
  3   6   9  12  15  18  21  24  27  30  33  36  39  42  45  48  51  54  57  60 
 63  66  69  72  75  78  81  84  87  90
  6   6  16  16  16  26  26  26  36  36  36  36  49  49  49  49  63  63  63  63 
 63  80  80  80  80  80  97  97  97  97
  8   8  16  16  16  24  24  24  36  36  36  36  48  48  48  48  60  60  60  60 
 76  76  76  76  76  92  92  92  92  92

step = 4
  4   8  12  16  20  24  28  32  36  40  44  48  52  56  60  64  68  72  76  80 
 84  88  92  96 100 104 108 112 116 120
  7  12  12  24  24  24  37  37  37  51  51  51  64  64  64  64  82  82  82  82 
100 100 100 100 100 123 123 123 123 123
  8   8  16  16  28  28  28  40  40  40  52  52  52  68  68  68  68  84  84  84 
 84 104 104 104 104 104 124 124 124 124

step = 5
  5  10  15  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90  95 100 
105 110 115 120 125 130 135 140 145 150
  8  17  17  28  28  39  39  51  51  51  67  67  67  84  84  84 101 101 101 101 
124 124 124 124 146 146 146 146 146 174
  8  16  16  28  28  36  36  48  48  60  60  60  76  76  76  96  96  96  96 116 
116 116 116 140 140 140 140 140 168 168

step = 6
  6  12  18  24  30  36  42  48  54  60  66  72  78  84  90  96 102 108 114 120 
126 132 138 144 150 156 162 168 174 180
  9  19  19  33  33  46  46  60  60  60  80  80  80 100 100 100 120 120 120 120 
147 147 147 147 174 174 174 174 174 208
 12  12  24  24  36  36  52  52  64  64  80  80  80 100 100 100 120 120 120 120 
144 144 144 144 172 172 172 172 200 200

step = 7
  7  14  21  28  35  42  49  56  63  70  77  84  91  98 105 112 119 126 133 140 
147 154 161 168 175 182 189 196 203 210
 10  21  21  37  37  53  53  69  69  84  84  84 108 108 108 132 132 132 155 155 
155 155 187 187 187 187 218 218 218 218
  8  16  28  28  44  44  60  60  76  76  92  92  92 116 116 116 136 136 136 160 
160 160 184 184 184 184 216 216 216 216

step = 8
  8  16  24  32  40  48  56  64  72  80  88  96 104 112 120 128 136 144 152 160 
168 176 184 192 200 208 216 224 232 240
 12  24  24  42  42  60  60  78  78  96  96  96 123 123 123 150 150 150 177 177 
177 177 213 213 213 213 249 249 249 249
  8  24  24  40  40  60  60  76  76  96  96  96 120 120 120 148 148 148 176 176 
176 176 212 212 212 212 248 248 248 248

step = 9
  9  18  27  36  45  54  63  72  81  90  99 108 117 126 135 144 153 162 171 180 
189 198 207 216 225 234 243 252 261 270
 16  26  36  36  56  56  76  76  97  97 117 117 117 147 147 147 178 178 178 208 
208 208 208 249 249 249 249 289 289 289
 12  20  36  36  56  56  76  76  96  96 116 116 136 136 136 168 168 168 196 196 
196 228 228 228 228 268 268 268 268 308

step = 10
 10  20  30  40  50  60  70  80  90 100 110 120 130 140 150 160 170 180 190 200 
210 220 230 240 250 260 270 280 290 300
 17  28  39  51  51  73  73  96  96 118 118 141 141 141 174 174 174 208 208 208 
242 242 242 242 287 287 287 287 332 332
 12  20  32  40  60  60  84  84 104 104 128 128 152 152 152 184 184 184 216 216 
216 252 252 252 252 296 296 296 296 340

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 

[issue38373] List overallocation strategy

2019-10-04 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

We should definitely revisit the over-allocation strategy.  I last worked on 
the existing strategy back in 2004.  Since then, the weighting of the 
speed/space trade-off considerations have changed.   

We need to keep the amortized O(1) append() behavior, but possibly we would 
benefit from more over-allocation and fewer resizes.  Also, we should look at 
the interaction with the small object allocator to see if its design is causing 
realloc() to frequently have to move data rather than extending in place.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-04 Thread Brandt Bucher


Change by Brandt Bucher :


--
nosy: +brandtbucher

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38373] List overallocation strategy

2019-10-04 Thread Serhiy Storchaka


New submission from Serhiy Storchaka :

Currently list uses the following formula for allocating an array for items (if 
size != 0):

allocated = size + (size >> 3) + (3 if size < 9 else 6)

If add items by 1 the growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, 
...

I think this strategy can be improved.

1. There is no much sense in allocating the space for 25 items. The standard 
Python allocator returns addresses aligned to 16 bytes on 64-bit platform. It 
will allocate a space for 26 items. It is more efficient if the starting 
address is a multiple of some power of two, and perhaps larger than the pointer 
size.

So it may be better to allocate a multiple of two or four items. Few first 
sizes are multiple of four, so this is good multiplier. I suggest to use the 
following expression:

allocated = (size + (size >> 3) + 6) & ~3

It adds bits AND, but removes conditional value.

2. It is common enough case if the list is created from a sequence of a known 
size and do not adds items anymore. Or if it is created by concatenating of few 
sequences. In such cases the list can overallocate a space which will never be 
used. For example:

>>> import sys
>>> sys.getsizeof([1, 2, 3])
80
>>> sys.getsizeof([*(1, 2, 3)])
104

List display allocates a space for exactly 3 items. But a list created from a 
tuple allocates a space for 6 items.

Other example. Let extend a list by 10 items at a time.

size allocated
 10 17
 20 28
 30 39
 40 51
 50 51
 60 73
 70 73
 80 96
 90 96
100118

We need to reallocate an array after first four steps. 17 is too small for 20, 
28 is too small for 30, 39 is too small for 40. So overallocating does not 
help, it just spends a space.

My idea, that if we adds several items at a time and need to reallocate an 
array, we check if the overallocated size is enough for adding the same amount 
of items next time. If it is not enough, we do not overallocate.

The final algorithm is:

allocated = (size + (size >> 3) + 6) & ~3
if size - old_size > allocated - size:
allocated = (size + 3) & ~3

It will save a space if extend a list by many items few times.

--
components: Interpreter Core
messages: 353976
nosy: rhettinger, serhiy.storchaka, tim.peters
priority: normal
severity: normal
status: open
title: List overallocation strategy
type: resource usage
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com