On Thu, May 30, 2013 at 3:16 PM, Steven D'Aprano <st...@pearwood.info> wrote: > > Sure enough, ''.join(list-of-substrings) is measurably faster than > ''.join(iterator-of-substrings).
A tuple or list is used directly. Otherwise join() has to create an iterator and build a new list. This isn't directly related to the discussion on string concatenation. But in relation to the offshoot discussion on lists, I've put together some examples that demonstrate how different was of creating the same list lead to different allocated sizes. First, here's a function to get the allocated length of a list's item array: from ctypes import sizeof, c_void_p, c_ssize_t alloc_offset = sizeof(c_void_p * 2) + sizeof(c_ssize_t * 2) def allocated(alist): addr = id(alist) alloc = c_ssize_t.from_address(addr + alloc_offset) return alloc.value It uses ctypes to peek into the object, but you can also use a list object's __sizeof__() method to calculate the result. First get the array's size in bytes by subtracting the size of an empty list from the size of the list. Then divide by the number of bytes in a pointer: import struct pointer_size = struct.calcsize('P') empty_size = [].__sizeof__() def allocated(alist): size_bytes = alist.__sizeof__() - empty_size return size_bytes // pointer_size Example 0: >>> allocated([0,1,2,3,4,5,6,7,8,9,10,11]) 12 In this case the constants are pushed on the stack, and the interpreter evaluates BUILD_LIST(12), which in CPython calls PyList_New(12). The same applies to using built-in range() in 2.x (not 3.x): >>> allocated(range(12)) 12 Example 1: >>> allocated([i for i in xrange(12)]) 16 This starts at 0 and grows as follows: 1 + 1//8 + 3 = 4 5 + 5//8 + 3 = 8 9 + 9//8 + 6 = 16 Example 2: >>> allocated(list(xrange(12))) 19 This also applies to range() in 3.x. Some iterators have a "__length_hint__" method for guessing the initial size: >>> iter(xrange(12)).__length_hint__() 12 The guess is immediately resized as follows: 12 + 12//8 + 6 = 19 Example 3: >>> allocated(list(i for i in xrange(12))) 12 The initializer here is a generator, compiled from the generator expression. A generator doesn't have a length hint. Instead, the list uses a default guess of 8, which is over-allocated as follows: 8 + 8//8 + 3 = 12 If the generator continues to a 13th item, the list resizes to 13 + 13//8 + 6 = 20: >>> allocated(list(i for i in xrange(13))) 20 _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor