On Tue, 19 Jan 2010 16:20:42 -0500, Gerald Britton wrote: > That's surprising. I wouldn't implement it that way at all. I'd use a > dynamically-expanding buffer as I suggested. That way you get a single > pass and don't have to calculate anything before you begin. In the best > case, you'd use half the memory (the result just fits in the buffer > after its last expansion and no saved tuple). In the worst case, the > memory use is about the same (you just expanded the buffer using a 2x > expansion rule then you hit the last item).
In the worst case, you waste 50% of the memory allocated. And because strings are immutable (unlike lists and dicts, which also use this approach), you can never use that memory until the string is garbage collected. In the current approach, join produces a temporary sequence, but it doesn't last very long. With your suggested approach, you could end up with a large number of long-lasting strings up to twice the size necessary. Since join is particularly useful for building large strings, this could be a significant memory pessimation. The obvious fix is for join to shrink the buffer once it has finished building the string, but the success of that may be operating system dependent. I don't know -- it sounds like a recipe for memory fragmentation to me. And even if it can be done, reclaiming the memory will take time, potentially more time than the current approach. Still, it's probably worth trying, and seeing if it speeds join up. -- Steven -- http://mail.python.org/mailman/listinfo/python-list