Glenn Linderman wrote:

For example, Python presently has a rather stupid algorithm for string concatenation.

Python the language has syntax and semantics. Python implementations have algorithms that fulfill the defined semantics.

It allocates only the exactly necessary space for the concatenated string. This is a brilliant move, when you realize that strings are immutable, and once allocated can never change, but the operation

for line in mylistofstrings:
   string = string + line

is basically O(N-squared) as a result. The better algorithm would double the size of memory allocated for string each time there is not enough room to add the next line, and that reduces the cost of the algorithm to O(N).

If there is more than one reference to a guaranteed immutable object, such as a string, the 'stupid' algorithm seem necessary to me. In-place modification of a shared immutable would violate semantics.

However, if you do

string = ''
for line in strings:
  string =+ line

so that there is only one reference and you tell the interpreter that you don't mind the old value being updated, then I believe in 2.6, if not before, CPython does overallocation and in-place extension. (I am not sure about s=s+l.) But this is just ref-counted CPython.

Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to