My local news feed seems to have lost the early part of this thread, so 
I'm afraid I don't know who I'm quoting here:

> My understanding is that appending to a list and then joining
> this list when done is the fastest technique for string
> concatenation. Is this true?
> 
> The 3 string concatenation techniques I can think of are:
> 
> - append to list, join
> - string 'addition' (s = s + char)
> - cStringIO

There is a fourth technique, and that is to avoid concatenation in the 
first place.   One possibility is to use the common append/join pattern 
mentioned above:

vector = []
while (stuff happens):
   vector.append(whatever)
my_string = ''.join(vector)

But, it sometimes (often?) turns out that you don't really need 
my_string.  It may just be a convenient way to pass the data on to the 
next processing step.  If you can arrange your code so the next step can 
take the vector directly, you can avoid creating my_string at all.

For example, if all you're going to do is write the string out to a file 
or network socket, you could user vectored i/o, with something like 
python-writev (http://pypi.python.org/pypi/python-writev/1.1).  If 
you're going to iterate over the string character by character, you 
could write an iterator which does that without the intermediate copy.  
Something along the lines of:

    def each(self):
        for s in self.vector:
            for c in s:
                yield c

Depending on the amount of data you're dealing with, this could be a 
significant improvement over doing the join().
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to