Re: [Tutor] Iterate Suggestion

Steven D'Aprano Sat, 14 Apr 2012 19:42:51 -0700

Bod Soutar wrote:

How about something like this


mylist =  ['serverA', 'serverB', 'serverC', 'serverD','serverE', 'serverF',
'serverG']
tempstr = ""
count = 0

for item in mylist:
    count += 1
    if count == 3:
        tempstr += (i + "\n")
        count = 0
    else:
        tempstr += (i + " ")

print tempstr

Warning: this is a good way to write HORRIBLY slow code that potentially takesmany minutes or even hours to generate output. And even worst, it will occurinconsistently, making it really hard to debug.

The right way to join many strings into one is with the join method:accumulate the substrings into a list, and then join them in one go:


' '.join(list_of_words)

The problem with your code is that you are doing repeated stringconcatenation, which is slow. If you understand Big Oh notation, stringconcatenation is O(n**2), which means that (roughly speaking) if you increasethe amount of data by a hundred, the time taken will increase by a factor often thousand.


You can read more about why this happens here:

http://www.joelonsoftware.com/articles/fog0000000319.html

CPython (the implementation you are using) has a clever optimization that*sometimes* can speed up this situation, which is why you may never havenoticed how slow it gets. But other implementations such as Jython andIronPython do not, and so your code will be pathologically slow on theseimplementations.

Worse, the clever optimization is easily defeated. On some operating systemsor memory schemes, it can fail and become horribly slow -- and debugging it isa real pain because others will report no slowdown.

A few years ago, a similar situation was reported in the urllib or urllib2module in the standard library. Thanks to the clever optimization, most peoplenever noticed, but one user reported that Python was taken twenty or thirtyminutes to download a file that Internet Explorer and wget would download infive or ten seconds. At first nobody believed him, because they couldn'treplicate the bug. Then they thought it was a network issue. Eventually thisfellow persevered and tracked the bug down to repeated string concatenation inthe standard library. The inventor of Python, Guido van Rossum, described itas "embarrassing".


You can search the Python-Dev mailing list archives for this.

Here's an example of how slow repeated string concatenation can be, with andwithout the clever optimization:



py> from timeit import Timer
py> t = Timer('for i in range(500): s = s + "x"', 's = ""')
py> t.timeit(300)  # repeat the test 300 times
0.038927078247070312

That's not too bad: less than half a second to do 150 thousand stringconcatenations. But see what happens when I defeat the optimizer with a smallchange to the code:



py> t = Timer('for i in range(500): s = "x" + s', 's = ""')
py> t.timeit(300)
5.8992829322814941

That's 152 times slower.



--
Steven

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Iterate Suggestion

Reply via email to