On Sun, 14 Apr 2013 17:44:28 -0700, Mark Janssen wrote: > On Sun, Apr 14, 2013 at 5:29 PM, Steven D'Aprano > <steve+comp.lang.pyt...@pearwood.info> wrote: >> On Sun, 14 Apr 2013 12:06:12 -0700, Mark Janssen wrote: >> >>> cleaned='' >>> for c in myStringNumber: >>> if c != ',': >>> cleaned+=c >>> int(cleaned) >> >> ....due to being an O(N**2) algorithm. > > What on earth makes you think that is an O(n**2) algorithm and not O(n)?
Strings are immutable. Consider building up a single string from four substrings: s = '' s += 'fe' s += 'fi' s += 'fo' s += 'fum' Python *might* optimize the first concatenation, '' + 'fe', to just reuse 'fe', (but it might not). Let's assume it does, so that no copying is needed. Then it gets to the second concatenation, and now it has to copy characters, because strings are immutable and cannot be modified in place. Showing the *running* total of characters copied: 'fe' + 'fi' => 'fefi' # four characters copied 'fefi' + 'fo' => 'fefifo' # 4 + 6 = ten characters copied 'fefifo' + 'fum' => 'fefifofum' # 10 + 9 = nineteen characters copied Notice how each intermediate substring gets copied repeatedly? In order to build up a string of length 9, we've had to copy at least 19 characters. With only four substrings, it's not terribly obvious how badly this performs. So let's add some more substrings, and see how the running total increases: 'fefifofum' + 'foo' => 'fefifofumfoo' # 19 + 12 = 31 'fefifofumfoo' + 'bar' => 'fefifofumfoobar' # 31 + 15 = 46 'fefifofumfoobar' + 'baz' => 'fefifofumfoobarbaz' # 46 + 18 = 64 'fefifofumfoobarbaz' + 'spam' => 'fefifofumfoobarbazspam' # 64 + 22 = 86 To build up a string of length 22, we've had to copy, and re-copy, and re- re-copy, 86 characters in total. And the string gets bigger, the inefficiency gets worse. Each substring (except the very last one) gets copied multiple times; the number of times it gets copied is proportional to the number of substrings. If the substrings are individual characters, then each character is copied a number of times proportional to the number of characters N; since there are N characters, each being copied (proportional to) N times, that makes N*N or N**2. -- Steven -- http://mail.python.org/mailman/listinfo/python-list