On 2007-06-14, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hi all, > > I am running Python 2.5 on Feisty Ubuntu. I came across some code that > is substantially slower when in a method than in a function. > > ################# START SOURCE ############# > # The function > > def readgenome(filehandle): > s = '' > for line in filehandle.xreadlines(): > if '>' in line: > continue > s += line.strip() > return s > > # The method in a class > class bar: > def readgenome(self, filehandle): > self.s = '' > for line in filehandle.xreadlines(): > if '>' in line: > continue > self.s += line.strip() > > ################# END SOURCE ############## > When running the function and the method on a 20,000 line text file, I > get the following: > >>>> cProfile.run("bar.readgenome(open('cb_foo'))") > 20004 function calls in 10.214 CPU seconds > > Ordered by: standard name > > ncalls tottime percall cumtime percall > filename:lineno(function) > 1 0.000 0.000 10.214 10.214 <string>:1(<module>) > 1 10.205 10.205 10.214 10.214 reader.py:11(readgenome) > 1 0.000 0.000 0.000 0.000 {method 'disable' of > '_lsprof.Profiler' objects} > 19999 0.009 0.000 0.009 0.000 {method 'strip' of 'str' > objects} > 1 0.000 0.000 0.000 0.000 {method 'xreadlines' of > 'file' objects} > 1 0.000 0.000 0.000 0.000 {open} > > >>>> cProfile.run("z=r.readgenome(open('cb_foo'))") > 20004 function calls in 0.041 CPU seconds > > Ordered by: standard name > > ncalls tottime percall cumtime percall > filename:lineno(function) > 1 0.000 0.000 0.041 0.041 <string>:1(<module>) > 1 0.035 0.035 0.041 0.041 reader.py:2(readgenome) > 1 0.000 0.000 0.000 0.000 {method 'disable' of > '_lsprof.Profiler' objects} > 19999 0.007 0.000 0.007 0.000 {method 'strip' of 'str' > objects} > 1 0.000 0.000 0.000 0.000 {method 'xreadlines' of > 'file' objects} > 1 0.000 0.000 0.000 0.000 {open} > > > The method takes > 10 seconds, the function call 0.041 seconds! > > Yes, I know that I wrote the underlying code rather > inefficiently, and I can streamline it with a single > file.read() call instead if an xreadlines() + strip loop. > Still, the differences in performance are rather staggering! > Any comments?
It is likely the repeated attribute lookup, self.s, that's slowing it down in comparison to the non-method version. Try the following simple optimization, using a local variable instead of an attribute to build up the result. # The method in a class class bar: def readgenome(self, filehandle): s = '' for line in filehandle.xreadlines(): if '>' in line: continue s += line.strip() self.s = s To further speed things up, think about using the str.join idiom instead of str.+=, and using a generator expression instead of an explicit loop. # The method in a class class bar: def readgenome(self, filehandle): self.s = ''.join(line.strip() for line in filehandle) -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list