On Mon, 04 Jan 2010 23:35:02 +0100, wiso wrote: > I'm trying the fileinput module, and I like it, but I don't understand > why it's so slow...
Because it is written for convenience, not speed. From the source code: "Performance: this module is unfortunately one of the slower ways of processing large numbers of input lines." > look: > > from time import time > from fileinput import FileInput > > file = ['r1_200907.log', 'r1_200908.log', 'r1_200909.log', > 'r1_200910.log', 'r1_200911.log'] > > def f1(): > n = 0 > for f in file: > print "new file: %s" % f > ff = open(f) > for line in ff: > n += 1 > ff.close() > return n > > def f2(): > f = FileInput(file) > for line in f: > if f.isfirstline(): print "new file: %s" % f.filename() > return f.lineno() > > def f3(): # f2 simpler > f = FileInput(file) > for line in f: > pass > return f.lineno() > > > t = time(); f1(); print time()-t # 1.0 > t = time(); f2(); print time()-t # 7.0 !!! > t = time(); f3(); print time()-t # 5.5 > > I'm using text files, there are 2563150 lines in total. The extra second and a half in f2() is probably due to the time it takes to call f.isfirstline() 2563150 times. -- Steven -- http://mail.python.org/mailman/listinfo/python-list