En Mon, 04 Jan 2010 19:35:02 -0300, wiso <gtu2...@alice.it> escribió:
I'm trying the fileinput module, and I like it, but I don't understand
why
it's so slow... look:
from time import time
from fileinput import FileInput
file = ['r1_200907.log', 'r1_200908.log', 'r1_200909.log',
'r1_200910.log',
'r1_200911.log']
def f1():
n = 0
for f in file:
print "new file: %s" % f
ff = open(f)
for line in ff:
n += 1
ff.close()
return n
def f2():
f = FileInput(file)
for line in f:
if f.isfirstline(): print "new file: %s" % f.filename()
return f.lineno()
def f3(): # f2 simpler
f = FileInput(file)
for line in f:
pass
return f.lineno()
Yes, the fileinput module is A LOT slower than normal file processing.
You may use itertools.chain instead:
def f4():
f = itertools.chain.from_iterable(open(fn) for fn in file)
n = 0
for line in f:
n += 1
return n
I get similar timings as f1() above.
Known major issues of this "poor man's" implementation:
- no lineno/filelineno/isfirstline attributes
- close() is implicit
- only for reading; inplace and backup don't work
--
Gabriel Genellina
--
http://mail.python.org/mailman/listinfo/python-list