En Mon, 04 Jan 2010 19:35:02 -0300, wiso <gtu2...@alice.it> escribió:

I'm trying the fileinput module, and I like it, but I don't understand why
it's so slow... look:

from time import time
from fileinput import FileInput

file = ['r1_200907.log', 'r1_200908.log', 'r1_200909.log', 'r1_200910.log',
'r1_200911.log']

def f1():
  n = 0
  for f in file:
    print "new file: %s" % f
    ff = open(f)
    for line in ff:
      n += 1
    ff.close()
  return n

def f2():
  f = FileInput(file)
  for line in f:
    if f.isfirstline(): print "new file: %s" % f.filename()
  return f.lineno()

def f3(): # f2 simpler
  f = FileInput(file)
  for line in f:
    pass
  return f.lineno()

Yes, the fileinput module is A LOT slower than normal file processing.
You may use itertools.chain instead:

def f4():
  f = itertools.chain.from_iterable(open(fn) for fn in file)
  n = 0
  for line in f:
    n += 1
  return n

I get similar timings as f1() above.

Known major issues of this "poor man's" implementation:

- no lineno/filelineno/isfirstline attributes
- close() is implicit
- only for reading; inplace and backup don't work

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to