Alexandre Ferrieux <[EMAIL PROTECTED]> wrote: > On Jul 23, 10:33 am, Duncan Booth <[EMAIL PROTECTED]> > wrote: >> >> The extra buffering means that iterating over a file is about 3 times >> faster than repeatedly calling readline. >> >> while 1: >> line = f.readline() >> if not line: >> break >> >> for line in f: >> pass >> > > Surely you'll notice that the comparison is spoilt by the fact that > the readline version needs an interpreted test each turn around. > A more interesting test would be the C-implemented iterator, just > calling fgets() (the thin layer policy) without extra 8k-blocking. > No, I believe the comparison is perfectly fair. You need the extra test for the readline version whatever you do, and you don't need it for the iterator.
If you insist, you can add an identical 'if not line: break' into the iterator version as well: it adds another 10% onto the iterator runtime which is still nearly a factor of 3 faster than the readline version, but then you aren't comparing equivalent code. Alternatively you can knock a chunk off the time for the readline loop by writing it as: while f.readline(): pass or even: read = f.readline while read(): pass which gets it down from 10.3 to 9.0 seconds. It's 'fair' in your book since it avoids all the extra interpreter overhead of attribute lookup and a separate test, but it does make it a touch hard to do anything useful with the actual data. Whatever, the iterator makes the code both cleaner and faster. It is at the expense of not being suitable for interactive sessions, or in some cases pipes, but for those situations you can continue to use readline and the extra overhead in runtime will not likely be noticable. -- http://mail.python.org/mailman/listinfo/python-list