Stephen Thorne wrote:

On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
<[EMAIL PROTECTED]> wrote:

By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them.  This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.

I just did a teensy test. All three options used exactly the same amount of total memory.

I would presume that, for a small file, the entire contents of the file will be sucked into the read buffer implemented by the underlying C file library. An iterator will only really save memory consumption when the file size is greater than that buffer's size.


Actually, now that I think of it, there's probably another copy of the data at Python level. For readlines(), that copy is the list object itself. For iter and iter.next(), it's in the iterator's read-ahead buffer. So perhaps memory savings will occur when *that* buffer size is exceeded. It's also quite possible that both buffers are the same size...

Anyhow, I'm sure that the fact that they use the same size for your test is a reflection of buffering. The next question is, which provides the most *conceptual* simplicity? (The answer to that one, I think, depends on how your brain happens to see things...)

Jeff Shannon
Technician/Programmer
Credit International

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to