In article <4e592852$0$29965$c3e8da3$54964...@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote:
> open("file.txt") # opens the file > .read() # reads the contents of the file > .split("\n\n") # splits the text on double-newlines. The biggest problem with this code is that read() slurps the entire file into a string. That's fine for moderately sized files, but will fail (or at least be grossly inefficient) for very large files. It's always annoyed me a little that while it's easy to iterate over the lines of a file, it's more complicated to iterate over a file character by character. You could write your own generator to do that: for c in getchar(open("file.txt")): whatever def getchar(f): for line in f: for c in line: yield c but that's annoyingly verbose (and probably not hugely efficient). Of course, the next problem for the specific problem at hand is that even with an iterator over the characters of a file, split() only works on strings. It would be nice to have a version of split which took an iterable and returned an iterator over the split components. Maybe there is such a thing and I'm just missing it? -- http://mail.python.org/mailman/listinfo/python-list