Lee Sander wrote: > I wanted to also say that this file is really huge, so I cannot > just do a read() and then split on ">" to get a record > thanks > lee
Below is the easy solution. To get even better performance, or if '<' is not always at the start of the line, you would have to implement the buffering that is done by readline() yourself (see _fileobject in socket.py in the standard lib for example). def chunkreader(f): name = None lines = [] while True: line = f.readline() if not line: break if line[0] == '>': if name is not None: yield name, lines name = line[1:].rstrip() lines = [] else: lines.append(line) if name is not None: yield name, lines if __name__ == '__main__': from StringIO import StringIO s = \ """> name1 line1 line2 line3 > name2 line 4 line 5 line 6""" f = StringIO(s) for name, lines in chunkreader(f): print '***', name print ''.join(lines) $ python test.py *** name1 line1 line2 line3 *** name2 line 4 line 5 line 6 -- Regards, Tijs -- http://mail.python.org/mailman/listinfo/python-list