The existing groupby() itertool works great when every element in a group has the same key, but it is not so handy when groups are determined by boundary conditions.
For edge-triggered events, we need to convert a boundary-event predicate to groupby-style key function. The code below encapsulates that process in a new itertool called split_on(). Would love you guys to experiment with it for a bit and confirm that you find it useful. Suggestions are welcome. Raymond ----------------------------------------- from itertools import groupby def split_on(iterable, event, start=True): 'Split iterable on event boundaries (either start events or stop events).' # split_on('X1X23X456X', 'X'.__eq__, True) --> X1 X23 X456 X # split_on('X1X23X456X', 'X'.__eq__, False) --> X 1X 23X 456X def transition_counter(x, start=start, cnt=[0]): before = cnt[0] if event(x): cnt[0] += 1 after = cnt[0] return after if start else before return (g for k, g in groupby(iterable, transition_counter)) if __name__ == '__main__': for start in True, False: for g in split_on('X1X23X456X', 'X'.__eq__, start): print list(g) print from pprint import pprint boundary = '--===============2615450625767277916==\n' email = open('email.txt') for mime_section in split_on(email, boundary.__eq__): pprint(list(mime_section, 1, None)) print '= = ' * 30 -- http://mail.python.org/mailman/listinfo/python-list