"larry.mart...@gmail.com" <larry.mart...@gmail.com> writes: > Thanks for the reply Paul. I had not heard of itertools. It sounds > like just what I need for this. But I am having 1 issue - how do you > know how many items are in each group?
Simplest is: for key, group in groupby(xs, lambda x:(x[-1],x[4],x[5])): gs = list(group) # convert iterator to a list n = len(gs) # this is the number of elements there is some theoretical inelegance in that it requires each group to fit in memory, but you weren't really going to have billions of files with the same basename. If you're not used to iterators and itertools, note there are some subtleties to using groupby to iterate over files, because an iterator actually has state. It bumps a pointer and maybe consumes some input every time you advance it. In a situation like the above, you've got some nexted iterators (the groupby iterator generating groups, and the individual group iterators that come out of the groupby) that wrap the same file handle, so bad confusion can result if you advance both iterators without being careful (one can consume file input that you thought would go to another). This isn't as bad as it sounds once you get used to it, but it can be a source of frustration at first. BTW, if you just want to count the elements of an iterator (while consuming it), n = sum(1 for x in xs) counts the elements of xs without having to expand it into an in-memory list. Itertools really makes Python feel a lot more expressive and clean, despite little kinks like the above. -- http://mail.python.org/mailman/listinfo/python-list