On Sat, Feb 06, 2010 at 14:42 -0500, Terry Reedy wrote: > On 2/6/2010 2:09 PM, Wolodja Wentland wrote:
> >I think you can use the itertools.groupby(L, lambda el: el[1]) to group > >elements in your *sorted* list L by the value el[1] (i.e. the > >identifier) and then iterate through these groups until you find the > >desired number of instances grouped by the same identifier. > This will generally not return the same result. It depends on > whether OP wants *any* item appearing at least 5 times or whether > the order is significant and the OP literally wants the first. Order is preserved by itertools.groupby - Have a look: >>> instances = [(1, 'b'), (2, 'b'), (3, 'a'), (4, 'c'), (5, 'c'), (6, 'c'), >>> (7, 'b'), (8, 'b')] >>> grouped_by_identifier = groupby(instances, lambda el: el[1]) >>> grouped_by_identifier = ((identifier, list(group)) for identifier, group in >>> grouped_by_identifier) >>> k_instances = (group for identifier, group in grouped_by_identifier if >>> len(group) == 2) >>> for group in k_instances: ... print group ... [(1, 'b'), (2, 'b')] [(7, 'b'), (8, 'b')] So the first element yielded by the k_instances generator will be the first group of elements from the original list whose identifier appears exactly k times in a row. > Sorting the entire list may also take a *lot* longer. Than what? Am I missing something? Is the "*sorted*" the culprit? If yes -> Just forget it as it is not relevant. -- .''`. Wolodja Wentland <wentl...@cl.uni-heidelberg.de> : :' : `. `'` 4096R/CAF14EFC `- 081C B7CD FF04 2BA9 94EA 36B2 8B7F 7D30 CAF1 4EFC
signature.asc
Description: Digital signature
-- http://mail.python.org/mailman/listinfo/python-list