On Nov 14, 1:16 pm, Florian Brucker <[EMAIL PROTECTED]> wrote: > Hi everybody! > > Given a dictionary, I want to create a clustered version of it, > collecting keys that have the same value: > > >>> d = {'a':1, 'b':2, 'c':1, 'd':1, 'e':2, 'f':3} > >>> cluster(d) > {1:['a', 'c', 'd'], 2:['b', 'e'], 3:['f']} > > That is, generate a new dict which holds for each value of the old dict > a list of the keys of the old dict that have that very value. > > Another requirement is that it should also work on lists, in that case > with indices instead of keys. We may assume that all values in the > original dict/list can be used as dict keys. > > Right now I'm doing it like this: > > def cluster(d): > try: > # is d a dict? > values = unique(d.values()) > keys = d.keys() > except AttributeError: > # assume d is a list > values = unique(d) > keys = range(len(d)) > > clusters = {} > for value in values: > clusters[value] = filter(lambda v: d[v] == value, keys) > > return clusters > > where unique() is from O'Reilly's Python Cookbook and returns a list > containing each item of the given list exactly once. > > Now I'm pretty new to Python and chances are rather high that this could > be done prettier and/or more efficient. Therefore any comment on the > above code is greatly appreciated. > > Regards, > Florian
Your implementation is pretty inefficient as it iterates over the whole dictionary as many times as there are distinct values in it. Here is a simpler solution. from collections import defaultdict def cluster(d): clusters = defaultdict(list) for key, val in d.iteritems(): clusters[val].append(key) return clusters Or without defaultdict: def cluster(d): clusters = {} for key, val in d.iteritems(): clusters.setdefault(val, []).append(key) return clusters -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list