Hi! ...I have found a good enough solution, although it only works if the number of patterns (clusters) is not very big: def classify(f): THERESHOLD=0.1
patterns={} for l in enumerate(f): found = False for p,c in patterns.items(): if dist(l,p) < THERESHOLD: found=True patterns[p] = c +1 if not found: patterns[l] = 1 return patterns This algorithm is O(n*np*m^2). Where n is the number of logs, np the number of patterns, and m is the log length (i.e. m^2 is the distance cost). So it's way better O(n^2*m^2) and I can run it for some hours to get back the results. I wonder if there is a single threaded/process clustering algorithm than runs in O(n)? Cheers, marc -- http://mail.python.org/mailman/listinfo/python-list