I have coded a robust (Theil-Sen) regression routine which takes as inputs two lists of numbers, x and y, and returns a robust estimate of the slope and intercept of the best robust straight line fit.
In a pre-processing phase, I create two new lists, x1 and y1; x1 has only the unique values in x, and for each unique value in x1, y1 has the median of all such values in x. My code follows, and it seems a bit clumsy - is there a cleaner way to do it? By the way, I'd be more than happy to share the code for the entire algorithm - just let me know and I will post it here. Thanks in advance Thomas Philips d = {} #identify unique instances of x and y for xx,yy in zip(x,y): if xx in d: d[xx].append(yy) else: d[xx] = [yy] x1 = [] #unique instances of x and y y1 = [] #median(y) for each unique value of x for xx,yy in d.iteritems(): x1.append(xx) l = len(yy) if l == 1: y1.append(yy[0]) else: yy.sort() y1.append( (yy[l//2-1] + yy[l//2])/2.0 if l % 2 == 0 else yy[l//2] ) -- http://mail.python.org/mailman/listinfo/python-list