Hello
I wrote a function that, given a list of numbers, finds clusters of
values by proximity and returns a reduced list containing the centers of
these clusters. However, I find it rather unclear. I would appreciate
any comments on how pythonic my function is and suggestions to improve
its readability.
The function is:
def aglomerate(x_lst, delta=1.e-5):
clusters = [] #list of pairs [center, number of clustered values]
for x in x_lst:
close_to = [abs(x - y) < delta for y,_ in clusters]
if any(close_to):
# x is close to a cluster
index = close_to.index(True)
center, n = clusters[index]
#update the cluster center including the new value,
#and increment dimension of cluster
clusters[index] = (n * center + x)/(n+1), n+1
else:
# x does not belong to any cluster, create a new one
clusters.append([x,1])
# return list with centers
return [center for center, _ in clusters]
Examples:
1. No clusters in x_lst:
In [52]: aglomerate([1., 2., 3., 4.])
Out[52]: [1.0, 2.0, 3.0, 4.0]
2. Some elements in x_lst are equal:
In [53]: aglomerate([1., 2., 1., 3.])
Out[53]: [1.0, 2.0, 3.0]
3. Some elements in x_lst should be clustered:
In [54]: aglomerate([1., 2., 1.1, 3.], delta=0.2)
Out[54]: [1.05, 2.0, 3.0]
So, the function seems to work as it should, but can it be made more
readable?
Thanks for any help.
Ze
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor