Hello
I wrote a function that, given a list of numbers, finds clusters of values by proximity and returns a reduced list containing the centers of these clusters. However, I find it rather unclear. I would appreciate any comments on how pythonic my function is and suggestions to improve its readability.
The function is:

def aglomerate(x_lst, delta=1.e-5):
    clusters = [] #list of pairs [center, number of clustered values]
    for x in x_lst:
        close_to = [abs(x - y) < delta for y,_ in clusters]
        if any(close_to):
            # x is close to a cluster
            index = close_to.index(True)
            center, n = clusters[index]
            #update the cluster center including the new value,
            #and increment dimension of cluster
            clusters[index] = (n * center + x)/(n+1), n+1
        else:
            # x does not belong to any cluster, create a new one
            clusters.append([x,1])
    # return list with centers
    return [center for center, _ in clusters]

Examples:
1. No clusters in x_lst:
In [52]: aglomerate([1., 2., 3., 4.])
Out[52]: [1.0, 2.0, 3.0, 4.0]

2. Some elements in x_lst are equal:
In [53]: aglomerate([1., 2., 1., 3.])
Out[53]: [1.0, 2.0, 3.0]

3. Some elements in x_lst should be clustered:
In [54]: aglomerate([1., 2., 1.1, 3.], delta=0.2)
Out[54]: [1.05, 2.0, 3.0]

So, the function seems to work as it should, but can it be made more readable?

Thanks for any help.
Ze
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to