I just noticed this recent paper <http://arxiv.org/abs/1109.2378> on arXiv
about faster hierarchical clustering. The author has come up with
algorithms to reduce the asymptotic complexity of many variants from O(N^3)
to O(N^2). Even better, he has come up with a C++ implementation with a
python/numpy interface
<http://math.stanford.edu/~muellner/fastcluster.html>that is drop-in
compatible with scipy.cluster.hierarchy.
I would take a crack at incorporating this into scikit-learn myself, but I
don't know the first thing about how to create python interfaces to C++
code. But maybe one of you are interested.
Conrad
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general