I just noticed this recent paper <http://arxiv.org/abs/1109.2378> on arXiv
about faster hierarchical clustering.  The author has come up with
algorithms to reduce the asymptotic complexity of many variants from O(N^3)
to O(N^2).  Even better, he has come up with a C++ implementation with a
python/numpy interface
<http://math.stanford.edu/~muellner/fastcluster.html>that is drop-in
compatible with scipy.cluster.hierarchy.

I would take a crack at incorporating this into scikit-learn myself, but I
don't know the first thing about how to create python interfaces to C++
code.  But maybe one of you are interested.

Conrad
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to