Hi everyone,
I recently had the opportunity to work with Apache Mahout in developing document clustering component for a CMS. My current research interests include the use of Neural Nets for text clustering. With Mahout 0.9 moving towards the use of Neural Networks (MLP Classifier), I thought it would be interesting to have probably the most widely used unsupervised neural network, the Kohonen Network (Self Organizing Map) based clustering module for Mahout. There are many variants of algorithms developed based on the idea of SOM [1]. Growing Self Organizing Map (GSOM) is another variant of SOM which provides the solutions to some of the limitations of SOM and an ideal candidate for hierarchical clustering [2]. I also went through the related JIRA issues regarding this (MAHOUT-64<https://issues.apache.org/jira/browse/MAHOUT-64>, MAHOUT-1344 <https://issues.apache.org/jira/browse/MAHOUT-1344>). If possible I would like to contribute in developing Self Organizing Maps for Mahout probably starting with the online version of the SOM. Any comments, opinions on this matter are highly appreciated. [1] R.D. Lawrence, G.S. Almasi, H.E. Rushmeier, "A Scalable Parallel Algorithm for Self-Organizing Maps with Applications to Sparse Data Mining Problems" [2] Toby Smith, Damminda Alahakoon, "Growing Self-Organizing Map for Online Continuous Clustering" Thanks, Chalitha
