Great to see you on here, Anthony! Hey the-rest-of-you! You should definitely check out his lsa4solr code, it's mostly very nice thin Clojure wrappers around the DistributedRowMatrix and related code, which allows for using Clojure's REPL to play interactively with the matrix (which in itself is awesome, beyond integration with Lucene or Solr).
We should make a new maven submodule for Clojure contributions, and can make some of Anthony's code the base of it. Are any of us Clojure users, who could help Anthony get his git repo (it's already Apache Licensed) integrated into our build framework and make a patch so we can start working with it? -jake (sorry for giving you the wrong list address at first!) On Fri, Apr 16, 2010 at 7:22 AM, Anthony <lsa4s...@gmail.com> wrote: > All, > > I have begun work on an integration of Apache Solr and Mahout, > http://github.com/algoriffic/lsa4solr which is related to #MAHOUT-343 > (https://issues.apache.org/jira/browse/MAHOUT-343 ). The > implementation is in Clojure and interfaces with both the > DistributedLanczosSolver and the distributed k-means clustering > algorithm from Mahout. I am about to begin implementing a > hierarchical clustering algorithm so that the number of clusters does > not need to be specified in advance. Has anyone done anything like > this in Mahout yet? Also, I'd be happy to contribute the code to > Mahout if anyone is interested. > > Thanks, > Anthony > > On Fri, Apr 16, 2010 at 9:50 AM, Jake Mannix <jman...@apache.org> wrote: > > Hey Anthony, > > We would love to have hierarchical clustering in Mahout, in Clojure or > > pure java. Come on over to the mahout-dev mailing list, and/or file > > a JIRA ticket, and join the fun, we'd love to work with you (and over > > on mahout-dev, you'll get even more positive feedback). > > If you'd rather, and aren't as familiar with the whole Apache process, > > I can file a JIRA ticket for you, and you can just comment there and > > start the conversation that way. > > Do you subscribe to the mahout-...@apache.org / mahout-user@ > > mailing lists? Their not too high traffic. > > -jake >