I have a friend right now who needs a jruby integration layer. It is about how code is glued together. Java is lingua franca, but there are lots of sites that are using Groovy (in the finance world especially), Ruby (in the web 2.0 world) or cloJure (in the too-cool crowd).
So I definitely see a place for small compatibility layers to make Mahout easily accessible from these languages. Having a repl is a very common pattern for tuning learning algorithms. If that repl is the same as your glue layer, so much the better. Right now, our only repl option is shell command lines for a few top level functions. It would be sooo much better if we had tighter integration in a real scripting language. Aside from my support for nice scripting layers, I think it might be better to take Anthony's clojure support, but have the LSA/Solr/Mahout integration be a consumer of Mahout rather than a part of Mahout. Obviously since he would be an early "customer", it would behoove us to work well with him to help him succeed, but I don't see a big advantage to integrating his main code until HE starts seeing some adoption and we see some independent pull. Mahout can't be such a big tent that we include every project or piece of code that uses anything from Mahout. At some point, it is more useful to draw a line. On Fri, Apr 16, 2010 at 11:56 AM, Sean Owen <sro...@gmail.com> wrote: > On Fri, Apr 16, 2010 at 7:39 PM, Jake Mannix <jake.man...@gmail.com> > wrote: > > I will start playing around with Anthony's github-based stuff, and > > see where a patch can be made. The question is where it would > > go? It's a fully functioning project already over on its own. > > > I suppose that's my question too -- what is being fixed by a move? > > The point about integrating with the ML community by having a > 'LISP-speaking' module, to be friendlier, is a good one. It does call > into question the Mahout identity -- is it for tinkering with in a lab > to explore new algorithms (for which Clojure/LISP makes sense)? or is > it for engineers and production systems at scale -- where Hadoop/Java > is the lingua franca? Yeah, this is not just another language, but for > a somewhat different audience.