THere are several clustering implementations done and in process, some variants of naive bayesian classficiation are happening, somebody is working on logistic regression (and hopefully generalized linear modeling). Another guy is doing some nice thinking about evolutionary algorithms. I think somebody is working on SVM, but I haven't heard much about that just lately.
Some things that have not been jumped on include: a) tree classifiers, notably random forests and some sort of boosted decision tree b) general coocurrence analysis c) better parallel matrix operations d) latent variable techniques such as LDA, MDCA, non-negative matrix factorization and LSI (although there has been some discussion of this recently) As far as I know (a) and (b) are wide open. I would expect that the folks working on different parts of existing efforts would welcome some additional umph, so I wouldn't let that stop you. On Fri, May 30, 2008 at 11:05 AM, Yuri Niyazov <[EMAIL PROTECTED]> wrote: > Hi everyone, > > I reviewed the mailing list archives, and the Stanford NIPS paper. It is > unclear which algorithms have already been "claimed" for development, so to > speak, by the selected GSoC participants. I am not part of GSoC, and I > would > like to write and contribute at least one full-blown (both uniprocess & MR) > algorithm implementation to Mahout, but I don't want to write something > that > someone else has plans to do in the near future. A summary or an > appropriate > pointer to such a summary would be appreciated. > > Thanks, > YN. > -- ted