FWIW, Jimmy Lin's book has a chapter on MapReduce-based EM algorithms (http://www.umiacs.umd.edu/~jimmylin/book.html)
On Mon, Nov 8, 2010 at 8:01 AM, Sebastian Schelter <s...@apache.org> wrote: > I'm moving a twitter conversation to the mailing list so that it doesn't > vanish in the short-lived microblogging sphere. > > To summarize, @alansaid is looking for an implementation of the EM-algorithm > as described here: > https://cwiki.apache.org/confluence/display/MAHOUT/Expectation+Maximization. > I could only point him to an unsuccessful implementation of PLSI tried at > https://issues.apache.org/jira/browse/MAHOUT-106. While this one worked for > tiny examples, it clearly didn't scale and it had some parts of the > algorithm wrong IMHO. @sbourke tweeted about using it besides scalability > issues but I would clearly discourage anyone from doing this. > > Nevertheless if Alan manages to make this work and scale I think it would > make a very nice contribution to Mahout. I guess we'd be willing to help, so > Alan, if you need support, just ask on d...@. There's also a mahout hackathon > planned in Berlin, maybe that would be a good opportunity work > collaboratively on that implementation. > > --sebastian >