Indeed there is. And Prasenjit is being properly modest by not pointing out that this was due to his efforts.
This is a great example of how terse a language like pig can make many problems that involve a bunch of counting. Most EM-like algorithms fit into this category including k-means, HMM fitting, Dirichlet Process mixture modeling and lots of others. The problem in my mind is that it is difficult to tie all of the little scripts together coherenly. Prasenjit did this using python, but there is still no cohesive whole to the resulting program even if the result is much smaller and probably easier to understand than a large java program. On Tue, Jun 16, 2009 at 11:07 PM, prasenjit mukherjee <[email protected]>wrote: > Well, there is a PLSI implementation using Pig ( over Hadoop ) as a mahout > patch : https://issues.apache.org/jira/browse/MAHOUT-106 > >
