Hi,
   I have implemented T. Hoffmann's PLSI based on EM algorithm in pig. The
E/M login was implemented in pig in ~ 30-35 lines of pig-latin statements.
The implementation is available in mahout as a part of the following patch :
https://issues.apache.org/jira/browse/MAHOUT-106.

Though the code works fine, would appreciate any feedback on the scalability
aspects of the pig implementation, as there are some joins/cogroups used to
compute the estimated probabilities of p(s|z) and p(z|u).

-Prasen

Reply via email to