a) Upgrade to the latest Mahout version, please move away from 0.7 a lot of lint was cleaned up since then.
b) Seems like u r running the old LDA algorithm that was replaced by CVB in later versions, try running ur corpus thru CVB once you upgrade to a later version of Mahout. I don't think u need Storm/Spark for that. On Friday, March 7, 2014 12:21 PM, vineet yadav <vineet.yadav.i...@gmail.com> wrote: Hi Ted, It is Mahout 0.7. Thanks Vineet Yadav On Thu, Mar 6, 2014 at 11:58 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > WHich version are you using? > > > On Thu, Mar 6, 2014 at 5:47 AM, vineet yadav <vineet.yadav.i...@gmail.com > >wrote: > > > Hi, > > I am using Mahout LDA algorithm for Topic Modeling on a huge no of > > documents(500k or more). Mahout is taking a lot of time, I am looking at > > other alternatives. I found the link( > > http://www.oracle.com/technetwork/articles/java/micro-1925135.html), > where > > storm is used with Mallet for real time topic modeling. I want to know if > > anyone has tried storm or spark with mahout to speed up the process. > > > > Thanks > > Vineet Yadav > > >