Hi Peyman, 

good to hear from u.  Not sure if anyone's responded to u yet, but the answer 
to ur question is I am not aware of any bench marking that was done for 
#Mahout's CVB impl. Others please jump in here if you think otherwise.

What has changed in LDA from 0.7 - 0.9?  


 - 0.7 had LDA with Gibbs Sampling and LDA with CVB. 

- deprecated LDA with Gibbs sampling in 0.8
- purged LDA with Gibbs Sampling in 0.9




On Sunday, March 9, 2014 11:46 AM, Peyman Faratin <peymanfara...@gmail.com> 
wrote:
 
Hi 

Is there any benchmarking to know the limits of the cvb (and what has changed 
in lda from 0.7-0.8-0.9 to solve the convergence speeds? I would like to use 
the cvb on 150k+ corpus but have come across a number of threads that mention 
the slow convergence speeds. Knowing what has changed in recent versions to 
address this issue would help decide whether to use Mahout or not (Y!LDA being 
the other option)

thank you


On Mar 7, 2014, at 12:36 PM, Suneel Marthi <suneel_mar...@yahoo.com> wrote:

> a) Upgrade to the latest Mahout version, please move away from 0.7 a lot of 
> lint was cleaned up since then.  
> 
> b) Seems like u r running the old LDA algorithm that was replaced by CVB in 
> later versions,  try running ur corpus thru CVB once you upgrade to a later 
> version of Mahout. I don't think u need Storm/Spark for that.
> 
> 
> 
> 
> 
> 
> 
> On Friday, March 7, 2014 12:21 PM, vineet yadav <vineet.yadav.i...@gmail.com> 
> wrote:
> 
> Hi Ted,
> It is Mahout 0.7.
> 
> Thanks
> Vineet Yadav
> 
> 
> On Thu, Mar 6, 2014 at 11:58 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> 
>> WHich version are you using?
>> 
>> 
>> On Thu, Mar 6, 2014 at 5:47 AM, vineet yadav <vineet.yadav.i...@gmail.com
>>> wrote:
>> 
>>> Hi,
>>> I am using Mahout LDA algorithm for Topic Modeling on a huge no of
>>> documents(500k or more). Mahout is taking a lot of time, I am looking at
>>> other alternatives. I found the link(
>>> http://www.oracle.com/technetwork/articles/java/micro-1925135.html),
>> where
>>> storm is used with Mallet for real time topic modeling. I want to know if
>>> anyone has tried storm or spark with mahout to speed up the process.
>>> 
>>> Thanks
>>> Vineet Yadav
>>> 

Reply via email to