Hi Saikat,

The differences are that MLLib offers a different set of algorithms (e.g. you want find cooccurrence analysis or stochastic svd) and that their codebase consists of hand-tuned, spark-specific implementations.

Mahout on the other hand, allows to implement algorithms in an engine-agnostic, declarative way. This allows for the automatic optimization of our algorithms as well as for running the same code on multiple backends (there has been interested from h20 as well as Apache Flink to integrate with our DSL).

--sebastian

On 06/01/2014 01:41 AM, Saikat Kanjilal wrote:
Actually the subject of my email should say spark->mlib versus mahout->spark :)

From: sxk1...@hotmail.com
To: dev@mahout.apache.org
Subject: mlib versus spark
Date: Sat, 31 May 2014 16:38:13 -0700

Ok I'll admit I'm not seeing what the obvious differences are, I'm a bit 
confused when I think of mahout using spark, since spark already uses an 
embedded machine learning library (mlib) what would be the impetus to use 
mahout instead, seems like you should be able to write or add algortihms to 
mlib and use spark, has someone from mahout looked at mlib to see if there will 
be a strongusecase for using one versus the other?
http://spark.apache.org/mllib/                                  
                                        


Reply via email to