Re: ALS mllib.recommendation vs ml.recommendation

Bryan Cutler Tue, 15 Dec 2015 10:51:07 -0800

Hi Roberto,

1. How do they differ in terms of performance?
They both use alternating least squares matrix factorization, the main
difference is ml.recommendation.ALS uses DataFrames as input which has
built-in optimizations and should give better performance

2.  Am I correct to assume ml.recommendation.ALS (unlike mllib) does not
support key-value RDDs? If so, what is the reason?
mllib.recommendation.ALS expects a Ratings RDD type as input, while
ml.recommendation.ALS expects a DataFrame with user, item and ratings
columns.  I'm not sure if that is what you mean about key-value RDDs.

On Mon, Dec 14, 2015 at 3:22 PM, Roberto Pagliari <roberto.pagli...@asos.com
> wrote:

> Currently, there are two implementations of ALS available:
> ml.recommendation.ALS
> <http://spark.apache.org/docs/latest/api/python/pyspark.ml.html#module-pyspark.ml.recommendation>
>  and mllib.recommendation.ALS
> <http://spark.apache.org/docs/latest/api/python/pyspark.mllib.html#module-pyspark.mllib.recommendation>
>
>
>
>    1. How do they differ in terms of performance?
>    2. Am I correct to assume ml.recommendation.ALS (unlike mllib) does
>    not support key-value RDDs? If so, what is the reason?
>
>
>
> Thank you,
>
>

Re: ALS mllib.recommendation vs ml.recommendation

Reply via email to