Re: ML ALS API

Nick Pentreath Tue, 08 Mar 2016 01:08:11 -0800

Hi Maciej

Yes, that *train* method is intended to be public, but it is marked as
*DeveloperApi*, which means that backward compatibility is not necessarily
guaranteed, and that method may change. Having said that, even APIs marked
as DeveloperApi do tend to be relatively stable.

As the comment mentions:

 * :: DeveloperApi ::
 * An implementation of ALS that supports *generic ID types*, specialized
for Int and Long. This is
 * exposed as a developer API for users who do need other ID types. But it
is not recommended
 * because it increases the shuffle size and memory requirement during
training.

This *train* method is intended for the use case where user and item ids
are not the default Int (e.g. String). As you can see it returns the factor
RDDs directly, as opposed to an ALSModel instance, so overall it is a
little less user-friendly.

The *Float* ratings are to save space and make ALS more efficient overall.
That will not change in 2.0+ (especially since the precision of ratings is
not very important).

Hope that helps.

On Tue, 8 Mar 2016 at 08:20 Maciej Szymkiewicz <mszymkiew...@gmail.com>
wrote:

> Can I ask for a clarifications regarding ml.recommendation.ALS:
>
> - is train method
> (
> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala#L598
> )
> intended to be be public?
> - Rating class
> (
> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala#L436)is
> using float instead of double like its MLLib counterpart. Is it going to
> be a default encoding in 2.0+?
>
> --
> Best,
> Maciej Szymkiewicz
>
>
>

Re: ML ALS API

Reply via email to