Repository: spark
Updated Branches:
  refs/heads/master d2dcd6792 -> 7620aed82


[SPARK-20011][ML][DOCS] Clarify documentation for ALS 'rank' parameter

## What changes were proposed in this pull request?

API documentation and collaborative filtering documentation page changes to 
clarify inconsistent description of ALS rank parameter.

 - [DOCS] was previously: "rank is the number of latent factors in the model."
 - [API] was previously:  "rank - number of features to use"

This change describes rank in both places consistently as:

 - "Number of features to use (also referred to as the number of latent 
factors)"

Author: Chris Snow <chris.snowuk.ibm.com>

Author: christopher snow <chsnow...@gmail.com>

Closes #17345 from snowch/SPARK-20011.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7620aed8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7620aed8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7620aed8

Branch: refs/heads/master
Commit: 7620aed828d8baefc425b54684a83c81f1507b02
Parents: d2dcd67
Author: christopher snow <chsnow...@gmail.com>
Authored: Tue Mar 21 13:23:59 2017 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue Mar 21 13:23:59 2017 +0000

----------------------------------------------------------------------
 docs/mllib-collaborative-filtering.md               |  2 +-
 .../org/apache/spark/mllib/recommendation/ALS.scala | 16 ++++++++--------
 python/pyspark/mllib/recommendation.py              |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/7620aed8/docs/mllib-collaborative-filtering.md
----------------------------------------------------------------------
diff --git a/docs/mllib-collaborative-filtering.md 
b/docs/mllib-collaborative-filtering.md
index 0f891a0..d1bb6d6 100644
--- a/docs/mllib-collaborative-filtering.md
+++ b/docs/mllib-collaborative-filtering.md
@@ -20,7 +20,7 @@ algorithm to learn these latent factors. The implementation 
in `spark.mllib` has
 following parameters:
 
 * *numBlocks* is the number of blocks used to parallelize computation (set to 
-1 to auto-configure).
-* *rank* is the number of latent factors in the model.
+* *rank* is the number of features to use (also referred to as the number of 
latent factors).
 * *iterations* is the number of iterations of ALS to run. ALS typically 
converges to a reasonable
   solution in 20 iterations or less.
 * *lambda* specifies the regularization parameter in ALS.

http://git-wip-us.apache.org/repos/asf/spark/blob/7620aed8/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
index 76b1bc1..1428822 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
@@ -301,7 +301,7 @@ object ALS {
    * level of parallelism.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    * @param lambda     regularization parameter
    * @param blocks     level of parallelism to split computation into
@@ -326,7 +326,7 @@ object ALS {
    * level of parallelism.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    * @param lambda     regularization parameter
    * @param blocks     level of parallelism to split computation into
@@ -349,7 +349,7 @@ object ALS {
    * parallelism automatically based on the number of partitions in `ratings`.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    * @param lambda     regularization parameter
    */
@@ -366,7 +366,7 @@ object ALS {
    * parallelism automatically based on the number of partitions in `ratings`.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    */
   @Since("0.8.0")
@@ -383,7 +383,7 @@ object ALS {
    * a level of parallelism given by `blocks`.
    *
    * @param ratings    RDD of (userID, productID, rating) pairs
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    * @param lambda     regularization parameter
    * @param blocks     level of parallelism to split computation into
@@ -410,7 +410,7 @@ object ALS {
    * iteratively with a configurable level of parallelism.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    * @param lambda     regularization parameter
    * @param blocks     level of parallelism to split computation into
@@ -436,7 +436,7 @@ object ALS {
    * partitions in `ratings`.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    * @param lambda     regularization parameter
    * @param alpha      confidence parameter
@@ -455,7 +455,7 @@ object ALS {
    * partitions in `ratings`.
    *
    * @param ratings    RDD of [[Rating]] objects with userID, productID, and 
rating
-   * @param rank       number of features to use
+   * @param rank       number of features to use (also referred to as the 
number of latent factors)
    * @param iterations number of iterations of ALS
    */
   @Since("0.8.1")

http://git-wip-us.apache.org/repos/asf/spark/blob/7620aed8/python/pyspark/mllib/recommendation.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/recommendation.py 
b/python/pyspark/mllib/recommendation.py
index 732300e..8118288 100644
--- a/python/pyspark/mllib/recommendation.py
+++ b/python/pyspark/mllib/recommendation.py
@@ -249,7 +249,7 @@ class ALS(object):
         :param ratings:
           RDD of `Rating` or (userID, productID, rating) tuple.
         :param rank:
-          Rank of the feature matrices computed (number of features).
+          Number of features to use (also referred to as the number of latent 
factors).
         :param iterations:
           Number of iterations of ALS.
           (default: 5)
@@ -287,7 +287,7 @@ class ALS(object):
         :param ratings:
           RDD of `Rating` or (userID, productID, rating) tuple.
         :param rank:
-          Rank of the feature matrices computed (number of features).
+          Number of features to use (also referred to as the number of latent 
factors).
         :param iterations:
           Number of iterations of ALS.
           (default: 5)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to