spark git commit: [SPARK-18849][ML][SPARKR][DOC] vignettes final check update

shivaram Wed, 14 Dec 2016 21:52:52 -0800

Repository: spark
Updated Branches:
  refs/heads/master ec0eae486 -> 7d858bc5c



[SPARK-18849][ML][SPARKR][DOC] vignettes final check update

## What changes were proposed in this pull request?

doc cleanup

## How was this patch tested?

~~vignettes is not building for me. I'm going to kick off a full clean build 
and try again and attach output here for review.~~
Output html here: https://felixcheung.github.io/sparkr-vignettes.html

Author: Felix Cheung <felixcheun...@hotmail.com>

Closes #16286 from felixcheung/rvignettespass.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7d858bc5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7d858bc5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7d858bc5

Branch: refs/heads/master
Commit: 7d858bc5ce870a28a559f4e81dcfc54cbd128cb7
Parents: ec0eae4
Author: Felix Cheung <felixcheun...@hotmail.com>
Authored: Wed Dec 14 21:51:52 2016 -0800
Committer: Shivaram Venkataraman <shiva...@cs.berkeley.edu>
Committed: Wed Dec 14 21:51:52 2016 -0800

----------------------------------------------------------------------
 R/pkg/vignettes/sparkr-vignettes.Rmd | 38 ++++++++++---------------------
 1 file changed, 12 insertions(+), 26 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/7d858bc5/R/pkg/vignettes/sparkr-vignettes.Rmd
----------------------------------------------------------------------
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd 
b/R/pkg/vignettes/sparkr-vignettes.Rmd
index 8f39922..fa2656c 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -447,33 +447,31 @@ head(teenagers)
 
 SparkR supports the following machine learning models and algorithms.
 
-* Generalized Linear Model (GLM)
+* Accelerated Failure Time (AFT) Survival Model
 
-* Random Forest
+* Collaborative Filtering with Alternating Least Squares (ALS)
+
+* Gaussian Mixture Model (GMM)
+
+* Generalized Linear Model (GLM)
 
 * Gradient-Boosted Trees (GBT)
 
-* Naive Bayes Model
+* Isotonic Regression Model
 
 * $k$-means Clustering
 
-* Accelerated Failure Time (AFT) Survival Model
-
-* Gaussian Mixture Model (GMM)
+* Kolmogorov-Smirnov Test
 
 * Latent Dirichlet Allocation (LDA)
 
-* Multilayer Perceptron Model
-
-* Collaborative Filtering with Alternating Least Squares (ALS)
-
-* Isotonic Regression Model
-
 * Logistic Regression Model
 
-* Kolmogorov-Smirnov Test
+* Multilayer Perceptron Model
 
-More will be added in the future.
+* Naive Bayes Model
+
+* Random Forest
 
 ### R Formula
 
@@ -601,8 +599,6 @@ head(aftPredictions)
 
 #### Gaussian Mixture Model
 
-(Added in 2.1.0)
-
 `spark.gaussianMixture` fits multivariate [Gaussian Mixture 
Model](https://en.wikipedia.org/wiki/Mixture_model#Multivariate_Gaussian_mixture_model)
 (GMM) against a `SparkDataFrame`. 
[Expectation-Maximization](https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm)
 (EM) is used to approximate the maximum likelihood estimator (MLE) of the 
model.
 
 We use a simulated example to demostrate the usage.
@@ -620,8 +616,6 @@ head(select(gmmFitted, "V1", "V2", "prediction"))
 
 #### Latent Dirichlet Allocation
 
-(Added in 2.1.0)
-
 `spark.lda` fits a [Latent Dirichlet 
Allocation](https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation) model on 
a `SparkDataFrame`. It is often used in topic modeling in which topics are 
inferred from a collection of text documents. LDA can be thought of as a 
clustering algorithm as follows:
 
 * Topics correspond to cluster centers, and documents correspond to examples 
(rows) in a dataset.
@@ -676,8 +670,6 @@ perplexity
 
 #### Multilayer Perceptron
 
-(Added in 2.1.0)
-
 Multilayer perceptron classifier (MLPC) is a classifier based on the 
[feedforward artificial neural 
network](https://en.wikipedia.org/wiki/Feedforward_neural_network). MLPC 
consists of multiple layers of nodes. Each layer is fully connected to the next 
layer in the network. Nodes in the input layer represent the input data. All 
other nodes map inputs to outputs by a linear combination of the inputs with 
the nodeâs weights $w$ and bias $b$ and applying an activation function. This 
can be written in matrix form for MLPC with $K+1$ layers as follows:
 $$
 y(x)=f_K(\ldots f_2(w_2^T f_1(w_1^T x + b_1) + b_2) \ldots + b_K).
@@ -726,8 +718,6 @@ head(select(predictions, predictions$prediction))
 
 #### Collaborative Filtering
 
-(Added in 2.1.0)
-
 `spark.als` learns latent factors in [collaborative 
filtering](https://en.wikipedia.org/wiki/Recommender_system#Collaborative_filtering)
 via [alternating least squares](http://dl.acm.org/citation.cfm?id=1608614).
 
 There are multiple options that can be configured in `spark.als`, including 
`rank`, `reg`, `nonnegative`. For a complete list, refer to the help file.
@@ -757,8 +747,6 @@ head(predicted)
 
 #### Isotonic Regression Model
 
-(Added in 2.1.0)
-
 `spark.isoreg` fits an [Isotonic 
Regression](https://en.wikipedia.org/wiki/Isotonic_regression) model against a 
`SparkDataFrame`. It solves a weighted univariate a regression problem under a 
complete order constraint. Specifically, given a set of real observed responses 
$y_1, \ldots, y_n$, corresponding real features $x_1, \ldots, x_n$, and 
optionally positive weights $w_1, \ldots, w_n$, we want to find a monotone 
(piecewise linear) function $f$ to  minimize
 $$
 \ell(f) = \sum_{i=1}^n w_i (y_i - f(x_i))^2.
@@ -802,8 +790,6 @@ head(predict(isoregModel, newDF))
 
 #### Logistic Regression Model
 
-(Added in 2.1.0)
-
 [Logistic regression](https://en.wikipedia.org/wiki/Logistic_regression) is a 
widely-used model when the response is categorical. It can be seen as a special 
case of the [Generalized Linear Predictive 
Model](https://en.wikipedia.org/wiki/Generalized_linear_model).
 We provide `spark.logit` on top of `spark.glm` to support logistic regression 
with advanced hyper-parameters.
 It supports both binary and multiclass classification with elastic-net 
regularization and feature standardization, similar to `glmnet`.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18849][ML][SPARKR][DOC] vignettes final check update

Reply via email to